Remove all text nodes from XML file

Remove all text nodes from XML file - c#

I want to remove all text nodes (but not any other type of node) from an XML file. How can I do this?
Example Input:
<root>
<slideshow id="1">
<Image>hii</Image>
<ImageContent>this</ImageContent>
<Thumbnail>is</Thumbnail>
<ThumbnailContent>A</ThumbnailContent>
</slideshow>
<slideshow id="2">
<Image>hii</Image>
<ImageContent>this</ImageContent>
<Thumbnail>is</Thumbnail>
<ThumbnailContent>B</ThumbnailContent>
</slideshow>
</root>
Expected Output:
<root>
<slideshow id="1">
<Image></Image>
<ImageContent></ImageContent>
<Thumbnail></Thumbnail>
<ThumbnailContent></ThumbnailContent>
</slideshow>
<slideshow id="2">
<Image></Image>
<ImageContent></ImageContent>
<Thumbnail></Thumbnail>
<ThumbnailContent></ThumbnailContent>
</slideshow>
</root>

How about:
var doc = XDocument.Load("test.xml");
doc.DescendantNodes()
.Where(x => x.NodeType == XmlNodeType.Text ||
x.NodeType == XmlNodeType.CDATA)
.Remove();
doc.Save("clean.xml");
EDIT: Note that the above was before I realized that XCData derived from XText, leading to the simpler:
var doc = XDocument.Load("test.xml");
doc.DescendantNodes()
.OfType<XText>()
.Remove();
doc.Save("clean.xml");

This question should help: Linq to XML - update/alter the nodes of an XML Document
You can use Linq to open the document and alter the values or remove the nodes altogether.

Related

C# Linq to Xml Sort elements inside a node

My xml file looks like this:
<Root>
<Child>
<SubChild>
<Item Sequence="2">Value2</Item>
<Item Sequence="1">Value1</Item>
<Node Sequence="1">First</Node>
<Node Sequence="3">Third</Node>
<Node Sequence="2">Second</Node>
<Url>https://url.com</Url>
</SubChild>
<Child>
</Root>
I want my result to be in this order
<Root>
<Child>
<SubChild>
<Item Sequence="1">Value1</Item>
<Item Sequence="2">Value2</Item>
<Node Sequence="1">First</Node>
<Node Sequence="2">Second</Node>
<Node Sequence="3">Third</Node>
<Url>https://url.com</Url>
</SubChild>
<Child>
</Root>
I can get to the node all fine. I am having issues sorting them while maintaining the element position. How can I order the nodes Item and Node and still maintain their order within the SubChild node? I need Items to the first node, followed by Node, and then Url.
This is what I tried.
var xdoc = new XmlDocument();
xdoc.LoadXml(xmlStr);
var doc = XDocument.Parse(xdoc.OuterXml);
var subChild = doc.Descendants("Root").Descendants("Child").Descendants("SubChild");
subChild.Elements("Item").OrderBy(x => Convert.ToInt32(x.Attribute("Sequence")));
subChild.Elements("Node").OrderBy(x => Convert.ToInt32(x.Attribute("Sequence")));

Get the <SubChild> element with XPath: "/Root/Child/SubChild" (subChild). This will be used for the updation later.
Extract the descendants in the <SubChild> element.
Order the result from 1 by LocalName from the element (comparing the element index in the elementOrder array), then by Sequence.
Update the node in subChild via .ReplaceNode().
using System.Xml;
using System.Xml.Linq;
using System.Xml.XPath;
var doc = XDocument.Parse(xmlStr);
var subChild = doc.XPathSelectElement("/Root/Child/SubChild");
var subChildDesc = doc.XPathSelectElement("/Root/Child/SubChild").Descendants();
string[] elementOrder = new string[] { "Item", "Node", "Url" };
subChildDesc = subChildDesc
.OfType<XElement>()
.OrderBy(x => Array.IndexOf(elementOrder, x.Name.LocalName))
.ThenBy(x => Convert.ToInt32(x.Attribute("Sequence")?.Value.ToString()))
.ToList();
subChild.ReplaceNodes(subChildDesc);
Demo # .NET Fiddle

Generate xml from existing xml without one node

I want to generate an xml from existing one but remove one node by Id:
My xml is:
<PartyList>
<Party Id="1" In="true" Out="true"/>
<Party Id="2" In="true" Out="false"/>
<Party Id="3" In="true" Out="true"/>
</PartyList>
and tried to select the node by using the following but cant remove it:
xmlNode = xmlDoc.SelectSingleNode("/PartyList/Party[#Id='3']"));
how can I remove it? and is there a better way by using linq to xml?

Removing selected element from the XmlDocument can be done as follow :
xmlNode = xmlDoc.SelectSingleNode("/PartyList/Partyx[#Id='3']");
xmlNode.ParentNode.RemoveChild(xmlNode);
xmlDoc.Save("path_for_the_updated_file.xml");
Or using LINQ-to-XML's XDocument :
var doc = XDocument.Load("path_to_your_xml_file.xml");
doc.Root
.Elements("Partyx")
.First(o => (int)o.Attribute("Id") == 3)
.Remove();
doc.Save("path_for_the_updated_file.xml");

Select Parent XML(Entire Hierarchy) Elements based on Child element values LINQ

I have the following XML and query through the ID,how do get the Parent Hierarchy
<Child>
<Child1 Id="1">
<Child2 Id="2">
<Child3 Id="3">
<Child4 Id="4">
<Child5 Id="5"/>
<Child6 Id="6"/>
</Child4>
</Child3>
</Child2>
</Child1>
</Child>
In this if i query(Id = 4) and find out the Parent elements using Linq in the particular element how to get the following output with Hierarchy.
<Child>
<Child1 Id="1">
<Child2 Id="2">
<Child3 Id="3">
<Child4 Id="4"/>
</Child3>
</Child2>
</Child1>
</Child>
Thanks In Advance.

Assume you want just one node parent tree:
string xml = #"<Child>
<Child1 Id="1">
<Child2 Id="2">
<Child3 Id="3">
<Child4 Id="4">
<Child5 Id="5"/>
<Child6 Id="6"/>
</Child4>
</Child3>
</Child2>
</Child1>
</Child>";
TextReader tr = new StringReader(xml);
XDocument doc = XDocument.Load(tr);
IEnumerable<XElement> myList =
from el in doc.Descendants()
where (string)el.Attribute("Id") == "4" // here whatever you want
select el;
// select your hero element in some way
XElement hero = myList.FirstOrDefault();
foreach (XElement ancestor in hero.Ancestors())
{
Console.WriteLine(ancestor.Name); // rebuild your tree in a separate document, I print ;)
}
To search for every element of your tree iterate retrieve the node with the select query without the where clause and call the foreach for every element.

Based on the sample XML provided, you could walk up the tree to find the parent node once you've found the node in question:
string xml =
#"<Child>
<Child1 Id='1'>
<Child2 Id='2'>
<Child3 Id='3'>
<Child4 Id='4'>
<Child5 Id='5'/>
<Child6 Id='6'/>
</Child4>
</Child3>
</Child2>
</Child1>
</Child>";
var doc = XDocument.Parse( xml );
// assumes there will always be an Id attribute for each node
// and there will be an Id with a value of 4
// otherwise an exception will be thrown.
XElement el = doc.Root.Descendants().First( x => x.Attribute( "Id" ).Value == "4" );
// discared all child nodes
el.RemoveNodes();
// walk up the tree to find the parent; when the
// parent is null, then the current node is the
// top most parent.
while( true )
{
if( el.Parent == null )
{
break;
}
el = el.Parent;
}

In Linq to XML there is a method called AncestorsAndSelf on XElement that
Returns a collection of elements that contain this element, and the
ancestors of this element.
But it will not transform your XML tree the way you want it.
What you want is:
For a given element, find the parent
Remove all elements from parent but the given element
Remove all elements from the given element
Something like this in Linq (no error handling):
XDocument doc = XDocument.Parse("<xml content>");
//finding element having 4 as ID for example
XElement el = doc.Descendants().First(el => el.Attribute("Id").Value == "4");
el.RemoveNodes();
XElement parent = el.Parent;
parent.RemoveNodes();
parent.Add(el);
[Edit]
doc.ToString() must give you what you want as a string.
[Edit]
Using RemoveNodes instead of RemoveAll, the last one also removes attributes.
Removing nodes from the chosen element too.

I found the following way
XElement elementNode = element.Descendants()
.FirstOrDefault(id => id.Attribute("id").Value == "4");
elementNode.RemoveNodes();
while (elementNode.Parent != null)
{
XElement lastNode = new XElement(elementNode);
elementNode = elementNode.Parent;
elementNode.RemoveNodes();
elementNode.DescendantsAndSelf().Last().AddFirst(lastNode);
}
return or Print elementNode.

Linq to XDocument Group by subset

I am looking for a linq to Xdoc query to group by a subset of the XML nodes. I've only been able to get this working to return a subset of the data but I need the entire xml document passed back with only the particular nodes grouped.
<Root>
<Elementname1>
</Elementname1>
<Elementname2>
</Elementname2>
<Elementname3 attrname="test1">
<Child>
</Child>
</Elementname3>
<Elementname3 attrname="test1">
<Child>
</Child>
</Elementname3>
</Root>
This code:
var result =
from row in xDoc.Descendants("Elementname3")
group row by (string)row.Attribute("attrname") into g
select g.First();
returns:
<Elementname3 attrname="test1">
<Child></Child>
</Elementname3>
Expecting:
<Root>
<Elementname1>
</Elementname1>
<Elementname2>
</Elementname2>
<Elementname3 attrname="test1">
<Child>
</Child>
</Elementname3>
</Root>
I understand since descendant element is starting at elementname3; just not sure on how to expound the linq query to start with the root node and group as expected.

Try this:
var result = new XDocument(
new XElement("Root",
from x in doc.Root.Elements()
group x by new { x.Name, Attr = (string)x.Attribute("attrname") } into g
select g.First()
)
);

Search XML doc with LINQ

I have an xml doc similar to this:
<Root>
<MainItem ID="1">
<SubItem></SubItem>
<SubItem></SubItem>
<SubItem></SubItem>
</MainItem>
<MainItem ID="2">
<SubItem></SubItem>
<SubItem></SubItem>
<SubItem></SubItem>
</MainItem>
...
</Root>
I want to return the whole of the MainItem element based on the value of attribute ID.
So effectively if Attribute ID is equal to 2, then give me that MainItem element back.
I can't work out how to do this with LINQ.
There seems to be a load of information on google, but I just can't quite seem to find what I'm looking for.
Little help ?
TIA
:-)

It could be something like this:
XDocument doc = XDocument.Load("myxmlfile.xml");
XElement mainElement = doc.Element("Root")
.Elements("MainItem")
.First(e => (int)e.Attribute("ID") == 2);
// additional work

How about this:
// load your XML
XDocument doc = XDocument.Load(#"D:\linq.xml");
// find element which has a ID=2 value
XElement mainItem = doc.Descendants("MainItem")
.Where(mi => mi.Attribute("ID").Value == "2")
.FirstOrDefault();
if(mainItem != null)
{
// do whatever you need to do
}
Marc

I changed your XML slightly to have values:
<?xml version="1.0"?>
<Root>
<MainItem ID="1">
<SubItem>value 1</SubItem>
<SubItem>val 2</SubItem>
<SubItem></SubItem>
</MainItem>
<MainItem ID="2">
<SubItem></SubItem>
<SubItem></SubItem>
<SubItem></SubItem>
</MainItem>
</Root>
And with this LINQ:
XDocument xmlDoc = XDocument.Load(#"C:\test.xml");
var result = from mainitem in xmlDoc.Descendants("MainItem")
where mainitem.Attribute("ID").Value == "1"
select mainitem;
foreach (var subitem in result.First().Descendants())
{
Console.WriteLine(subitem.Value);
}
Console.Read();

From here: How to: Filter on an Attribute (XPath-LINQ to XML)
// LINQ to XML query
IEnumerable<XElement> list1 =
from el in items.Descendants("MainItem")
where (string)el.Attribute("ID") == "2"
select el;
// XPath expression
IEnumerable<XElement> list2 = items.XPathSelectElements(".//MainItem[#ID='2']");

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Remove all text nodes from XML file - c#

This question should help: Linq to XML - update/alter the nodes of an XML Document You can use Linq to open the document and alter the values or remove the nodes altogether.

Related

C# Linq to Xml Sort elements inside a node

Generate xml from existing xml without one node

Select Parent XML(Entire Hierarchy) Elements based on Child element values LINQ

Linq to XDocument Group by subset

Search XML doc with LINQ

Categories

Resources