C# Identify parent,child elements in XML file

C# Identify parent,child elements in XML file - c#

I found this on internet.
string xml = #"
<food>
<child>
<nested />
</child>
<child>
<other>
</other>
</child>
</food>
";
XmlReader rdr = XmlReader.Create(new System.IO.StringReader(xml));
while (rdr.Read())
{
if (rdr.NodeType == XmlNodeType.Element)
{
Console.WriteLine(rdr.LocalName);
}
}
The result of the above will be
food
child
nested
child
other
This is working perfect, Just i need to identify which elements contains child elements.
for example, I need this output
startParent_food
startParent_child
nested
endParent_child
startParent_child
other
endParent_child
endParent_food

You can do this with XmlReader, but it won't be particularly easy. You can't know if an element has children without continuing to read further, so you'd have to buffer and track various things (as XmlReader is forward-only). Unless you have a good reason to use such a low level API then I'd strongly suggest you avoid it.
This is fairly trivial with LINQ to XML
private static void Dump(XElement element, int level)
{
var space = new string(' ', level * 4);
if (element.HasElements)
{
Console.WriteLine("{0}startParent_{1}", space, element.Name);
foreach (var child in element.Elements())
{
Dump(child, level + 1);
}
Console.WriteLine("{0}endParent_{1}", space, element.Name);
}
else
{
Console.WriteLine("{0}{1}", space, element.Name);
}
}
If, as you imply in your comment, your actual requirement is to modify some values then you can do this without any need to process the details of the XML structure. For example, to modify the value of your nested element:
var doc = XDocument.Parse(xml);
var target = doc.Descendants("nested").Single();
target.Value = "some text";
var result = doc.ToString();
See this fiddle for a demo of both.

For checking child elements, your code will look something like below:
System.Xml.Linq.XElement _x;
_x = System.Xml.Linq.XElement.Parse(xml);
if (_x.HasElements)
{
// your req element
}
you will have to make it recursive to check all elements.

Related

I am having trouble moving data from an XML file into an ARRAY with CDATA node type

As per title am having issues getting data from an XML file with CDATA elements into an array.
Based on my current limited understanding of how to do it, I came up with this basic working method
CDATA is odd so my normal methods didn't work. My normal route of finding the nodes wasn't stopping on them, and then there is the whole CDATA issue.
XmlTextReader xmlReader = new XmlTextReader(FilePath);
while (xmlReader.Read())
{
// Position the reader on the OrderNumber node
xmlReader.ReadToFollowing("quoteNumber");
XmlReader inner = xmlReader.ReadSubtree();
while (inner.Read())
{
switch (xmlReader.NodeType)
{
case XmlNodeType.CDATA:
Globals.COData[0] = inner.Value;
break;
}
}
xmlReader.ReadToFollowing("orderNumber");
inner = xmlReader.ReadSubtree();
while (inner.Read())
{
switch (xmlReader.NodeType)
{
case XmlNodeType.CDATA:
Globals.COData[1] = inner.Value;
break;
}
}
But I have many many data elements to fetch and assume there is a better way. File looks like:
And the relevant portion:
<quoteNumber>
<![CDATA[ John Test 123]]>
</quoteNumber>
<orderNumber>
<![CDATA[ 1352738]]>
</orderNumber>
The item contained does have a closing element at file end. The entire XML is too large to post.
the XML format is not in my control.
My end goal is to get the OrderNumber and its value into an array. And the Quote number and its value. I am used to seeing <OrderNumber>123</OrderNumber> so CDATA nodes are new to me.

It's not entirely clear where you are going wrong because you don't share your complete XML, but you are not checking the return value from XmlReader.ReadToFollowing(string) from inside your Read() loop. Thus, once you read past the last <orderNumber>, you will get an exception when another <quoteNumber> is not found.
I would suggest restructuring your code as follows:
var ns = ""; // Replace with #"http://intelliquip.com/integrationS..." can't see the full namespace from the XML image.
var list = new List<Tuple<string, string>>(); // List of (quoteNumber, orderNumber) values.
var xmlReader = XmlReader.Create(FilePath);
while (xmlReader.ReadToFollowing("quoteNumber", ns))
{
string quoteNumber = null;
string orderNumber = null;
using (var inner = xmlReader.ReadSubtree())
{
// We need to skip the insignificant whitespace around the CDATA nodes which ReadElementContentAsString() will not do.
while (inner.Read())
{
switch (xmlReader.NodeType)
{
case XmlNodeType.Text:
case XmlNodeType.CDATA:
quoteNumber += inner.Value;
break;
}
}
// After ReadSubtree() the reader is positioned on the </quoteNumber> element end.
}
// If the next orderNumber node is nmissing, ReadToFollowing() will read all the way past the next quoteNumber node.
// Use ReadToNextSibling() instead.
if (xmlReader.ReadToNextSibling("orderNumber", ns))
{
using (var inner = xmlReader.ReadSubtree())
{
while (inner.Read())
{
switch (xmlReader.NodeType)
{
case XmlNodeType.Text:
case XmlNodeType.CDATA:
orderNumber += inner.Value;
break;
}
}
}
}
if (quoteNumber != null && orderNumber != null)
list.Add(Tuple.Create(quoteNumber, orderNumber));
else
{
// Add error handling here
}
}
Notes:
CDATA is just an alternate way of encoding an XML character data node, see What does <![CDATA[]]> in XML mean? for details. XmlReader.Value will contain the unescaped value of an XML character data node regardless of whether it is encoded as a regular text node or a CDATA node.
It is unclear from your question whether there must be exactly one <quoteNumber> node in the XML file. Because of that I read the quote and order number pairs into a List<Tuple<string, string>>. After reading is complete you can check how many were read and add then to Globals.COData as appropriate.
XmlReader.ReadToFollowing() returns
true if a matching element is found; otherwise false and the XmlReader is in an end of file state.
Thus its return value needs to be check to make sure you don't try to read past the end of the file.
Your code doesn't attempt to handle situations where an <orderNumber> is missing. If it is, the code will may skip all the way past the next <quoteNumber> to read its order number. To avoid this possibility I use XmlReader.ReadToNextSibling() to limit the scope of the search to <orderNumber> nodes belonging to the same parent node.
By using XmlReader.ReadToFollowing("orderNumber") you hardcode your code to assume that the orderNumber node(s) have no namespace prefix. Rather than doing that, it would be safer to explicitly indicate the namespace they are in which seems to be something like http://intelliquip.com/integrationS... where the ... portion is not shown.
I recommend using XmlReader.ReadToFollowing("orderNumber", ns) where ns is the namespace the order and quote nodes are actually in.
XmlTextReader has been deprecated since .Net 2.0. Use XmlReader.Create() instead.
The XmlReader API is rather fussy to use. If your XML files are not large you might consider loading them into an XDocument and using LINQ to XML to query it.
For instance, your XmlReader code could be rewritten as follows:
var doc = XDocument.Load(FilePath);
XNamespace ns = ""; // Replace with #"http://intelliquip.com/integrationS..." can't see the full namespace from the XML image.
var query = from quote in doc.Descendants(ns + "quoteNumber")
let order = quote.ElementsAfterSelf(ns + "orderNumber").FirstOrDefault()
where order != null
select Tuple.Create(quote.Value, order.Value);
var list = query.ToList();
Which looks much simpler.
You might also consider replacing the Tuple<string, string> with a proper data model such as
public class Order
{
public string QuoteNumber { get; set; }
public string OrderNumber { get; set; }
}
Demo fiddle #1 here for XmlReader and #2 here for LINQ to XML.

Get all child element values of specific node using XPath

I'm using XPath to read elements from an XML document. Specifically I want to return the values of any element which is the child of a specified element (here the specified element is <SceneryType> and these elements have single-digit values. So I want to return all of the children of <SceneryType> 1 for example.
Here is the XML:
<MissionObjectives>
<Theme themeName="Gothic">
<SceneryType>
1
<Objective>
Do a river thing.
</Objective>
<Objective>
Get all men to the other side of the river.
</Objective>
</SceneryType>
<SceneryType>
2
<Objective>
Climb some trees!
</Objective>
<Objective>
Shoot the tree!
</Objective>
</SceneryType>
</Theme>
I've tried various ways of getting these child elements, but I can't figure it out. My //objective part of the expression just returns everything from the root it seems, but the iterator isn't running which seems odd, shouldn't it loop through every element if the expression is returning a nodelist of all the elements?
XPathDocument missionDoc = new XPathDocument(objectivesPath + "MissionObjectives" + chosenTheme + ".xml");
XPathNavigator nav = missionDoc.CreateNavigator();
foreach (Scenery scenery in world.currentWorld)
{
int sceneryType = scenery.type;
XPathExpression expr = nav.Compile($"MissionObjectives/Theme/SceneryType[text()='{sceneryType}']//Objective");
XPathNodeIterator iterator = nav.Select(expr);
while (iterator.MoveNext())
{
XPathNavigator nav2 = iterator.Current.Clone();
compatibleObjectivesList.Add(nav2.Value);
}
}
I've tried looking through Stack Overflow for similar questions but I can't seem to find anything which applies to XPath. I can't use LINQ to XML for this. Any idea how I can return all the values of the various 'Objective' nodes?
Cheers for any help!

its much simpler to use the XDocument:
var doc = XDocument.Load(objectivesPath + "MissionObjectives" + chosenTheme + ".xml");
to get all of the first SceneryType child nodes:
var node = doc.XPathSelectElement("//MissionObjectives/Theme/SceneryType[1]");
to get the second objective node:
var node = doc.XPathSelectElement("//MissionObjectives/Theme/SceneryType/Objective[2]");
more xpath samples

For one, your xml data has carriage returns, line feeds, and white spaces in the search element's text node. Keep in mind, that an XML node can be an element, attribute, or text (among other node types). The solution below is a bit on the "long-handed" side and perhaps a little "hacky", but it should work. I wasn't certain if you wanted the child element text data or the entire child element, but I return just the child text node data (without carriage returns and line feeds). Also, while this solution DOES NOT use LINQ to XML in the strictest sense, it does use one LINQ expression.
private List<string> getSceneryTypeObjectiveTextList(string xml, int sceneryTypeId, string xpath = "/MissionObjectives/Theme/SceneryType")
{
List<string> result = null;
XmlDocument doc = null;
XmlNodeList sceneryTypeNodes = null;
try
{
doc = new XmlDocument();
doc.LoadXml(xml);
sceneryTypeNodes = doc.SelectNodes(xpath);
if (sceneryTypeNodes != null)
{
if (sceneryTypeNodes.Count > 0)
{
foreach (XmlNode sceneryTypeNode in sceneryTypeNodes)
{
if (sceneryTypeNode.HasChildNodes)
{
var textNode = from XmlNode n in sceneryTypeNode.ChildNodes
where (n.NodeType == XmlNodeType.Text && n.Value.Replace("\r", "").Replace("\n", "").Replace(" ", "") == sceneryTypeId.ToString())
select n;
if (textNode.Count() > 0)
{
XmlNodeList objectiveNodes = sceneryTypeNode.SelectNodes("Objective");
if (objectiveNodes != null)
{
result = new List<string>(objectiveNodes.Count);
foreach (XmlNode objectiveNode in objectiveNodes)
{
result.Add(objectiveNode.InnerText.Replace("\r", "").Replace("\n", "").Trim());
}
// Could break out of the iteration, here, if we know that SceneryType is always unique (i.e. - no duplicates in Element text node)
}
}
}
}
}
}
}
catch (Exception ex)
{
// Handle error
}
finally
{
}
return result;
}
private sampleCall(string filePath, int sceneryTypeId)
{
List<string> compatibleObjectivesList = null;
try
{
compatibleObjectivesList = getSceneryTypeObjectiveTextList(File.ReadAllText(filePath), sceneryTypeId);
}
catch (Exception ex)
{
// Handle error
}
finally
{
}
}

LINQ: How to return all child elements?

For an application I am working on, I have to display data from an XML File. There's a few transformations being done, but eventually the end result will be displayed in a treeview. When a user then clicks on a node, I want to pop up the details in a listview.
When no node has been selected, I basically use LINQ to grab the details of the first item I encounter.
Here's a simplified version of my XML
<root>
<parent label="parent1">
<child label="child1">
<element1>data</element1>
<element2>data</element2>
...
</child>
<child label="child2">
<element1>data</element1>
<element2>data</element2>
...
</child>
</parent>
</root>
And here's the code used to grab it (After selecting the parent-node that the treeview has been set to by means of an XPAthSelectStatement):
protected void listsSource_Selecting(object sender, LinqDataSourceSelectEventArgs e)
{
XElement rootElement = XElement.Load(MapPath(TreeSource.DataFile));
rootElement = rootElement.XPathSelectElement("//parent[#label='parent1']");
XElement parentElement;
parentElement = rootElement;
var query = (from itemElement in parentElement.Descendants("child")
select new
{
varElement1 = itemElement.Element("element1").Value,
varElement2 = itemElement.Element("element2").Value,
...
}).Take(1);
e.result = Query;
}
This works a treat, and I can read out the varElement1 and varElement2 values from there. However, when I try and implement a similar mechanism for when the user actually did select a node, I seem to run into a wall.
My approach was to use another XPatchSelectStatement to get to the actual node:
parentElement = rootElement.XPathSelectElement("//child[#label='" + tvwChildren.SelectedNode.Text + "']");
But I am kind of stumped on how to now get a proper LINQ query built up to read in all elements nested under the child node. I tried using parentElement.Elements(), but that was yielding an error. I also looked at using Nodes(), but with similar results.
I suppose I could use a foreach loop to access the nodes, but then I'm not sure how to get the results into a LINQ query so I can return the same e.Result = query back.
I'm fairly new to LINQ, as you might have guessed, so any hints would be very much appreciated.

Here's the query that will give you the child element (given that there is only one child element with the specified label):
var childElement = rootNode.Descendants("child")
.Single(e=>e.Attribute("label").Value == "child1");
If you have more than one child elements with label="child1" but those elements are under different parent elements you can use the same approach to get first the parent element and then the child element.
Having the above, you can use this query to get all element nodes under the child node:
var elements = childElement.Descendants().Select(e=>e.Value);

I think data binding is much easier in this case.
XDocument doc = XDocument.Load(filePath);
if (doc.Root == null)
{
throw new ApplicationException("invalid data");
}
tvwChildren.Source=doc;
But if you want in this way hope following one helps(not the exact solution)
XElement root = XElement.Load("Employees.xml");
TreeNode rootNode = new TreeNode(root.Name.LocalName);
treeView1.Nodes.Add(rootNode);
foreach(XElement employee in root.Elements())
{
TreeNode employeeNode = new TreeNode("Employee ID :" + employee.Attribute("employeeid").Value);
rootNode.Nodes.Add(employeeNode);
if (employee.HasElements)
{
foreach(XElement employeechild in employee.Descendants())
{
TreeNode childNode = new TreeNode(employeechild.Value);
employeeNode.Nodes.Add(childNode);
}
}
}
And you can try Resharper tool for create better linq statements. It shows possible ones and you can easily convert each for,foreach loops into linq statements.

I'm not entirely sure I understand what you're trying to do, but it sounds like it could be this:
var data =
from p in xml.Root.Elements("parent")
where p.Attribute("label").Value == "parent1"
from c in p.Elements("child")
where c.Attribute("label").Value == "child2"
from d in c.Elements()
select d.Value;
Let me know if that helps.

Using this Xml library you can write your XPath like:
XElement child = rootElement.XPathElement(
"//parent[#label={0}]/child[#label={1}]", "parent1", "child2");

XmlReader to return node as-is without children

I'm traversing a large XML document using XmlReader and stitching it into a much smaller and more manageable XmlDocmuent. Along the way, I find a node that's interesting so to move it I do this:
targetDoc.LoadXml("<result></result>");
// Some interesting code removed
using (XmlReader r = XmlReader.Create(file))
{
while (r.Read())
{
if (r.NodeType == XmlNodeType.Element)
{
if (r.Name == match)
{
// Put the node into the target document
targetDoc.FirstChild.InnerXml = r.ReadOuterXml();
return targetDoc;
}
}
}
}
This is all well and good, except I'd like to include the node without its descendents. What I'm interested in is the node itself with its attributes. The descendents are very large, bulky and uninteresting at this point. (And reading them into memory all at once will cause out of memory errors...)
Is there an easy way to get the text (?) of the found element with its attributes -- but not its descendents -- into the target document?

I don't think there's a Built-in way of doing it. I think you have to read out the Attributes and content yourself.
e.g.
static void Main(string[] args)
{
var xml = #"<root>
<parent a1 = 'foo' a2 = 'bar'>Some Parent text
<child a3 = 'frob' a2= 'brob'> Some Child Text
</child>
</parent>
</root>";
var file = new StringReader(xml) ;
using (XmlReader r = XmlReader.Create(file))
{
while (r.Read())
{
if (r.NodeType == XmlNodeType.Element)
{
if (r.Name == "parent")
{
var output = new StringBuilder();
var settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
using (var elementWriter = XmlWriter.Create(output, settings))
{
elementWriter.WriteStartElement(r.Name);
elementWriter.WriteAttributes(r,false);
elementWriter.WriteValue(r.ReadString());
elementWriter.WriteEndElement();
}
Console.WriteLine(output.ToString());
}
}
}
}
if (System.Diagnostics.Debugger.IsAttached)
Console.ReadLine();
}
Will produce
<parent a1="foo" a2="bar">Some Parent text</parent>
Press any key to continue . . .

You can try XmlNode.CloneNode(bool deep) method.
deep: true to recursively clone the subtree under the specified node; false to clone only the node itself.

Not necessarily a great way, but you can read the string until you get to the end of the start tag, and then manually append an end tag and load that into an XmlDocument.
edit:
Thinking something like:
string xml = r.ReadOuterXml();
int firstEndTag = xml.IndexOf('>');
int lastStartTag = xml.LastIndexOf('<');
string newXml = xml.Substring(0, firstEndTag) + xml.Substring(lastStartTag);
This might not be valid at all, given that there's a large string right there. Your way might be the best. Neither are pretty, but I personally can't think of a better way, given your constraints (which is not to say that a better way doesn't exist).

Overwrite specific XML attributes

Let's say I have a file like this:
<outer>
<inner>
<nodex attr="value1">text</attr>
<nodex attr="value2">text</attr>
</inner>
</outer>
Basically what I want to do is, in C# (constrained to .net 2.0 here), this (pseudocode):
foreach node
if(node eq 'nodex')
update attr to newvalue
When complete, the xml file (on disk) should look like:
<outer>
<inner>
<nodex attr="newvalue1">text</attr>
<nodex attr="newvalue2">text</attr>
</inner>
</outer>
These two look marginally promising:
Overwrite a xml file value
Setting attributes in an XML document
But it's unclear whether or not they actually answer my question.
I've written this code in the meantime:
Here's a more minimal case which works:
public static void UpdateXML()
{
XmlDocument doc = new XmlDocument();
using (XmlReader reader = XmlReader.Create("XMLFile1.xml"))
{
doc.Load(reader);
XmlNodeList list = doc.GetElementsByTagName("nodex");
foreach (XmlNode node in list)
{
node.Attributes["attr"].Value = "newvalue";
}
}
using (XmlWriter writer = XmlWriter.Create("XMLFile1.xml"))
{
doc.Save(writer);
}
}

The fastest solution would be to use a loop with XmlTextReader/XmlTextWriter. That way you do not need to load the whole xml in memory.
In pseudocode:
while (reader.read)
{
if (reader.Node.Name == "nodex")
......
writer.write ...
}
You can check here for ideas.

Here is a sample script that can be run from LinqPad
var x = #"<outer>
<inner>
<nodex attr=""value1"">text</nodex>
<nodex attr=""value2"">text</nodex>
</inner>
</outer>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(x);
foreach (XmlNode n in doc.SelectNodes("//nodex"))
{
n.Attributes["attr"].Value = "new" + n.Attributes["attr"].Value.ToString();
}
doc.OuterXml.Dump();

As starting point you can show us what you have tried, you could use XPATH to select the nodes you want to modify, search for select node by attribute value in xpath.
After you have found the nodes you want to update you can reassign the attribute value as needed with a normal assignment.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Identify parent,child elements in XML file - c#

For checking child elements, your code will look something like below: System.Xml.Linq.XElement _x; _x = System.Xml.Linq.XElement.Parse(xml); if (_x.HasElements) { // your req element } you will have to make it recursive to check all elements.

Related

I am having trouble moving data from an XML file into an ARRAY with CDATA node type

Get all child element values of specific node using XPath

LINQ: How to return all child elements?

XmlReader to return node as-is without children

Overwrite specific XML attributes

Categories

Resources