Get specific data from XML document - c#

I have xml document like this:
<level1>
<level2>
<level3>
<attribute1>...</attribute1>
<attribute2>false</attribute2>
<attribute3>...</attribute3>
</level3>
<level3>
<attribute1>...</attribute1>
<attribute2>true</attribute2>
<attribute3>...</attribute3>
</level3>
</level2>
<level2>
<level3>
<attribute1>...</attribute1>
<attribute2>false</attribute2>
...
...
...
I'm using c#, and I want to go thru all "level3", and for every "level3", i want to read attribute2, and if it says "true", i want to print the corresponding attribute3 (can be "level3" without these attributes).
I keep the xml in XmlDocument.
Then I keep all the "level3" nodes like this:
XmlNodeList xnList = document.SelectNodes(String.Format("/level1/level2/level3"));
(document is the XmlDocument).
But from now on, I don't know exactly how to continue. I tried going thru xnList with for..each, but nothing works fine for me..
How can I do it?
Thanks a lot

Well I'd use LINQ to XML:
var results = from level3 in doc.Descendants("level3")
where (bool) level3.Element("attribute2")
select level3.Element("attribute3").Value;
foreach (string result in results)
{
Console.WriteLine(result);
}
LINQ to XML makes all kinds of things much simpler than the XmlDocument API. Of course, the downside is that it requires .NET 3.5...
(By the way, naming elements attributeN is a bit confusing... one would expect attribute to refer to an actual XML attribute...)

You can use LINQ to XML and reading this is a good start.

You can use an XPath query. This will give you a XmlNodeList that contains all <attribute3> elements that match your requirement:
var list = document.SelectNodes("//level3[attribute2 = 'true']/attribute3");
foreach(XmlNode node in list)
{
Console.WriteLine(node.InnerText);
}
You can split the above xpath query in three parts:
"//level3" queries for all descendant elements named <level3>.
"[attribute2 = 'true']" filters the result from (1) and only keeps the elements where the child element <attribute2> contains the text true.
"/attribute3" takes the <attribute3> childnode of each element in the result of (2).

Related

Iterate through XDocument when you dont know the structure

Is there any way to iterate through a XDocument when you dont know what the XML structure is (using c#)?
There is plenty of examples when you know the structure, like the answer to this question : C# - Select XML Descendants with Linq and C# Foreach XML Node
I've tried Descendants("A") where A is the example below - which in my foreach returns me one element with the name as the root and the value as 'all of the values concatinated into one string'
The reason I'm doing this is to anonymize certain nodes which I know the names.
The XDocument's I'm loading can be of any shape - so i've decided to just create a list which users can add to which contains these sensitive elements.
A solution I want to avoid is users creating XPath's for sensitive fields.
The XML is also sensitive so I cant share online literally but one example (out of 5) would look.
<A>
<B>
<C>
<D>
<dee>value1</dee>
<doo>value2</doo>
<date>value3</date>
<time>value4</time>
</D>
</C>
</B>
<E>
...ommited..this doc is 5000 lines long with 500~ unique node names
</E>
............
</A>
So is there a way to iterate without using Descendants?
Use .Descendants() to iterate every element.
xmlDoc.Root.Descendants()
.ToList()
.ForEach(e => Console.WriteLine(e.Name));
This is the way I went about it.
Descendants means you know the structure of the nodes before hand. Even with an empty method call to descendants (which should get everything from the root) wasn't giving me what I was expecting.
The below code should work for any XML document, without knowing the structure.
XmlDocument doc = new XmlDocument();
doc.Load(file);
using (XmlReader reader = new XmlNodeReader(doc))
{
while (reader.Read())
{
currentNodeName = reader.Name;

Saving "skipped" nodes in xml into array

In my code, I am downloading an xml file, and because one of the nodes is variable (both name and count of them), I use code like this:
XmlNodeList arrivals = airplanes.SelectNodes("/myXml/flights/*/arrivals");
Now what I need to do, is saving names of the nodes skipped by "*" into an array, or arraylist, something like that. Later I will need to use some foreach to do something with each of the nodes, now saved as strings. I have tried
foreach(* in MyArrayList)
and that doesnt work, I get a number of errors there, assuming I cant use the " * " here.
Each XmlNode in the XmlNodeList has a ParentNode property, you should be able to use that to navigate back up from the arrivals node in the xml to the * node.
The following Linq query should get the names:
var names = arrivals.Cast<XmlNode>().Select(x => x.ParentNode.Name).ToList();
The Cast<XmlNode> is needed because XmlNodeList doesn't implement the generic IEnumerable interface.

How to get the immediate child elements of the root element using C# and XML?

<Document>
<Heading1>
<text>Heading Title</text>
<para>para1</para>
<para>para2</para>
<para>para3</para>
</Heading1>
<Heading1>
<text>2nd Heading Title</text>
<para>para4</para>
<para>para5</para>
<para>para6</para>
<Heading2>
<text>3rd Heading Title</text>
<para>para4</para>
<para>para5</para>
</Heading2>
</Heading1>
</Document>
This is XML Document. Now, i want to parse this XML file using C# (4.0). Here, I want to get all the Heading1 elements without using that element name in my program. For example, don't use document.GetElementsByTagName("Heading1");. How i get it. Guide me get out of this issue.
Thanks & Regards.
Using LINQ to XML, you can do:
var headings = yourXDocument.Root.Elements();
Using Nodes() instead of Elements() will also return text nodes and comments, which is apparently not what you want.
You can access the child elements of the document or element through the Elements() method if using LINQ to XML.
XDocument doc = ...;
var query = doc.Root.Elements();
If you're using XmlDocument, this works:
var elements = doc.SelectNodes("/*/*");
That finds all child elements of the top-level element irrespective of any of their names. It's usually safer to specify the names if you know them, so that elements with unexpected names don't get returned in your list - use /Document/Heading1 to do this.

Reading an XML File and Selecting Nodes in .NET

I have a heavier XML file with lots and lots of tree nodes. I need to pick-up some particular node (for example say Diet), under which there are multiple sections.
ie. Diet node occurs randomly in the XML, so i need to find the node as Diet and get its child elements and save it to DB.
Assume that Diet is not only one line, it has 10-12 entries underneath it (may be i can get its contents using InnerXML, but really can't get line by line nodes)
Make sure you have added a reference to "System.xml.Linq'.
Suck out all the Diet elements:
XElement wholeFile = XElement.Load(#"C:\DietSampleXML.xml");
IEnumerable<XElement> dietElements = wholeFile.Descendants("Diet");
If you set a breakpoint and hover the mouse over "dietElements" and click "Results View", you will see all the Diet elements and their inner xml.
Now iterate through dietElements to add each element and/or children to your database: "foreach (XElement x in dietElements) { ... }"
I tested this with the following xml:
<?xml version="1.0" encoding="utf-8" ?>
<TestElement>
<Diet>
<Name>Atkins</Name>
<Colories>1000</Colories>
</Diet>
<TestElement2>
<Diet>
<Name>Donuts Only</Name>
<Calories>1500</Calories>
</Diet>
</TestElement2>
<TestElement3>
<TestElement4>
<Diet>
<Name>Vegetarian</Name>
<Calories>500</Calories>
</Diet>
</TestElement4>
</TestElement3>
</TestElement>
Depending on the structure of your XML file, you might try loading it into a DataSet (DataSet.ReadXML()) and see what DataTable it puts your Diet nodes into ... if it parses it ok then it is pretty simple to loop through the DataTable and get all your Diet node values.
I wrote a little toy app that opens XML like that, listing all the DataTables in a tree view then showing the table content in a grid. The VS project file for it is here or just an MSI to install it is here, if you want to see how a DataSet parses your XML file.
In XPath, it's just //Diet
To say more, I'd need to know more about your environment.
var doc = XDocument.Load("yourfile.xml");
var nodes = from d in doc.Desendants("Diet")
select d;
foreach(var node in nodes)
{ // do stuff with node
}
The pseudo code below, contains the XPath statement that would get you all elements who have a 'Diet' as parent. Since it produces a XmlNodeList you can walk every node and save it to the DB. For performance i would consider consolidating what you want to save, and then save it, not per line (round trip for every entry is sub-optimal)
XmlNodeList list = xDoc.DocumentElement.SelectNodes("//*[parent::Diet]");
foreach (XmlNode entry in list)
{
DAL.SaveToDatabase(entry);
}
Hope this helps,

Setting attributes in an XML document

I'm writing one of my first C# programs. Here's what I'm trying to do:
Open an XML document
Navigate to a part of the XML tree and select all child elements of type <myType>
For each <myType> element, change an attribute (so <myType id="oldValue"> would become <myType id="newValue">
Write this modified XML document to a file.
I found the XmlDocument.SelectNodes method, which takes an XPath expression as its argument. However, it returns an XmlNodeList. I read a little bit about the difference between an XML node and an XML element, and this seems to explain why there is no XmlNode.SetAttribute method. But is there a way I can use my XPath expression to retrieve a list of XmlElement objects, so that I can loop through this list and set the id attributes for each?
(If there's some other easier way, please do let me know.)
Simply - it doesn't know if you are reading an element or attribute. Quite possibly, all you need is a cast here:
foreach(XmlElement el in doc.SelectNodes(...)) {
el.SetAttribute(...);
}
The SelectNodes returns an XmlNodeList, but the above treats each as an XmlElement.
I am a big fan of System.Xml.Linq.XDocument and the features it provides.
XDocument xDoc = XDocument.Load("FILENAME.xml");
// assuming you types is the parent and mytype is a bunch of nodes underneath
IEnumerable<XElement> elements = xdoc.Element("types").Elements("myType");
foreach (XElement type in elements)
{
// option 1
type.Attribute("id").Value = NEWVALUE;
// option 2
type.SetAttributeValue("id", NEWVALUE);
}
Option 1 or 2 works but I prefer 2 because if the attribute doesn't exist this'll create it.
I'm sitting at my Mac so no .NET for me...
However, I think that you can cast an XmlNode to an XmlElement via an explicit cast.
You should be able to cast the XmlElement to an XmlNode then and get it's children Nodes using something like XmlNode.ChildNodes.

Categories

Resources