Searching XML Elements by Attributes (C#) - c#

I'm Trying to check XML elements for specific attributes so I can keep from saving duplicate element entries. the XML looks more or less like this:
<root>
<artist name="Coldplay">
<track name="yellow" artist="Coldplay" url="coldplay.com/yellow" playCount="123" />
<track name="fix you" artist="Coldplay" url="coldplay.com/fixyou" playCount="135" >
</artist>
//ect.
</root>
google and various search results suggest something like
[#id='foo']
but i don't know what that is and for reasons that might be more obvious to you than to me i can't "google" a collection of special characters like that without getting bizarre results. So If anyone can offer a suggestion for an if checking statement I'd be much obliged! or a name or link for how special characters are used in C#.

It's an XPath expression. You can use them along with a variety of XML-related objects in c#.
XmlDocument xd = new XmlDocument();
xd.LoadXml( xmlString );
XmlNodeList nodes = xd.SelectNodes( "//root/artist/track[#name='yellow']" );
General Reference: http://msdn.microsoft.com/en-us/library/ms256086.aspx
XPath with LINQ: http://msdn.microsoft.com/en-us/library/bb675183.aspx.

That's an XPath expression - but personally, I'd use LINQ to XML for the searching myself:
XDocument doc = XDocument.Load("test.xml");
var track = doc.Descendants("track")
.Where(t => (string) t.Attribute("id") == "foo")
.FirstOrDefault();
(Use Single, SingleOrDefault, First etc if you want to.)

Related

Create XML document from XPaths

I have a number of XPaths from which I'd like to create an XML document, probably using the XmlDocument class and preferably utilising some existing functionality rather than building node-by-node in some kind of possibly recursive loop.
So given the 3 xpaths:
THIS/IS/FIRST/XPATH
THIS/IS/FIRST/XPATH/GOING/DEEPER
THIS/IS/SECOND/XPATH
I would like to produce:
<THIS>
<IS>
<FIRST>
<XPATH>
<GOING>
<DEEPER>
</DEEPER>
</GOING>
</XPATH>
</FIRST>
<SECOND>
<XPATH>
</XPATH>
</SECOND>
</IS>
</THIS>
I'm hoping the code is something simply along the lines of this, with XPaths being added in any order:
var doc = new XmlDocument();
doc.AddXPath("THIS/IS/FIRST/XPATH");
doc.AddXPath("THIS/IS/SECOND/XPATH");
doc.AddXPath("THIS/IS/FIRST/XPATH/GOING/DEEPER");
string result = doc.ToString();`
Many thanks!
XPath purpose is to query xml documents, not to create them, so I don't think what you are trying to achieve is possible this way.
Maybe you can get inspiration there :
Create XML Nodes based on XPath?

C# IEnumerable<XElement> - How to Xpath Filter on "root"?

In C#, I have an IEnumerable of XElements. All of the XElements have the same name, but different types. I would like to perform an "xpath filter" on the "root" element of each XElement.
Sample XML:
<xml>
<Location>
<Type>Airport</Type>
<Buildings></Buildings>
</Location>
<Location>
<Type>Mine</Type>
<Buildings></Buildings>
</Location>
<Location>
<Type>Airport</Type>
<Buildings></Buildings>
</Location>
</xml>
Sample C#:
var elements = xml.Elements("Location");
What I need is to get all the Buildings where the Location/Type is "Airport". What I would like to do is something like:
elements.SelectMany(el => el.XPathSelectElements(".[Type = 'Airport']/Buildings/Building"));
However, I cannot figure out the xpath syntax for filtering at the "root" of the XElement (the ".[Type" part).
What I can do is:
Add the elements to a made-up root element, and then apply my xpath filter (because Location would no longer be at the "root").
Filter the Locations using Linq eg: elements.Where(loc => loc.Element("Type").Value == "Airport")
But I would like to learn if there is an xpath way.
Can anyone point me in the right direction for the xpath syntax?
Thanks!
EDIT
The above XML is an extremely dumbed-down sample. The actual XML is tens of thousands of lines long, relatively unpredictable (a change in the source object can change thousands of lines of XML), and its schema is not fully known (on my end). Some of the structures repeat and/or nest. Therefore, using "//" is unlikely sufficient. Apologies for the confusion.
Try this:
var buildings = xml.XPathSelectElements("//xml/Location[Type=\"Airport\"]/Buildings");
Example:
string xmlString =
#"<xml>
<Location>
<Type>Airport</Type>
<Buildings>First airport buildings</Buildings>
</Location>
<Type>Mine</Type>
<Buildings>Mine buildings</Buildings>
<Location>
<Type>Airport</Type>
<Buildings>Second airport buildings</Buildings>
</Location>
</xml>";
XDocument xml = XDocument.Parse(xmlString);
var buildings =
xml.XPathSelectElements("//xml/Location[Type=\"Airport\"]/Buildings");
foreach (var b in buildings)
{
Console.WriteLine(b.Value);
}
Result:
First airport buildings
Second airport buildings
Well, it doesn't seem like what I want is possible. So instead, I went with creating a "fake root" element, added my XElement collection, and used an xpath:
var airportBlgs = new XElement("root", locations)
.XPathSelectElements( "./Location[Type='Airport']/Building" );
The fake root means I don't have to use "//", which is too broad. It's too bad this can't be done using just xpath.

How to access an XML element in a single go?

I have an XML string like below:
<root>
<Test1>
<Result time="2">ProperEnding</Result>
</Test1>
<Test2></Test2>
I have to operate on these elements. Most of the time the elements are unique within their parent element. I am using XDocument. I can remember that there is a way to access an element like this.
XNode resultTest1 = GetNodes("/root//Test1//result")
But I forgot it. It is possible to access the same using linq:
doc.root.Elements.etc.etc.
But I want it using a single string as shown above. Can anybody say how to make it?
Descendants() will skip any number level of intermediate nodes, e.g. this will skip over root and Test1:
doc.Decendants("Result")
Also note that you can use XPath with Linq2Xml as well, e.g. XPathSelectElements
doc.XPathSelectElements("/root/Test1/Result");
You can skip intermediate levels of the hierarchy with // (or use // at the start of the xpath string to skip the root)
"/root//Result"
One caveat - Xml is case sensitive , so Result and result are not the same element.
The string you're referring to ("/root//Test1//result") is an XPath expression.
You can use it with LINQ to XML classes (like XDocument) using XPathEvaluate, XPathSelectElement, and XPathSelectElements extension methods.
You can find more info about these methods on MSDN: http://msdn.microsoft.com/en-us/library/vstudio/system.xml.xpath.extensions_methods(v=vs.90).aspx
To make them work, you need using System.Xml.XPath at the top of your file and System.Xml.Linq.dll assembly referenced (which is probably already there).
You can try to load your xml using XDocument:
// loads xml file with root element
XDocument xml = XDocument.Load("filename.xml");
Now you can append LINQ statements to your xml variable like this:
var retrieveSomeSpecificDataLikeListOfElementsAsAnonymousObjects = xml.Descendants("parentNodeName").Select(node => new { SomeSpecialValueYouWant = node.Element("elementNameUnderParentNode").Value }).ToList();
You can mix and do whatever you want - above is just an example.
Is this what you looking?
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml("YourXML");
XmlNodeList xmlNodes = xmlDocument.SelectNodes("/root/Test1/result");

parse XDocument for attributes

I have tried to parse an XML file like this:
<books>
<book>
<attr name="Moby Dick">Moby Dick is a classic</attr>
<attr isbn="isbnNumber">123456789</attr>
</book>
</books>
How can I get the value of "123456789"? I don't really need the first attribute.
I tried reading these in a foreach loop getting XElements but I keep getting a NULL object exception.
foreach(XElement xe in XDoc.Attribute("attr"))
{
string str = xe.Attribute("isbnNumber").Value // NULL EXCEPTION HERE
}
Thanks in advance...
You could try using the XPathSelectElement() extension method [you'll need to use System.Xml.XPath to get them].
E.g.
var isbn = xDoc.XPathSelectElement("//book/attr[#isbn='isbnNumber']").Value
PS. A good XPath tester is at: http://www.yetanotherchris.me/home/2010/6/7/online-xpath-tester.html
123456789 is actually the value of an element, not an attribute. What you want can be done like so:
XElement attr = xDoc.Descendants("attr")
.FirstOrDefault( x=>
(string)x.Attribute("isbn") == "isbnNumber"
);
string isbn = (string)attr;
You could even make it one line but this might be easier to read if you're new to LINQ to XML.
Well, I can't figure out how to respond to the individual answers..... but I implemented them both and they both work.
I am going with Reddog's answer as it is a little more straightforward and being new to LINQ it is the easiest as of now for readability.
Thanks for the responses!

XDocument or XmlDocument

I am now learning XmlDocument but I've just ran into XDocument and when I try to search the difference or benefits of them I can't find something useful, could you please tell me why you would use one over another ?
If you're using .NET version 3.0 or lower, you have to use XmlDocument aka the classic DOM API. Likewise you'll find there are some other APIs which will expect this.
If you get the choice, however, I would thoroughly recommend using XDocument aka LINQ to XML. It's much simpler to create documents and process them. For example, it's the difference between:
XmlDocument doc = new XmlDocument();
XmlElement root = doc.CreateElement("root");
root.SetAttribute("name", "value");
XmlElement child = doc.CreateElement("child");
child.InnerText = "text node";
root.AppendChild(child);
doc.AppendChild(root);
and
XDocument doc = new XDocument(
new XElement("root",
new XAttribute("name", "value"),
new XElement("child", "text node")));
Namespaces are pretty easy to work with in LINQ to XML, unlike any other XML API I've ever seen:
XNamespace ns = "http://somewhere.com";
XElement element = new XElement(ns + "elementName");
// etc
LINQ to XML also works really well with LINQ - its construction model allows you to build elements with sequences of sub-elements really easily:
// Customers is a List<Customer>
XElement customersElement = new XElement("customers",
customers.Select(c => new XElement("customer",
new XAttribute("name", c.Name),
new XAttribute("lastSeen", c.LastOrder)
new XElement("address",
new XAttribute("town", c.Town),
new XAttribute("firstline", c.Address1),
// etc
));
It's all a lot more declarative, which fits in with the general LINQ style.
Now as Brannon mentioned, these are in-memory APIs rather than streaming ones (although XStreamingElement supports lazy output). XmlReader and XmlWriter are the normal ways of streaming XML in .NET, but you can mix all the APIs to some extent. For example, you can stream a large document but use LINQ to XML by positioning an XmlReader at the start of an element, reading an XElement from it and processing it, then moving on to the next element etc. There are various blog posts about this technique, here's one I found with a quick search.
I am surprised none of the answers so far mentions the fact that XmlDocument provides no line information, while XDocument does (through the IXmlLineInfo interface).
This can be a critical feature in some cases (for example if you want to report errors in an XML, or keep track of where elements are defined in general) and you better be aware of this before you happily start to implement using XmlDocument, to later discover you have to change it all.
XmlDocument is great for developers who are familiar with the XML DOM object model. It's been around for a while, and more or less corresponds to a W3C standard. It supports manual navigation as well as XPath node selection.
XDocument powers the LINQ to XML feature in .NET 3.5. It makes heavy use of IEnumerable<> and can be easier to work with in straight C#.
Both document models require you to load the entire document into memory (unlike XmlReader for example).
As mentioned elsewhere, undoubtedly, Linq to Xml makes creation and alteration of xml documents a breeze in comparison to XmlDocument, and the XNamespace ns + "elementName" syntax makes for pleasurable reading when dealing with namespaces.
One thing worth mentioning for xsl and xpath die hards to note is that it IS possible to still execute arbitrary xpath 1.0 expressions on Linq 2 Xml XNodes by including:
using System.Xml.XPath;
and then we can navigate and project data using xpath via these extension methods:
XPathSelectElement - Single Element
XPathSelectElements - Node Set
XPathEvaluate - Scalars and others
For instance, given the Xml document:
<xml>
<foo>
<baz id="1">10</baz>
<bar id="2" special="1">baa baa</bar>
<baz id="3">20</baz>
<bar id="4" />
<bar id="5" />
</foo>
<foo id="123">Text 1<moo />Text 2
</foo>
</xml>
We can evaluate:
var node = xele.XPathSelectElement("/xml/foo[#id='123']");
var nodes = xele.XPathSelectElements(
"//moo/ancestor::xml/descendant::baz[#id='1']/following-sibling::bar[not(#special='1')]");
var sum = xele.XPathEvaluate("sum(//foo[not(moo)]/baz)");
XDocument is from the LINQ to XML API, and XmlDocument is the standard DOM-style API for XML. If you know DOM well, and don't want to learn LINQ to XML, go with XmlDocument. If you're new to both, check out this page that compares the two, and pick which one you like the looks of better.
I've just started using LINQ to XML, and I love the way you create an XML document using functional construction. It's really nice. DOM is clunky in comparison.
Also, note that XDocument is supported in Xbox 360 and Windows Phone OS 7.0.
If you target them, develop for XDocument or migrate from XmlDocument.
I believe that XDocument makes a lot more object creation calls. I suspect that for when you're handling a lot of XML documents, XMLDocument will be faster.
One place this happens is in managing scan data. Many scan tools output their data in XML (for obvious reasons). If you have to process a lot of these scan files, I think you'll have better performance with XMLDocument.

Categories

Resources