Difference between XPathEvaluate on XElement or XDocument? - c#

Somewhere in a C# program, I need to get an attribute value from an xml structure. I can reach this xml structure directly as an XElement and have a simple xpath string to get the attribute. However, using XPathEvaluate, I get an empty array most of the time. (Yes, sometimes, the attribute is returned, but mostly it isn't... for the exact same XElement and xpath string...)
However, if I first convert the xml to string and reparse it as an XDocument, I do always get the attribute back. Can somebody explain this behavior ? (Am using .NET 3.5)
Code that mostly returns an empty IEnumerable:
string xpath = "/exampleRoot/exampleSection[#name='test']/#value";
XElement myXelement = RetrieveXElement();
((IEnumerable)myXElement.XPathEvaluate(xpath)).Cast<XAttribute>().FirstOrDefault().Value;
Code that does always work (I get my attribute value):
string xpath = "/exampleRoot/exampleSection[#name='test']/#value";
string myXml = RetrieveXElement().ToString();
XDocument xdoc = XDocument.Parse(myXml);
((IEnumerable)xdoc.XPathEvaluate(xpath)).Cast<XAttribute>().FirstOrDefault().Value;
With the test xml:
<exampleRoot>
<exampleSection name="test" value="2" />
<exampleSection name="test2" value="2" />
</exampleRoot>
By suggestion related to a surrounding root, I did some 'dry tests' in a test program, using the same xml structure (txtbxXml and txtbxXpath representing the xml and xpath expression described above):
// 1. XDocument Trial:
((IEnumerable)XDocument.Parse(txtbxXml.Text).XPathEvaluate(txtbxXPath.Text)).Cast<XAttribute>().FirstOrDefault().Value.ToString();
// 2. XElement trial:
((IEnumerable)XElement.Parse(txtbxXml.Text).XPathEvaluate(txtbxXPath.Text)).Cast<XAttribute>().FirstOrDefault().Value.ToString();
// 3. XElement originating from other root:
((IEnumerable)(new XElement("otherRoot", XElement.Parse(txtbxXml.Text)).Element("exampleRoot")).XPathEvaluate(txtbxXPath.Text)).Cast<XAttribute>().FirstOrDefault().Value.ToString();
Result : case 1 and 3 produce the correct result, while case 2 throws a nullref exception.
If case 3 would fail and case 2 succeed, it would have made some sense to me, but now I don't get it...

The problem is that the XPath expression is starting with the children of the specified node. If you start with an XDocument, the root element is the child node. If you start with an XElement representing your exampleRoot node, then the children are the two exampleSection nodes.
If you change your XPath expression to "/exampleSection[#name='test']/#value", it will work from the element. If you change it to "//exampleSection[#name='test']/#value", it will work from both the XElement and the XDocument.

Related

XmlDocument SelectSingleNode

On Stack Overflow there is a document explaining the use of XmlDocument and how to select a node.
C# XmlDocument SelectSingleNode without attribute
The code presented is the code I am using as follows.
XmlDocument doc = new XmlDocument();
doc.Load("C:\\FileXml.xml")
string Version = doc.DocumentElement.SelectSingleNode("/Version").InnerText;
Console.Write(Version); //I want to see 3
The Xml file is shown below "in its entirety".
<CharacterObject xmlns="http://www.w3.org/2005/Atom">
<Version>3</Version>
<Path>C:\\FilePath\FileName.xml</Path>
</CharacterObject>
The error that I am receiving is that SelectSingleNode above returns null. It returned null when I searched for CharacterObject as well. No matter what XML node I search for the function SelectSingleNode returns null. This means I must be using SingleSelectNode incorrectly just not sure how.
I would like SelectSingleNode to return the node so that InnerText will return the Version information in the XML Node. I'm just having a problem in usage of reading the information from the nodes.
According to documentation on XmlDocument.DocumentElement - it is a root xml element. So in your case it is CharacterObject already.
When you call .SelectSingleNode('/CharacterObject') for it - you are requesting an CharacterObject element inside the root CharacterObject - which is not there at all.
You can simply use XmlDocument.DocumentElement.InnerText to achieve the result you want.
This particular problem has a solution. This might be due to the namespace attribute in the XML root node itself. Eliminating this attribute solves my issue.

How to select element by index in XPath .net application

I have the following xml received from a web service
<GRID xmlns="http://schemas.datastream.net/MP_functions/MP0118_GetGridHeaderData_001_Result">
<DATA>
<R>
<D>2645</D>
<D>HJIT.HRE#RGW.COM</D>
<D>2019-09-27 10:17:36.0</D>
<D>114041</D>
<D>Awaiting Planning</D>
<D>Work Planned</D>
</R>
<R>
<D>2649</D>
<D>HJIT.HRE#RGW.COM</D>
<D>2019-09-27 10:33:24.0</D>
<D>114043</D>
<D>Awaiting Release</D>
<D>Awaiting Planning</D>
</R>
<R>
<D>2652</D>
<D>HJIT.HRE#RGW.COM</D>
<D>2019-09-27 10:36:53.0</D>
<D>114041</D>
<D>Awaiting Planning</D>
<D>Work Planned</D>
</R>
</DATA>
</GRID>
I wrote the following piece of .NET code to extract the R nodes
HttpWebResponse resp = (HttpWebResponse)Req.GetResponse();
XPathDocument xpResDoc = new XPathDocument(resp.GetResponseStream());
XPathNavigator xpNav = xpResDoc.CreateNavigator();
XmlNamespaceManager nsmgr = new XmlNamespaceManager(xpNav.NameTable);
nsmgr.AddNamespace("g2", "http://schemas.datastream.net/MP_functions/MP0118_GetGridHeaderData_001_Result");
XPathNodeIterator xpNIter = xpNav.Select("//g2:R", nsmgr); // I can successfully get the three R elements
foreach (XPathNavigator nav in xpNIter)
{
/*
Now I want to iterate through each R element and use XPATH to select each of the six D nodes by its index position.
The order of the D nodes are a known dataset and I want to build a comma separated string by concatenating the value of each D node,
which will later be appended to a CSV file along with a pre-defined header row.
*/
/* I attempted the following XPATH */
// XPathNodeIterator xpDi = nav.Select("(//D)[1]"); -- This does not work and yields a null result
}
Now I want to iterate through each R element and use XPATH to select each of the six D nodes by its index position. The order of the D nodes are a known dataset and I want to build a comma separated string by concatenating the value of each D node, which will later be appended to a CSV file along with a pre-defined header row.
I didn't want to use anything like LINQ to XML as this is part of read-only data extraction program which needs to be as lite and as performant as possible.
What is the correct way to get the D elements by index with XPATH using the XPathNavigator ?
You have a few problems here:
xpNav.Select("//g2:R", nsmgr) does not work for the XML shown in your question.
This expression selects for nodes with local name R in the http://schemas.datastream.net/MP_functions/MP0118_GetGridHeaderData_001_Result namespace -- however in your actual XML none of the nodes are in this namespace. There's a namespace declaration xmlns:dstm="http://schemas.datastream.net/MP_functions/MP0118_GetGridHeaderData_001_Result" but it's not the default namespace, so none of the nodes are actually in it, as they aren't using the dstm: prefix.
Instead, you should do xpNav.Select("//R", nsmgr) (or better yet xpNav.Select("/*/DATA/R", nsmgr)).
In your question you wrote I can successfully get the three R elements so maybe this is a typo in the question.
nav.Select("(//D)[1]"); -- This does not work and yields a null result.
I cannot reproduce this exact problem -- XPathNavigator.Select()never returns null. It will throw an exception on a malformed query, but not return null.
What I can reproduce is that this always returns the same result for every <R>, specifically the value of the first <D> element, <D>2645</D>. Demo fiddle #1 here.
The problem here is that the recursive descent operator //D selects for all nodes named R in the entire document. To select only the nodes in the current <R> element you need to restrict the scope by prefacing the XPath query with .: nav.Select("(.//D)[1]") (or better yet, nav.Select("(./D)[1]")).
Incidentally, since you expect 6 child <D> nodes of <R> it will be more performant to run one single XPath query and collect all 6 into a list, rather than running 6 queries for each specific node:
var nodes = nav.Select("./D").Cast<XPathNavigator>().ToList();
You indicated that performance is important, but you are using the recursive descent operator // which can have bad performance.
From Effective Xml Part 2: How to kill the performance of an app with XPath…:
// (descendant-or-self axis)
This is a very common pattern that very often leads to serious performance problems. The way it works is that it flattens the whole subtree (the most common usage I saw is flattening the whole xml document) and then it looks for the specified elements. Now in the .NET Framework there aren’t any specific optimizations for this patterns and using it is costly...
Instead, it's better to specify the path directly.
Pulling all of the above together, your code should look something like:
//xpNav and nsmgr set up as in the question
var csvLines = xpNav.Select("/*/DATA/R", nsmgr).Cast<XPathNavigator>()
.Select(nav => string.Join(",", nav.Select("./D").Cast<XPathNavigator>()))
.ToList();
Demo fiddle #2 here.
Notes:
If the XML in your question has been incorrectly edited and the nodes <R> and <D> are really in the dstm: namespace after all, add the g2: prefix to the node names in the XPath queries like so:
var csvLines = xpNav.Select("/*/g2:DATA/g2:R", nsmgr).Cast<XPathNavigator>()
.Select(nav => string.Join(",", nav.Select("./g2:D", nsmgr).Cast<XPathNavigator>()))
.ToList();
Demo fiddle #3 here.
As an aside, you might want to check your assumption that XPathDocument will be more performant than LINQ to XML. I am not sure this will be the case.
I was on the right path, just needed to use the right method which allows to specify the namespace as seen below:
HttpWebResponse resp = (HttpWebResponse)Req.GetResponse();
XPathDocument xpResDoc = new XPathDocument(resp.GetResponseStream());
XPathNavigator xpNav = xpResDoc.CreateNavigator();
XmlNamespaceManager nsmgr = new XmlNamespaceManager(xpNav.NameTable);
nsmgr.AddNamespace("g2", "http://schemas.datastream.net/MP_functions/MP0118_GetGridHeaderData_001_Result");
XPathNodeIterator xpNIter = xpNav.Select("//g2:R", nsmgr);
foreach (XPathNavigator nav in xpNIter)
{
string r =
$"{nav.SelectSingleNode("./g2:D[1]", nsmgr).Value}," +
$"{nav.SelectSingleNode("./g2:D[2]", nsmgr).Value}," +
$"{nav.SelectSingleNode("./g2:D[3]", nsmgr).Value}," +
$"{nav.SelectSingleNode("./g2:D[4]", nsmgr).Value}," +
$"{nav.SelectSingleNode("./g2:D[5]", nsmgr).Value}," +
$"{nav.SelectSingleNode("./g2:D[6]", nsmgr).Value}";
Console.WriteLine(r);
}
// Start writing to a file stream;

How to access an XML element in a single go?

I have an XML string like below:
<root>
<Test1>
<Result time="2">ProperEnding</Result>
</Test1>
<Test2></Test2>
I have to operate on these elements. Most of the time the elements are unique within their parent element. I am using XDocument. I can remember that there is a way to access an element like this.
XNode resultTest1 = GetNodes("/root//Test1//result")
But I forgot it. It is possible to access the same using linq:
doc.root.Elements.etc.etc.
But I want it using a single string as shown above. Can anybody say how to make it?
Descendants() will skip any number level of intermediate nodes, e.g. this will skip over root and Test1:
doc.Decendants("Result")
Also note that you can use XPath with Linq2Xml as well, e.g. XPathSelectElements
doc.XPathSelectElements("/root/Test1/Result");
You can skip intermediate levels of the hierarchy with // (or use // at the start of the xpath string to skip the root)
"/root//Result"
One caveat - Xml is case sensitive , so Result and result are not the same element.
The string you're referring to ("/root//Test1//result") is an XPath expression.
You can use it with LINQ to XML classes (like XDocument) using XPathEvaluate, XPathSelectElement, and XPathSelectElements extension methods.
You can find more info about these methods on MSDN: http://msdn.microsoft.com/en-us/library/vstudio/system.xml.xpath.extensions_methods(v=vs.90).aspx
To make them work, you need using System.Xml.XPath at the top of your file and System.Xml.Linq.dll assembly referenced (which is probably already there).
You can try to load your xml using XDocument:
// loads xml file with root element
XDocument xml = XDocument.Load("filename.xml");
Now you can append LINQ statements to your xml variable like this:
var retrieveSomeSpecificDataLikeListOfElementsAsAnonymousObjects = xml.Descendants("parentNodeName").Select(node => new { SomeSpecialValueYouWant = node.Element("elementNameUnderParentNode").Value }).ToList();
You can mix and do whatever you want - above is just an example.
Is this what you looking?
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml("YourXML");
XmlNodeList xmlNodes = xmlDocument.SelectNodes("/root/Test1/result");

xml multiple namespaces in root

I am having trouble generating an xml root. I have to match this structure as the elements of the xml use the prefixs throughout.
<ShipmentReceiptNotification
xmlns="urn:rosettanet:specification:interchange:ShipmentReceiptNotification:xsd:schema:02.02"
xmlns:dacc="urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03"
xmlns:dbpq="urn:rosettanet:specification:domain:Procurement:BookPriceQualifier:xsd:codelist:01.04"
xmlns:dccc="urn:rosettanet:specification:domain:Procurement:CreditCardClassification:xsd:codelist:01.03"
xmlns:dcrt="urn:rosettanet:specification:domain:Procurement:CustomerType:xsd:codelist:01.03"
..\..\XML\Interchange\ShipmentReceiptNotification_02_02.xsd">
if I do something like
XmlNode ShipmentReceiptNotification0Node = xmlDoc.CreateElement("ShipmentReceiptNotification", "xmlns=\"urn:rosettanet:specification:interchange:ShipmentReceiptNotification:xsd:schema:02.02\"xmlns:dacc=\"urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03\"");
I get
-ShipmentReceiptNotification
xmlns="xmlns="urn:rosettanet:specification:interchange:ShipmentReceiptNotification:xsd:schema:02.02"xmlns:dacc=&
quot;urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03"">
The second argument of CreateElement accepts the URI of the namespace that the element being created, that is ShipmentReceiptNotification, belongs to. Not the whole bunch of xmlns attributes. This code:
XmlElement e = xmlDoc.CreateElement(
"ShipmentReceiptNotification",
"urn:rosettanet:specification:interchange:ShipmentReceiptNotification:xsd:schema:02.02");
Produces this XML:
<ShipmentReceiptNotification
xmlns="urn:rosettanet:specification:interchange:ShipmentReceiptNotification:xsd:schema:02.02" />
To produce what you want, you need to add attributes to the element. Like this:
XmlElement e = xmlDoc.CreateElement("ShipmentReceiptNotification");
e.SetAttribute("xmlns", "urn:rosettanet:specification:interchange:ShipmentReceiptNotification:xsd:schema:02.02");
e.SetAttribute("xmlns:dacc", "urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03");
Produces this XML:
<ShipmentReceiptNotification
xmlns="urn:rosettanet:specification:interchange:ShipmentReceiptNotification:xsd:schema:02.02"
xmlns:dacc="urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03" />
Note that this is the “manual” way. You should play with XmlNamespaceManager to do it “right”. However, that may be a bit more complex task which need not be necessary for your scenario.

Converting string to xml nodes

i have the string, it contains xml nodes, returned from the PHP file.
It's like
<a>1</a><b>0</b><c>4</c>..............
Now i need to find out what value each node have i.e a, b, c......
while loading this string to xmlDocument i'm getting error like "There are multiple root elements".
any solution for this
One of the basic rules for well-formed XML that it has a single root node. With your example, you have multiple roots:
<a>1</a>
<b>0</b>
<c>4</c>
To make it well-formed you will have to make these elements a child of a single root:
<root>
<a>1</a>
<b>0</b>
<c>4</c>
</root>
An XML document that is not well-formed is not really an XML document at all and you will find that no XML parser will be able to read it!
Wrap it in a root element. E.g:
From <a>1</a><b>2</b>...
To <root><a>1</a><b>2</b>...</root>
Then compute as normal.
That is because each element is at the same level. There need to be a "root" that encloses all of them. Wrap them in a arbitrary node, say <root>...</root> then load the new string to the xmlDocument.
This seems like XML, but it's not valid XML. In XML, you have a root element that wraps all of the other elements.
So, if you have that string in str, do this:
str = String.Format("<root>{0}</root>", str);

Categories

Resources