Why xmlDocument.Select return zero count - c#

i am trying to access a node for xml
<?xml version="1.0" encoding="utf-8"?>
<LinkAnalysis>
<ImgInfo>
<Number>xyz</Number>
<ImgPath>D:\Projects\VERBALinks\VERBALinks\bin\Debug\LA_img\xyz.png</ImgPath>
</ImgInfo>
</LinkAnalysis>
using following code
var nodes = doc.SelectNodes(String.Format("/LinkAnalysis/ImgInfo[#Number=\"{0}\"]", "xyz"));
But it returns me zero count. Why??

<Number> is an element, not an attribute, so your XPath expression is wrong.
Try:
String.Format("/LinkAnalysis/ImgInfo[Number/text()='{0}']", "xyz")

Related

C# XmlNode.ChildNodes breakline count as node

I have these xml
<?xml version="1.0" encoding="UTF-8"?>
<atom:entry xmlns:atom="http://www.w3.org/2005/Atom" xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/" xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/">
<atom:id>urn:uuid:00000000-0000-0000-0000-00000000000</atom:id>
<atom:title>RenamedDocument</atom:title>
<atom:updated>2012-07-13T06:14:05Z</atom:updated>
<cmisra:object xmlns:ns3="http://docs.oasis-open.org/ns/cmis/messaging/200908/">
<cmis:properties>
<cmis:propertyString propertyDefinitionId="cmis:name">
<cmis:value>RenamedDocument</cmis:value>
</cmis:propertyString>
<cmis:propertyString propertyDefinitionId="cmisma:[OSTERONE]Test">
<cmis:value>[NULL]</cmis:value>
</cmis:propertyString>
</cmis:properties>
</cmisra:object>
</atom:entry>
and I would like to have every children of cmis:property so I do
XmlNodeList list = rawData.GetElementsByTagName("cmis:properties")[0].ChildNodes;
but I have 5 children. It's appear that every \n is count as a node.
How can I suppress these breakline to have only the "real children" ?
Thanks to juharr, I found a way with NodeType.
Just have to make an if condition with NodeType != NodeType.Whitespace.
Thanks you !

C# Xpath can't get element by name

Have document loaded to XmlDocument with next srtucture
<?xml version="1.0" encoding="UTF-8"?>
<FictionBook xmlns="http://www.gribuser.ru/xml/fictionbook/2.0" xmlns:l="http://www.w3.org/1999/xlink">
<stylesheet type="text/css"></stylesheet>
<description>...</description>
<body>...</body>
<binary id="19317.jpg" content-type="image/jpeg">...</binary>
</FictionBook>
Next metods return me null (or empty collection if i use SelectNodes):
doc.SelectSingleNode("body");
doc.SelectSingleNode("//body");
doc.LastChild.SelectSingleNode("body");
doc.LastChild.SelectSingleNode("//body");
But this one works correctly
doc.LastChild["body"];
Why XPath don't give me any results?
doc.SelectSingleNode("//body"); doesn't work because body is declared in a specific namespace "http://www.gribuser.ru/xml/fictionbook/2.0", so to query for it you could code it like this:
var mgr = new XmlNamespaceManager(new NameTable());
mgr.AddNamespace("whatever", "http://www.gribuser.ru/xml/fictionbook/2.0");
var node = doc.SelectSingleNode("//whatever:body", mgr);
doc.LastChild["body"]; works because the implementation supports it, but you could use it like this to avoid ambiguities:
doc.LastChild["body", "http://www.gribuser.ru/xml/fictionbook/2.0"]

Using XDocument to convert selected node to string

I have the following XML sample.
<?xml version="1.0" encoding="utf-8"?>
<GlobalResponses>
<Filters>
<FilterId>11</FilterId>
<FilterId>5</FilterId>
<FilterId>10</FilterId>
</Filters>
<Responses>
<Response>
<Name>Bob</Name>
</Response>
<Response>
<Name>Jim</Name>
</Response>
<Response>
<Name>Steve</Name>
</Response>
</Responses>
</GlobalResponses>
Using XDocument, how can I get only the <Responses> parent and also child nodes, and convert them to a string variable. I looked at XDocument Elements and Descendants, but by calling oXDocument.Descendants("Responses").ToString(); didn't work.
Do I have to iterate over all of the XElements checking each one and then appending to a string variable ?
Function Descendants returns enumeration of XElement, so you need to select specific element.
If you want to get XML element with all the child nodes, you can use:
// assuming that you only have one tag Responses.
oXDocument.Descendants("Responses").First().ToString();
The result is
<Responses>
<Response>
<Name>Bob</Name>
</Response>
<Response>
<Name>Jim</Name>
</Response>
<Response>
<Name>Steve</Name>
</Response>
</Responses>
If you want to get child nodes and concatenate them to single string you can use
// Extract list of names
var names = doc.Descendants("Responses").Elements("Response").Select(x => x.Value);
// concatenate
var result = string.Join(", ", names);
The result is Bob, Jim, Steve
The Descendants() method takes input the element name and it will return you a collection of nodes and from those you then further need to get the elements you are interested in.
You can use linq with XDocument to extract the information. For example, the following code with extract the Name element value from each Response node and prints out :
var nodes = from response in Doc.Descendants("Response")
select response.Element("Name").Value;
foreach(var node in nodes)
Console.WriteLine(node);
Here above Doc.Descendants("Response") will fetch all the <Response> elements and then we are using response.Element("Name") to fetch the <Element> tag for each <Response> element and then using .Value property we get the value between the tag.
See this working DEMO fiddle.

Linq to XML get innertext based on attributes

How can I get the innertext of a xml node based on his attribute (type string)?
My XML-File looks following.
<?xml version="1.0" encoding="UTF-8" ?>
<office:document-content office:version="1.2">
<office:body>
<office:text>
<table:table table:name="Tabelle6" table:style-name="Tabelle6">
<table:table-row>
<table:table-cell table:style-name="Tabelle6.A1" office:value-type="string">
<text:p text:name="Invoice" text:style-name="P6">0001</text:p>
</table:table-cell>
</table:table-row>
</table:table>
</office:text>
</office:body>
</office:document-content>
I want to get the invoice number (0001) from this xml file.
My code looks like this
var xml = XDocument.Load(filePath);
var query = from item in xml.Elements("text:p")
where (string)item.Attribute("text:name").Value == "Invoice"
select item.Value;
If I execute this, I get an error:
The ': ' character, hexadecimal value 0x3A, must not be contained in a name.
Maybe it's important, the content.xml is a part of a extracted .odt-File.

How to correctly parse an XML document with arbitrary namespaces

I am trying to parse somewhat standard XML documents that use a schema called MARCXML from various sources.
Here are the first few lines of an example XML file that needs to be handled...
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
<marc:record>
<marc:leader>00925njm 22002777a 4500</marc:leader>
and one without namespace prefixes...
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
<record>
<leader>01142cam 2200301 a 4500</leader>
Key point: in order to get the XPaths to resolve further along in the program I have to go through a regex routine to add the namespaces to the NameTable (which doesn't add them by default). This seems unnecessary to me.
Regex xmlNamespace = new Regex("xmlns:(?<PREFIX>[^=]+)=\"(?<URI>[^\"]+)\"", RegexOptions.Compiled);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xmlRecord);
XmlNamespaceManager nsMgr = new XmlNamespaceManager(xmlDoc.NameTable);
MatchCollection namespaces = xmlNamespace.Matches(xmlRecord);
foreach (Match n in namespaces)
{
nsMgr.AddNamespace(n.Groups["PREFIX"].ToString(), n.Groups["URI"].ToString());
}
The XPath call looks something like this...
XmlNode leaderNode = xmlDoc.SelectSingleNode(".//" + LeaderNode, nsMgr);
Where LeaderNode is a configurable value and would equal "marc:leader" in the first example and "leader" in the second example.
Is there a better, more efficient way to do this? Note: suggestions for solving this using LINQ are welcome, but I would mainly like to know how to solve this using XmlDocument.
EDIT: I took GrayWizardx's advice and now have the following code...
if (LeaderNode.Contains(":"))
{
string prefix = LeaderNode.Substring(0, LeaderNode.IndexOf(':'));
XmlNode root = xmlDoc.FirstChild;
string nameSpace = root.GetNamespaceOfPrefix(prefix);
nsMgr.AddNamespace(prefix, nameSpace);
}
Now there's no more dependency on Regex!
If you know there is going to be a given element in the document (for instance the root element) you could try using GetNamespaceOfPrefix.

Categories

Resources