i am trying to create an application which would output the data present at some xpath(which will be specified by user)
XPathDocument xmldoc = new XPathDocument(file);
XPathNavigator nav = xmldoc.CreateNavigator();
XPathNavigator result = nav.SelectSingleNode("//p");
MessageBox.Show(result.Value);
Here variable file is location of xml file.
Now when I am running this code on a xml file that has lots of namespaces defined on it , the above code returns nullreference exception, because variable result is null and i am trying to access
result.Value .
But when i created my own xml file
<a>
<b>
<p>abc</p>
</b>
</a>
the codes runs fine .
So, what i am inferring is that the problem is because i am not including the namespaces in the code.
I searched and found out the suggestion that the way to trick namespaces is to use relative xpaths such as //p .Here
What is a good way to find a specific value in an XML document using C#?
But the code still does not works on the original file (one containing the namespaces)
How about:
XPathNavigator result = nav.SelectSingleNode("//*[local-name()='p']");
Related
On Stack Overflow there is a document explaining the use of XmlDocument and how to select a node.
C# XmlDocument SelectSingleNode without attribute
The code presented is the code I am using as follows.
XmlDocument doc = new XmlDocument();
doc.Load("C:\\FileXml.xml")
string Version = doc.DocumentElement.SelectSingleNode("/Version").InnerText;
Console.Write(Version); //I want to see 3
The Xml file is shown below "in its entirety".
<CharacterObject xmlns="http://www.w3.org/2005/Atom">
<Version>3</Version>
<Path>C:\\FilePath\FileName.xml</Path>
</CharacterObject>
The error that I am receiving is that SelectSingleNode above returns null. It returned null when I searched for CharacterObject as well. No matter what XML node I search for the function SelectSingleNode returns null. This means I must be using SingleSelectNode incorrectly just not sure how.
I would like SelectSingleNode to return the node so that InnerText will return the Version information in the XML Node. I'm just having a problem in usage of reading the information from the nodes.
According to documentation on XmlDocument.DocumentElement - it is a root xml element. So in your case it is CharacterObject already.
When you call .SelectSingleNode('/CharacterObject') for it - you are requesting an CharacterObject element inside the root CharacterObject - which is not there at all.
You can simply use XmlDocument.DocumentElement.InnerText to achieve the result you want.
This particular problem has a solution. This might be due to the namespace attribute in the XML root node itself. Eliminating this attribute solves my issue.
I am working on some code in C# where I want the output as the text present at some xpath from some xml-file. Now as the xml file keeps changing and so do the namespaces, I don't want to hardcode the namespaces in the code.
XmlDocument xml = new XmlDocument();
xml.Load(file);
XmlNamespaceManager nsMgr = new XmlNamespaceManager(xml.NameTable);
Now as the namespace keeps changing I am thinking of reading the xml and using some string operations to get the namspace and uri string:
string s = System.IO.File.ReadAllText(file);
string[] s1 = new string[1];
s1[0]="xmlns:";
string[] s3 = s.Split(s1, System.StringSplitOptions.None);
foreach (string k in s3)
{
nsMgr.AddNamespace( k.Substring( 0 , k.IndexOf('=') - 1) , need help for this )
}
Some sample values of k after the split operation are:
"xs=\"http://www.w3.org/2001/XMLSchema\" "
"location=\"urn:x-ABC:content:location:mastering:1\" "
"entity=\"urn:x-ABC:content:identified-entities:mastering:1\" "
I need help on second parameter of nsMgr.AddNamespace(). Also if there is a cleaner way of adding namespaces without hardcoding.
EDIT:- Clarifying what i am doing here. I am trying to write a winform through which onc can get output as text present at some xpath in some xml. So winform takes 2 inputs .One is xml file location and other is xpath . The output should be the text at that xpath location.
For example
<a>
<b>abc
</b>
</a>
if user searches for //b or a/b then output should be abc.
The code works fine when there are default namespaces , but when the xml has namespaces defined i need to include them in the code. As i want to make the code generic , so i cannot hardcode the namespaces.
I am working on a program in C# that edits open-document files on xml level. For example it adds rows to tables.
So I load the content.xml into an XmlDocument "doc" and traverse the xml structure.
Say I have the <table:table-row> node in an XmlNode "row" and now I want to add a <table:table-cell> node to it. So I call
XmlDocument doc = new XmlDocument();
doc.Load(filename);
...
XmlNode row = ...;
...
XmlNode cell = doc.CreateElement("table:table-cell");
row.Append(cell);
...
doc.Save(filename);
The problem is that, in the file, the new node only contains
<table-cell>...</table-cell>
C# just decides to ignore what I told it to and does something else without even telling me (at first I overlooked the problem and was wondering why it didn't work although the generated xml looked okay).
From what I gathered out so far, the problem has to do with the fact that "table:" is a namespace. When I also supply a NamespaceURI to CreateElement, I get
<table:table-cell table:xmlns="THE_URI" >... - but the original document did not have this xmlns, so I don't want it either...
I tried to use an XmlTextWriter and setting writer.Settings.Namespaces = false, because I thought, this should suppress the output of the xmlns, but it only caused an exception - the document has some namespaces, which are forbidden if Namespaces is set to false... (wtf!? suppressing the output of xmlns seems a billion times more logical than throwing an exception if an xmlns is present...)
In some similar discussions I read that you should set the cell.Name manually, but this property is read-only...
Others suggest to change it on text-file level (that's tinkering and it would be slow)
Can anyone give me a hint?
Every namespace should have at least one xmlns definition with a URI. This is the ultimate differentiation between two tags.
You can however have the xmlns attribute declared only once in the file (in the beginning).
See
Creating a specific XML document using namespaces in C#
The table: parts are not namespaces. They are "namespace prefixes". They are an alias for the actual namespace. They must be declared somewhere. If they are not declared at all in your source XML, then it is not valid XML, and you shouldn't expect to be able to process it.
Are you sure that what you have loaded is the entire XML document? They haven't left off parts to make it simpler? Those parts may be the ones that contain the definition of table:.
Update
I want to have an expression (XPath, or Regex Expression, similar) that can match an XML element with a particular namespace. For example, I want to locate the value of the link element (e.g. I need the http://url within <b:link>http://url</b:link>) shown below. However, the namespace prefix varies depending on different xml files as shown in cases 1-3.
Considering the allowed character for namespace prefix (e.g. is any character allowed/valid) , could anyone provide the solution (XPath, Regex Expression or similar?
Please note that because the xml file is unknown, thus, the namespace and prefix are unknown until runtime. Does it mean I cannot use this XDocument/XmlDocument, because it requires namespace to be known in the code.
Update
Case 1
<A xmlns:b="link">
<b:link>http://url
</b:link>
</A>
Case 2
<A xmlns="link">
<link>http://url
</link>
</A>
Case 3
<A xmlns:a123="link">
<a123:link>http://url
</a123:link>
</A>
Please note that the url within the link element could be any http url, and unknown until runtime.
Update
Please mark up my question.
You need to know the namespaces you will be dealing with and register them with an XmlNamespaceManager. Here is an example:
XmlDocument doc = new XmlDocument();
doc.LoadXml("<A xmlns:b='link'><b:Books /></A>");
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("b", "link");
XmlNodeList books = doc.SelectNodes("//b:Books", nsmgr);
And if you want to do this using XDocument, which I would recommend for its brevity, here is how:
XDocument xDoc = XDocument.Parse("<A xmlns:b='link'><b:Books /></A>");
XNamespace ns = "link";
var books = xDoc.Descendants(ns + "Books");
If you do not know the namespace(s) ahead of time, see this post which shows how to query across an XDocument using only the local name. Here's an example:
XDocument xDoc = XDocument.Parse("<A xmlns:b='link'><b:Books /></A>");
var books = xDoc.Descendants().Where(e => e.Name.LocalName.ToLower() == "books");
Use an XML parser, not a regex.
That being said, you could use:
<(?:(.+?):)?Books />
And the namespace would be in captured group 1.
In fact, I'd more strongly recommend you use
<(?:([^<>]+?):)?Books />
To prevent mistakes like matching over another set of XML tags (who would use <> in a namespace anyway?!)
I have tons of XML files all containing a the same XML Document, but with different values. But the structure is the same for each file.
Inside this file I have a datetime field.
What is the best, most efficient way to query these XML files? So I can retrieve for example... All files where the datetime field = today's date?
I'm using C# and .net v2. Should I be using XML objects to achieve this or text in file search routines?
Some code examples would be great... or just the general theory, anything would help, thanks...
This depends on the size of those files, and how complex the data actually is. As far as I understand the question, for this kind of XML data, using an XPath query and going through all the files might be the best approach, possibly caching the files in order to lessen the parsing overhead.
Have a look at:
XPathDocument, XmlDocument classes and XPath queries
http://support.microsoft.com/kb/317069
Something like this should do (not tested though):
XmlNamespaceManager nsmgr = new XmlNamespaceManager(new NameTable());
// if required, add your namespace prefixes here to nsmgr
XPathExpression expression = XPathExpression.Compile("//element[#date='20090101']", nsmgr); // your query as XPath
foreach (string fileName in Directory.GetFiles("PathToXmlFiles", "*.xml")) {
XPathDocument doc;
using (XmlTextReader reader = new XmlTextReader(fileName, nsmgr.NameTable)) {
doc = new XPathDocument(reader);
}
if (doc.CreateNavigator().SelectSingleNode(expression) != null) {
// matching document found
}
}
Note: while you can also load a XPathDocument directly from a URI/path, using the reader makes sure that the same nametable is being used as the one used to compile the XPath query. If a different nametable was being used, you'd not get results from the query.
You might look into running XSL queries. See also XSLT Tutorial, XML transformation using Xslt in C#, How to query XML with an XPath expression by using Visual C#.
This question also relates to another on Stack Overflow: Parse multiple XML files with ASP.NET (C#) and return those with particular element. The accepted answer there, though, suggests using Linq.
If it is at all possible to move to C# 3.0 / .NET 3.5, LINQ-to-XML would be by far the easiest option.
With .NET 2.0, you're stuck with either XML objects or XSL.