how to use XPath with XDocument? - c#

There is a similar question, but it seems that the solution didn't work out in my case: Weirdness with XDocument, XPath and namespaces
Here is the XML I am working with:
<?xml version="1.0" encoding="utf-8"?>
<Report Id="ID1" Type="Demo Report" Created="2011-01-01T01:01:01+11:00" Culture="en" xmlns="http://demo.com/2011/demo-schema">
<ReportInfo>
<Name>Demo Report</Name>
<CreatedBy>Unit Test</CreatedBy>
</ReportInfo>
</Report>
And below is the code that I thought it should be working but it didn't...
XDocument xdoc = XDocument.Load(#"C:\SampleXML.xml");
XmlNamespaceManager xnm = new XmlNamespaceManager(new NameTable());
xnm.AddNamespace(String.Empty, "http://demo.com/2011/demo-schema");
Console.WriteLine(xdoc.XPathSelectElement("/Report/ReportInfo/Name", xnm) == null);
Does anyone have any ideas?
Thanks.

If you have XDocument it is easier to use LINQ-to-XML:
var document = XDocument.Load(fileName);
var name = document.Descendants(XName.Get("Name", #"http://demo.com/2011/demo-schema")).First().Value;
If you are sure that XPath is the only solution you need:
using System.Xml.XPath;
var document = XDocument.Load(fileName);
var namespaceManager = new XmlNamespaceManager(new NameTable());
namespaceManager.AddNamespace("empty", "http://demo.com/2011/demo-schema");
var name = document.XPathSelectElement("/empty:Report/empty:ReportInfo/empty:Name", namespaceManager).Value;

XPath 1.0, which is what MS implements, does not have the idea of a default namespace. So try this:
XDocument xdoc = XDocument.Load(#"C:\SampleXML.xml");
XmlNamespaceManager xnm = new XmlNamespaceManager(new NameTable());
xnm.AddNamespace("x", "http://demo.com/2011/demo-schema");
Console.WriteLine(xdoc.XPathSelectElement("/x:Report/x:ReportInfo/x:Name", xnm) == null);

you can use the example from Microsoft - for you without namespace:
using System.Xml.Linq;
using System.Xml.XPath;
var e = xdoc.XPathSelectElement("./Report/ReportInfo/Name");
should do it

To work w/o default namespace suffix, I automatically expand the path.
Usage: SelectElement(xdoc.Root, "/Report/ReportInfo/Name");
private static XElement SelectElement(XElement startElement, string xpathExpression, XmlNamespaceManager namespaceManager = null) {
// XPath 1.0 does not have support for default namespace, so we have to expand our path.
if (namespaceManager == null) {
var reader = startElement.CreateReader();
namespaceManager = new XmlNamespaceManager(reader.NameTable);
}
var defaultNamespace = startElement.GetDefaultNamespace();
var defaultPrefix = namespaceManager.LookupPrefix(defaultNamespace.NamespaceName);
if (string.IsNullOrEmpty(defaultPrefix)) {
defaultPrefix = "ᆞ";
namespaceManager.AddNamespace(defaultPrefix, defaultNamespace.NamespaceName);
}
xpathExpression = AddPrefix(xpathExpression, defaultPrefix);
var selected = startElement.XPathSelectElement(xpathExpression, namespaceManager);
return selected;
}
private static string AddPrefix(string xpathExpression, string prefix) {
// Implementation notes:
// * not perfect, but it works for our use case.
// * supports: "Name~~" "~~/Name~~" "~~#Name~~" "~~[Name~~" "~~[#Name~~"
// * does not work in complex expressions like //*[local-name()="HelloWorldResult" and namespace-uri()='http://tempuri.org/']/text()
// * does not exclude strings like 'string' or function like func()
var s = Regex.Replace(xpathExpression, #"(?<a>/|\[#|#|\[|^)(?<name>\w(\w|[-])*)", "${a}${prefix}:${name}".Replace("${prefix}", prefix));
return s;
}
If anyone has a better solution to find element and attribute names, feel free to change this post.

Related

C# XML querying using XPath containing a namespace

In C# I'm struggling to understand how to query XML using XPath that includes a namespace.
XML:
<?xml version="1.0" encoding="utf-8"?>
<SomeEntity xmlns="http://www.example.com/Schemas/SomeEntity/2023/01">
<Child>
<TextValue>Some text</TextValue>
</Child>
</SomeEntity>
XPath:
/SomeEntity[#xmlns="http://www.example.com/Schemas/SomeEntity/2023/01"]/Child/TextValue
C#:
XmlNodeList nodes = doc.SelectNodes(xpath); \\ doc being of type XmlDocument
SelectNodes always results in an empty XmlNodeList.
What's the best way in C# to resolve XPath queries that include a namespace in this way?
I get the same result when using XDocument:
var doc = XDocument.Parse(xml);
string xpath = #"/SomeEntity[#xmlns=""http://www.example.com/Schemas/SomeEntity/2023/01""]/Child/TextValue";
var results = doc.XPathSelectElements(xpath);
It is better to use LINQ to XML.
There is no need to hardcode the default namespace. The GetDefaultNamespace() call gets it for you.
c#
void Main()
{
XDocument xdoc = XDocument.Parse(#"<SomeEntity xmlns='http://www.example.com/Schemas/SomeEntity/2023/01'>
<Child>
<TextValue>Some text</TextValue>
</Child>
</SomeEntity>");
XNamespace ns = xdoc.Root.GetDefaultNamespace();
string TextValue = xdoc.Descendants(ns + "TextValue")?.FirstOrDefault().Value;
Console.WriteLine("TextValue='{0}'", TextValue);
}
Output
TextValue='Some text'
For querying with the xmlns attibute condition, with #xmlns is not working. Instead, you need namespace-uri().
Pre-requisites:
Must add the XML namespace for query.
Provide the namespace prefix in the query.
For XmlDocument
string xpath = #"//se:SomeEntity[namespace-uri()='http://www.example.com/Schemas/SomeEntity/2023/01']/se:Child/se:TextValue";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
var mgr = new XmlNamespaceManager(doc.NameTable);
mgr.AddNamespace("se", "http://www.example.com/Schemas/SomeEntity/2023/01");
var nodes = doc.DocumentElement.SelectNodes(xpath, mgr);
Console.WriteLine(nodes.Count);
Console.WriteLine(nodes.Item(0).FirstChild.Value);
For XDocument
using System.Linq;
using System.Xml.XPath;
using System.Xml.Linq;
string xpath = #"//se:SomeEntity[namespace-uri()='http://www.example.com/Schemas/SomeEntity/2023/01']/se:Child/se:TextValue";
var xDoc = XDocument.Parse(xml);
var mgr = new XmlNamespaceManager(doc.NameTable);
mgr.AddNamespace("se", "http://www.example.com/Schemas/SomeEntity/2023/01");
var results = xDoc.XPathSelectElements(xpath, mgr);
Console.WriteLine(results.FirstOrDefault()?.Value);

Process IMSManifest.xml with XPath and C#

I am unfamiliar with XPath and I'm looking for guidance on simply selecting three values from a file: schemaversion, title and description.
The Xpath expression //title/langstring only matches the element value when I strip out the name spacing information from <manifest> and <lom>.
What is the correct way to search the contents for these values?
Unit test:
[Test]
public void TitleIsNotNull()
{
var manifestManager = new ManifestManager("imsmanifest.xml");
// Code which initializes object and calls GetTitle() is encapsulated.
Assert.IsNotNullOrEmpty(manifestManager.Title);
}
System Under Test:
private string GetTitle()
{
var document = XElement.Parse(_contents);
const string XpathExpression = "//title/langstring";
return (string)document.XPathSelectElement(XpathExpression);
}
_contents (excerpted):
<?xml version="1.0" encoding="utf-8"?>
<manifest xsi:schemaLocation="http://www.imsproject.org/xsd/imscp_rootv1p1p2
imscp_rootv1p1p2.xsd
http://www.imsglobal.org/xsd/imsmd_rootv1p2p1 imsmd_rootv1p2p1.xsd
http://www.adlnet.org/xsd/adlcp_rootv1p2 adlcp_rootv1p2.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:adlcp="http://www.adlnet.org/xsd/adlcp_rootv1p2"
xmlns="http://www.imsproject.org/xsd/imscp_rootv1p1p2"
version="1.0"
identifier="ExampleIdGoesHere">
<metadata>
<schema>ADL SCORM</schema>
<schemaversion>1.2</schemaversion>
<lom xsi:schemaLocation="http://www.imsglobal.org/xsd/imsmd_rootv1p2p1
imsmd_rootv1p2p1.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.imsglobal.org/xsd/imsmd_rootv1p2p1">
<general>
<title>
<langstring xml:lang="x-none">Example title goes here.</langstring>
</title>
<description>
<langstring xml:lang="x-none">Example description goes here.</langstring>
</description>
</general>
</lom>
</metadata>
Tweaked Code Based on Steven Doggart's solution
//Revised
private string GetTitle()
{
var xmlReader = GetXmlReader();
var document = XElement.Load(xmlReader);
var xmlNamespaceManager = GetXmlNamespaceManager(xmlReader);
const string XpathExpression = "//y:title/y:langstring";
return (string)document.XPathSelectElement(XpathExpression, xmlNamespaceManager);
}
//Private Helpers
private XmlReader GetXmlReader()
{
var contents = new StringReader(_contents);
var xmlReader = XmlReader.Create(contents);
return xmlReader;
}
private XmlNamespaceManager GetXmlNamespaceManager(XmlReader xmlReader)
{
if (xmlReader.NameTable != null)
{
var xmlNamespaceManager = new XmlNamespaceManager(xmlReader.NameTable);
xmlNamespaceManager.AddNamespace("x", "http://www.imsproject.org/xsd/imscp_rootv1p1p2");
xmlNamespaceManager.AddNamespace("y", "http://www.imsglobal.org/xsd/imsmd_rootv1p2p1");
return xmlNamespaceManager;
}
return null;
}
The problem you are having is that the element you are trying to select actually belongs to a certain namespace, but you are not specifying the namespace when you select it. The title and langstring elements both belong to the default namespace. In the XML document, the default namespace is defined as "http://www.imsproject.org/xsd/imscp_rootv1p1p2". With XPath, there is no way to specify a default namespace. If you do not provide a namespace, it always assumes you mean no namespace at all. Therefore, to select that element, you will have to explicitly provide the namespace, like this:
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(nameTable);
namespaceManager.AddNamespace("x", "http://www.imsproject.org/xsd/imscp_rootv1p1p2");
const string XpathExpression = "//x:title/x:langstring";
return (string)document.XPathSelectElement(XpathExpression, namespaceManager);
The trick, however, is getting the XmlNameTable to give to the XmlNamespaceManager. Unfortunately, the XElement class does not provide a way to get the XmlNameTable for the document, so your best bet would be to load it via an XmlReader, which can provide that, like this:
XmlReader reader = XmlReader.Create(new StringReader(_contents));
XElement document = XElement.Load(reader);
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(reader.NameTable);
namespaceManager.AddNamespace("x", "http://www.imsproject.org/xsd/imscp_rootv1p1p2");
const string XpathExpression = "//x:title/x:langstring";
return (string)document.XPathSelectElement(XpathExpression, namespaceManager);
Alternatively, you could use XmlDocument which is slightly easier when dealing with namespaces. Or, you could also choose to use LINQ to select the element instead of XPath.

How to Linq2Xml a webservice?

I'm calling a WebService with HttpWebRequest.Create method and read it with StreamReader, (below method do this job):
private string ReadWebMethod(string address)
{
var myRequest = (HttpWebRequest)HttpWebRequest.Create(address);
myRequest.Method = "POST";
using (var responseReader = new StreamReader(myRequest.GetResponse().GetResponseStream()))
return responseReader.ReadToEnd();
}
This method works well and output a string like this:
<ArrayOfAppObject xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://tempuri.org/">
<AppObject>
<Name>MyApp</Name>
<Image>StoreApp.png</Image>
<Version>SM2.1.0</Version>
</AppObject>
</ArrayOfAppObject>
Now I want to have a look in this string, So I use XElemnet and parse the string. (below code):
XElement x = XElement.Parse(ReadWebMethod(address));
Now, When I foreach, x.Elements("AppObject"), it doesnt return any item and skip the foreach.
My foreach is like this:
foreach (var item in x.Elements("AppObject"))
{
listApplication.Add(new AppObject { Image = item.Element("Image").Value, Name = item.Element("Name").Value, Version = item.Element("Version").Value });
}
I create a local string and remove attributes after "ArrayOfAppObject" (xmlns:xsi="htt...)(some where name it xmlnamespace) and test it again, And it works as well and foreach does not skipped!
SO, How can I use the xml with namespace?
use XDocument class
using System.Xml.Linq;
//...
var xml = ReadWebMethod(address);
var xdocument = System.Xml.Linq.XDocument.Parse(xml);
updates
as your XML data define the namespace.. xmlns="http://tempuri.org/"
You must declare full XName with valid namespace. to access each element value
XName theElementName = XName.Get("AppObject", "http://tempuri.org/");
//or alternate way..
XName theElementName = XName.Get("{http://tempuri.org/}AppObject");
here 's sample test method
[TestMethod]
public void ParseXmlElement()
{
XDocument xdoc = XDocument.Parse(this.mockXml);
XName appTag = XName.Get("{http://tempuri.org/}AppObject");
XName nameTag = XName.Get("{http://tempuri.org/}Name");
XName imageTag = XName.Get("{http://tempuri.org/}Image");
XElement objElement = xdoc.Root.Element(appTag);
Assert.IsNotNull(objElement, "<AppObject> not found");
Assert.AreEqual("{http://tempuri.org/}AppObject", objElement.Name);
XElement nameElement = objElement.Element(nameTag);
Assert.IsNotNull(nameElement, "<Name> not found");
Assert.AreEqual("MyApp", nameElement.Value);
XElement imageElement = objElement.Element(imageTag);
Assert.IsNotNull(imageElement, "<Image> not found");
Assert.AreEqual("StoreApp.png", imageElement.Value);
}
using Xml.Linq this way..
[TestMethod]
public void ParseXmlLinq()
{
XDocument xdoc = XDocument.Parse(this.mockXml);
XElement app = xdoc.Root.Elements()
.FirstOrDefault(e => e.Name == XName.Get("AppObject", "http://tempuri.org/"));
Assert.IsNotNull(app, "<AppObject> not found");
XElement img = app.Elements()
.FirstOrDefault(x => x.Name == XName.Get("Image", "http://tempuri.org/"));
Assert.IsNotNull(img, "<Image> not found");
Assert.AreEqual("StoreApp.png", img.Value);
}
Note that I just mock up and parse string from your XML.

Help in parsing XML, simple string - but I can't seem to parse it

I have the following XML:
<iq xmlns="jabber:client" to="39850777771287777738178727#guest.google.com/agsXMPP" xml:lang="en" id="sub23" from="search.google.com" type="result">
<pubsub xmlns="http://jabber.org/protocol/pubsub">
<subscription subscription="subscribed" subid="5077774B57777BD77770" node="search" jid="39850777771287777738178727#guest.google.com/agsXMPP" />
</pubsub>
</iq>
I've tried parsing with linq to sql, but it doesn't seem to understand that these are different nodes. It groups the whole iq into a single element.
Can anyone help in parsing this using XML?
The data I want to get is the subid="5077774B57777BD77770" and the id="sub23"
Thanks!
Edit:
Here's the code I have, tried doing it in two ways:
XDocument doc = XDocument.Parse("<xml>" + iq.ToString() + "</xml>");
var results = from feed in doc.Elements("xml")
select new
{
Id = (string)feed.Element("iq").Attribute("id"),
Subid = (string)feed.Element("iq").Element("pubsub").Element("subscription").Attribute("subid")
};
and
var doc = new System.Xml.XmlDocument();
doc.LoadXml(iq.ToString());
var searchId = doc.Attributes["id"];
var subid = doc.SelectSingleNode("/pubsub/subscription").Attributes["subid"];
As Dimitre pointed out, you have a namespace issue. This will work:
using System;
using System.Xml;
namespace XMLTest
{
class Program
{
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
XmlNamespaceManager namespaces = new XmlNamespaceManager(doc.NameTable);
namespaces.AddNamespace("ns1", "jabber:client");
namespaces.AddNamespace("ns2", "http://jabber.org/protocol/pubsub");
doc.Load("xmltest.xml");
XmlNode iqNode = doc.SelectSingleNode("/ns1:iq", namespaces);
string ID = iqNode.Attributes["id"].Value;
Console.WriteLine(ID);
XmlNode subscriptionNode = doc.SelectSingleNode("/ns1:iq/ns2:pubsub/ns2:subscription", namespaces);
string subID = subscriptionNode.Attributes["subid"].Value;
Console.WriteLine(subID);
Console.ReadLine();
}
}
}
Read this for an explanation and a complete code example how to evaluate an XPath expression that contains location steps with nodes whose names are in a default namespace and are unprefixed in the XML document..
I'm not sure if this is what you're after, but it works:
XNamespace jabber = "jabber:client";
XNamespace pubsub = "http://jabber.org/protocol/pubsub";
string xmltext = "<iq xmlns=\"jabber:client\" to=\"39850777771287777738178727#guest.google.com/agsXMPP\" xml:lang=\"en\" id=\"sub23\" from=\"search.google.com\" type=\"result\">\n"
+ "<pubsub xmlns=\"http://jabber.org/protocol/pubsub\">\n"
+ "<subscription subscription=\"subscribed\" subid=\"5077774B57777BD77770\" node=\"search\" jid=\"39850777771287777738178727#guest.google.com/agsXMPP\" />\n"
+ "</pubsub>\n"
+ "</iq>";
XDocument xdoc = XDocument.Parse(xmltext);
var iqelem = xdoc.Element(jabber + "iq");
var id = iqelem.Attribute("id").Value;
var subselem = iqelem.Element(pubsub + "pubsub").Element(pubsub + "subscription");
var subid = subselem.Attribute("subid").Value;
Console.WriteLine("SubId = {0}\nId={1}", subid, id);
I concur with Dimitre - the "empty" xmlns namespace at the top is probably causing the problem. I sometimes strip these out with a regex if they're not used, otherwise play around with XmlNameSpaceManager as described

XPath and *.csproj

I am for sure missing some important detail here. I just cannot make .NET's XPath work with Visual Studio project files.
Let's load an xml document:
var doc = new XmlDocument();
doc.Load("blah/blah.csproj");
Now execute my query:
var nodes = doc.SelectNodes("//ItemGroup");
Console.WriteLine(nodes.Count); // whoops, zero
Of course, there are nodes named ItemGroup in the file. Moreover, this query works:
var nodes = doc.SelectNodes("//*/#Include");
Console.WriteLine(nodes.Count); // found some
With other documents, XPath works just fine.
I am absolutely puzzled about that. Could anyone explain me what is going on?
You probably need to add a reference to the namespace http://schemas.microsoft.com/developer/msbuild/2003.
I had a similar problem, I wrote about it here. Do something like this:
XmlDocument xdDoc = new XmlDocument();
xdDoc.Load("blah/blah.csproj");
XmlNamespaceManager xnManager =
new XmlNamespaceManager(xdDoc.NameTable);
xnManager.AddNamespace("tu",
"http://schemas.microsoft.com/developer/msbuild/2003");
XmlNode xnRoot = xdDoc.DocumentElement;
XmlNodeList xnlPages = xnRoot.SelectNodes("//tu:ItemGroup", xnManager);
Look at the root namespace; you'll have to include an xml-namespace manager and use queries like "//x:ItemGroup", where "x" is your designated alias for the root namespace. And pass the manager into the query. For example:
XmlDocument doc = new XmlDocument();
doc.Load("my.csproj");
XmlNamespaceManager mgr = new XmlNamespaceManager(doc.NameTable);
mgr.AddNamespace("foo", doc.DocumentElement.NamespaceURI);
XmlNode firstCompile = doc.SelectSingleNode("//foo:Compile", mgr);
I posted a LINQ / Xml version over at:
http://granadacoder.wordpress.com/2012/10/11/how-to-find-references-in-a-c-project-file-csproj-using-linq-xml/
But here is the gist of it. It may not be 100% perfect......but it shows the idea.
I'm posting the code here, since I found this (original post) when searching for an answer. Then I got tired of searching and wrote my own.
using System;
using System.Linq;
using System.Xml.Linq;
string fileName = #"C:\MyFolder\MyProjectFile.csproj";
XDocument xDoc = XDocument.Load(fileName);
XNamespace ns = XNamespace.Get("http://schemas.microsoft.com/developer/msbuild/2003");
//References "By DLL (file)"
var list1 = from list in xDoc.Descendants(ns + "ItemGroup")
from item in list.Elements(ns + "Reference")
/* where item.Element(ns + "HintPath") != null */
select new
{
CsProjFileName = fileName,
ReferenceInclude = item.Attribute("Include").Value,
RefType = (item.Element(ns + "HintPath") == null) ? "CompiledDLLInGac" : "CompiledDLL",
HintPath = (item.Element(ns + "HintPath") == null) ? string.Empty : item.Element(ns + "HintPath").Value
};
foreach (var v in list1)
{
Console.WriteLine(v.ToString());
}
//References "By Project"
var list2 = from list in xDoc.Descendants(ns + "ItemGroup")
from item in list.Elements(ns + "ProjectReference")
where
item.Element(ns + "Project") != null
select new
{
CsProjFileName = fileName,
ReferenceInclude = item.Attribute("Include").Value,
RefType = "ProjectReference",
ProjectGuid = item.Element(ns + "Project").Value
};
foreach (var v in list2)
{
Console.WriteLine(v.ToString());
}

Categories

Resources