I am importing some data from an XML file which contains a couple of namespaces. I am using nested classes to import data in different level. Unfortunately I cannot access the data in inner classes. I know the problem is regarding namespaces but I do not know how I can resolve them without modifying inner classes. Does anybody know how I can resolve namespaces in XDocument.
This is a part of that XML:
<TransXChange xmlns="http://www.transxchange.org.uk/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:apd="http://www.govtalk.gov.uk/people/AddressAndPersonalDetails"
xsi:schemaLocation="http://www.transxchange.org.uk/ http://www.transxchange.org.uk/schema/2.1/TransXChange_registration.xsd"
xml:lang="en" CreationDateTime="2004-06-09T14:20:00-05:00"
>
<Operators>
<LicensedOperator id="O1" >
<NationalOperatorCode>ABC</NationalOperatorCode>
<OperatorAddresses>
<CorrespondenceAddress>
<apd:Line>45 City Road</apd:Line>
</CorrespondenceAddress>
</OperatorAddresses>
</LicensedOperator>
</Operators>
</TransXChange>
You have to use the namespace that matches the one used in your XML, with Linq to XML you can use XNamespace for that i.e.:
XDocument doc = XDocument.Load("test.xml");
XNamespace apd = "http://www.govtalk.gov.uk/people/AddressAndPersonalDetails";
var firstHit = doc.Descendants(apd + "Line").First();
You need to tell the parser how to reference the namespace:
XmlNamespaceManager nsmgr = new XmlNamespaceManager(xmlElem.OwnerDocument.NameTable);
nsmgr.AddNamespace("apd", "http://[PATH_TO_NAMESPACE_DOC]");
That should do it
Related
How does XPath deal with XML namespaces?
If I use
/IntuitResponse/QueryResponse/Bill/Id
to parse the XML document below I get 0 nodes back.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<IntuitResponse xmlns="http://schema.intuit.com/finance/v3"
time="2016-10-14T10:48:39.109-07:00">
<QueryResponse startPosition="1" maxResults="79" totalCount="79">
<Bill domain="QBO" sparse="false">
<Id>=1</Id>
</Bill>
</QueryResponse>
</IntuitResponse>
However, I'm not specifying the namespace in the XPath (i.e. http://schema.intuit.com/finance/v3 is not a prefix of each token of the path). How can XPath know which Id I want if I don't tell it explicitly? I suppose in this case (since there is only one namespace) XPath could get away with ignoring the xmlns entirely. But if there are multiple namespaces, things could get ugly.
XPath 1.0/2.0
Defining namespaces in XPath (recommended)
XPath itself doesn't have a way to bind a namespace prefix with a namespace. Such facilities are provided by the hosting library.
It is recommended that you use those facilities and define namespace prefixes that can then be used to qualify XML element and attribute names as necessary.
Here are some of the various mechanisms which XPath hosts provide for specifying namespace prefix bindings to namespace URIs.
(OP's original XPath, /IntuitResponse/QueryResponse/Bill/Id, has been elided to /IntuitResponse/QueryResponse.)
C#:
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3");
XmlNodeList nodes = el.SelectNodes(#"/i:IntuitResponse/i:QueryResponse", nsmgr);
Google Docs:
Unfortunately, IMPORTXML() does not provide a namespace prefix binding mechanism. See next section, Defeating namespaces in XPath, for how to use local-name() as a work-around.
Java (SAX):
NamespaceSupport support = new NamespaceSupport();
support.pushContext();
support.declarePrefix("i", "http://schema.intuit.com/finance/v3");
Java (XPath):
xpath.setNamespaceContext(new NamespaceContext() {
public String getNamespaceURI(String prefix) {
switch (prefix) {
case "i": return "http://schema.intuit.com/finance/v3";
// ...
}
});
Remember to call
DocumentBuilderFactory.setNamespaceAware(true).
See also:
Java XPath: Queries with default namespace xmlns
JavaScript:
See Implementing a User Defined Namespace Resolver:
function nsResolver(prefix) {
var ns = {
'i' : 'http://schema.intuit.com/finance/v3'
};
return ns[prefix] || null;
}
document.evaluate( '/i:IntuitResponse/i:QueryResponse',
document, nsResolver, XPathResult.ANY_TYPE,
null );
Note that if the default namespace has an associated namespace prefix defined, using the nsResolver() returned by Document.createNSResolver() can obviate the need for a customer nsResolver().
Perl (LibXML):
my $xc = XML::LibXML::XPathContext->new($doc);
$xc->registerNs('i', 'http://schema.intuit.com/finance/v3');
my #nodes = $xc->findnodes('/i:IntuitResponse/i:QueryResponse');
Python (lxml):
from lxml import etree
f = StringIO('<IntuitResponse>...</IntuitResponse>')
doc = etree.parse(f)
r = doc.xpath('/i:IntuitResponse/i:QueryResponse',
namespaces={'i':'http://schema.intuit.com/finance/v3'})
Python (ElementTree):
namespaces = {'i': 'http://schema.intuit.com/finance/v3'}
root.findall('/i:IntuitResponse/i:QueryResponse', namespaces)
Python (Scrapy):
response.selector.register_namespace('i', 'http://schema.intuit.com/finance/v3')
response.xpath('/i:IntuitResponse/i:QueryResponse').getall()
PhP:
Adapted from #Tomalak's answer using DOMDocument:
$result = new DOMDocument();
$result->loadXML($xml);
$xpath = new DOMXpath($result);
$xpath->registerNamespace("i", "http://schema.intuit.com/finance/v3");
$result = $xpath->query("/i:IntuitResponse/i:QueryResponse");
See also #IMSoP's canonical Q/A on PHP SimpleXML namespaces.
Ruby (Nokogiri):
puts doc.xpath('/i:IntuitResponse/i:QueryResponse',
'i' => "http://schema.intuit.com/finance/v3")
Note that Nokogiri supports removal of namespaces,
doc.remove_namespaces!
but see the below warnings discouraging the defeating of XML namespaces.
VBA:
xmlNS = "xmlns:i='http://schema.intuit.com/finance/v3'"
doc.setProperty "SelectionNamespaces", xmlNS
Set queryResponseElement =doc.SelectSingleNode("/i:IntuitResponse/i:QueryResponse")
VB.NET:
xmlDoc = New XmlDocument()
xmlDoc.Load("file.xml")
nsmgr = New XmlNamespaceManager(New XmlNameTable())
nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3");
nodes = xmlDoc.DocumentElement.SelectNodes("/i:IntuitResponse/i:QueryResponse",
nsmgr)
SoapUI (doc):
declare namespace i='http://schema.intuit.com/finance/v3';
/i:IntuitResponse/i:QueryResponse
xmlstarlet:
-N i="http://schema.intuit.com/finance/v3"
XSLT:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:i="http://schema.intuit.com/finance/v3">
...
Once you've declared a namespace prefix, your XPath can be written to use it:
/i:IntuitResponse/i:QueryResponse
Defeating namespaces in XPath (not recommended)
An alternative is to write predicates that test against local-name():
/*[local-name()='IntuitResponse']/*[local-name()='QueryResponse']
Or, in XPath 2.0:
/*:IntuitResponse/*:QueryResponse
Skirting namespaces in this manner works but is not recommended because it
Under-specifies the full element/attribute name.
Fails to differentiate between element/attribute names in different
namespaces (the very purpose of namespaces). Note that this concern could be addressed by adding an additional predicate to check the namespace URI explicitly:
/*[ namespace-uri()='http://schema.intuit.com/finance/v3'
and local-name()='IntuitResponse']
/*[ namespace-uri()='http://schema.intuit.com/finance/v3'
and local-name()='QueryResponse']
Thanks to Daniel Haley for the namespace-uri() note.
Is excessively verbose.
XPath 3.0/3.1
Libraries and tools that support modern XPath 3.0/3.1 allow the specification of a namespace URI directly in an XPath expression:
/Q{http://schema.intuit.com/finance/v3}IntuitResponse/Q{http://schema.intuit.com/finance/v3}QueryResponse
While Q{http://schema.intuit.com/finance/v3} is much more verbose than using an XML namespace prefix, it has the advantage of being independent of the namespace prefix binding mechanism of the hosting library. The Q{} notation is known as Clark Notation after its originator, James Clark. The W3C XPath 3.1 EBNF grammar calls it a BracedURILiteral.
Thanks to Michael Kay for the suggestion to cover XPath 3.0/3.1's BracedURILiteral.
I use /*[name()='...'] in a google sheet to fetch some counts from Wikidata. I have a table like this
thes WD prop links items
NOM P7749 3925 3789
AAT P1014 21157 20224
and the formulas in cols links and items are
=IMPORTXML("https://query.wikidata.org/sparql?query=SELECT(COUNT(*)as?c){?item wdt:"&$B14&"[]}","//*[name()='literal']")
=IMPORTXML("https://query.wikidata.org/sparql?query=SELECT(COUNT(distinct?item)as?c){?item wdt:"&$B14&"[]}","//*[name()='literal']")
respectively. The SPARQL query happens not to have any spaces...
I saw name() used instead of local-name() in Xml Namespace breaking my xpath!, and for some reason //*:literal doesn't work.
I am struggling with parsing XPathDocument with XPathNavigator where XML document contains xmlns tag, but no namesapce prefix is used.
<?xml version="1.0" encoding="UTF-8"?>
<Order version="1" xmlns="http://example.com">
<Data>
<Id>1</Id>
<b>Aaa</b>
</Data>
</Order>
SelectSingleNode method requires namespace manager with namespace defined.
XmlNamespaceManager nsmgr = new XmlNamespaceManager(nav.NameTable);
nsmgr.AddNamespace("ns","http://example.com");
As a result I have to start using ns prefix in all XPath expressions, although the original XML file does not use prefixes, for e.g.:
nav.SelectSingleNode("/ns:Order/ns:Data/ns:Id", nsmgr)
This adds an unnecessary complexity.
Is there any way to push XPathNavigator to ignore namespace and its prefixes? Ot at least use of prefixes, as namespace definition itself exists only once, so not much effort
The XML has a default namespace. So each and every XML element is bound to it, visible or not.
It is better to use LINQ to XML.
It is a preferred XML API, and available in the .Net Framework since 2007.
c#
void Main()
{
XDocument xdoc = XDocument.Parse(#"<Order version='1' xmlns='http://example.com'>
<Data>
<Id>1</Id>
<b>Aaa</b>
</Data>
</Order>");
XNamespace ns = xdoc.Root.GetDefaultNamespace();
foreach (XElement elem in xdoc.Descendants(ns + "Data"))
{
Console.WriteLine("ID: {0}, b: {1}"
, elem.Element(ns + "Id")?.Value
, elem.Element(ns + "b")?.Value);
}
}
I have a XDocument with the following structure where I want to add a bunch of XElements.
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
<CstmrCdtTrfInitn>
<GrpHdr>
...
</GrpHdr>
<!-- loaded nodes go here -->
<CstmrCdtTrfInitn>
</Document>
The XElements have the following structure:
<PmtInf xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
...
</PmtInf>
The problem is that the namespace in child nodes is not supported at the recipients side and since it is the same as the XDocument's namespace - it is redundant. How do I avoid/remove that namespace on the child nodes?
The code that I use right now:
var childNodes = new XElement(NameSpace + "GrpHdr", ...);
XElement[] loadedNodes = ...;//Loads from a service using XElement.Load
var content = new XElement(NameSpace + "CstmrCdtTrfInitn", childNodes,loadedNodes));
When calling Save on XElement or XDocument, there is a flags enum SaveOptions that allow you to control to some extent how the document is written to XML.
The easiest way to achieve what you want (without traversing the structure to remove the redundant attributes) is to use one of these flags: OmitDuplicateNamespaces.
Remove the duplicate namespace declarations while serializing.
You can see in this fiddle that adding this flag changes my example output from this:
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
<CstmrCdtTrfInitn>
<GrpHdr />
<PmtInf xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">...</PmtInf>
</CstmrCdtTrfInitn>
</Document>
To this:
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
<CstmrCdtTrfInitn>
<GrpHdr />
<PmtInf>...</PmtInf>
</CstmrCdtTrfInitn>
</Document>
I am trying to read the entitysets within the EDMX file from Entity Framework.
The EDMX file (XML format) has the following layout:
<edmx:Edmx Version="3.0" xmlns:edmx="http://schemas.microsoft.com/ado/2009/11/edmx">
<edmx:Runtime>
<edmx:ConceptualModels>
<Schema Namespace="Model" Alias="Self" p1:UseStrongSpatialTypes="false" xmlns:annotation="http://schemas.microsoft.com/ado/2009/02/edm/annotation" xmlns:p1="http://schemas.microsoft.com/ado/2009/02/edm/annotation" xmlns="http://schemas.microsoft.com/ado/2009/11/edm">
<EntityContainer Name="EntityModel" p1:LazyLoadingEnabled="true">
<EntitySet Name="TableName" EntityType="Model.TableName" />
I am using following XPath to get all EntitySet Nodes within the EntityContainer:
/edmx:Edmx/edmx:Runtime/edmx:ConceptualModels/Schema/EntityContainer/EntitySet
but I am getting no result with this C# code:
XmlDocument xdoc = new XmlDocument("pathtoedmx");
var ns = new XmlNamespaceManager(xdoc.NameTable);
ns.AddNamespace("edmx", "http://schemas.microsoft.com/ado/2009/11/edmx");
ns.AddNamespace("annotation", "http://schemas.microsoft.com/ado/2009/02/edm/annotation");
ns.AddNamespace("p1", "http://schemas.microsoft.com/ado/2009/02/edm/annotation");
ns.AddNamespace("", "http://schemas.microsoft.com/ado/2009/11/edm");
var entitySets = xdoc.SelectNodes("/edmx:Edmx/edmx:Runtime/edmx:ConceptualModels/Schema/EntityContainer/EntitySet", ns);
Already got the XPath from this tool (http://qutoric.com/xmlquire/), because I started not trusting my own XPath skills but it tells me the same XPath I was already using.
If I remove the "/Schema/EntityContainer/EntitySet" part its finding the "/edmx:Edmx/edmx:Runtime/edmx:ConceptualModels", but not further on already tried to specify the "edmx" namespace ("edmx:/Schema") but no difference.
Hope you can help me out, already banging my head against the table. :)
Namespaces are a convention on how to combine two different XML dialects into a single document. Those prefixes really doesn't matter as long you keep your URI component exactly the same. For instance, take something like this:
ns.AddNamespace("xxx", "http://schemas.microsoft.com/ado/2009/11/edmx");
Console.WriteLine(xdoc.SelectNodes("/xxx:Edmx", ns).Count); // 1
You'll get one node because your namespace URI matched, despite your "wrong" namespace prefix.
If you have an attribute named xmlns, current element and it's children will inherits that namespace URI.
In your case, your root element doesn't have a default namespace and that's ok. But your Schemas element does have a namespace and you need to inform it. I came with this code:
// change "" to "edm"
ns.AddNamespace("edm", "http://schemas.microsoft.com/ado/2009/11/edm");
var entitySets = xdoc.SelectNodes("/edmx:Edmx/edmx:Runtime/edmx:ConceptualModels/edm:Schema/edm:EntityContainer/edm:EntitySet", ns);
I usually search the web high and low for my answers but this time I am drawing a blank. I'm using VS2005 to write code to POST xml to an API. I have classes setup in C# that I serialize into an XML document. The classes are below:
[Serializable]
[XmlRoot(Namespace = "", IsNullable = false)]
public class Request
{
public RequestIdentify Identify;
public string Method;
public string Params;
}
[Serializable]
public class RequestIdentify
{
public string StoreId;
public string Password;
}
When I serialize this I get:
<?xml version="1.0" encoding="UTF-8"?>
<Request xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Identify>
<StoreId>00</StoreId>
<Password>removed for security</Password>
</Identify>
<Method>ProductExport</Method>
<Params />
</Request>
But the API returns a "No XML sent" error.
If I send the xml directly in a string as:
string xml = #"<Request><Identify><StoreId>00</StoreId><Password>Removed for security</Password></Identify><Method>ProductExport</Method><Params /></Request>";
effectively sending this xml (without the schema info in the "Request" tag):
<Request>
<Identify>
<StoreId>00</StoreId>
<Password>Removed for security</Password>
</Identify>
<Method>ProductExport</Method>
<Params />
</Request>
It seems to recognise the XML no problem.
So my question I guess is how can I change my current classes to Serialize into XML and get the XML as in the second case? I assume I need another "parent" class to wrap around the existing ones and use InnerXml property on this "parent" or something similar but I don't know how to do this.
Apologies for the question, I've only been using C# for 3 months and I'm a trainee developer who is having to teach himself on the job!
Oh and PS I don't know why but VS2005 really does not want to let me set these classes up with private variables and then use getters and setters on public equivalents so I have them written how they are for now.
Thanks in advance :-)
UPDATE:
As with most things it's very hard to find answers if you're not sure what you need to ask or how to word it but:
Once I knew what to look for I found the answers I needed:
Removing the XML declaration:
XmlWriterSettings writerSettings = new XmlWriterSettings();
writerSettings.OmitXmlDeclaration = true;
StringWriter stringWriter = new StringWriter();
using (XmlWriter xmlWriter = XmlWriter.Create(stringWriter, writerSettings))
{
serializer.Serialize(xmlWriter, request);
}
string xmlText = stringWriter.ToString();
Removing/Setting the namespace (Thanks to above replies that helped find this one!):
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add("", "");
Thanks for your help everyone who answered or pointed me in the right direction! And yes I did find articles to read once I knew what I was asking :-) It's the first time I have come unstuck in 3 months of teaching myself so I think I'm doing pretty well...
From Rydal's blog:
The XmlDocument object by default assigns a namespace to the XML string and also includes the declaration as the first line of the XML document. I definitely do not need or use those, therefore, I need to remove them. Here is how you go about doing just that.
Removing declaration and namespaces from XML serialization