c# cannot extract element with xsd:schemaLocation attribute - c#

Please have a look at the following lines of XML codes. My goal is to extract the values in the interactor element:
<HPRD3r xmlns="org:hprd:dtd:hprd3r">
<interactions>
<entrySet xsi:schemaLocation="net:sf:psidev:mi http://psidev.sourceforge.net/mi/rel25/src/MIF25.xsd">
<interactionList>
<interactor>
For simplicity, let's assume interactions is a direct child of root.
Set the namespace as follows,
XNamespace ns = "org:hprd:dtd:hprd3r";
The following always returns null although "entrySet" is present:
root.Element(ns+"interactions").Element(ns+"entrySet");
On the other hand,
root.Descendants(ns+"interactor");
does not return null but gives a count of zero even if there are more than one interactor elements in the file.
Seems like the problem is the attribute xsi:schemaLocation in entrySet. Would someone explain to me please the reasons behind the problems above and how to fix them.
Thanks

Related

Get all attributes with the same name

I'm using XDocument and I need to parse my XML file to retrieve all attribute with the same name event if its node's name is different from the other.
For example, for this XML :
<document>
<person name='jame'/>
<animals>
<dog name='robert'/>
</animals>
</document>
I want to retrieve all attributes named 'name'.
Can I do that with one request XPath or do I need to parse every node to find thos attributes ?
Thanks for your help !
The XPath expression
//#name
will select all attributes called name, regardless of where they appear.
By the way, 'parsing' is something that happens to the XML document before XPath ever enters the picture. So when you say "do I need to parse every node", I think this isn't really what you mean. The entire document is typically already parsed before you run an XPath query. However, I'm not sure what you do mean instead of 'parse'. Probably something like "do I need to visit every element" to find those attributes? In which case the answer is no, unless in some vague implementation-dependent sense that doesn't make any difference to you.

Check if element in XML exists that ends with matching string

I can see a way of searching for an element within XML by just going:
if(doc.SelectSingleNode("//mynode")==null)
But what I'm more interested in, is finding an element that matches the part of the name. Something like:
doc.SelectSingleNode ...that contains "table" in it.
So if I had a node called "AlinasTable", I want it to find that. Why it matters is because my node can inconsistently contain anything that comes before "table", like "JohnsTable" - in which case I'd want that to be returned. So something more generic.
Cheers.
You can use the contains function, as in the following XPath expression:
doc.SelectSingleNode("//*[contains(name(), 'Table')]")

Xml with Namespace but without xmlns

I am working on a program in C# that edits open-document files on xml level. For example it adds rows to tables.
So I load the content.xml into an XmlDocument "doc" and traverse the xml structure.
Say I have the <table:table-row> node in an XmlNode "row" and now I want to add a <table:table-cell> node to it. So I call
XmlDocument doc = new XmlDocument();
doc.Load(filename);
...
XmlNode row = ...;
...
XmlNode cell = doc.CreateElement("table:table-cell");
row.Append(cell);
...
doc.Save(filename);
The problem is that, in the file, the new node only contains
<table-cell>...</table-cell>
C# just decides to ignore what I told it to and does something else without even telling me (at first I overlooked the problem and was wondering why it didn't work although the generated xml looked okay).
From what I gathered out so far, the problem has to do with the fact that "table:" is a namespace. When I also supply a NamespaceURI to CreateElement, I get
<table:table-cell table:xmlns="THE_URI" >... - but the original document did not have this xmlns, so I don't want it either...
I tried to use an XmlTextWriter and setting writer.Settings.Namespaces = false, because I thought, this should suppress the output of the xmlns, but it only caused an exception - the document has some namespaces, which are forbidden if Namespaces is set to false... (wtf!? suppressing the output of xmlns seems a billion times more logical than throwing an exception if an xmlns is present...)
In some similar discussions I read that you should set the cell.Name manually, but this property is read-only...
Others suggest to change it on text-file level (that's tinkering and it would be slow)
Can anyone give me a hint?
Every namespace should have at least one xmlns definition with a URI. This is the ultimate differentiation between two tags.
You can however have the xmlns attribute declared only once in the file (in the beginning).
See
Creating a specific XML document using namespaces in C#
The table: parts are not namespaces. They are "namespace prefixes". They are an alias for the actual namespace. They must be declared somewhere. If they are not declared at all in your source XML, then it is not valid XML, and you shouldn't expect to be able to process it.
Are you sure that what you have loaded is the entire XML document? They haven't left off parts to make it simpler? Those parts may be the ones that contain the definition of table:.

Xelement.XPathSelectElement

I have the following XPath String
"(//DEAL_SETS/DEAL_SET/DEALS/DEAL/PARTIES/PARTY[ROLES/ROLE/
ROLE_DETAIL/PartyRoleType='Borrower'])
[(ROLES/PARTY_ROLE_IDENTIFIERS/PARTY_ROLE_IDENTIFIER/
PartyRoleIdentifier='1' or position() = 1)]/ROLES/ROLE/BORROWER/
RESIDENCES/RESIDENCE/ADDRESS/PostalCode[../../RESIDENCE_DETAIL/
BorrowerResidencyType='Current']"
which works when I put it in Altova XML Spy and gives me a result.
But when I use it directly as it is Xelement.XPathSelectElement(XPath), it does not work, but what works is Xelement.XPathSelectElement(collective, nameSpaceManager) where collective is namesspace prefix (say named "ns") plus my XPath string.
But the problem is I have to change XPath string to something like this
"(//ns:DEAL_SETS/ns:DEAL_SET/ns:DEALS/ns:DEAL/ns:PARTIES/ns:PARTY[ns:ROLES/
ns:ROLE/ns:ROLE_DETAIL/ns:PartyRoleType='Borrower'])[(ns:ROLES/
ns:PARTY_ROLE_IDENTIFIERS/ns:PARTY_ROLE_IDENTIFIER/
ns:PartyRoleIdentifier='1' or position() = 1)]/ns:ROLES/
ns:ROLE/ns:BORROWER/ns:RESIDENCES/ns:RESIDENCE/
ns:ADDRESS/ns:PostalCode[../../ns:RESIDENCE_DETAIL/
ns:BorrowerResidencyType='Current']"
Is there any way to avoid having to put the namespace prefix (ns:) at each node.
Sorry for not posting the sample xml earlier,you may replicate the Party tag and its elements and fill up with false data, PartyRoleType gives me what party it is could be borrower,appraiser etc partyroleidentifier gives me a way to differentiate between two party with same partyroletypes for eg there could be 2 borrowers,the partyroleidentifier differentiates them as 1 and 2
<?xml version="1.0" encoding="utf-8"?>
<MESSAGE xmlns="http://www.example.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xxxxReferenceModelIdentifier="3.0.0.263.12" xsi:schemaLocation="http://www.example.org/schemas C:\Subversion\xxx_3_0.xsd">
<DEAL_SETS>
<DEAL_SET>
<DEALS>
<DEAL>
<PARTIES>
<PARTY>
<ROLES>
<ROLE>
<PARTY_ROLE_IDENTIFIERS>
<PARTY_ROLE_IDENTIFIER>
<PartyRoleIdentifier>1</PartyRoleIdentifier>
</PARTY_ROLE_IDENTIFIER>
</PARTY_ROLE_IDENTIIERS>
<BORROWER>
<RESIDENCES>
<RESIDENCE>
<ADDRESS>
<PostalCode>56236</PostalCode>
</ADDRESS>
</RESIDENCE>
</RESIDENCES>
</BORROWER>
<ROLE_DETAIL>
<PartyRoleType>Borrower</PartyRoleType>
</ROLE_DETAIL>
</ROLE>
</ROLES>
</PARTY>
</PARTIES>
</DEAL>
</DEALS>
</DEAL_SET>
</DEAL_SETS>
</MESSAGE>
According to the MSDN documentation, when you call AddNamespace for your namespace manager, you can use String.Empty as the prefix for your namespace URI. That should enable you to do an XPath expression without all those ns: prefixes.
Update:
OK, I remember encountering this issue before. The behavior of empty namespace prefixes in XPath expressions is not consistent with the rest of the .NET XML suite. There is some evidence that some Microsoft folks are aware of the issue. See this forum thread. I'll look for a while more for solutions, but I'm thinking that you'll have to make do with using namespace prefixes.
Update 2:
Feel free to vote on this Microsoft Connect Bug that I've just created. I see that they have closed similar bugs in the past, but none of them focused on usability concerns before. Maybe we'll get some traction with more votes.
Update 3:
Just some clarifications, my filed bug is not and never was about compliance with XPath standards, but rather about usability. No matter the standards, users need a way to make XPath queries with namespaced XML more succinct, and that is currently not supported. As far as standards support goes, XSLT 2.0 has a way to do XPath 2.0 with empty namespace prefixes associated with a namespace URI, and XMLSpy has that capability as well. The MSDN documentation claims that .NET has XPath 2.0 support, but the veracity of that claim is under dispute. One could argue that .NET is properly implementing XPath 1.0 by disallowing the modification of what an empty prefix represents.
The currently accepted answer is wrong !
Any compliant XPath processor interprets a non-prefixed name to belong to "no-namespace".
This is the biggest FAQ in XPath:
There is no way in XPath to select elements in a default namespace, by using their un-prefixed name.
The reason is that XPath considers any unprefixed name to reside in "no namespave" and it is looking to find elements with such name that are in "no namespace" Such elements aren't found and selected, because all elements are in a non-empty, default namespace.
Here is a precise quote from the W3C XPath 1.0 specification:
"if the QName does not have a prefix, then the namespace URI is null".
There are two main solutions:
Using the XPath-related API available in your programming language, register an association between the default namespace and a string prefix (lets say "x"). Then modify your wxpression and replace anyunprefixed someElementName with x:someElementName. In .NET, to register the association, an XmlNamespaceManager object typically has to be created, and its AddNamespace() method needs to be used for registering the association. Then, when calling the XPath-selecting method, the XmlNamespaceManager instance is passed as an additional argument.
If one wants to avoid registering the default namespace, one workaround is, instead of:
..
/a/b/c
to use:
/*[name()='a']/*[name()='b']/*[name()='c']

Parsing XML: Colon in my element causes XPath to miss it

I have an XML document that I load in and try to search with XPath. The root node in this file is <t:Transmission xmlns:t='urn:InboundShipment'> and the file end is properly closed with </t:Transmission>.
My problem is that I cannot walk the tree without using a descendant axis. In other words, I can do: SelectSingleNode("//TransactionHeader[SHIPPERSTATE='CA']") and get a node in return. But I cannot do what should be the equivalent: SelectSingleNode("/Transmission/TransmissionBody/Transaction/TransactionHeader[SHIPPERSTATE='CA']")
If I remove the t: I can do an XPath search on /Transmission and get the whole file. With the t: in there I just get null. Or if I try SelectSingleNode("t:Transmission") I get an error with my XPath statement.
I generally do not need to query the root element, so I should be able to make do with just using the descendant axis for my searches. But the XML looks valid to me and so I'd like to know how to address this. Plus I don't want to ask the client to remove "t:" just because I don't know how to deal with it.
The "t:" is a namespace prefix, which is bound to the namespace 'urn:InboundShipment.' In order to properly handle it, you have to tell c# what the prefix is bound to. This page should explain how to use System.Xml.XmlNamespaceManager to handle the namespace.
Edit: See this answer, as well.

Categories

Resources