I'm trying to get rid of empty namespace tags in my xml file. All of the solutions i've seen are based creating the xml from scratch. I have various xelements constructed from a previous xml. All I'm doing is
XElement InputNodes = XElement.Parse(InputXML);
m_Command = InputNodes.Element("Command");
and it adding the xmlns = "" everywhere. This is really infuriating. Thanks for any help.
There's a post on MSDN blogs that shows how to get around this (reasonably) easily. Before outputing the XML, you'll want to execute this code:
foreach (XElement e in root.DescendantsAndSelf())
{
if (e.Name.Namespace == string.Empty)
{
e.Name = ns + e.Name.LocalName;
}
}
The alternative, as the poster mentions, is prefixing every element name with the namespace as you add it, but this seems like a nicer solution in that it's more automated and saves a bit of typing.
Possibly it's this: Empty namespace using Linq Xml
This would indicate your document is in a different default namespace than the elements you add.
I think the second answer down on this post:
XElement Add function adds xmlns="" to the XElement
was very useful. Basically if you just do
XNamespace rootNamespace = doc.Root.Name.NamespaceName;
XElement referenceElement = new XElement(rootNamespace + "Reference");
That should solve it. So I guess you have to tell it not to worry about a special namespace when you are creating the element. Odd.
Related
I am working on a program in C# that edits open-document files on xml level. For example it adds rows to tables.
So I load the content.xml into an XmlDocument "doc" and traverse the xml structure.
Say I have the <table:table-row> node in an XmlNode "row" and now I want to add a <table:table-cell> node to it. So I call
XmlDocument doc = new XmlDocument();
doc.Load(filename);
...
XmlNode row = ...;
...
XmlNode cell = doc.CreateElement("table:table-cell");
row.Append(cell);
...
doc.Save(filename);
The problem is that, in the file, the new node only contains
<table-cell>...</table-cell>
C# just decides to ignore what I told it to and does something else without even telling me (at first I overlooked the problem and was wondering why it didn't work although the generated xml looked okay).
From what I gathered out so far, the problem has to do with the fact that "table:" is a namespace. When I also supply a NamespaceURI to CreateElement, I get
<table:table-cell table:xmlns="THE_URI" >... - but the original document did not have this xmlns, so I don't want it either...
I tried to use an XmlTextWriter and setting writer.Settings.Namespaces = false, because I thought, this should suppress the output of the xmlns, but it only caused an exception - the document has some namespaces, which are forbidden if Namespaces is set to false... (wtf!? suppressing the output of xmlns seems a billion times more logical than throwing an exception if an xmlns is present...)
In some similar discussions I read that you should set the cell.Name manually, but this property is read-only...
Others suggest to change it on text-file level (that's tinkering and it would be slow)
Can anyone give me a hint?
Every namespace should have at least one xmlns definition with a URI. This is the ultimate differentiation between two tags.
You can however have the xmlns attribute declared only once in the file (in the beginning).
See
Creating a specific XML document using namespaces in C#
The table: parts are not namespaces. They are "namespace prefixes". They are an alias for the actual namespace. They must be declared somewhere. If they are not declared at all in your source XML, then it is not valid XML, and you shouldn't expect to be able to process it.
Are you sure that what you have loaded is the entire XML document? They haven't left off parts to make it simpler? Those parts may be the ones that contain the definition of table:.
I have some XML data (similar to the sample below) and I want to read the values in code.
Why am I forced to specify the default namespace to access each element? I would have expected the default namespace to be used for all elements.
Is there a more logical way to achieve my goal?
Sample XML:
<?xml version="1.0" encoding="UTF-8"?>
<ReceiptsBatch xmlns="http://www.secretsonline.gov.uk/secrets">
<MessageHeader>
<MessageID>00000173</MessageID>
<Timestamp>2009-10-28T16:50:01</Timestamp>
<MessageCheck>BX4f+RmNCVCsT5g</MessageCheck>
</MessageHeader>
<Receipts>
<Receipt>
<Status>OK</Status>
</Receipt>
</Receipts>
</ReceiptsBatch>
Code to read xml elements I'm after:
XDocument xDoc = XDocument.Load( FileInPath );
XNamespace ns = "http://www.secretsonline.gov.uk/secrets";
XElement MessageCheck = xDoc.Element(ns+ "MessageHeader").Element(ns+"MessageCheck");
XElement MessageBody = xDoc.Element("Receipts");
As suggested by this answer, you can do this by removing all namespaces from the in-memory copy of the document. I suppose this should only be done if you know you won't have name collisions in the resulting document.
/// <summary>
/// Makes parsing easier by removing the need to specify namespaces for every element.
/// </summary>
private static void RemoveNamespaces(XDocument document)
{
var elements = document.Descendants();
elements.Attributes().Where(a => a.IsNamespaceDeclaration).Remove();
foreach (var element in elements)
{
element.Name = element.Name.LocalName;
var strippedAttributes =
from originalAttribute in element.Attributes().ToArray()
select (object)new XAttribute(originalAttribute.Name.LocalName, originalAttribute.Value);
//Note that this also strips the attributes' line number information
element.ReplaceAttributes(strippedAttributes.ToArray());
}
}
You can use XmlTextReader.Namespaces property to disable namespaces while reading XML file.
string filePath;
XmlTextReader xReader = new XmlTextReader(filePath);
xReader.Namespaces = false;
XDocument xDoc = XDocument.Load(xReader);
This is how the Linq-To-Xml works. You can't find any element, if it is not in default namespace, and the same is true about its descendants. The fastest way to get rid from namespace is to remove link to the namespace from your initial XML.
The theory is that the meaning of the document is not affected by the user's choice of namespace prefixes. So long as the data is in the namespace http://www.secretsonline.gov.uk/secrets, it doesn't matter whether the author chooses to use the prefix "s", "secrets", "_x.cafe.babe", or the "null" prefix (that is, making it the default namespace). Your application shouldn't care: it's only the URI that matters. That's why your application has to specify the URI.
Note that the element Receipts is also in namespace http://www.secretsonline.gov.uk/secrets, so the XNamespace would also be required for the access to the element:
XElement MessageBody = xDoc.Element(ns + "Receipts");
As an alternative to using namespaces note that you can use "namespace agnostic" xpath using local-name() and namespace-uri(), e.g.
/*[local-name()='SomeElement' and namespace-uri()='somexmlns']
If you omit the namespace-uri predicate:
/*[local-name()='SomeElement']
Would match ns1:SomeElement and ns2:SomeElement etc. IMO I would always prefer XNamespace where possible, and the use-cases for namespace-agnostic xpath are quite limited, e.g. for parsing of specific elements in documents with unknown schemas (e.g. within a service bus), or best-effort parsing of documents where the namespace can change (e.g. future proofing, where the xmlns changes to match a new version of the document schema)
I have an application that has to load XML document and output nodes depending on XPath.
Suppose I start with a document like this:
<aaa>
...[many nodes here]...
<bbb>text</bbb>
...[many nodes here]...
<bbb>text</bbb>
...[many nodes here]...
</aaa>
With XPath //bbb
So far everything is nice.
And selection doc.SelectNodes("//bbb"); returns the list of required nodes.
Then someone uploads a document with one node like <myfancynamespace:foo/> and extra namespace in the root tag, and everything breaks.
Why? //bbb does not give a damn about myfancynamespace, theoretically it should even be good with //myfancynamespace:foo, as there is no ambiguity, but the expression returns 0 results and that's it.
Is there a workaround for this behavior?
I do have a namespace manager for the document, and I am passing it to the Xpath query. But the namespaces and the prefixes are unknown to me, so I can't add them before the query.
Do I have to pre-parse the document to fill the namespace manager before I do any selections? Why on earth such behavior, it just doesn't make sense.
EDIT:
I'm using:
XmlDocument and XmlNamespaceManager
EDIT2:
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
//I wish I could:
//nsmgr.AddNamespace("magic", "http://magicnamespaceuri/
//...
doc.LoadXML(usersuppliedxml);
XmlNodeList nodes = doc.SelectNodes(usersuppliedxpath, nsmgr);//usersuppliedxpath -> "//bbb"
//nodes.Count should be > 0, but with namespaced document they are 0
EDIT3:
Found an article which describes the actual scenario of the issue with one workaround, but not very pretty workaround: http://codeclimber.net.nz/archive/2008/01/09/How-to-query-a-XPath-doc-that-has-a-default.aspx
Almost seems that stripping the xmlns is the way to go...
You're missing the whole point of XML namespaces.
But if you really need to perform XPath on documents that will use an unknown namespace, and you really don't care about it, you will need to strip it out and reload the document. XPath will not work in a namespace-agnostic way, unless you want to use the local-name() function at every point in your selectors.
private XmlDocument StripNamespace(XmlDocument doc)
{
if (doc.DocumentElement.NamespaceURI.Length > 0)
{
doc.DocumentElement.SetAttribute("xmlns", "");
// must serialize and reload for this to take effect
XmlDocument newDoc = new XmlDocument();
newDoc.LoadXml(doc.OuterXml);
return newDoc;
}
else
{
return doc;
}
}
<myfancynamespace:foo/> is not necessarily the same as <foo/>.
Namespaces do matter. But I can understand your frustration as they usually tend to breaks codes as various implementation (C#, Java, ...) tend to output it differently.
I suggest you change your XPath to allow for accepting all namespaces. For example instead of
//bbb
Define it as
//*[local-name()='bbb']
That should take care of it.
You should describe a bit more detailed what you want to do. The way you ask your question it make no sense at all. The namespace is just a part of the name. Nothing more, nothing less. So your question is the same as asking for an XPath query to get all tags ending with "x". That's not the idea behind XML, but if you have strange reasons to do so: Feel free to iterate over all nodes and implement it yourself. The same applies to functionality you are requesting.
You could use the LINQ XML classes like XDocument. They greatly simplify working with namespaces.
So I'm trying to parse the following XML document with C#, using System.XML:
<root xmlns:n="http://www.w3.org/TR/html4/">
<n:node>
<n:node>
data
</n:node>
</n:node>
<n:node>
<n:node>
data
</n:node>
</n:node>
</root>
Every treatise of XPath with namespaces tells me to do the following:
XmlNamespaceManager mgr = new XmlNamespaceManager(xmlDoc.NameTable);
mgr.AddNamespace("n", "http://www.w3.org/1999/XSL/Transform");
And after I add the code above, the query
xmlDoc.SelectNodes("/root/n:node", mgr);
Runs fine, but returns nothing. The following:
xmlDoc.SelectNodes("/root/node", mgr);
returns two nodes if I modify the XML file and remove the namespaces, so it seems everything else is set up correctly. Any idea why it work doesn't with namespaces?
Thanks alot!
As stated, it's the URI of the namespace that's important, not the prefix.
Given your xml you could use the following:
mgr.AddNamespace( "someOtherPrefix", "http://www.w3.org/TR/html4/" );
var nodes = xmlDoc.SelectNodes( "/root/someOtherPrefix:node", mgr );
This will give you the data you want. Once you grasp this concept it becomes easier, especially when you get to default namespaces (no prefix in source xml), since you instantly know you can assign a prefix to each URI and strongly reference any part of the document you like.
The URI you specified in your AddNamespace method doesn't match the one in the xmlns declaration.
If you declare prefix "n" to represent the namespace "http://www.w3.org/1999/XSL/Transform", then the nodes won't match when you do your query. This is because, in your document, the prefix "n" refers to the namespace "http://www.w3.org/TR/html4/".
Try doing mgr.AddNamespace("n", "http://www.w3.org/TR/html4/"); instead.
This may seem like an odd question, but I have my own reasons for this! I am trying to parse a Delphi 2009 project file (.dproj), which is an XML representation of the project. I Can load the document into an XmlDocument, but when I try and get to the units that are used in the project, SelectNodes gives me an empty list.
An example of the project is below :
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
...
...
<ItemGroup>
<DelphiCompile Include="Package.dpk">
<MainSource>MainSource</MainSource>
</DelphiCompile>
<DCCReference Include="vcl.dcp"/>
<DCCReference Include="Unit1.pas"/>
<DCCReference Include="Unit2.pas"/>
<DCCReference Include="Unit3.pas"/>
<DCCReference Include="Unit4.pas"/>
<DCCReference Include="Unit5.pas"/>
...
</ItemGroup>
</Project>
An example of the code is below:
ProjectDocument.Load(FileName);
XmlNodeList nodeList;
XmlElement RootNode = ProjectDocument.DocumentElement;
string xmlns = RootNode.Attributes["xmlns"].Value;
// This gives an empty list
nodeList = RootNode.SelectNodes("/Project/ItemGroup/DCCReference");
foreach (XmlNode title in nodeList)
{
Console.WriteLine(title.InnerXml);
}
// This also gives an empty list
nodeList = RootNode.SelectNodes("/ItemGroup/DCCReference");
foreach (XmlNode title in nodeList)
{
Console.WriteLine(title.InnerXml);
}
The question is really, what am I doing wrong, as I must be missing something. The only odd thing is that the document is not a .xml, it is a .dproj.
So, thanks in advance if you can solve this.
Mark
Quoting the documentation:
Remarks
If the XPath expression requires namespace resolution, you must use the SelectNodes overload which takes an XmlNamespaceManager as its argument. The XmlNamespaceManager is used to resolve namespaces.
Note If the XPath expression does not include a prefix, it is assumed that the namespace URI is the empty namespace. If your XML includes a default namespace, you must still use the XmlNamespaceManager and add a prefix and namespace URI to it; otherwise, you will not get any nodes selected.
You need the two-argument version of SelectNodes. Also, note that the nodes you're selecting have no contents, so the InnerXML property will be empty. Once you get a non-empty list, your code will still print a list of empty lines.
You need to use a XmlNamespaceManager:
var nsmgr = new XmlNamespaceManager(ProjectDocument.NameTable);
nsmgr.AddNamespace("x", ProjectDocument.DocumentElement.NamespaceURI);
var nodes = doc.SelectNodes("descendant::x:DCCReference", nsmgr);
Also, why aren't you parsing the DPR file with your own simple parser? Find the uses
line, then grab every line after that, strip the comments, and the final comma, until you hit a line with a semicolon.
Benefit #2 besides simpler parsing is this works with EVERY delphi version, whereas DPROJ and equivalent files are famously different from delphi version to version.