Select xml file part with xpath and xdocument - C#/Win8 - c#

I am building a Windows 8 app, and I need to extract the whole XML node and its children as string from a large xml document, and the method that does that so far looks like this:
public string GetNodeContent(string path)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = true;
settings.ConformanceLevel = ConformanceLevel.Auto;
settings.IgnoreComments = true;
using (XmlReader reader = XmlReader.Create("something.xml", settings))
{
reader.MoveToContent();
reader.Read();
XmlDocument doc = new XmlDocument();
doc.LoadXml(reader.ReadOuterXml());
IXmlNode node = doc.SelectSingleNode(path);
return node.InnerText;
}
}
When I pass any form of xpath, node gets the value of null. I'm using the reader to get the first child of root node, and then use XMLDocument to create one from that xml. Since it's Windows 8, apparently, I can't use XPathSelectElements method and this is the only way I can't think of. Is there a way to do it using this, or any other logic?
Thank you in advance for your answers.
[UPDATE]
Let's say XML has this general form:
<nodeone attributes...>
<nodetwo attributes...>
<nodethree attributes... />
<nodethree attributes... />
<nodethree attributes... />
</nodetwo>
</nodeone >
I expect to get as a result nodetwo and all of its children in the form of xml string when i pass "/nodeone/nodetwo" or "//nodetwo"

I've come up with this solution, the whole approach was wrong to start with. The problematic part was the fact that this code
reader.MoveToContent();
reader.Read();
ignores the namespace by itself, because it skips the root tag. This is the new, working code:
public static async Task<string> ReadFileTest(string xpath)
{
StorageFolder folder = await Package.Current.InstalledLocation.GetFolderAsync("NameOfFolderWithXML");
StorageFile xmlFile = await folder.GetFileAsync("filename.xml");
XmlDocument xmldoc = await XmlDocument.LoadFromFileAsync(xmlFile);
var nodes = doc.SelectNodes(xpath);
XmlElement element = (XmlElement)nodes[0];
return element.GetXml();
}

Related

Parsing XML Using C# in uwp Xaml

How do you get the data from description in the tag something: <something description = "something else"> </something> using c# in uwp.
Here is the test code to show how to get the XML node attribute value:
private void GetContent()
{
string xml = "<?xml version=\"1.0\" encoding=\"utf-8\"?><body><content title =\"XML File!\"></content></body>";
var doc = new XmlDocument();
doc.LoadXml(xml);
var tags=doc.GetElementsByTagName("content");
if (tags.Count > 0)
{
var firstContent = tags.First();
string result = firstContent.Attributes.GetNamedItem("title").InnerText;
}
}
Tips
In UWP, loading an XmlDocument via a path is not recommended. It is best to get the XML file first, read all the text, and load the XmlDocument via text.
The XmlDocument prefix namespace is Windows.Data.Xml.Dom, NOT System.Xml
Best regards.

Using XDocument to read the root element from XML using C# is not showing the root element

I am new to C# programming and trying to update the XML file using C#. Here when I am trying to get the root element using XDocument it is showing the complete script in the file.
Below is my code explanation:
I am having the below function and it is reading the file path from the command line arguments.
private XDocument doc;
public void Update(string filepath)
{
string filename = Path.GetFileName(filepath);
doc = xDocument.Load(filepath);
XElement rootelement = doc.Root;
}
Into the filepath variable, we are taking the path "E:\BuilderTest\COMMON.wxs"
Then we are loading the file using XDocument.
But when we are trying to get the rootelement from the file, it is not showing the root element. Instead, it is showing the complete data in the file.
But when I am using XmlDocument() instead of XDocument() I am able to see only the root element.
Below is the code using XmlDocument():
private XmlDocument doc;
public void Update(string filepath)
{
string filename = Path.GetFileName(filepath);
doc = new XmlDocument();
doc.Load(filepath);
XmlElement rootelement = doc.DocumentElement;
}
Please help me by providing your valuable inputs on this.
XDocument and XmlDocument are different class structure to follow as per requirement.
XDocument will work like below
XDocument doc;
doc = XDocument.Load(filepath);
XElement root = doc.Root;
Root, Descendants, Elements are the operations provided in XDocument. For every node its gives XElement
In your case you should use doc.Root to find the element, then use .Value to get its value
XElement comes with System.Xml.Linq. It is derived from XNode.
It gives you serialized information of every node one by one.
On the other hand XMLDocument will work like below
XmlDocument doc;
doc = new XmlDocument();
doc.Load(filepath);
XmlElement rootelement = doc.DocumentElement;
XmlElement comes with System.Xml. It is derived from XmlNode which is again derived from IEnumerable.
It gives you information in a Enumerable which you can easily parse.

XXE: Improper Restriction of XML External Entity Reference With XDocument

So I am running into an issue when I run a security scan on my application. It turns out that I am failing to protect against XXE.
Here is a short snippet showing the offending code:
static void Main()
{
string inp = Console.ReadLine();
string xmlStr = ""; //This has a value that is much too long to put into a single post
if (!string.IsNullOrEmpty(inp))
{
xmlStr = inp;
}
XmlDocument xmlDocObj = new XmlDocument {XmlResolver = null};
xmlDocObj.LoadXml(xmlStr);
XmlNodeList measureXmlNodeListObj = xmlDocObj.SelectNodes("REQ/MS/M");
foreach (XmlNode measureXmlNodeObj in measureXmlNodeListObj)
{
XmlNode detailXmlNodeListObj = xmlDocObj.SelectSingleNode("REQ/DTD");
string measureKey = measureXmlNodeObj.Attributes["KY"].Value;
if (detailXmlNodeListObj.Attributes["MKY"].Value ==
measureKey) //Checking if selected MeasureKey is same
{
XmlNode filerNode = measureXmlNodeObj.SelectSingleNode("FS");
if (filerNode != null)
{
XDocument fixedFilterXmlObj = XDocument.Load(new StringReader(filerNode.OuterXml));
var measureFixedFilters = (from m in fixedFilterXmlObj.Element("FS").Elements("F")
select m).ToList();
foreach (var fixedFilter in measureFixedFilters)
{
var fixedFilterValues = (from m in fixedFilter.Elements("VS").Elements("V")
select m.Attribute("DESC").Value).ToList();
foreach (var value in fixedFilterValues)
{
Console.WriteLine(value.Trim());
}
}
}
}
}
Console.ReadLine();
}
According to Veracode, the line that unsafe is XDocument fixedFilterXmlObj = XDocument.Load(new StringReader(filerNode.OuterXml));
But it seems like according to Owsap, it should be safe:
Both the XElement and XDocument objects in the System.Xml.Linq library
are safe from XXE injection by default. XElement parses only the
elements within the XML file, so DTDs are ignored altogether.
XDocument has DTDs disabled by default, and is only unsafe if
constructed with a different unsafe XML parser.
So it seems like I am making the mistake of using an usafe XML Parser, opening XDocument to XXE.
I found a unit test that replicates the issue and also has a safe usage of XDocument but I can't seem to find what exactly my code is unsafe, because I do not use:
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse; // unsafe!
You can run my code to replicate the issue, but you should replace the line with the empty xmlStr with this value: here (too large for a single post)
I'm not sure how or why this works, but it does:
XDocument fixedFilterXmlObj;
using (XmlNodeReader nodeReader = new XmlNodeReader(filerNode))
{
nodeReader.MoveToContent();
fixedFilterXmlObj = XDocument.Load(nodeReader);
}

Wrap XML root node with parent node

I have a .net Web Api 2 application that delivers data in XML.
My problem:
One of my classes looks like this:
public class Horse
{
public string Name { get;set; }
public string Category { get;set; }
}
When i serialize this, the result is:
<Horse>
<Name>Bobo</Name>
<Category>LargeAnimal</Category>
</Horse>
What i want is to wrap all outgoing XML content with a root element like this:
<Animal>
<Horse>
.....
</Horse>
</Animal>
I was hoping to do this in a custom XmlFormatter. But i can't seem to figure out how to append a root element on the writestream.
What is the best way to resolve this issue?
I have tried tweaking this answer to work in my custom xmlserializer, but doesn't seem to work. How to add a root node to an xml?
( I had a really short amount of time to write this question, so if anything is missing, please leave a comment.)
So.. Tweaked the answer to this question: How to add a root node to an xml? to work with my XmlFormatter.
The following code works, although i feel this is a hackish approach.
public override Task WriteToStreamAsync(Type type, object value, Stream writeStream, HttpContent content, TransportContext transportContext)
{
return Task.Factory.StartNew(() =>
{
XmlSerializer xs = new XmlSerializer(type);
XmlDocument temp = new XmlDocument(); //create a temporary xml document
var navigator = temp.CreateNavigator(); //use its navigator
using (var w = navigator.AppendChild()) //to get an XMLWriter
xs.Serialize(w, value); //serialize your data to it
XmlDocument xdoc = new XmlDocument(); //init the main xml document
//add xml declaration to the top of the new xml document
xdoc.AppendChild(xdoc.CreateXmlDeclaration("1.0", "utf-8", null));
//create the root element
var animal = xdoc.CreateElement("Animal");
animal.InnerXml = temp.InnerXml; //copy the serialized content
xdoc.AppendChild(animal);
using (var xmlWriter = new XmlTextWriter(writeStream, encoding))
{
xdoc.WriteTo(xmlWriter);
}
});
}

Reading large xml file makes the server stop working - out of memory

I have a piece of code which works well for normal files. But for really big files, it makes the server stop working.
Here it is:
XmlReader reader = null;
try
{
reader = XmlReader.Create(file_name + ".xml");
XDocument xml = XDocument.Load(reader);
XmlNamespaceManager namespaceManager = GetNamespaceManager(reader);
XElement root = xml.Root;
//XAttribute supplier = root.XPathSelectElement("//sh:Receive/sh:Id", namespaceManager).Attribute("Authority");
//string version = root.XPathSelectElement("//sh:DocumentId/sh:Version", namespaceManager).Value;
var nodes = root.XPathSelectElements("//eanucc:msg/eanucc:transact", namespaceManager);
return nodes;
}
catch
{ }
I think this is the part which causes the memory problem which happens on the server. How can I fix this?
It sounds like there's simply too much data to read in one go. You'll have to iterate over the elements one at a time, using XmlReader as a cursor, and converting one element to XElement at a time.
public static IEnumerable<XElement> ReadTransactions()
{
using (var reader = XmlReader.Create(file_name + ".xml"))
{
while (reader.ReadToFollowing("transact", eanuccNamespaceUri))
{
using (var subtree = reader.ReadSubtree())
{
yield return XElement.Load(subtree);
}
}
}
}
Note: this assumes there are never "transact" elements at any other level. If there are, you'll need to be more careful with your XmlReader than just calling ReadToFollowing. Also note that you'll need to find the actual namespace URI of the eanucc alias.
Don't forget that if you try to read all of this information in one go (e.g. by calling ToList()) then you'll still run out of memory. You need to stream the information. (It's not clear what you're trying to do with the elements, but you need to think about it carefully.)
Try putting the reader in a using(){} clause so it gets disposed of after use.
try
{
using(var reader = XmlReader.Create(file_name + ".xml"))
{
XDocument xml = XDocument.Load(reader);
XmlNamespaceManager namespaceManager = GetNamespaceManager(reader);
XElement root = xml.Root;
var nodes = root.XPathSelectElements("//eanucc:msg/eanucc:transact", namespaceManager);
return nodes;
}
}
catch
{ }

Categories

Resources