I am new to XML Programming using C# and have been trying to grasp the concepts. I have a 2books.xml file which looks like
<!--sample XML fragment-->
<bookstore>
<book genre='novel' ISBN='10-861003-324'>
<title>The Handmaid's Tale</title>
<price>19.95</price>
</book>
<book genre='novel' ISBN='1-861001-57-5'>
<title>Pride And Prejudice</title>
<price>24.95</price>
</book>
<book genre='novel' ISBN='1-861991-57-9'>
<title>The Honor</title>
<price>20.12</price>
</book>
</bookstore>
Now using XmlReader when I try this following section of code
using (XmlReader xReader = XmlReader.Create(#"C:\Users\Chiranjib\Desktop\2books.xml"))
{
xReader.MoveToContent();
Console.WriteLine("-----------> Now "+xReader.Name);
Console.WriteLine("------Inner XML -----> "+xReader.ReadInnerXml()); //Positions the reader to the next root element type after the call
Console.WriteLine("------OuterXML XML -----> " + xReader.ReadOuterXml()); //Positions the reader to the next root element type after the call -- for a leaf node it reacts the same way as Read()
while (xReader.Read())
{
Console.WriteLine("In Loop");
if ((xReader.NodeType == XmlNodeType.Element) && (xReader.Name == "book"))
{
xReader.ReadToFollowing("price");
Console.WriteLine("---------- In Loop -------- Price "+xReader.GetAttribute("price"));
}
}
}
Console.ReadKey();
}
obviously xReader.ReadInnerXml() places the reader after call at the End of File and as a result of that xReader.ReadOuterXml() prints nothing.
Now I want xReader.ReadOuterXml() to be called successfully . How can I get back to my previous root node ?
I tried xReader.MoveToElement() but I guess it does not do so .
You can't really do that, as it's not what XmlReader was designed for. What you probably want is a much higher level API like LINQ to XML.
For example, you could loop through your books like this:
var doc = XDocument.Parse(xml);
foreach (var book in doc.Descendants("book"))
{
Console.WriteLine("Title: {0}", (string) book.Element("title"));
Console.WriteLine("ISBN: {0}", (string) book.Attribute("ISBN"));
Console.WriteLine("Price: {0}", (decimal) book.Element("price"));
Console.WriteLine("---");
}
See a working demo here: https://dotnetfiddle.net/m99eCl
Related
I am need bit of help on getting list of xml nodes and printing them.
My code is as below:
XmlDocument doc = new XmlDocument();
doc.Load("To44532.xml");
XmlNode xn = doc.DocumentElement;
foreach (XmlNode xn2 in xn)
{ Console.WriteLine(xn2); }
Console.ReadLine();
I am new to c# please accept my apologies in advance for asking this basic question. So I wanted full list of nodes and then printing them in output.
I ended up with this piece of code because I wanted to debug one of the other code. The idea was that I wanted to display specific nodes in winforms. I tried if statement e.g. :
foreach (XmlNode node in doc.DocumentElement)
{
if (node.Equals("DbtItm"))
{ ..... }
Could you please advise whats the best way to achieve it?
You can select XML Nodes by Name.
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
For example to get only book author and book year from as above xml.
XmlDocument xml = new XmlDocument();
xml.Load("XMLFile1.xml");
XmlNodeList xnList = xml.SelectNodes("/bookstore/book");
foreach (XmlNode xn in xnList)
{
string author = xn["author"].InnerText;
string year = xn["year"].InnerText;
Console.WriteLine(author+"-"+year);
}
This is my xml
<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography" publicationdate="1981-03-22" ISBN="1-861003-11-0">
<author>
<title>The Autobiography of Benjamin Franklin</title>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel" publicationdate="1967-11-17" ISBN="0-201-63361-2">
<author>
<title>The Confidence Man</title>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>
here is my code
XPathNavigator nav;
XPathNodeIterator nodesList = nav.Select("//bookstore//book");
foreach (XPathNavigator node in nodesList)
{
var price = node.Select("price");
string currentPrice = price.Current.Value;
var title = node.Select("author//title");
string text = title.Current.Value;
}
am getting the same output for both
The Autobiography of Benjamin FranklinBenjaminFranklin8.99
I will be having condition like if(price > 10) then get the title. how to fix this
The method XPathNavigator.Select() that you are calling here:
var price = node.Select("price");
Returns an XPathNodeIterator, so as shown in the docs you need to actually iterate through it, through either the old (c# 1.0!) style:
var price = node.Select("price");
while (price.MoveNext())
{
string currentPriceValue = price.Current.Value;
Console.WriteLine(currentPriceValue); // Prints 8.99
}
Or the newer foreach style, which does the same thing:
var price = node.Select("price");
foreach (XPathNavigator currentPrice in price)
{
string currentPriceValue = currentPrice.Value;
Console.WriteLine(currentPriceValue); // 8.99
}
In both examples above, the enumerator's current value is used after the first call to MoveNext(). In your code, you are using IEnumerator.Current before the first call to MoveNext(). And as explained in the docs:
Initially, the enumerator is positioned before the first element in the collection. You must call the MoveNext method to advance the enumerator to the first element of the collection before reading the value of Current; otherwise, Current is undefined.
The odd behavior you are seeing is as a result of using Current when the value is undefined. (I would sort of expect an exception to be thrown in such a situation, but all these classes are very old -- dating from c# 1.1 I believe -- and coding standards were less stringent then.)
If you are sure there will be only one <price> node and don't want to have to iterate through multiple returned nodes, you could use LINQ syntax to pick out that single node:
var currentPriceValue = node.Select("price").Cast<XPathNavigator>().Select(p => p.Value).SingleOrDefault();
Console.WriteLine(currentPriceValue); // 8.99
Or switch to SelectSingleNode():
var currentPrice = node.SelectSingleNode("price");
var currentPriceValue = (currentPrice == null ? null : currentPrice.Value);
Console.WriteLine(currentPriceValue); // 8.99
Finally, consider switching to LINQ to XML for loading and querying arbitrary XML. It's just much simpler than the old XmlDocument API.
You can use condition directly in an xpath expression.
XPathNodeIterator titleNodes = nav.Select("/bookstore/book[price>10]/author/title");
foreach (XPathNavigator titleNode in titleNodes)
{
var title = titleNode.Value;
Console.WriteLine(title);
}
I have a very long xml file and I need to identify what are the distinct TagName in that xml file. I wonder if I can get it in my C# application with XmlDocument library.
In this example xml, I want to find all the TagName: bookstore, book genre, title, first name
<bookstore>
<book genre="novel">
<title>The Autobiography of Benjamin Franklin</title>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<first-name>Herman</first-name>
</book>
</bookstore>
Parse it as an XDocument and you could do this:
var names = doc.Descendants().Select(e => e.Name.LocalName).Distinct();
This will give you the results (in some order):
bookstore
book
title
first-name
Otherwise if you must use an XmlDocument, you could do this:
var names = doc.DocumentElement
.SelectNodes("//*").Cast<XmlNode>()
.Select(e => e.LocalName)
.Distinct();
You can use HashSet to get distinct names. Moreover, it is very fast.
var doc = XDocument.Load("test.xml");
var set = new HashSet<string>();
foreach (var node in doc.Descendants())
{
set.Add(node.Name.LocalName);
foreach (var attr in node.Attributes())
set.Add(attr.Name.LocalName);
}
foreach (var name in set)
Console.WriteLine(name);
I want to load all of element in memory and find a list of root to node paths for them. for example in this XML :
<SigmodRecord>
<issue>
<volume>11</volume>
<number>1</number>
<articles>
<article>
<title>Annotated Bibliography on Data Design.</title>
<initPage>45</initPage>
<endPage>77</endPage>
<authors>
<author position="00">Anthony I. Wasserman</author>
<author position="01">Karen Botnich</author>
</authors>
</article>
<article>
<title>Architecture of Future Data Base Systems.</title>
<initPage>30</initPage>
<endPage>44</endPage>
<authors>
<author position="00">Lawrence A. Rowe</author>
<author position="01">Michael Stonebraker</author>
</authors>
</article>
<article>
<title>Database Directions III Workshop Review.</title>
<initPage>8</initPage>
<endPage>8</endPage>
<authors>
<author position="00">Tom Cook</author>
</authors>
</article>
<article>
<title>Errors in 'Process Synchronization in Database Systems'.</title>
<initPage>9</initPage>
<endPage>29</endPage>
<authors>
<author position="00">Philip A. Bernstein</author>
<author position="01">Marco A. Casanova</author>
<author position="02">Nathan Goodman</author>
</authors>
</article>
</articles>
</issue>
</SigmodRecord>
the answer must be something like this :
1 /SigmodRecord
2 /SigmodRecord/issue
3 /SigmodRecord/issue/volume
4 /SigmodRecord/issue/number
5 /SigmodRecord/issue/articles
6 /SigmodRecord/issue/articles/article
7 /SigmodRecord/issue/articles/article/title
8 /SigmodRecord/issue/articles/article/authors
9 /SigmodRecord/issue/articles/article/initPage
10 /SigmodRecord/issue/articles/article/endPage
11 /SigmodRecord/issue/articles/article/authors/author
You can use XLinq to query the XML document and fetch root nodes and it`s descendants.
XDocument xDoc = XDocument.Load("myXml.xml");
XElement element = null;
if(xDoc!=null)
{
element=xDoc.Root;
}
var descendants=element.DescendantsAndSelf(); //Returns collection of descancdants
var descendants=element.DescendantsAndSelf("nodeName");//Filters to send only nodes with specified name.
Hope it helps!!!
One possible way, by recursively extracting path for each XML element * :
public static List<string> GetXpaths(XDocument doc)
{
var xpathList = new List<string>();
var xpath = "";
foreach(var child in doc.Elements())
{
GetXPaths(child, ref xpathList, xpath);
}
return xpathList;
}
public static void GetXPaths(XElement node, ref List<string> xpathList, string xpath)
{
xpath += "/" + node.Name.LocalName;
if (!xpathList.Contains(xpath))
xpathList.Add(xpath);
foreach(XElement child in node.Elements())
{
GetXPaths(child, ref xpathList, xpath);
}
}
Usage example in console application :
var doc = XDocument.Load("path_to_your_file.xml");
var result = GetXpaths(doc);
foreach(var path in result)
Console.WriteLine(path);
.NET Fiddle demo
*) Adapted from my old answer to another question. Note that this only worked for simple XML without namespace.
I have an XmlNode which represents the following xml for example:
XmlNode xml.innerText =
<book>
<name><![CDATA[Harry Potter]]</name>
<author><![CDATA[J.K. Rolling]]</author>
</book>
I want to change this node so that it'll contain the following:
XmlNode xml.innerText =
<book>
<name>Harry Potter</name>
<author>J.K. Rolling</author>
</book>
Any ideas?Thanks!
well, if it's exactly how you put it, then it's easy:
xml.innerText = xml.innerText.Replace("![CDATA[","").Replace("]]","");
xmlDoc.Save();// xmlDoc is your xml document
I suggest you to read your entire xml and rewrite it. You can read values without cdata like this
foreach (var child in doc.Root.Elements())
{
string name = child.Name;
string value = child.Value
}