Collect attributes from a XML file by XMLReader

Collect attributes from a XML file by XMLReader - c#

I have an XML file as below.
<BOOK bnumber="1" bname="Book">
<CHAPTER cnumber="1">
<Sentence vnumber="1">This is the sentence 1.</Sentence>
<Sentence vnumber="2">This is the sentence 2.</Sentence>
<Sentence vnumber="3">This is the sentence 3.</Sentence>
</CHAPTER>
<CHAPTER cnumber="2">
<Sentence vnumber="1">Hello World 1.</Sentence>
<Sentence vnumber="2">Hello World 2.</Sentence>
<Sentence vnumber="3">Hello World 3.</Sentence>
<Sentence vnumber="4">Hello World 4.</Sentence>
</CHAPTER>
<CHAPTER cnumber="3">
<Sentence vnumber="1">Good morning 1.</Sentence>
<Sentence vnumber="2">Good morning 2.</Sentence>
<Sentence vnumber="3">Good morning 3.</Sentence>
</CHAPTER>
</BOOK>
What I want is to collect the attributes of "CHAPTER".
The goal is to get
Chapter={"Chapter 1";"Chapter 2","Chapter 3"};
Current I use tradition method,
XmlDocument xdoc = new XmlDocument();
xdoc.Load(#"C:\books.xml"); //load the xml file into our document
XmlNodeList nodes = xdoc.SelectNodes(#"//BOOK/CHAPTER[#cnumber='" + chap
string sentences = "";
foreach(XmlNode node in nodes) {
sentences += node.InnerText + "; ";
}
but I want to use XMLReader because the XML file is big, I don't want to load it in memory.
Thanks for help.

Well basicly you can do like this:
var chapters = new List<string>();
using (XmlReader reader = XmlReader.Create(new StringReader(xmlString)))
{
reader.ReadToFollowing("CHAPTER");
reader.MoveToFirstAttribute();
string chapterNumber = reader.Value;
chapters.Add("Chapter " + chapterNumber);
}
where the xmlString is your xml.
This will find the first chapter and get the attribute from it and add it to a list of chapters.

Related

How to retrieve tags inside certain tag for XML using GetElementsByTagName/SelectNode/SelectSingleNode?

Let say I have a XML with this format:
<TEST>
<DRINK>
<NAME>Ice tea</NAME>
<NAME>Milo</NAME>
<NAME>Coffee</NAME>
</DRINK>
<FOOD>
<NAME>Fried Rice</NAME>
<NAME>Hamburger</NAME>
<NAME>Fried Noodles</NAME>
</FOOD>
</TEST>
How to retrieve only food names and put them in the ASP.NET web form textbox?
This is my current code:
XmlDocument doc = new XmlDocument();
doc.Load(filepath);
root = doc.DocumentElement;
TextBox1.Text = root.GetElementsByTagName("NAME")[0].InnerText;
TextBox2.Text = root.GetElementsByTagName("NAME")[1].InnerText;
TextBox3.Text = root.GetElementsByTagName("NAME")[2].InnerText;
This code will instead retrieve drink names instead of food names. How to make it read NAME tags in FOOD tag?

By using XmlNode.SelectNodes Method with providing the XPath.
var foodElements = root.SelectNodes("FOOD/NAME");
Console.WriteLine(foodElement[0].InnerText);
Console.WriteLine(foodElement[1].InnerText);
Console.WriteLine(foodElement[2].InnerText);
Sample .NET Fiddle

Basic xml parsing.
this:
{
string sXML
=
#"<TEST>
<DRINK>
<NAME>Ice tea</NAME >
<NAME>Milo</NAME >
<NAME>Coffee</NAME>
</DRINK>
<FOOD>
<NAME>Fried Rice</NAME>
<NAME>Hamburger</NAME>
<NAME>Fried Noodles</NAME>
</FOOD>
</TEST>";
XmlDocument myXml = new XmlDocument();
myXml.LoadXml(sXML);
XmlNodeList myNodes = myXml.SelectNodes("TEST/FOOD/NAME");
foreach (XmlNode OneNode in myNodes)
{
Debug.Print(OneNode.InnerText);
}
}
output:

C# Dividing XML into parts

I am trying to divide an XML file into parts
I have an XML file like this
<?xml version="1.0" encoding="utf-8"?>
<RegistrationOpenData xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://example.gov">
<Description>Registration data is collected by ABC XYZ</Description>
<InformationURL>http://www.example.com/html/hpd/property-reg-unit.shtml</InformationURL>
<SourceAgency>ABC Department of Housing</SourceAgency>
<SourceSystem>PREMISYS</SourceSystem>
<StartDate>2016-02-29T00:03:06.642772-05:00</StartDate>
<EndDate i:nil="true" />
<Registrations>
<Registration xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<RegistrationID>1</RegistrationID>
<BuildingID>1A</BuildingID>
<element1>E11</element1>
<element2>E21</element2>
<element3>E31</element3>
<element4>E41</element4>
</Registration>
<Registration xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<RegistrationID>2</RegistrationID>
<BuildingID>2A</BuildingID>
<element1>E21</element1>
<element2>E22</element2>
<element3>E32</element3>
<element4>E42</element4>
</Registration>
</Registrations>
</RegistrationOpenData>
And I am trying to fetch the number of nodes trough this code
XmlDocument doc = null;
doc = new XmlDocument();
doc.Load(#"D:\Registrations20160229.xml");
XmlNodeReader nodeReader = new XmlNodeReader(doc);
XmlElement root = doc.DocumentElement;
XmlNodeList elemList = root.GetElementsByTagName("Registration");
int totalnode = elemList.Count;
int nodehalf = totalnode / 2;
MessageBox.Show(nodehalf.ToString());
But after this I am unable to proceed, This code I have used to calculate number of Registration Nodes and then made them into half, now I don't know how to proceed further to split this file, I have total of 158718 entries (Registration Nodes) inside the file (sometimes even more) and I am trying to break all into parts, maybe 3 to 4 parts.

Try this , it should not load whole xml to memory
using(XmlReader reader = XmlReader.Create(new FileStream(#"D:\Registrations20160229.xml" , FileMode.Open))){
while (reader.Read())
{
if(reader.NodeType == XmlNodeType.Element && reader.Name == "Registration")
counter++;
}
Console.WriteLine(counter);
}

How to go back to the root element in XML using C#?

I am new to XML Programming using C# and have been trying to grasp the concepts. I have a 2books.xml file which looks like
<!--sample XML fragment-->
<bookstore>
<book genre='novel' ISBN='10-861003-324'>
<title>The Handmaid's Tale</title>
<price>19.95</price>
</book>
<book genre='novel' ISBN='1-861001-57-5'>
<title>Pride And Prejudice</title>
<price>24.95</price>
</book>
<book genre='novel' ISBN='1-861991-57-9'>
<title>The Honor</title>
<price>20.12</price>
</book>
</bookstore>
Now using XmlReader when I try this following section of code
using (XmlReader xReader = XmlReader.Create(#"C:\Users\Chiranjib\Desktop\2books.xml"))
{
xReader.MoveToContent();
Console.WriteLine("-----------> Now "+xReader.Name);
Console.WriteLine("------Inner XML -----> "+xReader.ReadInnerXml()); //Positions the reader to the next root element type after the call
Console.WriteLine("------OuterXML XML -----> " + xReader.ReadOuterXml()); //Positions the reader to the next root element type after the call -- for a leaf node it reacts the same way as Read()
while (xReader.Read())
{
Console.WriteLine("In Loop");
if ((xReader.NodeType == XmlNodeType.Element) && (xReader.Name == "book"))
{
xReader.ReadToFollowing("price");
Console.WriteLine("---------- In Loop -------- Price "+xReader.GetAttribute("price"));
}
}
}
Console.ReadKey();
}
obviously xReader.ReadInnerXml() places the reader after call at the End of File and as a result of that xReader.ReadOuterXml() prints nothing.
Now I want xReader.ReadOuterXml() to be called successfully . How can I get back to my previous root node ?
I tried xReader.MoveToElement() but I guess it does not do so .

You can't really do that, as it's not what XmlReader was designed for. What you probably want is a much higher level API like LINQ to XML.
For example, you could loop through your books like this:
var doc = XDocument.Parse(xml);
foreach (var book in doc.Descendants("book"))
{
Console.WriteLine("Title: {0}", (string) book.Element("title"));
Console.WriteLine("ISBN: {0}", (string) book.Attribute("ISBN"));
Console.WriteLine("Price: {0}", (decimal) book.Element("price"));
Console.WriteLine("---");
}
See a working demo here: https://dotnetfiddle.net/m99eCl

Obtain all Child Nodes of Specific Type Returns Nothing

I am attempting to get all the Pit XML elements from a Drainage_String XML node.
My Problem: When I go to retrieve all the Pit elements from the node, the XMLNodeList is always empty. I know that the node does contain 2 Pit elements so it should contain 2 node elements.
What is going wrong?
XmlDocument xdoc = new XmlDocument();
xdoc.Load(xmlFilePath);
XmlNodeList xNodes = xdoc.DocumentElement.GetElementsByTagName("string_drainage");
foreach (XmlNode dStr in xNodes) {
XmlNodeList pits = dStr.SelectNodes("pit");
MessageBox.Show("Num: "+pits.Count.ToString(), "Number");
// always outputs "Num: 0"
}
Example data I am using:
<string_drainage>
<pit>
<name>MH. </name>
<ip>0</ip>
<ratio>0</ratio>
<x>212908.89268569</x>
<y>612015.26122586</y>
<z>80.62414621</z>
</pit>
</string_drainage>
Detailed data:
<?xml version="1.0"?>
<xml12d xmlns="http://www.12d.com/schema/xml12d-10.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" language="English" version="1.0" date="2013-08-27" time="16:33:14" xsi:schemaLocation="http://www.12d.com/schema/xml12d-10.0 http://www.12d.com/schema/xml12d-10.0/xml12d.xsd">
<meta_data>
<units>
<metric>
<linear>metre</linear>
<area>square metre</area>
<volume>cubic metre</volume>
<temperature>celsius</temperature>
<pressure>millibars</pressure>
<angular>decimal degrees</angular>
<direction>decimal degrees</direction>
</metric>
</units>
<application>
<name>12d Model</name>
<manufacturer>12d Solutions Pty Ltd</manufacturer>
<manufacturer_url>www.12d.com</manufacturer_url>
<application>12d Model 10.0C1j</application>
<application_build>10.1.10.22</application_build>
<application_path>C:\Program Files\12d\12dmodel\10.00\nt.x64\12d.exe</application_path>
<application_date_gmt>24-Jul-2013 02:18:30</application_date_gmt>
<application_date>24-Jul-2013 12:18:30</application_date>
<project_name>mjkhjk</project_name>
<project_guid>{30A05217-706A-41c1-AF53-0D1A0975A5D0}</project_guid>
<project_folder>C:\12djobs\mjkhjk</project_folder>
<client>12d Training - NSW</client>
<dongle>572d471062</dongle>
<environment/>
<env4d>C:\12d\10.00\user\env.4d</env4d>
<user>Sam Zielke-Ryner</user>
<export_file_name>Tttt.xml</export_file_name>
<export_date_gmt>27-Aug-2013 06:33:14</export_date_gmt>
<export_date>27-Aug-2013 16:33:14</export_date>
</application>
</meta_data>
<comments>
<manufacturer>12d Solutions Pty Ltd</manufacturer>
<manufacturer_url>www.12d.com</manufacturer_url>
<application>12d Model 10.0C1j</application>
<application_build>10.1.10.22</application_build>
<application_path>C:\Program Files\12d\12dmodel\10.00\nt.x64\12d.exe</application_path>
<application_date_gmt>24-Jul-2013 02:18:30</application_date_gmt>
<application_date>24-Jul-2013 12:18:30</application_date>
<export_file_name>Tttt.xml</export_file_name>
<export_date_gmt>27-Aug-2013 06:33:14</export_date_gmt>
<export_date>27-Aug-2013 16:33:14</export_date>
</comments>
<string_drainage>
<name/>
<time_created>29-Jul-2013 02:02:03</time_created>
<time_updated>29-Jul-2013 02:02:11</time_updated>
<outfall>null</outfall>
<flow_direction>1</flow_direction>
<use_pit_con_points>false</use_pit_con_points>
<data_3d>
<p>212908.89268569 612015.26122586 0</p>
<p>212715.09268598 612007.24091243 84.20896044</p>
</data_3d>
<pit>
<name>MH. </name>
<type>CONC COVER</type>
<road_name/>
<road_chainage>null</road_chainage>
<diameter>1.1</diameter>
<con_point_mode>Points</con_point_mode>
<floating>true</floating>
<hgl>null</hgl>
<chainage>0</chainage>
<ip>0</ip>
<ratio>0</ratio>
<x>212908.89268569</x>
<y>612015.26122586</y>
<z>80.62414621</z>
</pit>
<pit>
<name>MH. </name>
<type>CONC COVER</type>
<road_name/>
<road_chainage>null</road_chainage>
<diameter>1.1</diameter>
<con_point_mode>Points</con_point_mode>
<floating>true</floating>
<hgl>null</hgl>
<chainage>193.96588699</chainage>
<ip>1</ip>
<ratio>0</ratio>
<x>212715.09268598</x>
<y>612007.24091243</y>
<z>84.20896044</z>
</pit>
<pipe>
<name>A</name>
<type>PVC</type>
<diameter>0.15</diameter>
<nominal_diameter>0.15</nominal_diameter>
<us_level>77.38411559</us_level>
<ds_level>79.32377446</ds_level>
<us_hgl>0</us_hgl>
<ds_hgl>0</ds_hgl>
<flow_velocity>0</flow_velocity>
<flow_volume>0</flow_volume>
<attributes>
<real>
<name>nominal diameter</name>
<value>0.15</value>
</real>
<real>
<name>calculated critical cover chainage</name>
<value>4.31482574</value>
</real>
</attributes>
</pipe>
</string_drainage>
</xml12d>

The call to SelectNodes requires the default namespace to be added to it.
XmlNamespaceManager nsmgr = new XmlNamespaceManager(xdoc.NameTable);
nsmgr.AddNamespace("x", xdoc.DocumentElement.NamespaceURI);
XmlNodeList pits = dStr.SelectNodes("x:pit");
Refer to the help located here
A tip I use for xml files is to always have namespaces aliased in xml files I use. Otherwise its harder to write xpath references to them.

I hope this will help you
test.xml is a copy/paste of your xml
static void Main(string[] args)
{
XmlDocument xdoc = new XmlDocument();
xdoc.Load(#"c:\test.xml");
XmlNodeList xNodes = xdoc.DocumentElement.SelectNodes("pit");
Console.WriteLine("Num: " + xNodes.Count.ToString());
foreach (XmlNode dStr in xNodes)
{
Console.WriteLine("Name: " + dStr.SelectSingleNode("name").InnerText);
Console.WriteLine("ip: " + dStr.SelectSingleNode("ip").InnerText);
Console.WriteLine("ratio: " + dStr.SelectSingleNode("ratio").InnerText);
Console.WriteLine("X: " + dStr.SelectSingleNode("z").InnerText);
Console.WriteLine("Y: " + dStr.SelectSingleNode("y").InnerText);
Console.WriteLine("X: " + dStr.SelectSingleNode("x").InnerText);
}
Console.Read();
}

maybe is better use XmlNode and XmlSingleNode of Xmldocument

get this xml value with c#

I need to get the number next to the word text, in this case the number is 1
<SD>
<POPULARITY URL="google.com/" TEXT="1"/>
<REACH RANK="1"/>
<RANK DELTA="+0"/>
</SD>
How can I get the number in c#
Thanks

In addition to the examples above you could try using linq to xml
See below.
var str = #"<ALEXA VER='0.9' URL='google.com/' HOME='0' AID='='>
<SD TITLE='A' FLAGS='DMOZ' HOST='google.com'>
<TITLE TEXT='Google '/>
<ADDR STREET='' CITY='' STATE='' ZIP='' COUNTRY='' />
<CREATED DATE='15-Sep-1997' DAY='15' MONTH='09' YEAR='1997'/>
<PHONE NUMBER='unlisted'/>
<OWNER NAME='unlisted'/>
<EMAIL ADDR='dns-admin#google.com'/>
<LANG LEX='en'/>
<LINKSIN NUM='704402'/>
<SPEED TEXT='1581' PCT='48'/>
<REVIEWS AVG='4.5' NUM='524'/>
<CHILD SRATING='0'/>
<ASSOCS>
<ASSOC ID='googlecom'/></ASSOCS>
</SD>
<KEYWORDS>
<KEYWORD VAL='Mountain View'/>
</KEYWORDS><DMOZ>
<SITE BASE='google.com/' TITLE='Google' DESC='Enables users to search the Web, Usenet, and images. Features include PageRank, caching and translation of results, and an option to find similar pages. The companys focus is developing search technology.'>
<CATS>
<CAT ID='Top/Computers/Internet/Searching/Search_Engines/Google' TITLE='Search Engines/Google' CID='374822'/>
<CAT ID='Top/Regional/North_America/United_States/California/Localities/M/Mountain_View/Business_and_Economy/Industrial/Computers_and_Internet' TITLE='Industrial/Computers and Internet' CID='625367'/>
<CAT ID='Top/World/Arabic/إقليمـي/الشرق_الأوسط/السعودية/تجارة_و_أقتصاد/كمبيوتر_و_إنترنت/محركات_بحث' TITLE='كمبيوتر و إنترنت/محركات بحث' CID='204954'/>
<CAT ID='Top/World/Français/Informatique/Internet/Recherche/Moteurs_de_recherche/Google' TITLE='Moteurs de recherche/Google' CID='247347'/>
</CATS>
</SITE>
</DMOZ>
<SD>
<POPULARITY URL='google.com/' TEXT='1'/>
<REACH RANK='1'/>
<RANK DELTA='+0'/>
</SD>
</ALEXA>";
var item = XElement.Parse(str);
var subSet = item.Elements("SD");
var actualItem = subSet.Where(x => x.Element("POPULARITY") != null).First();
var value = actualItem.Element("POPULARITY").Attribute("TEXT").Value;
Hope this helps

Something like this:
XmlDocument doc = new XmlDocument();
doc.LoadXml( #"<SD> <POPULARITY URL=""google.com/"" TEXT=""1""/> <REACH RANK=""1""/> <RANK DELTA=""+0""/> </SD> ");
XmlNode root = doc.FirstChild;
Debug.WriteLine(root["POPULARITY"].Attributes["TEXT"].InnerXml);

You can try:
XmlDocument doc = new Xmldocument();
doc.Load(stringWithYourXml);
XmlNode node = doc.SelectSingleNode("/SD/POPULARITY");
var val = node.Attributes["TEXT"].Value
Please consider this as a sample ( do some more checks and error detection )

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Collect attributes from a XML file by XMLReader - c#

Related

How to retrieve tags inside certain tag for XML using GetElementsByTagName/SelectNode/SelectSingleNode?

C# Dividing XML into parts

How to go back to the root element in XML using C#?

Obtain all Child Nodes of Specific Type Returns Nothing

get this xml value with c#

Categories

Resources