get attribute name in addition to attribute value in xml - c#

I am receiving dynamic xml where I won't know the attribute names, if you'll look at the xml and code... I tried to make a simple example, I can get the attribute values i.e. "myName", "myNextAttribute", and "blah", but I can't get the attribute names i.e. "name", "nextAttribute", and "etc1". Any ideas, I figure it has to be something easy I'm missing...but I'm sure missing it.
static void Main(string[] args)
{
string xml = "<test name=\"myName\" nextAttribute=\"myNextAttribute\" etc1=\"blah\"/>";
TextReader sr = new StringReader(xml);
using (XmlReader xr = XmlReader.Create(sr))
{
while (xr.Read())
{
switch (xr.NodeType)
{
case XmlNodeType.Element:
if (xr.HasAttributes)
{
for (int i = 0; i < xr.AttributeCount; i++)
{
System.Windows.Forms.MessageBox.Show(xr.GetAttribute(i));
}
}
break;
default:
break;
}
}
}
}

You can see in MSDN:
if (reader.HasAttributes) {
Console.WriteLine("Attributes of <" + reader.Name + ">");
while (reader.MoveToNextAttribute()) {
Console.WriteLine(" {0}={1}", reader.Name, reader.Value);
}
// Move the reader back to the element node.
reader.MoveToElement();
}

Your switch is unnecessary since you only have a single case, try rolling that into your if statement instead.
if (xr.NodeType && xr.HasAttributes)
{
...
}
Note that the && operator evaluates in order, so if xr.NoteType is false, the rest of the arguments are ignored and the if block is skipped.

Related

Url Xml Parsing In c#

i want to get a data from a xml site but i want spesific data i want to get USD/TRY, GBP/TRY and EUR/TRY Forex Buying values i dont know how to split those values from the data i have a test console program and the is like this
using System;
using System.Xml;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
string XmlUrl = "https://www.tcmb.gov.tr/kurlar/today.xml";
XmlTextReader reader = new XmlTextReader(XmlUrl);
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element: // The node is an element.
Console.Write("<" + reader.Name);
while (reader.MoveToNextAttribute()) // Read the attributes.
Console.Write(" " + reader.Name + "='" + reader.Value + "'");
Console.Write(">");
Console.WriteLine(">");
break;
case XmlNodeType.Text: //Display the text in each element.
Console.WriteLine(reader.Value);
break;
case XmlNodeType.EndElement: //Display the end of the element.
Console.Write("</" + reader.Name);
Console.WriteLine(">");
break;
}
}
}
}
}
how can i split the values from i want from the xml
My desired output is this
public class ParaBirimi
{
public decimal ForexBuying { get; set; }
public string Name { get; set; }//Values like USD GBP EUR
}
class to a list
It is better to use LINQ to XML API. It is available in the .Net Framework since 2007.
Here is your starting point. You can extend it to read any attribute or element.
XML fragment
<Tarih_Date Tarih="28.05.2021" Date="05/28/2021" Bulten_No="2021/100">
<Currency CrossOrder="0" Kod="USD" CurrencyCode="USD">
<Unit>1</Unit>
<Isim>ABD DOLARI</Isim>
<CurrencyName>US DOLLAR</CurrencyName>
<ForexBuying>8.5496</ForexBuying>
<ForexSelling>8.5651</ForexSelling>
<BanknoteBuying>8.5437</BanknoteBuying>
<BanknoteSelling>8.5779</BanknoteSelling>
<CrossRateUSD/>
<CrossRateOther/>
</Currency>
<Currency CrossOrder="1" Kod="AUD" CurrencyCode="AUD">
<Unit>1</Unit>
<Isim>AVUSTRALYA DOLARI</Isim>
<CurrencyName>AUSTRALIAN DOLLAR</CurrencyName>
<ForexBuying>6.5843</ForexBuying>
<ForexSelling>6.6272</ForexSelling>
<BanknoteBuying>6.5540</BanknoteBuying>
<BanknoteSelling>6.6670</BanknoteSelling>
<CrossRateUSD>1.2954</CrossRateUSD>
<CrossRateOther/>
</Currency>
...
</Tarih_Date>
c#
void Main()
{
const string URL = #"https://www.tcmb.gov.tr/kurlar/today.xml";
XDocument xdoc = XDocument.Load(URL);
foreach (XElement elem in xdoc.Descendants("Currency"))
{
Console.WriteLine("CrossOrder: '{1}', Kod: '{1}', CurrencyCode: '{2}', Isim: '{3}', CurrencyName: '{4}'{0}"
, Environment.NewLine
, elem.Attribute("CrossOrder").Value
, elem.Attribute("Kod").Value
, elem.Attribute("CurrencyCode").Value
, elem.Element("Isim").Value
, elem.Element("CurrencyName").Value
);
}
}
Output
CrossOrder: '0', Kod: '0', CurrencyCode: 'USD', Isim: 'USD', CurrencyName: 'ABD DOLARI'
CrossOrder: '1', Kod: '1', CurrencyCode: 'AUD', Isim: 'AUD', CurrencyName: 'AVUSTRALYA DOLARI'

ReadOuterXml is throwing OutOfMemoryException reading part of large (1 GB) XML file

I am working on a large XML file and while running the application, XmlTextReader.ReadOuterXml() method is throwing memory exception.
Lines of codes are like,
XmlTextReader xr = null;
try
{
xr = new XmlTextReader(fileName);
while (xr.Read() && success)
{
if (xr.NodeType != XmlNodeType.Element)
continue;
switch (xr.Name)
{
case "A":
var xml = xr.ReadOuterXml();
var n = GetDetails(xml);
break;
}
}
}
catch (Exception ex)
{
//Do stuff
}
Using:
private int GetDetails (string xml)
{
var rootNode = XDocument.Parse(xml);
var xnodes = rootNode.XPathSelectElements("//A/B").ToList();
//Then working on list of nodes
}
Now while loading the XML files, the application throwing exception on the xr.ReadOuterXml() line. What can be done to avoid this? The size of XML is almost 1 GB.
The most likely reason you are getting a OutOfMemoryException in ReadOuterXml() is that you are trying to read in a substantial portion of the 1 GB XML document into a string, and are hitting the Maximum string length in .Net.
So, don't do that. Instead load directly from the XmlReader using XDocument.Load() with XmlReader.ReadSubtree():
using (var xr = XmlReader.Create(fileName))
{
while (xr.Read() && success)
{
if (xr.NodeType != XmlNodeType.Element)
continue;
switch (xr.Name)
{
case "A":
{
// ReadSubtree() positions the reader at the EndElement of the element read, so the
// next call to Read() moves to the next node.
using (var subReader = xr.ReadSubtree())
{
var doc = XDocument.Load(subReader);
GetDetails(doc);
}
}
break;
}
}
}
And then in GetDetails() do:
private int GetDetails(XDocument rootDocument)
{
var xnodes = rootDocument.XPathSelectElements("//A/B").ToList();
//Then working on list of nodes
return xnodes.Count;
}
Not only will this use less memory, it will also be more performant. ReadOuterXml() uses a temporary XmlWriter to copy the XML in the input stream to an output StringWriter (which you then parse a second time). This version of the algorithm completely skips this extra work. It also avoids creating strings large enough to go on the large object heap which can cause additional performance issues.
If this is still using too much memory you will need to implement SAX-like parsing for your XML where you only load one element <B> at a time. First, introduce the following extension method:
public static partial class XmlReaderExtensions
{
public static IEnumerable<XElement> WalkXmlElements(this XmlReader xmlReader, Predicate<Stack<XName>> filter)
{
Stack<XName> names = new Stack<XName>();
while (xmlReader.Read())
{
if (xmlReader.NodeType == XmlNodeType.Element)
{
names.Push(XName.Get(xmlReader.LocalName, xmlReader.NamespaceURI));
if (filter(names))
{
using (var subReader = xmlReader.ReadSubtree())
{
yield return XElement.Load(subReader);
}
}
}
if ((xmlReader.NodeType == XmlNodeType.Element && xmlReader.IsEmptyElement)
|| xmlReader.NodeType == XmlNodeType.EndElement)
{
names.Pop();
}
}
}
}
Then, use it as follows:
using (var xr = XmlReader.Create(fileName))
{
Predicate<Stack<XName>> filter =
(stack) => stack.Peek().LocalName == "B" && stack.Count > 1 && stack.ElementAt(1).LocalName == "A";
foreach (var element in xr.WalkXmlElements(filter))
{
//Then working on the specific node.
}
}
using (var reader = XmlReader.Create(fileName))
{
XmlDocument oXml = new XmlDocument();
while (reader.Read())
{
oXml.Load(reader);
}
}
For me above code resolved the issue when we return it to XmlDocument through XmlDocument Load method

Build XPath for node from XmlReader

I am writing an application which parses dynamic xml from various sources and traverses the XML and returns all the unique elements.
Given the sometimes very large size of the Xml files I am using a XmlReader to parse the Xml structure due to memory constraints.
public IDictionary<string, int> Discover(string filePath)
{
Dictionary<string, string> nodeTable = new Dictionary<string, string>();
using (XmlReader reader = XmlReader.Create(filePath))
{
while (!reader.EOF)
{
if (reader.NodeType == XmlNodeType.Element)
{
if (!nodeTable.ContainsKey(reader.LocalName))
{
nodeTable.Add(reader.LocalName, reader.Depth);
}
}
reader.Read();
}
}
Debug.WriteLine("The node table has {0} items.", nodeTable.Count);
return nodeTable;
}
This works a treat and is nice and performant, however the final piece of the puzzle eludes me, I am trying to generate the XPath for each element.
Now, this at first seemed straight forward using something like this.
var elements = new Stack<string>();
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
elements.Push(reader.LocalName);
break;
case XmlNodeType.EndElement:
elements.Pop();
break;
case XmlNodeType.Text:
path = string.Join("/", elements.Reverse());
break;
}
}
But this only really gives me one part of the solution. Given that I wish to return the XPath for every node in the tree which contains data and also detect if a given node tree contains nested collections of data.
i.e.
<customers>
<customer id=2>
<name>ted smith</name>
<addresses>
<address1>
<line1></line1>
</address1>
<address2>
<line1></line1>
<line2></line2>
</address2>
</addresses>
</customer>
<customer id=322>
<name>smith mcsmith</name>
<addresses>
<address1>
<line1></line1>
<line2></line2>
</address1>
<address2>
<line1></line1>
<line2></line2>
</address2>
</addresses>
</customer>
</customers>
Keeping in mind the data is completely dynamic and the schema is unknown.
So the output should include
/customer/name
/customer/address1/line1
/customer/address1/line2
/customer/address2/line1
/customer/address2/line2
I like using recursive method rather than push/pop. See code below
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string input =
"<customers>" +
"<customer id=\"2\">" +
"<name>ted smith</name>" +
"<addresses>" +
"<address1>" +
"<line1></line1>" +
"</address1>" +
"<address2>" +
"<line1></line1>" +
"<line2></line2>" +
"</address2>" +
"</addresses>" +
"</customer>" +
"<customer id=\"322\">" +
"<name>smith mcsmith</name>" +
"<addresses>" +
"<address1>" +
"<line1></line1>" +
"<line2></line2>" +
"</address1>" +
"<address2>" +
"<line1></line1>" +
"<line2></line2>" +
"</address2>" +
"</addresses>" +
"</customer>" +
"</customers>";
StringReader sReader = new StringReader(input);
XmlReader reader = XmlReader.Create(sReader);
Node root = new Node();
ReadNode(reader, root);
}
static bool ReadNode(XmlReader reader, Node node)
{
Boolean done = false;
Boolean endElement = false;
while(done = reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
if (node.name.Length == 0)
{
node.name = reader.Name;
GetAttrubutes(reader, node);
}
else
{
Node newNode = new Node();
newNode.name = reader.Name;
if (node.children == null)
{
node.children = new List<Node>();
}
node.children.Add(newNode);
GetAttrubutes(reader, newNode);
done = ReadNode(reader, newNode);
}
break;
case XmlNodeType.EndElement:
endElement = true;
break;
case XmlNodeType.Text:
node.text = reader.Value;
break;
case XmlNodeType.Attribute:
if (node.attributes == null)
{
node.attributes = new Dictionary<string, string>();
}
node.attributes.Add(reader.Name, reader.Value);
break;
}
if (endElement)
break;
}
return done;
}
static void GetAttrubutes(XmlReader reader, Node node)
{
for (int i = 0; i < reader.AttributeCount; i++)
{
if (i == 0) node.attributes = new Dictionary<string, string>();
reader.MoveToNextAttribute();
node.attributes.Add(reader.Name, reader.Value);
}
}
}
public class Node
{
public string name = string.Empty;
public string text = string.Empty;
public Dictionary<string, string> attributes = null;
public List<Node> children = null;
}
}
​

how to read xml files in C# ?

I have a code which reads an xml file. There are some parts I dont understand.
From my understanding , the code will create an xml file with 2 elements,
"Product" and "OtherDetails" . How come we only have to use writer.WriteEndElement();
once when we used writer.WriteStartElement twice ? , shouldn't we close each
writer.WriteStartElement statement with a writer.WriteEndElement() statement ?
using System.Xml;
public class Program
{
public static void Main()
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
XmlWriter writer = XmlWriter.Create("Products.xml", settings);
writer.WriteStartDocument();
writer.WriteComment("This file is generated by the program.");
writer.WriteStartElement("Product"); // first s
writer.WriteAttributeString("ID", "001");
writer.WriteAttributeString("Name", "Soap");
writer.WriteElementString("Price", "10.00")
// Second Element
writer.WriteStartElement("OtherDetails");
writer.WriteElementString("BrandName", "X Soap");
writer.WriteElementString("Manufacturer", "X Company");
writer.WriteEndElement();
writer.WriteEndDocument();
writer.Flush();
writer.Close();
}
}
using System;
using System.Xml;
public class Program
{
public static void Main()
{
XmlReader reader = XmlReader.Create("Products.xml");
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element
&& reader.Name == "Product")
{
Console.WriteLine("ID = " + reader.GetAttribute(0));
Console.WriteLine("Name = " + reader.GetAttribute(1));
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.Name == "Price")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Price = {0:C}", Double.Parse(reader.Value));
}
}
reader.Read();
} //end if
if (reader.Name == "OtherDetails")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.Name == "BrandName")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Brand Name = " + reader.Value);
}
}
reader.Read();
} //end if
if (reader.Name == "Manufacturer")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Manufacturer = " + reader.Value);
}
}
} //end if
}
} //end if
} //end while
} //end if
} //end while
}
}
I don't get this part:
if (reader.Name == "OtherDetails")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.Name == "BrandName")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Brand Name = " + reader.Value);
}
}
notice how the condition while (reader.NodeType != XmlNodeType.EndElement) has been used twice ?
why is that we don't have to specify
if (reader.NodeType == XmlNodeType.Element for OtherDetails) as we did for Product,
like this
if (reader.NodeType == XmlNodeType.Element
&& reader.Name == "OtherDetails")
{}
To answer your first question:
As the MSDN documentation for XmlWriter.WriteEndDocument() says:
Closes any open elements or attributes and puts the writer back in the Start state.
So it will automatically close any open elements for you. In fact, you can remove the call to WriteEndElement() altogether and it will still work ok.
And as people are saying in the comments above, you should perhaps consider using Linq-to-XML.
It can make things much easier. For example, to create the XML structure from your program using Linq-to-XML you can do this:
var doc = new XDocument(
new XElement("Product",
new XAttribute("ID", "001"), new XAttribute("Name", "Soap"),
new XElement("Price", 10.01),
new XElement("OtherDetails",
new XElement("BrandName", "X Soap"),
new XElement("Manufacturer", "X Company"))));
File.WriteAllText("Products.xml", doc.ToString());
If you were reading data from the XML, you can use var doc = XDocument.Load("Filename.xml") to load the XML from a file, and then getting the data out is as simple as:
double price = double.Parse(doc.Descendants("Price").Single().Value);
string brandName = doc.Descendants("BrandName").Single().Value;
Or alternatively (casting):
double price = (double) doc.Descendants("Price").Single();
string brandName = (string) doc.Descendants("BrandName").Single();
(In case you're wondering how on earth we can cast an object of type XElement like that: It's because a load of explict conversion operators are defined for XElement.)
If you need anything strait forward (no reading or research), here is what I did:
I recently wrote a custom XML parsing method for my MenuStrip for WinForms (it had hundreds of items and XML was my best bet).
// load the document
// I loaded mine from my C# resource file called TempResources
XDocument doc = XDocument.Load(new MemoryStream(Encoding.UTF8.GetBytes(TempResources.Menu)));
// get the root element
// (var is an auto token, it becomes what ever you assign it)
var elements = doc.Root.Elements();
// iterate through the child elements
foreach (XElement node in elements)
{
// if you know the name of the attribute, you can call it
// mine was 'name'
// (if you don't know, you can call node.Attributes() - this has the name and value)
Console.WriteLine("Loading list: {0}", node.Attribute("name").Value);
// in my case, every child had additional children, and them the same
// *.Cast<XElement>() would give me the array in a datatype I can work with
// menu_recurse(...) is just a resursive helper method of mine
menu_recurse(node.Elements().Cast<XElement>().ToArray()));
}
(My answer can also be found here: Reading an XML File With Linq - though it unfortunately is not Linq)
Suppose if you want to read an xml file we need to use a dataset,because xml file internally converts into datatables using a dataset.Use the following line of code to access the file and to bind the dataset with the xml data.
DataSet ds=new DataSet();
ds.ReadXml(HttpContext.Current.Server.MapPath("~/Labels.xml");
DataSet comprises of many datatables,count of those datatables depends on number of Parent Child Tags in an xml file

XML take the position of an element and at next usage go directly there

So i have a huge XML file ( wikipedia dump xml ) .
My school project requirement says that i should be able to do a really fast search on this xml file ( so no, not import it into an sql database )
so of course i want to create an indexer, that will display into a separate file ( probably xml ) something like this : [content to search]:[byte offset to the start of the xml node that contains the content]
My question is how can i take the position of the element, and how can I jump to that position in the xml in case it is required for a search ?
The project is in C#. Thank you in advance.
Later Edit : I am trying to work with XmlReader, but I am open for any other suggestions.
For the moment this is how I read my XML for a non-indexed search
XmlReader reader = XmlReader.Create(FileName);
while (reader.Read())
{
switch (reader.Name)
{
case "page":
Boolean found = false;
String title = "";
String element = "<details>";
readMore(reader, "title");
title = reader.Value;
if (title.Contains(word))
{
found = true;
}
readMore(reader, "text");
String content = reader.Value;
if (content.Contains(word) & !found)
{
found = true;
}
if (found)
{
element += "<summary>" + title + " (click)</summary>";
element += content;
element += "</details>";
result.Add(element);
}
break;
}
}
reader.Close();
if (result.Count == 0)
{
result.Add("No results were found");
}
return result;
…
static void readMore(XmlReader reader, String name)
{
while (reader.Name != name)
{
reader.Read();
}
reader.Read();
}
The correct solution would be to use an intermediary binary format; but if you can't do that, and assuming that you use DOM, I don't see any solution but to store the position of the node in the DOM tree as a list of indexes.
Example in JavaScript (should be fairly the same in C#):
function getPosition(node) {
var pos = [], i = 0;
while (node != document.documentElement) {
if (node.previousSibling) {
++i;
node = node.previousSibling;
} else {
pos.unshift(i);
i = 0;
node = node.parentNode;
}
}
return pos;
}
function getNode(pos) {
var node = document.documentElement;
for (var i = 0; i < pos.length; ++i) {
node = node.childNodes[pos[i]];
}
return node;
}

Categories

Resources