Read multiple XML files into a XML class structure

Read multiple XML files into a XML class structure - c#

I would like to Read and Deserialize more than one XML file into my XML class structure given a list of strings consisting of file names.
Obviously when reading ONE xml file, you can go like this:
XmlRoot file = null;
XmlSerializer ser = new XmlSerializer(typeof(XmlRoot));
using (XmlReader read = XmlReader.Create(FileName))
{
file = (XmlRoot)ser.Deserialize(read);
{
Which will deserialize the XML file into the class structure?
It is not possible to have a list with file names and use a foreach loop to iterate over them, reading and deserializing one by one as it would theoretically result into multiple root elements being read, deserialized and replicated in the class structure.
So in general I would like to deserialize each file and append the required master elements to a root object.
Does anyone know how to accomplish this? It would be of great help.
Thanks in advance!
PS: Excuse me for my English, as I am not a native speaker. If you need further information, just tell me!

I managed to solve the problem for myself.
First i created a XDocument for the first file i read, afterwards i iterate through the other documents creating a new XDocument for every xml file and try to get the elements after the root (Language in my case) and add it to the root of the XDocument created outside the loop.
XDocument lDoc = new XDocument();
int counter = 0;
foreach (var fileName in multipleFileNames)
{
try
{
counter++;
if (lCounter <= 1)
{
doc = XDocument.Load(fileName);
}
else
{
XDocument doc2 = XDocument.Load(fileName);
IEnumerable<XElement> elements = doc2.Element("Language")
.Elements();
doc.Root.Add(elements);
}
}
return Deserialize(lDoc);
Afterwards i call the Deserialize method, deserializing my created XDocument like this:
public static XmlLanguage Deserialize(XDocument doc)
{
XmlSerializer ser = new XmlSerializer(typeof(XmlLanguage));
return (XmlLanguage)ser.Deserialize(doc.CreateReader());
}

Related

Reading a Text File that has XML data C# [duplicate]

How do I read and parse an XML file in C#?

XmlDocument to read an XML from string or from file.
using System.Xml;
XmlDocument doc = new XmlDocument();
doc.Load("c:\\temp.xml");
or
doc.LoadXml("<xml>something</xml>");
then find a node below it ie like this
XmlNode node = doc.DocumentElement.SelectSingleNode("/book/title");
or
foreach(XmlNode node in doc.DocumentElement.ChildNodes){
string text = node.InnerText; //or loop through its children as well
}
then read the text inside that node like this
string text = node.InnerText;
or read an attribute
string attr = node.Attributes["theattributename"]?.InnerText
Always check for null on Attributes["something"] since it will be null if the attribute does not exist.

LINQ to XML Example:
// Loading from a file, you can also load from a stream
var xml = XDocument.Load(#"C:\contacts.xml");
// Query the data and write out a subset of contacts
var query = from c in xml.Root.Descendants("contact")
where (int)c.Attribute("id") < 4
select c.Element("firstName").Value + " " +
c.Element("lastName").Value;
foreach (string name in query)
{
Console.WriteLine("Contact's Full Name: {0}", name);
}
Reference: LINQ to XML at MSDN

Here's an application I wrote for reading xml sitemaps:
using System;
using System.Collections.Generic;
using System.Windows.Forms;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Data;
using System.Xml;
namespace SiteMapReader
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Please Enter the Location of the file");
// get the location we want to get the sitemaps from
string dirLoc = Console.ReadLine();
// get all the sitemaps
string[] sitemaps = Directory.GetFiles(dirLoc);
StreamWriter sw = new StreamWriter(Application.StartupPath + #"\locs.txt", true);
// loop through each file
foreach (string sitemap in sitemaps)
{
try
{
// new xdoc instance
XmlDocument xDoc = new XmlDocument();
//load up the xml from the location
xDoc.Load(sitemap);
// cycle through each child noed
foreach (XmlNode node in xDoc.DocumentElement.ChildNodes)
{
// first node is the url ... have to go to nexted loc node
foreach (XmlNode locNode in node)
{
// thereare a couple child nodes here so only take data from node named loc
if (locNode.Name == "loc")
{
// get the content of the loc node
string loc = locNode.InnerText;
// write it to the console so you can see its working
Console.WriteLine(loc + Environment.NewLine);
// write it to the file
sw.Write(loc + Environment.NewLine);
}
}
}
}
catch { }
}
Console.WriteLine("All Done :-)");
Console.ReadLine();
}
static void readSitemap()
{
}
}
}
Code on Paste Bin
http://pastebin.com/yK7cSNeY

There are lots of way, some:
XmlSerializer. use a class with the target schema
you want to read - use XmlSerializer
to get the data in an Xml loaded into
an instance of the class.
Linq 2 xml
XmlTextReader.
XmlDocument
XPathDocument (read-only access)

You could use a DataSet to read XML strings.
var xmlString = File.ReadAllText(FILE_PATH);
var stringReader = new StringReader(xmlString);
var dsSet = new DataSet();
dsSet.ReadXml(stringReader);
Posting this for the sake of information.

You can either:
Use XmlSerializer class
Use XmlDocument class
Examples are on the msdn pages provided

Linq to XML.
Also, VB.NET has much better xml parsing support via the compiler than C#. If you have the option and the desire, check it out.

Check out XmlTextReader class for instance.

There are different ways, depending on where you want to get.
XmlDocument is lighter than XDocument, but if you wish to verify minimalistically that a string contains XML, then regular expression is possibly the fastest and lightest choice you can make. For example, I have implemented Smoke Tests with SpecFlow for my API and I wish to test if one of the results in any valid XML - then I would use a regular expression. But if I need to extract values from this XML, then I would parse it with XDocument to do it faster and with less code. Or I would use XmlDocument if I have to work with a big XML (and sometimes I work with XML's that are around 1M lines, even more); then I could even read it line by line. Why? Try opening more than 800MB in private bytes in Visual Studio; even on production you should not have objects bigger than 2GB. You can with a twerk, but you should not. If you would have to parse a document, which contains A LOT of lines, then this documents would probably be CSV.
I have written this comment, because I see a lof of examples with XDocument. XDocument is not good for big documents, or when you only want to verify if there the content is XML valid. If you wish to check if the XML itself makes sense, then you need Schema.
I also downvoted the suggested answer, because I believe it needs the above information inside itself. Imagine I need to verify if 200M of XML, 10 times an hour, is valid XML. XDocument will waste a lof of resources.
prasanna venkatesh also states you could try filling the string to a dataset, it will indicate valid XML as well.

public void ReadXmlFile()
{
string path = HttpContext.Current.Server.MapPath("~/App_Data"); // Finds the location of App_Data on server.
XmlTextReader reader = new XmlTextReader(System.IO.Path.Combine(path, "XMLFile7.xml")); //Combines the location of App_Data and the file name
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
break;
case XmlNodeType.Text:
columnNames.Add(reader.Value);
break;
case XmlNodeType.EndElement:
break;
}
}
}
You can avoid the first statement and just specify the path name in constructor of XmlTextReader.

If you want to retrive a particular value from an XML file
XmlDocument _LocalInfo_Xml = new XmlDocument();
_LocalInfo_Xml.Load(fileName);
XmlElement _XmlElement;
_XmlElement = _LocalInfo_Xml.GetElementsByTagName("UserId")[0] as XmlElement;
string Value = _XmlElement.InnerText;

Here is another approach using Cinchoo ETL - an open source library to parse xml file with few lines of code.
using (var r = ChoXmlReader<Item>.LoadText(xml)
.WithXPath("//item")
)
{
foreach (var rec in r)
rec.Print();
}
public class Item
{
public string Name { get; set; }
public string ProtectionLevel { get; set; }
public string Description { get; set; }
}
Sample fiddle: https://dotnetfiddle.net/otYq5j
Disclaimer: I'm author of this library.

Using an XmlTextAttribute on an array of mixed types removes whitespace strings

I am writing a set of objects that must serialize to and from Xml, following a strict specification that I cannot change. One element in this specification can contain a mix of strings and elements in-line.
A simple example of this Xml output would be this:
<root>Leading text <tag>tag1</tag> <tag>tag2</tag></root>
Note the whitespace characters between the closing of the first tag, and the start of the second tag. Here are the objects that represents this structure:
[XmlRoot("root")]
public class Root
{
[XmlText(typeof(string))]
[XmlElement("tag", typeof(Tag))]
public List<object> Elements { get; set; }
//this is simply for the sake of example.
//gives us four objects in the elements array
public static Root Create()
{
Root root = new Root();
root.Elements.Add("Leading text ");
root.Elements.Add(new Tag() { Text = "tag1" });
root.Elements.Add(" ");
root.Elements.Add(new Tag() { Text = "tag2" });
return root;
}
public Root()
{
Elements = new List<object>();
}
}
public class Tag
{
[XmlText]
public string Text {get;set;}
}
Calling Root.Create(), and saving to a file using this method looks perfect:
public XDocument SerializeToXml(Root obj)
{
XmlSerializer serializer = new XmlSerializer(typeof(Root));
XDocument doc = new XDocument();
using (var writer = doc.CreateWriter())
{
serializer.Serialize(writer, obj);
}
return doc;
}
Serialization looks exactly like the xml structure at the beginning of this post.
Now when I want to serialize an xml file back into a Root object, I call this:
public static Root FromFile(string file)
{
XmlSerializer serializer = new XmlSerializer(typeof(Root));
XmlReaderSettings settings = new XmlReaderSettings();
XmlReader reader = XmlTextReader.Create(file, settings);
//whitespace gone here
Root root = serializer.Deserialize(reader) as Root;
return root;
}
The problem is here. The whitespace string is eliminated. When I call Root.Create(), there are four objects in the Elements array. One of them is a space. This serializes just fine, but when deserializing, there are only 3 objects in Elements. The whitespace string gets eliminated.
Any ideas on what I'm doing wrong? I've tried using xml:space="preserve", as well as a host of XmlReader, XmlTextReader, etc. variations. Note that when I use a StringBuilder to read the XmlTextReader, the xml contains the spaces as I'd expect. Only when calling Deserialize(stream) do I lose the spaces.
Here's a link to an entire working example. It's LinqPad friendly, just copy/paste: http://pastebin.com/8MkUQviB The example opens two files, one a perfect serialized xml file, the second being a deserialized then reserialized version of the first file. Note you'll have to reference System.Xml.Serialization.
Thanks for reading this novel. I hope someone has some ideas. Thank you!

It looks like a bug. Workaround seems to be replace all whitespaces and crlf in XML text nodes by
entities. Semantic equal entities (
) does not work.
<root>Leading text <tag>tag1</tag> <tag>tag2</tag></root>
is working for me.

How do I retrieve all data from an XML file?

I used this code for retrieving specific value from the XML file.Now i want to retrieve all the data which are present in the XML file .Can anybody help me to find out the solution?
StorageFile xmlFile = await Windows.ApplicationModel.Package.Current.InstalledLocation.GetFileAsync("Content1.xml");
XmlDocument xmlDoc;
xmlDoc = await XmlDocument.LoadFromFileAsync(xmlFile);
System.Xml.Linq.XDocument duc = System.Xml.Linq.XDocument.Parse(xmlDoc.GetXml());
var query=
from Date in duc.Root.Elements("Serial")
where Date.Attribute("No").Value=="1"
from Current in Date.Elements("Current")
select new {
NarratedBy=Current.Attribute("NarratedBy").Value,
value=Current.Attribute("Date").Value
};
foreach(var Date in query) {
System.Diagnostics.Debug.WriteLine("{0}\t{1}", Date.NarratedBy, Date.value);
}

You already have whole XML document loaded into duc variable.
That line is responsible for that:
System.Xml.Linq.XDocument duc = System.Xml.Linq.XDocument.Parse(xmlDoc.GetXml());

then you can just retrieve your XDocument details for example into a string variable with an XDocument extension ToString()

You have all data already:
xmlDoc = await XmlDocument.LoadFromFileAsync(xmlFile); // data loadded
System.Xml.Linq.XDocument duc = System.Xml.Linq.XDocument.Parse(xmlDoc.GetXml()); // data parsed
===================
Here is a sample code how you may do it. It is fully functional (using local string xml instead of your file) so you may run it. I added only three attributes but you may add as many as you want.
class Program {
static void Main(string[] args) {
// this is a sample string. Use your file instead
string s = "<catalog>" +
"<book id=\"bk101\" author=\"Gambardella, Matthew\" title=\"XML Developer's Guide\" genre=\"Computer\"/>" +
"<book id=\"bk102\" author=\"Ralls, Kim\" title=\"Midnight Rain\" genre=\"Fantasy\"/>" +
"</catalog>";
XmlDocument xdoc = new XmlDocument();
xdoc.LoadXml(s); // here we load data
// here we get attributes. I have three, you will add three more. Also you may want to use string array instead of variables
foreach (XmlNode task in xdoc.DocumentElement.ChildNodes)
{
string author = task.Attributes["author"].InnerText;
string title = task.Attributes["title"].InnerText;
string genre = task.Attributes["genre"].InnerText;
}
}
}

C# Xml Parsing from StringBuilder

I have a StringBuilder with the contents of an XML file. Inside the XML file is a root tag called <root> and contains multiple <node> tags.
I'd like to parse through the XML to read values of tags within in s, but not sure how to do it.
Will I have to use some C# XML data type for this?
Thanks in advance

StringBuilder sb = new StringBuilder (xml);
TextReader textReader = new StringReader (sb.ToString ());
XDocument xmlDocument = XDocument.Load (textReader);
var nodeValueList = from node in xmlDocument.Descendants ("node")
select node.Value;

You should use classes available in either System.Xml or System.Xml.Linq to parse XML.
XDocument is part of the LINQ extensions for XML and is particularly easy to use if you need to parse through an arbitrary structure. I would suggest using it rather than XmlDocument (unless you have legacy code or are not on .NET 3.5).
Creating an XDocument from a StringBuilder is straightforward:
var doc = XDocument.Parse( stringBuilder.ToString() );
From here, you can use FirstNode, Descendents(), and the many other properties and methods available to walk and examine the XML structure. And since XDocument is designed to work well with LINQ, you can also write queries like:
var someData = from node in doc.Descendants ("yourNodeType")
select node.Value; // etc..

If you are just looking the specifically named nodes then you don't need to load the document into memory, you can process it yourself with an XmlReader.
using(var sr = new StringReader(stringBuilder.ToString)) {
using(var xr = XmlReader.Create(sr)) {
while(xr.Read()) {
if(xr.IsStartElement() && xr.LocalName == "node")
xr.ReadElementString(); //Do something here
}
}
}

use XDocument.Parse(...)

There are several objects at your disposal for working with XML. Look at the System.Xml namespace for objects such as XmlDocument as well as the XmlReader and XmlWriter families of objects. If using C# 3.0+, look at the System.Xml.Linq namespace and the XDocument class.

If you're looking to read all the values in the XML file , you could look into deserializing the XML into a C# data Object.
Deserializing XML into class obj in C#

Yes, I suggest you use an XmlDocument object to parse the content of your string.
Here is an example who print all inner text contained in your tags:
var doc=new XmlDocument();
doc.LoadXml(stringBuilder.TosTring());
XmlNodeList elemList = doc.GetElementsByTagName("node");
for (int i=0; i < elemList.Count; i++)
{
XmlNode node=elemList[i];
Console.WriteLine(node.InnerText);
}
using Node object members, you can also easily extract all you attributes .

How do I read and parse an XML file in C#?

How do I read and parse an XML file in C#?

XmlDocument to read an XML from string or from file.
using System.Xml;
XmlDocument doc = new XmlDocument();
doc.Load("c:\\temp.xml");
or
doc.LoadXml("<xml>something</xml>");
then find a node below it ie like this
XmlNode node = doc.DocumentElement.SelectSingleNode("/book/title");
or
foreach(XmlNode node in doc.DocumentElement.ChildNodes){
string text = node.InnerText; //or loop through its children as well
}
then read the text inside that node like this
string text = node.InnerText;
or read an attribute
string attr = node.Attributes["theattributename"]?.InnerText
Always check for null on Attributes["something"] since it will be null if the attribute does not exist.

LINQ to XML Example:
// Loading from a file, you can also load from a stream
var xml = XDocument.Load(#"C:\contacts.xml");
// Query the data and write out a subset of contacts
var query = from c in xml.Root.Descendants("contact")
where (int)c.Attribute("id") < 4
select c.Element("firstName").Value + " " +
c.Element("lastName").Value;
foreach (string name in query)
{
Console.WriteLine("Contact's Full Name: {0}", name);
}
Reference: LINQ to XML at MSDN

Here's an application I wrote for reading xml sitemaps:
using System;
using System.Collections.Generic;
using System.Windows.Forms;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Data;
using System.Xml;
namespace SiteMapReader
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Please Enter the Location of the file");
// get the location we want to get the sitemaps from
string dirLoc = Console.ReadLine();
// get all the sitemaps
string[] sitemaps = Directory.GetFiles(dirLoc);
StreamWriter sw = new StreamWriter(Application.StartupPath + #"\locs.txt", true);
// loop through each file
foreach (string sitemap in sitemaps)
{
try
{
// new xdoc instance
XmlDocument xDoc = new XmlDocument();
//load up the xml from the location
xDoc.Load(sitemap);
// cycle through each child noed
foreach (XmlNode node in xDoc.DocumentElement.ChildNodes)
{
// first node is the url ... have to go to nexted loc node
foreach (XmlNode locNode in node)
{
// thereare a couple child nodes here so only take data from node named loc
if (locNode.Name == "loc")
{
// get the content of the loc node
string loc = locNode.InnerText;
// write it to the console so you can see its working
Console.WriteLine(loc + Environment.NewLine);
// write it to the file
sw.Write(loc + Environment.NewLine);
}
}
}
}
catch { }
}
Console.WriteLine("All Done :-)");
Console.ReadLine();
}
static void readSitemap()
{
}
}
}
Code on Paste Bin
http://pastebin.com/yK7cSNeY

There are lots of way, some:
XmlSerializer. use a class with the target schema
you want to read - use XmlSerializer
to get the data in an Xml loaded into
an instance of the class.
Linq 2 xml
XmlTextReader.
XmlDocument
XPathDocument (read-only access)

You could use a DataSet to read XML strings.
var xmlString = File.ReadAllText(FILE_PATH);
var stringReader = new StringReader(xmlString);
var dsSet = new DataSet();
dsSet.ReadXml(stringReader);
Posting this for the sake of information.

You can either:
Use XmlSerializer class
Use XmlDocument class
Examples are on the msdn pages provided

Linq to XML.
Also, VB.NET has much better xml parsing support via the compiler than C#. If you have the option and the desire, check it out.

Check out XmlTextReader class for instance.

public void ReadXmlFile()
{
string path = HttpContext.Current.Server.MapPath("~/App_Data"); // Finds the location of App_Data on server.
XmlTextReader reader = new XmlTextReader(System.IO.Path.Combine(path, "XMLFile7.xml")); //Combines the location of App_Data and the file name
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
break;
case XmlNodeType.Text:
columnNames.Add(reader.Value);
break;
case XmlNodeType.EndElement:
break;
}
}
}
You can avoid the first statement and just specify the path name in constructor of XmlTextReader.

If you want to retrive a particular value from an XML file
XmlDocument _LocalInfo_Xml = new XmlDocument();
_LocalInfo_Xml.Load(fileName);
XmlElement _XmlElement;
_XmlElement = _LocalInfo_Xml.GetElementsByTagName("UserId")[0] as XmlElement;
string Value = _XmlElement.InnerText;

Here is another approach using Cinchoo ETL - an open source library to parse xml file with few lines of code.
using (var r = ChoXmlReader<Item>.LoadText(xml)
.WithXPath("//item")
)
{
foreach (var rec in r)
rec.Print();
}
public class Item
{
public string Name { get; set; }
public string ProtectionLevel { get; set; }
public string Description { get; set; }
}
Sample fiddle: https://dotnetfiddle.net/otYq5j
Disclaimer: I'm author of this library.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.