How to get XElement from XDocument by Line Number - c#

Is it possible in C# to get XElement of XDocument by giving in
the line number?
Ive got any test XML like:
<Student>
<Name>Josphine</Name>
</Student>
<Student>
<Name>Hendrick</Name>
</Student>
I want to give as Parameter any integer like 5.
5 would give me the Element <Name>Hendrick</Name>
Is this possible in any way? Or do I Need to parse the whole
XDocument by a Reader and check the line number every loop.

You can read your file to string array
string[] lines = File.ReadAllLines("path/to/file");
And then get your line like lines[4].
Or you should better look at XPath as your XML document can change.
Take a look at these exaples and tutorials: XPath Examples, Selecting Nodes.

There is a another look-around, if your XML is well-formed and you want your job get done using XLinq only, then below code might help you:
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
XDocument doc = XDocument.Parse(#"<Students>
<Student>
<Name>Josphine</Name>
</Student>
<Student>
<Name>Hendrick</Name>
</Student>
</Students>", LoadOptions.SetLineInfo);
IEnumerable<XElement> descendants = doc.Descendants();
foreach (XElement ele in descendants)
{
string ln_num = (((IXmlLineInfo)ele).HasLineInfo() ? ((IXmlLineInfo)ele).LineNumber.ToString() : "");
string ln_pos = (((IXmlLineInfo)ele).HasLineInfo() ? ((IXmlLineInfo)ele).LinePosition.ToString() : "");
Console.WriteLine(string.Format("{0} ({1}): at line no. {2}, position {3}", ele.Name.ToString(), ele.Value.ToString(), ln_num.ToString(), ln_pos.ToString()));
}
Console.ReadKey();
}
}
}

Related

Select single node

From the following xml:
<response>
<content>
<Result xmlns="http://www.test.com/nav/webservices/types">
<Name>Test</Name>
</Result>
</content>
<status>ok</status>
</response>
I am trying to get the value of the Name element the following way but that does not work:
private static void Main()
{
var response = new XmlDocument();
response.Load("Response.xml");
var namespaceManager = new XmlNamespaceManager(response.NameTable);
namespaceManager.AddNamespace("ns", "http://www.test.com/nav/webservices/types");
Console.WriteLine(response.SelectSingleNode("/response/content/Result/Name", namespaceManager).InnerXml);
}
How can I select the Name element?
Your code would have worked just fineif the Xml had defined the namespace with a "ns:" prefix.
But in this case, the namespace is given without any prefix, which sets the default namespace for everything in the Result tag to ".../webservice/types".
To reflect this, you need to modify the Xpath, and tell the XmlDocument that the nodes you are looking for under Resultare in the webservice/types namespace. So your query will look like this:
Console.WriteLine(response.SelectSingleNode(#"/response/content/ns:Result/ns:Name", namespaceManager).InnerXml);
For getting directly the text value of a node there is a text() function, if used in the query it would look like:
/response/content/Result/Name/text()
Try this:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.InnerXml = "<response><content><Result xmlns=\"http://www.test.com/nav/webservices/types\"><Name>Test</Name></Result></content><status>ok</status>";
string elementValue = String.Empty;
if (xmlDoc != null)
{
xNode = xmlDoc.SelectSingleNode("/Result");
xNodeList = xNode.ChildNodes;
foreach (XmlNode node in xNodeList)
{
elementValue = node.InnerText;
}
}

xml parsing error (xpath, HTMLagilitypack)

I am trying to parse an xml. All nodes have opening and closing tags except one node that in some lines in only has this tag: <persons/>
In most of the time it appears like this: <persons> ... </persons>
I cannot get values from the xml when this node is not closing like this
Here is my code:
foreach (HtmlNode man in bm.SelectNodes(".//persons"))
{
//store values
}
How can I overcome this issue? Even if some nodes are like this at the start:
<persons> </persons>
if there is a tag like this in the middle of the file
<persons/>
I cannot get the remaining <persons> </persons> values from the remaining lines
why are you using htmlnode? xmlnode would be just fine.
Or else, show more codes.
Did you step through the line? Did you encounter any error?
try this:
internal string ParseXML()
{
string ppl = "";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlString);
foreach (XmlElement node in doc.SelectNodes(".//person"))
{
string text = node.InnerText; //or loop through its children as well
ppl += text;
}
return ppl;
}

Parsing (and keeping) XML structure into SQL Server

I'm looking to parse a relatively complex XML file through C# and store a selection of the data into a SQL Server '08 database. This is what I'm looking to extract from the XML file:
<educationSystem>
<school>
<name>Primary School</name>
<students>
<student id="123456789">
<name>Steve Jobs</name>
<other elements>More Data</other elements>
</student>
<student id="987654">
<name>Jony Ive</name>
<otherElements>More Data</otherElements>
</student>
</students>
</school>
<school>
<name>High School</name>
<students>
<student id="123456">
<name>Bill Gates</name>
<other elements>More Data</other elements>
</student>
<student id="987654">
<name>Steve Ballmer</name>
<otherElements>More Data</otherElements>
</student>
</students>
</school>
</educationSystem>
[Before you ask, no this isn't a school assignment - I'm using school/students as an example and because the original is a lot more sensitive.]
I'm able to (using XDocument/XElement) parse the XML file and get a list of all school names, student names and student ID's, but when this gets added to the database, I end up with the Bill Gates student entry being under a second school. It's all just line-by-line.
I'm looking to find a way to say, achieve this:
Foreach school
put it's name into an XElement
foreach student
grab the name and id put into XElements
Grab next school and repeat
I believe Linq would be the best way to achieve this, but I'm having trouble in how to get started with the process. Would anyone be able to point me in the right direction?
Edit: Here's the code I'm currently using to save data to the database. It processes a list at a time (hence things aren't related as they should be). I'll also be tidying up the SQL as well.
private void saveToDatabase (List<XElement> currentSet, String dataName)
{
SqlConnection connection = null;
try
{
string connectionString = ConfigurationManager.ConnectionStrings["connString"].ConnectionString + "; Asynchronous Processing=true";
connection = new SqlConnection(connectionString);
connection.Open();
foreach (XElement node in currentSet)
{
SqlCommand sqlCmd = new SqlCommand("INSERT INTO dbo.DatabaseName (" + dataName + ") VALUES ('" + node.Value + "')", connection);
sqlCmd.ExecuteNonQuery();
}
}
This LINQ will generate a Collection of Objects,with two properties
Name of the school
List of students(again a collection)
var result = XElement.Load("data.xml")
.Descendants("school")
.Select( x => new {
name = XElement.Parse(x.FirstNode.ToString()).Value,
students =x.Descendants("student")
.Select(stud => new {
id = stud.Attribute("id"),
name = XElement.Parse(stud.FirstNode.ToString()).Value})
.ToList()});
Note:The LINQ assumes <name> as the first node under <school> and <student> tags
Then you can use the foreach that you intended and it will work like a charm
foreach (var school in result)
{
var schoolName = school.name;
foreach (var student in school.students)
{
//Access student.id and student.name here
}
}
For this particular type of workings with XML data, you could use XML Serialization / Deserialization.
This will allow you to Deserialize your XML Data into a IEnumerable Class Object, Perform your LINQ Queries on this Class and then save to SQL.
Hope this helps.
Update: The original code example did not mention a namespace. Namespaces need to be either accounted for when searching for elements by XName or one needs to to search using the XName.LocalName property. Updated the example to show how to handle selecting elements in such a case.
namespace Stackover
{
using System;
using System.Xml.Linq;
class Program
{
private const string Xml = #"<?xml version=""1.0"" encoding=""UTF-8""?>
<namespaceDocument xmlns=""http://www.namedspace/schemas"" xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" xsi:schemaLocation=""http://www.namedspace/schemas.xsd"">
<educationSystem>
<school>
<name>Primary School</name>
<students>
<student id=""123456789"">
<name>Steve Jobs</name>
<otherElements>
<dataA>data</dataA>
</otherElements>
</student>
<student id=""987654"">
<name>Jony Ive</name>
<otherElements>
<dataB>data</dataB>
</otherElements>
</student>
</students>
</school>
<school>
<name>High School</name>
<students>
<student id=""123456"">
<name>Bill Gates</name>
<otherElements>
<dataC>data</dataC>
</otherElements>
</student>
<student id=""987654"">
<name>Steve Ballmer</name>
<otherElements>
<dataD>data</dataD>
</otherElements>
</student>
</students>
</school>
</educationSystem>
</namespaceDocument>";
static void Main(string[] args)
{
var root = XElement.Parse(Xml);
XNamespace ns = "http://www.namedspace/schemas";
foreach(var school in root.Descendants(ns + "school")) // or root.Descendants().Where(e => e.Name.LocalName.Equals("school"));
{
Console.WriteLine(school.Element(ns + "name").Value);
foreach (var students in school.Elements(ns+ "students"))
{
foreach (var student in students.Elements())
{
Console.WriteLine(student.Attribute("id"));
Console.WriteLine(student.Name); // Name = namespace + XName
Console.WriteLine(student.Name.LocalName); // no namespace
}
}
}
}
}
}

Split XML document apart creating multiple output files from repeating elements

I need to take an XML file and create multiple output xml files from the repeating nodes of the input file. The source file "AnimalBatch.xml" looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<Animals>
<Animal id="1001">
<Quantity>One</Quantity>
<Adjective>Red</Adjective>
<Name>Rooster</Name>
</Animal>
<Animal id="1002">
<Quantity>Two</Quantity>
<Adjective>Stubborn</Adjective>
<Name>Donkeys</Name>
</Animal>
<Animal id="1003">
<Quantity>Three</Quantity>
<Color>Blind</Color>
<Name>Mice</Name>
</Animal>
</Animals>
The program needs to split the repeating "Animal" and produce 3 files named: Animal_1001.xml, Animal_1002.xml, and Animal_1003.xml
Each output file should contain just their respective element (which will be the root). The id attribute from AnimalsBatch.xml will supply the sequence number for the Animal_xxxx.xml filenames. The id attribute does not need to be in the output files.
Animal_1001.xml:
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>One</Quantity>
<Adjective>Red</Adjective>
<Name>Rooster</Name>
</Animal>
Animal_1002.xml
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>Two</Quantity>
<Adjective>Stubborn</Adjective>
<Name>Donkeys</Name>
</Animal>
Animal_1003.xml>
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>Three</Quantity>
<Adjective>Blind</Adjective>
<Name>Mice</Name>
</Animal>
I want to do this with XmlDocument, since it needs to be able to run on .Net 2.0.
My program looks like this:
static void Main(string[] args)
{
string strFileName;
string strSeq;
XmlDocument doc = new XmlDocument();
doc.Load("D:\\Rick\\Computer\\XML\\AnimalBatch.xml");
XmlNodeList nl = doc.DocumentElement.SelectNodes("Animal");
foreach (XmlNode n in nl)
{
strSeq = n.Attributes["id"].Value;
XmlDocument outdoc = new XmlDocument();
XmlNode rootnode = outdoc.CreateNode("element", "Animal", "");
outdoc.AppendChild(rootnode); // Put the wrapper element into outdoc
outdoc.ImportNode(n, true); // place the node n into outdoc
outdoc.AppendChild(n); // This statement errors:
// "The node to be inserted is from a different document context."
strFileName = "Animal_" + strSeq + ".xml";
outdoc.Save(Console.Out);
Console.WriteLine();
}
Console.WriteLine("END OF PROGRAM: Press <ENTER>");
Console.ReadLine();
}
I think I have 2 problems.
A) After doing the ImportNode on node n into outdoc, I call outdoc.AppendChild(n) which complains: "The node to be inserted is from a different document context." I do not know if this is a scope issue referencing node n within the ForEach loop - or if I am somehow not using ImportNode() or AppendChild properly. 2nd argument on ImportNode() is set to true, because I want the child elements of Animal (3 fields arbitrarily named Quantity, Adjective, and Name) to end up in the destination file.
B) Second problem is getting the Animal element into outdoc. I'm getting '' but I need ' ' so I can place node n inside it. I think my problem is how I am doing: outdoc.AppendChild(rootnode);
To show the xml, I'm doing: outdoc.Save(Console.Out); I do have the code to save() to an output file - which does work, as long as I can get outdoc assembled properly.
There is a similar question at: Split XML in Multiple XML files, but I don't understand the solution code yet. I think I'm pretty close on this approach, and will appreciate any help you can provide.
I'm going to be doing this same task using XmlReader, since I'm going to need to be able to handle large input files, and I understand that XmlDocument reads the whole thing in and can cause memory issues.
That's a simple method that seems what you are looking for
public void test_xml_split()
{
XmlDocument doc = new XmlDocument();
doc.Load("C:\\animals.xml");
XmlDocument newXmlDoc = null;
foreach (XmlNode animalNode in doc.SelectNodes("//Animals/Animal"))
{
newXmlDoc = new XmlDocument();
var targetNode = newXmlDoc.ImportNode(animalNode, true);
newXmlDoc.AppendChild(targetNode);
newXmlDoc.Save(Console.Out);
Console.WriteLine();
}
}
This approach seems to work without using the "var targetnode" statement. It creates an XmlNode object called targetNode from outdoc's "Animal" element in the ForEach loop. I think the main things that were problems in my original code were: A) I was getting nodelist nl incorrectly. And B) I couldn't "Import" node n, I think because it was associated specifically with doc. It had to be created as its own Node.
The problem with the prior proposed solution was the use of the "var" keyword. My program has to assume 2.0 and that came in with v3.0. I like Rogers solution, in that it is concise. For me - I wanted to do each thing as a separate statement.
static void SplitXMLDocument()
{
string strFileName;
string strSeq;
XmlDocument doc = new XmlDocument(); // The input file
doc.Load("D:\\Rick\\Computer\\XML\\AnimalBatch.xml");
XmlNodeList nl = doc.DocumentElement.SelectNodes("//Animals/Animal");
foreach (XmlNode n in nl)
{
strSeq = n.Attributes["id"].Value; // Animal nodes have an id attribute
XmlDocument outdoc = new XmlDocument(); // Create the outdoc xml document
XmlNode targetNode = outdoc.CreateElement("Animal"); // Create a separate node to hold the Animal element
targetNode = outdoc.ImportNode(n, true); // Bring over that Animal
targetNode.Attributes.RemoveAll(); // Remove the id attribute in <Animal id="1001">
outdoc.ImportNode(targetNode, true); // place the node n into outdoc
outdoc.AppendChild(targetNode); // AppendChild to make it stick
strFileName = "Animal_" + strSeq + ".xml";
outdoc.Save(Console.Out); Console.WriteLine();
outdoc.Save("D:\\Rick\\Computer\\XML\\" + strFileName);
Console.WriteLine();
}
}

How can I get all the nodes of a xml file?

Let's say I have this XML file:
<Names>
<Name>
<FirstName>John</FirstName>
<LastName>Smith</LastName>
</Name>
<Name>
<FirstName>James</FirstName>
<LastName>White</LastName>
</Name>
</Names>
And now I want to print all the names of the node:
Names
Name
FirstName
LastName
I managed to get the all in a XmlNodeList, but I dont know how SelectNodes works.
XmlNodeList xnList = xml.SelectNodes(/*What goes here*/);
I want to select all nodes, and then do a foreach of xnList (Using the .Value property I assume).
Is this the correct approach? How can I use the selectNodes to select all the nodes?
Ensuring you have LINQ and LINQ to XML in scope:
using System.Linq;
using System.Xml.Linq;
If you load them into an XDocument:
var doc = XDocument.Parse(xml); // if from string
var doc = XDocument.Load(xmlFile); // if from file
You can do something like:
doc.Descendants().Select(n => n.Name).Distinct()
This will give you a collection of all distinct XNames of elements in the document. If you don't care about XML namespaces, you can change that to:
doc.Descendants().Select(n => n.Name.LocalName).Distinct()
which will give you a collection of all distinct element names as strings.
There are several ways of doing it.
With XDocument and LINQ-XML
foreach(var name in doc.Root.DescendantNodes().OfType<XElement>().Select(x => x.Name).Distinct())
{
Console.WriteLine(name);
}
If you are using C# 3.0 or above, you can do this
var data = XElement.Load("c:/test.xml"); // change this to reflect location of your xml file
var allElementNames =
(from e in in data.Descendants()
select e.Name).Distinct();
Add
using System.Xml.Linq;
Then you can do
var element = XElement.Parse({Your xml string});
Console.Write(element.Descendants("Name").Select(el => string.Format("{0} {1}", el.Element("FirstName").Value, el.Element("LastName").Value)));

Categories

Resources