Extract part of a big XML - c#

I have to extract a part of an XML. My XML file can contain thousands of nodes and I would like to get only a part of it and have this part as an xml string.
My XML structure:
<ResponseMessage xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<ErrorResponse>
<Code>SUCCESS</Code>
<Message>Success</Message>
</ErrorResponse>
<OutputXml>
<Response>
<Product>
<child1>xxx</child1>
<child2>xxx</child2>
...
</Product>
<Product>
<child1>xxx</child1>
<child2>xxx</child2>
...
</Product>
...
</Response>
</OutputXML>
</ResponseMessage>
I'm getting the XML from a webservice like that:
...
System.Net.WebResponse wResponse = req.GetResponse();
reqstream = wResponse.GetResponseStream();
System.IO.StreamReader reader = new System.IO.StreamReader(reqstream);
System.Xml.Linq.XDocument xmlResponse = System.Xml.Linq.XDocument.Parse(reader.ReadToEnd());
Then I tried to put the XML in a generic collection to process it using linq:
int startIndex = 0;
int nbItem = 25;
System.Text.StringBuilder outputXml = new System.Text.StringBuilder();
System.Collections.Generic.IEnumerable<System.Xml.Linq.XElement> partialList =
xmlResponse.Elements("Response").Skip(startIndex).Take(nbItem);
foreach (System.Xml.Linq.XElement x in partialList)
{
outputXml.Append(x.ToString());
}
My problem is that my list is always empty.

You can use an LINQ To Xml by using the following code:
IEnumerable<XElement> elements = xmlResponse.Root.Element("OutputXml").Element("Response").Elements("Product");
foreach(XElement element in elements)
{
// Do Work Here
}
This will filter the list down to just products and it will select them correctly without using an index. Using indexes with xml is not the greatest idea because the xml can change.

You can use XPathEvaluate to read a subtree.
If your list is empty, chances are it is namespace problem, so you did not account for this namespace in your code xmlns:i="http://www.w3.org/2001/XMLSchema-instance". XDocument/XElement cannot resolve namespaces automatically.
See this topic on how to use namespaces with LINQ-to-XML.

Related

XML parsing a subtree in C#

I have many xml-files which I need to parse. The xml-files are loaded from elsewhere. I can examine these files to get the paths I need to extract my desired data. The paths aren't the same.
So I added the paths in an ini-file for each xml-file. This works fine for 5 of 6 files.
WebClient client = new WebClient();
data = client.DownloadData("ftp://some.site/my.xml");
MemoryStream stream = new MemoryStream(data);
XmlDocument xml_doc = new XmlDocument();
xml_doc.Load(stream);
var prod_ids = xml_doc.DocumentElement.SelectNodes("/Catalog/Products/Product/Product_Id/text()");
foreach (XmlNode node in prod_ids) {
[...]
}
In the last file I need to get 2 information from one subtree at once, because I have to combine them in one string, therefore reading all nodes seperatly doesn't work. See Example-XML:
<Catalog>
<Created><![CDATA[2020-11-16T00:22:11+01:00]]></Created>
<Products>
<Product>
<Product_Id><![CDATA[ABC]]></Product_Id>
<Color_Code><![CDATA[123]]></Color_Code>
<Size><![CDATA[]]></Size>
<Length>210</Length>
<Width>0</Width>
</Product>
<Product>
<Product_Id><![CDATA[ABC]]></Product_Id>
<Color_Code><![CDATA[456]]></Color_Code>
<Size><![CDATA[]]></Size>
<Length>44</Length>
<Width>55</Width>
</Product>
<Product>
<Product_Id><![CDATA[XYZ]]></Product_Id>
<Color_Code><![CDATA[123]]></Color_Code>
<Size><![CDATA[]]></Size>
<Length>150</Length>
<Width>11</Width>
</Product>
</Products>
</Catalog>
I'm lookig for some code which parses each subtree (/Catalog/Products/Product) in which I can read the innerText from Product_Id and Color_Code to combine them to one string.
Any ideas?
You're really close, but you're going too low in the DOM tree. Instead of looping through each Product/ProductID, start your loop at each Product, then in the loop get each ProductID / ColorCode.
foreach( XmlElement ndProduct in xml.SelectNodes( "//Product") ) {
XmlElement ndProductID = (XmlElement)ndProduct.SelectSingleNode("Product_Id");
string strProductID = ndProductID.InnerText;
XmlElement ndColorCode = (XmlElement)ndProduct.SelectSingleNode("Color_Code");
string strColorCode = ndColorCode.InnerText;
string strReturn = strProductID + " - " + strColorCode;
}
Use a more modern linq to xml.
var doc = XDocument.Load(stream);
var values = doc.Root
.Element("Products")
.Elements("Product")
.Select(p => p.Element("Product_Id").Value + p.Element("Color_Code").Value);
foreach (var value in values)
Console.WriteLine(value);
I can offer the following solution.
Get the values of different nodes using the OR operation |.
Then we go through the collection with an increment of two and combine the values.
var prod_ids = xml_doc.DocumentElement.SelectNodes(
"/Catalog/Products/Product/Product_Id | /Catalog/Products/Product/Color_Code");
for (int i = 0; i < prod_ids.Count; i += 2)
Console.WriteLine(prod_ids[i].InnerText + prod_ids[i + 1].InnerText);

c# Reading data from XML

I have problems with understandig, how to read data from XML.
XML looks like this:
<PosXML version="7.2.0">
<ReadCardResponse>
<ReturnCode>1</ReturnCode>
<Card>
<Pan>222300******5062</Pan>
<Expires>****</Expires>
<CardName>MASTERCARD</CardName>
<CardSource>2</CardSource>
</Card>
</ReadCardResponse>
</PosXML>
I have loaded XML from stream:
XDocument doc;
using (Stream responseStream = httpResponse.GetResponseStream())
{
doc= XDocument.Load(responseStream);
}
Tried this, but it's not working:
XElement returnCode = doc.XPathSelectElement("ReturnCode")
var returnCode = doc.XPathSelectElement(#"PosXML/ReadCardResponse/ReturnCode");
You need to use the full path to the element
Try:
XElement returnCode = doc.Element("ReadCardResponse").Element("ReturnCode")
You can also access elements by XPath, nodes, or some linq query. Try to play around with intellisense of your IDE

How to remove xmlns tag from xml element using c#

I want to remove this xmlns="http://www.AB.com/BC/QualityDeviationCase/1_0" from all xml element except root tag. xml element using C#. I have written below code to get to as xml form. But unable to remove xmlns tag from xml element.
<?xml version="1.0" encoding="utf-16"?>
<QualityDeviationCaseType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<CaseID xmlns="http://www.AB.com/BC/QualityDeviationCase/1_0">2</CaseID>
<Description xmlns="http://www.AB.com/BC/QualityDeviationCase/1_0">Air filter is not present</Description>
<StartDate xmlns="http://www.AB.com/BC/QualityDeviationCase/1_0">0001-01-01T00:00:00</StartDate>
<LastUpdated xmlns="http://www.AB.com/BC/QualityDeviationCase/1_0">0001-01-01T00:00:00</LastUpdated>
</QualityDeviationCaseType>
After remove xmlns="http://www.AB.com/BC/QualityDeviationCase/1_0" this tag, my result should return like this as below-
<?xml version="1.0" encoding="utf-16"?>
<QualityDeviationCaseType xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<CaseID>2</CaseID>
<Description>Air filter is not present</Description>
<StartDate>0001-01-01T00:00:00</StartDate>
<LastUpdated>0001-01-01T00:00:00</LastUpdated>
</QualityDeviationCaseType>
My C# Code, I am getting xml result from my XSD file.
connection.Open();
adapter = new SqlDataAdapter(sql, connection);
adapter.Fill(dt);
myTestTable = dt.Clone();
DataRow[] orderRows = dt.Select();
XmlDocument xmlDoc = new XmlDocument();
QualityDeviationCaseType oQualityDeviationCaseType = new QualityDeviationCaseType();
foreach (DataRow row in orderRows)
{
oQualityDeviationCaseType = new QualityDeviationCaseType();
oQualityDeviationCaseType.CaseID = row[0].ToString();
oQualityDeviationCaseType.Description = row[3].ToString();
}
using (StringWriter stringwriter = new System.IO.StringWriter())
{
XmlSerializer ser = new XmlSerializer(typeof(QualityDeviationCaseType));
ser.Serialize(stringwriter, oQualityDeviationCaseType);
sampleChannel.Publish(stringwriter.ToString());
//This line of code sending my xml file to IBM WMQ.
}
As per my above code, My result is coming with xml tag for each lement. I want to remove from element tag using c#.
The xmlns attribute has important semantics in XML: it puts all elements in the default namespace http://www.AB.com/BC/QualityDeviationCase/1_0.
I am not familiar with C# specifically, however: removing it implies reasoning at the semantic level rather than syntactic.
Concretely, it should be done by removing the namespace of all elements in the XML document being exported, or not putting them in any namespace in the first place. If the C# library is complete, it should provide a way to do so.
Did you try Regex?
using System.Text.RegularExpressions;
...
string pattern = #"xmlns=""[a-zA-Z0-9:\/._]{1,}""";
using (StringWriter stringwriter = new System.IO.StringWriter())
{
XmlSerializer ser = new XmlSerializer(typeof(QualityDeviationCaseType));
ser.Serialize(stringwriter, oQualityDeviationCaseType);
string s = stringwriter.ToString();
Match m = Regex.Match(s,pattern);
if(m.Success)
s=s.Replace(m.Value, "");
sampleChannel.Publish(s);
//This line of code sending my xml file to IBM WMQ.
}
I'm not that familiar with XmlSerializer, but that should work I think. Match will only find anything beginning with 'xmlns="' and ending with '"'.

Parsing an XML file with tags inside some values in C#

First, thank you for your time, i need to to write a c# code that gets an XML response from a REST web service, some tags contain a special field with an entire tag inside the value here is an example :
<Root date="22.22.2222" version="specific system versonning">
<record field1="value1" field2="value1" specialfield="just_a_normal_value"/>
<record field1="value1" field2"value2" specialfield=" <multiplesubfields subfield1= "subvalue1" subfield2="subvalue2"/> " field3="value3"/>
<record field1="value1" field2"value2" specialfield=" <multiplesubfields subfield1= "subvalue1" subfield2="subvalue2" subfield3="subvalue3"/> " field3="value3"/>
<record field1="value1" field2="value1" specialfield="just_a_normal_value"/>
</Root>
The number of inner fields inside the tag value is not fixed.
For some records the specialfield contains just a normal value with no tags inside, in that case no modifications needed.
What i want to do is to retrieve those subfields and values and add them to the parent tag as they where normal fields and remove the field specialfield from it the result should be then as following :
<Root date="22.22.2222" version="specific system versonning">
<record field1="value1" field2="value1" specialfield="just_a_normal_value"/>
<record field1="value1" field2"value2" subfield1= "subvalue1" subfield2="subvalue2" field3="value3"/>
<record field1="value1" field2"value2" subfield1= "subvalue1" subfield2="subvalue2" subfield3="subvalue3" field3="value3"/>
<record field1="value1" field2="value1" specialfield="just_a_normal_value"/>
</Root>
Following is code to get the XML file:
static void Main(string[] args)
{
string url = "www.blablabla.."
HttpWebRequest req = WebRequest.Create(url) as HttpWebRequest;
req.Credentials = new NetworkCredential("USER", "PASSWORD");
XmlDocument xmlDoc = new XmlDocument();
using (HttpWebResponse resp = req.GetResponse() as HttpWebResponse)
{
xmlDoc.Load(resp.GetResponseStream());
}
}
The above code is working and is fetching perfectly the file, i checked it with
StringWriter sw = new StringWriter();
XmlTextWriter tx = new XmlTextWriter(sw);
xmlDoc.WriteTo(tx);
System.Console.Writeline(sw.ToString());
but i have no clue how to do the rest in a clean way.I read many other similar posts in Stackoverflow and other Sites but still not able able to get started. I mean i have some ugly String treatement ideas of my own that might work but i don't even dare to try them =).
Thanks you all for reading this post.
XDocument doc = XDocument.Parse(xml);
var result = doc.Root.Descendants("record").Where(x => x.Attribute("specialfield") != null && x.Attribute("specialfield").Value.Contains("multiplesubfields"));
foreach(var item in result)
{
item.Attribute("specialfield").Remove();
}
You can use XDocument, after that take all records nodes and search for attributes which specifiedfield contains the multiplesubfields string.
LINQ to Xml is handy tool for working with xml.
You can parse special xml(<multiplesubfields>) as XElement and add attributes of the parsed element to the parent <record>.
const string XMLSTART = " <multiplesubfields";
XDocument doc = XDocument.Load(yourResponseStream);
var specialsRecords =
doc.Root
.Descendants("record")
.Where(rec => rec.Attribute("specialfield") != null)
.Where(rec => rec.Attribute("specialfield").Value.StartsWith(XMLSTART));
foreach (var special in specialsRecords)
{
var multiplesubfields = XElement.Parse(special.Attribute("specialfield").Value);
foreach (var subField in multiplesubfields.Attributes())
{
special.Add(subField);
}
// Remove original attribute
special.Attribute("specialfield").Remove();
}
Console.WriteLine(doc.ToString());
check the url below you can get sample xml study them and have an object form of it.
http://xmltocsharp.azurewebsites.net/
i had similar isues when i was integrating with zenith bank Nigeria limite soap api. they values aint consistent and i had to study the situation.
cheers

How can I get all the nodes of a xml file?

Let's say I have this XML file:
<Names>
<Name>
<FirstName>John</FirstName>
<LastName>Smith</LastName>
</Name>
<Name>
<FirstName>James</FirstName>
<LastName>White</LastName>
</Name>
</Names>
And now I want to print all the names of the node:
Names
Name
FirstName
LastName
I managed to get the all in a XmlNodeList, but I dont know how SelectNodes works.
XmlNodeList xnList = xml.SelectNodes(/*What goes here*/);
I want to select all nodes, and then do a foreach of xnList (Using the .Value property I assume).
Is this the correct approach? How can I use the selectNodes to select all the nodes?
Ensuring you have LINQ and LINQ to XML in scope:
using System.Linq;
using System.Xml.Linq;
If you load them into an XDocument:
var doc = XDocument.Parse(xml); // if from string
var doc = XDocument.Load(xmlFile); // if from file
You can do something like:
doc.Descendants().Select(n => n.Name).Distinct()
This will give you a collection of all distinct XNames of elements in the document. If you don't care about XML namespaces, you can change that to:
doc.Descendants().Select(n => n.Name.LocalName).Distinct()
which will give you a collection of all distinct element names as strings.
There are several ways of doing it.
With XDocument and LINQ-XML
foreach(var name in doc.Root.DescendantNodes().OfType<XElement>().Select(x => x.Name).Distinct())
{
Console.WriteLine(name);
}
If you are using C# 3.0 or above, you can do this
var data = XElement.Load("c:/test.xml"); // change this to reflect location of your xml file
var allElementNames =
(from e in in data.Descendants()
select e.Name).Distinct();
Add
using System.Xml.Linq;
Then you can do
var element = XElement.Parse({Your xml string});
Console.Write(element.Descendants("Name").Select(el => string.Format("{0} {1}", el.Element("FirstName").Value, el.Element("LastName").Value)));

Categories

Resources