How to parse the below xml string in c# - c#

I have the below xml string in one string variable.
string xmlString = "<a:ORegions>
<a:ID>1</a:ID>
<a:regionCode>US</a:regionCode>
</a:ORegions>
<a:ORegions>
<a:ID>2</a:ID>
<a:regionCode>CANADA</a:regionCode>
</a:ORegions>
<a:ORegions>
<a:ID>3</a:ID>
<a:regionCode>ASIA</a:regionCode>
</a:ORegions>
Now i want to access regionCode values, that is US, CANADA, ASIA
How i can do that using c#. I am new to xml parsing.

You can deserialize that string (assuming you fix the various syntax errors) via the System.Xml namespace classes, particularly XmlDocument, such as with its Load method. To access the namespaces (a in a:Oregions and such is a namespace), you'll want an XmlNamespaceManager. You'd then register the namespaces (they must be defined somewhere) with the manager and use that when querying the XmlDocument.

Use LinqToXml
var doc = XDocument.Parse(xmlString);
You can then access elements, values and attributes within:
XNamespace xmlNamespace = "a";
//e.g. Retrieve's a list of regioncodes...
var ids = doc.Elements(xmlNamespace + "ORegions")
.Select(r => r.Element("regionCode").Value);

XmlDocument document = new XmlDocument();
document.Load(filePath);
foreach (XmlNode node in document.GetElementsByTagName("a:regionCode"))
Console.WriteLine(node.InnerText);

Related

How to read a node in a serialized XML file?

I'm trying to read a node in a serialized XML file. Here is the the first part of the XML file (I'm using a screen cap because pasting ended up with weird formatting):
And this is the code I'm using to read the XML and the error it's throwing:
I'm trying to read the <ScenarioDescription> node.
As per request, here's the entire XML file. Unfortunately it's just a complete mess. Here is a link to the XML file.
You should You would need to specify the namespace. In this particular case, the default namespace is used to declare the http://schemas.datacontract.org/2004/07/ModelLib namespace.
var xml = new XmlDocument();
xml.LoadXml(str);
XmlNamespaceManager ns = new XmlNamespaceManager(xml.NameTable);
ns.AddNamespace("x", xml.DocumentElement.NamespaceURI);
var root = xml.DocumentElement;
var test = root.SelectSingleNode("//x:ScenarioDescription",ns);
var scenarioText = test.InnerText;
You can use the following code to access the ScenarioDescription node and its InnerText
var document = new XmlDocument();
document.Load(s);
var root = document.DocumentElement;
var node = root["ScenarioDescription"];
var text = node?.InnerText;
SelectSingleNode accepts the xpath expression, you simply can use XmlElement indexer instead. Otherwise you'll need to create a XmlNamespaceManager instance and add your root namespace to it
You can supply the XmlNamespaceManager object as a parameter to the
SelectNodes or SelectSingleNode method of the XmlDocument class to
execute XPath query expressions that reference namespace-qualified
element and attribute name

C# Read XML with multiple variable namespaces

I have to read some tags and attributes from an XML that has a defined structure but since those files can be generated from different sources, they can have different namespaces and prefixes.
This is the first XML sample
<Order xmlns="urn:oasis:names:specification:ubl:schema:xsd:Order-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cbc:UBLVersionID>2.1</cbc:UBLVersionID>
<cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns001:ver2.0:extended:urn:www.peppol.eu:bis:peppol3a:ver2.0:extended:urn:www.ubl-italia.org:spec:ordine:ver2.1</cbc:CustomizationID>
<cbc:ID>ORD-001</cbc:ID>
<cbc:IssueDate>2016-10-01</cbc:IssueDate>
<cbc:OrderTypeCode listID="UNCL1001">221</cbc:OrderTypeCode>
<cac:ValidityPeriod>
<cbc:EndDate>2024-10-19</cbc:EndDate>
</cac:ValidityPeriod>
<cac:BuyerCustomerParty>
<cac:Party>
<cbc:EndpointID schemeID="IT:IPA">ITAK12MH</cbc:EndpointID>
<cac:PartyIdentification>
<cbc:ID schemeID="IT:VAT">01567570254</cbc:ID>
</cac:PartyIdentification>
<cac:PartyName>
<cbc:Name>A Custom Name</cbc:Name>
</cac:PartyName>
</cac:Party>
</cac:BuyerCustomerParty>
</Order>
This is the second XML sample with different namespaces and prefixes, but same structure (tags, attributes).
<ns10:Order xmlns="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:ns2="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2" xmlns:ns3="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:ns4="http://www.w3.org/2000/09/xmldsig#" xmlns:ns5="http://uri.etsi.org/01903/v1.3.2#" xmlns:ns6="urn:oasis:names:specification:ubl:schema:xsd:SignatureBasicComponents-2" xmlns:ns7="urn:oasis:names:specification:ubl:schema:xsd:SignatureAggregateComponents-2" xmlns:ns8="http://uri.etsi.org/01903/v1.4.1#" xmlns:ns9="urn:oasis:names:specification:ubl:schema:xsd:CommonSignatureComponents-2" xmlns:ns10="urn:oasis:names:specification:ubl:schema:xsd:Order-2">
<UBLVersionID>2.1</UBLVersionID>
<CustomizationID>urn:www.cenbii.eu:transaction:biitrns001:ver2.0:extended:urn:www.peppol.eu:bis:peppol3a:ver2.0:extended:urn:www.ubl-italia.org:spec:ordine:ver2.1</CustomizationID>
<ID>ORD-001</ID>
<IssueDate>2016-10-01</IssueDate>
<OrderTypeCode listID="UNCL1001">221</OrderTypeCode>
<ns3:ValidityPeriod>
<EndDate>2024-10-19</EndDate>
</ns3:ValidityPeriod>
<ns3:BuyerCustomerParty>
<ns3:Party>
<EndpointID schemeID="IT:IPA">ITAK12MH</EndpointID>
<ns3:PartyIdentification>
<ID schemeID="IT:VAT">01567570254</ID>
</ns3:PartyIdentification>
<ns3:PartyName>
<Name>A Custom Name</Name>
</ns3:PartyName>
</ns3:Party>
</ns3:BuyerCustomerParty>
</ns10:Order>
Those files must be considered the same and so both valid.
A third example can be a file similar to the second where the namespaces are the same but their prefixes are different. Obviously the important thing is that the prefix used to match the namespace belongs to that particular tag.
I have no way of knowing in advance what will be the prefixes associated with namespaces.
<aaa:Order xmlns="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:aaa="urn:oasis:names:specification:ubl:schema:xsd:Order-2" xmlns:bbb="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2">
<UBLVersionID>2.1</UBLVersionID>
<CustomizationID>urn:www.cenbii.eu:transaction:biitrns001:ver2.0:extended:urn:www.peppol.eu:bis:peppol3a:ver2.0:extended:urn:www.ubl-italia.org:spec:ordine:ver2.1</CustomizationID>
<ID>ORD-001</ID>
<IssueDate>2016-10-01</IssueDate>
<OrderTypeCode listID="UNCL1001">221</OrderTypeCode>
<bbb:ValidityPeriod>
<EndDate>2024-10-19</EndDate>
</bbb:ValidityPeriod>
<bbb:BuyerCustomerParty>
<bbb:Party>
<EndpointID schemeID="IT:IPA">ITAK12MH</EndpointID>
<bbb:PartyIdentification>
<ID schemeID="IT:VAT">01567570254</ID>
</bbb:PartyIdentification>
<bbb:PartyName>
<Name>A Custom Name</Name>
</bbb:PartyName>
</bbb:Party>
</bbb:BuyerCustomerParty>
</aaa:Order>
This last file must be considered valid as the others.
As you can see, the association between the tags and their namespaces are always the same. The only things that are changed are the prefixes.
My actual code uses XDocument and XElement classes to read the XML but it can be the way because I need to know the exact prefix for each tag and since they can vary, it works only with the first XML file sample.
XDocument doc;
XmlNamespaceManager manager;
using (XmlReader reader = XmlReader.Create(stream))
{
doc = XDocument.Load(reader);
// Retrieving namespaces of XML file
XPathNavigator navigator = doc.CreateNavigator();
navigator.MoveToFollowing(XPathNodeType.Element);
IDictionary<string, string> namespaces = navigator.GetNamespacesInScope(XmlNamespaceScope.All);
// Add namespaces to an XmlNamespaceManager to read nodes
manager = new XmlNamespaceManager(reader.NameTable);
foreach (KeyValuePair<string, string> ns in namespaces)
{
manager.AddNamespace(ns.Key, ns.Value);
}
}
XElement currentNode;
currentNode = doc.Root.XPathSelectElement("cbc:ID", manager);
if (currentNode != null)
item.DespatchAdviceId = currentNode.Value;
currentNode = doc.Root.XPathSelectElement("cbc:IssueDate", manager);
if (currentNode != null)
{
DateTime dataEmissione;
if (DateTime.TryParseExact(currentNode.Value, validDateFormats, CultureInfo.InvariantCulture, DateTimeStyles.None, out dataEmissione))
item.OrderIssueDate = dataEmissione;
}
currentNode = doc.Root.XPathSelectElement("cac:BuyerCustomerParty/cac:Party/cac:PartyIdentification/cbc:ID", manager);
if (currentNode != null)
{
item.BuyerPartyId = currentNode.Value;
if (currentNode.Attribute("schemeID") != null)
item.BuyerPartySchemeId = currentNode.Attribute("schemeID").Value;
}
// ... and so on...
How can I read the XMLs without having to specify the namespace prefixes?
Should I use another .NET library or maybe a 3rd party one?
Using LocalName, you can linq it without adding the namespace.
//this is for <cbc:ID>ORD-001</cbc:ID>
var element = doc.Root.Elements().Where(x => x.Name.LocalName == "ID").FirstOrDefault();
If you want to go in the nested elements
var element = doc.Root.Elements().Where(x => x.Name.LocalName == "ValidityPeriod").
Elements().Where(x=> x.Name.LocalName == "EndDate").FirstOrDefault();
I need to know the exact prefix for each tag.
No, you don't. The prefixes are entirely irrelevant to qualified name of an element or attribute. If you want to go the XPath route, then don't read the namespaces and prefixes from the document to create your namespace manager, specify them yourself so you know what they are. Then use those in your query. For example, this will work with any of your XML documents:
var manager = new XmlNamespaceManager(new NameTable());
manager.AddNamespace("cbc",
"urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2");
var id = doc.Root.XPathSelectElement("cbc:ID", manager);
What I would encourage, though, is that you ditch XPath. LINQ to XML is so much nicer. And another quick hint, there is an overload of XDocument.Load that accepts a stream. There's no need to create the XmlReader. So:
XNamespace order = "urn:oasis:names:specification:ubl:schema:xsd:Order-2";
XNamespace cbc = "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2";
XNamespace cac = "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2";
var doc = XDocument.Load(stream);
var id = (string) doc.Elements(order + "Order")
.Elements(cbc + "ID")
.Single();
var issueDate = (DateTime) doc.Elements(order + "Order")
.Elements(cbc + "IssueDate")
.Single();
var buyerPartySchemeId = (string) doc.Descendants(cac + "BuyerCustomerParty")
.Descendants(cbc + "ID")
.Attributes("schemeID")
.Single();

How can I address specific XML-Elements with C# when using namespaces?

I have to import XML-documents into a SQL-DB via C#. The XML has several namespaces.
I tried to address the specific elements I want wo import via XPath, but it broke due to the namespaces. Instead of
XmlDocument doc = new XmlDocument();
doc = XDocument.Load(Filename);
GENERATOR_INFO = doc.SelectSingleNode("ORDER/ORDER_HEADER/ORDER_INFO/GENERATOR_INFO").InnerText;
I write something like this:
XmlDocument doc = new XmlDocument();
doc = XDocument.Load(Filename);
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("openTrans", "http://www.opentrans.org/XMLSchema/2.1");
GENERATOR_INFO = doc.SelectSingleNode("//openTrans:GENERATOR_INFO", nsmgr).InnerText;
thus shortening the "path" to the element I need the Value of.
My question is: How can I address an element directly? In the XMl there are two elements called "ORDER_ID":
ORDER/ORDER_HEADER/ORDER_INFO/ORDER_ID
and
ORDER/ORDER_HEADER/CUSTOMER_ORDER_REFERENCE/ORDER_INFO
When using the syntax above I get only one element, with a loop/Linq I get both values but don't know, which one is which.
Is there an option to address the elements by there unique XPath despite the namespace?
regards
Jens
Found it :)
I have to reference every "step" of the XPath: ORDER_ID = doc.SelectSingleNode("//openTrans:ORDER_INFO/openTrans:ORDER‌​_ID", nsmgr).InnerText; resp. VIND_ORDER_ID = COR.SelectSingleNode("//openTrans:CUSTOMER_ORDER_REFERENCE/o‌​penTrans:ORDER_ID", nsmgr).InnerText

XmlDocument Remove XMLNS from Root Element

New here, looking to get a little help with my XmlDocument. Is it possible to have string data in my root element AND remove the xmlns= attribute from being shown? I'm looking for something like this:
<Rulebase author=yadda datetime=bingbang version=1.x </Rulebase>
When I try to use my string data by doing:
xmlDom.AppendChild(xmlDom.CreateElement("", "Rulebase", data));
XmlElement xmlRoot = xmlDom.DocumentElement;
It ends up looking like this:
<Rulebase xmlns="version=0 author=username date=7/13/2011 </Rulebase>
and it also appends xmlns="" to all my other nodes.
The CreateElement overload you're using takes a prefix as it's first argument, local name as second, and namespace as third. If you don't want a namespace, don't use this overload. Just use the one that takes a local name as the one and only argument. Then add your data separately as child elements and attributes.
var xmlDom = new XmlDocument();
XmlElement root = xmlDom.CreateElement("Rulebase");
xmlDom.AppendChild(root);
XmlElement data = xmlDom.CreateElement("Data");
root.AppendChild(data);
XmlAttribute attribute = xmlDom.CreateAttribute("author");
attribute.Value = "username";
data.Attributes.Append(attribute);
attribute = xmlDom.CreateAttribute("date");
attribute.Value = XmlConvert.ToString(DateTime.Now, XmlDateTimeSerializationMode.RoundtripKind);
data.Attributes.Append(attribute);
Console.WriteLine(xmlDom.OuterXml);
Creates (formatting added)
<Rulebase>
<Data author="username" date="2011-07-13T22:44:27.5488853-04:00" />
</Rulebase>
Using XmlDocument to generate XML is pretty tedious though. There are many better ways in .NET, like XmlSerializer and DataContractSerializer. You can also use Linq-to-Xml and XElement. Or you can use an XmlWriter.Create(). Lots of options.

C# Xml Parsing from StringBuilder

I have a StringBuilder with the contents of an XML file. Inside the XML file is a root tag called <root> and contains multiple <node> tags.
I'd like to parse through the XML to read values of tags within in s, but not sure how to do it.
Will I have to use some C# XML data type for this?
Thanks in advance
StringBuilder sb = new StringBuilder (xml);
TextReader textReader = new StringReader (sb.ToString ());
XDocument xmlDocument = XDocument.Load (textReader);
var nodeValueList = from node in xmlDocument.Descendants ("node")
select node.Value;
You should use classes available in either System.Xml or System.Xml.Linq to parse XML.
XDocument is part of the LINQ extensions for XML and is particularly easy to use if you need to parse through an arbitrary structure. I would suggest using it rather than XmlDocument (unless you have legacy code or are not on .NET 3.5).
Creating an XDocument from a StringBuilder is straightforward:
var doc = XDocument.Parse( stringBuilder.ToString() );
From here, you can use FirstNode, Descendents(), and the many other properties and methods available to walk and examine the XML structure. And since XDocument is designed to work well with LINQ, you can also write queries like:
var someData = from node in doc.Descendants ("yourNodeType")
select node.Value; // etc..
If you are just looking the specifically named nodes then you don't need to load the document into memory, you can process it yourself with an XmlReader.
using(var sr = new StringReader(stringBuilder.ToString)) {
using(var xr = XmlReader.Create(sr)) {
while(xr.Read()) {
if(xr.IsStartElement() && xr.LocalName == "node")
xr.ReadElementString(); //Do something here
}
}
}
use XDocument.Parse(...)
There are several objects at your disposal for working with XML. Look at the System.Xml namespace for objects such as XmlDocument as well as the XmlReader and XmlWriter families of objects. If using C# 3.0+, look at the System.Xml.Linq namespace and the XDocument class.
If you're looking to read all the values in the XML file , you could look into deserializing the XML into a C# data Object.
Deserializing XML into class obj in C#
Yes, I suggest you use an XmlDocument object to parse the content of your string.
Here is an example who print all inner text contained in your tags:
var doc=new XmlDocument();
doc.LoadXml(stringBuilder.TosTring());
XmlNodeList elemList = doc.GetElementsByTagName("node");
for (int i=0; i < elemList.Count; i++)
{
XmlNode node=elemList[i];
Console.WriteLine(node.InnerText);
}
using Node object members, you can also easily extract all you attributes .

Categories

Resources