Parsing XML document using .Descendents(value)

Parsing XML document using .Descendents(value) - c#

I am trying to parse an xml document that I have created. However xml.Descendants(value) doesn't work if value has certain characters (including space, which is my problem).
My xml is structured like this:
<stockists>
<stockistCountry country="Great Britain">
<stockist>
<name></name>
<address></address>
</stockist>
</stockistCountry>
<stockistCountry country="Germany">
<stockist>
<name></name>
<address></address>
</stockist>
</stockistCountry>
...
</stockists>
And my C# code for parsing looks like this:
string path = String.Format("~/Content/{0}/Content/Stockists.xml", Helper.Helper.ResolveBrand());
XElement xml = XElement.Load(Server.MapPath(path));
var stockistCountries = from s in xml.Descendants("stockistCountry")
select s;
StockistCountryListViewModel stockistCountryListViewModel = new StockistCountryListViewModel
{
BrandStockists = new List<StockistListViewModel>()
};
foreach (var stockistCountry in stockistCountries)
{
StockistListViewModel stockistListViewModel = new StockistListViewModel()
{
Country = stockistCountry.FirstAttribute.Value,
Stockists = new List<StockistDetailViewModel>()
};
var stockist = from s in xml.Descendants(stockistCountry.FirstAttribute.Value) // point of failure for 'Great Britain'
select s;
foreach (var stockistDetail in stockist)
{
StockistDetailViewModel stockistDetailViewModel = new StockistDetailViewModel
{
StoreName = stockistDetail.FirstNode.ToString(),
Address = stockistDetail.LastNode.ToString()
};
stockistListViewModel.Stockists.Add(stockistDetailViewModel);
}
stockistCountryListViewModel.BrandStockists.Add(stockistListViewModel);
}
return View(stockistCountryListViewModel);
I am wondering if I am approaching the Xml parsing correctly, whether I shouldn't have spaces in my attributes etc? How to fix it so that Great Britain will parse

However xml.Descendants(value) doesn't work if value has certain characters
XElement.Descendants() expects an XName for the tag, not for the value.
And XML tags are indeed not allowed to contain spaces.
Your sample XML however only contains a value for an attribute, and the space there is fine.
Update:
I think you need
//var stockist = from s in xml.Descendants(stockistCountry.FirstAttribute.Value)
// select s;
var stockists = stockistCountry.Descendants("stockist");

Related

Parse XML with quoted attributes

I'm using a 3rd party XML parser (not my decision) and found it does something bad. Here is the inner part of an XML tag:
"Date=""2014-01-01"" Amounts=""100717.72 100717.72 100717.72 100717.72"""
To parse the attributes, the code does a .split on spaces, ignoring the quotes. This works fine as long as there's no strings with spaces, but here we are. It returns proper Date=2014-01-01 and semi-proper Amounts=100717.72, but then four more entries of just the numbers.
I have the C# code for the parser, and thought about replacing the spaces-inside-quotes with some other character, splitting, and the changing them back. But then I thought I should ask here first.
Is there a way to parse this text into two entries properly?
UPDATE: original XML follows (typed in from another computer, forgive me!)
<DetailAmounts Date="2014-01-01" Amounts="100717.72 100717.72 100717.72 100717.72" />

You should just use XmlSerializer to deserialize the data:
public class DetailAmounts
{
[XmlAttribute]
public DateTime Date { get; set; }
[XmlAttribute]
public string Amounts { get; set; }
}
// ...
var xml = "<DetailAmounts Date=\"2014-01-01\" Amounts=\"100717.72 100717.72 100717.72 100717.72\" />";
var serializer = new XmlSerializer(typeof(DetailAmounts));
using (var reader = new StringReader(xml))
{
var detailAmounts = (DetailAmounts)serializer.Deserialize(reader);
}
Or, you can use XElement to parse each individual values:
var xml = "<DetailAmounts Date=\"2014-01-01\" Amounts=\"100717.72 100717.72 100717.72 100717.72\" />";
var element = XElement.Parse(xml);
var detailAmounts = new
{
Date = (DateTime)element.Attribute("Date"),
Amounts = element.Attribute("Amounts").Value.Split(' ')
.Select(x => decimal.Parse(x, CultureInfo.InvariantCulture))
.ToArray()
};

XMLinvalide chars replacement C#

I have a string that is displayed in XML but in it I have some invalid chars like string
s = <root> something here <XMLElement>hello</XMLElement> somethig here too </root>
where XMLElement is a List like XMLElement = {"bold", "italic",...} .
What I need is to replace the < and </ if followed by any of the XMLElements to be replaced by > or < depending on the cases.
The <root> is to keep
I have tried so far some regEx
strAux = Regex.Replace(strAux, "bold=\"[^\"]*\"",
match => match.Value.Replace("<", "<").Replace(">", ">"));
or
List<string> startsWith = new List<string> { "<", "</"};
foreach(var stw in startsWith)
{
int nextLt = 0;
while ((nextLt = strAux.IndexOf(stw, nextLt)) != -1)
{
bool isMatch = strAux.Substring(nextLt + 1).StartsWith(BoldElement); // needs to ckeck all the XMLElements
//is element, leave it
if (isMatch)
{
//its not, replace
strAux = string.Format(#"{0}<{1}", strAux.Substring(0, nextLt), strAux.Substring(nextLt +1, strAux.Length - (nextLt + 1)));
}
nextLt++;
}
}
Also tried
XmlDocument doc = new XmlDocument();
XmlElement element = doc.CreateElement("root");
element.InnerText = strAux;
Console.WriteLine(element.OuterXml);
strAux = element.OuterXml.Replace("<root>", "").Replace("</root>", "");
return strAux; But it will repeat the `<root>` too
But nothing worked like I suposed. Is there any different ideias .Thanks

What you have is well-formed XML, so you can use the XML APIs to help you:
Using LINQ to XML (which is generally the better API):
var element = XElement.Parse(s);
element.Value = string.Concat(element.Nodes());
var result = element.ToString();
Or using the older XmlDocument API:
var doc = new XmlDocument();
doc.LoadXml(s);
var root = doc.DocumentElement;
root.InnerText = root.InnerXml;
var result = root.OuterXml;
The result for both is:
<root> something here <XMLElement>hello</XMLElement> somethig here too </root>
See this fiddle for a demo.

You should be using the XmlWriter class.
Sample from the documentation:
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.ConformanceLevel = ConformanceLevel.Fragment;
settings.CloseOutput = false;
// Create the XmlWriter object and write some content.
MemoryStream strm = new MemoryStream();
XmlWriter writer = XmlWriter.Create(strm, settings);
writer.WriteElementString("someNode", "someValue");
writer.Flush();
writer.Close();
https://msdn.microsoft.com/en-us/library/system.xml.xmlwriter(v=vs.110).aspx

It sounds like your input is well-formed XML, but you want to escape some of the tags. The issue here is that there's no way for the code to know which tags are valid and which aren't.
One way to do this is to create a list of valid tags.
List<string> validTags = new List<string>() { "root", "..." };
Then use regex to pick out all instances of <tag> or </tag>and replace them if they're not in the list.
Another way which is faster and easier, but requires more information up front, is to create a list of tags which aren't valid.
List<string> invalidTags = new List<string>() { "XMLElement", "..." };
Simple string manipulation will do, now.
string s = GetYourXMLString();
invalidTags.ForEach(t => s = s.Replace($"</{t}>",$"<{t}>")
.Replace($"<{t}>",$"</{t}>"));
The second way should really only be used if you know which foreign tags are making (or will ever make) an appearance. If not the first approach should be used. One clever possibility is to dynamically create the list of valid tags using reflection or a data contract so that changes to the XML spec will be automatically reflected in your code.
For example, if each element is a property of an object, you might get the list like this:
var validTags = typeof(MyObjectType).GetProperties()
.Select(p => p.PropertyName)
.ToList();
Of course, the property names likely won't be the actual tag names, AND often you'll want to only include certain properties. So you make an attribute class to designate the desired properties (let's call it XMLTagName) and then you can do this:
var validTags = typeof(MyObjectType).GetProperties()
.Select(p => p.GetCustomAttribute<XMLTagName>()?.TagName)
.Where(tagName => tagName != null) //gets rid of properties that aren't tagged
.ToList();
Even with all that, you'll still committing the crime of string manipulation on raw XML. After all, the best real solution here is to figure out how to fix the incoming XML to actually contain the data you want. But if that's not a possibility, the above should do the job.

Getting innertext of XML node

When i try M Adeel Khalid kode i get nothing, and trying others i get errors. i miss something, but i cant se it. My code look like this. but i get an error on Descendants, Saying "xmlDocument does not contain a definition for descendants" As you can probably see, I'm pretty new to this, so bear with me.
protected void btnRetVare_Click(object sender, EventArgs e)
{
fldRetVare.Visible = true;
try
{
functions func = new functions();
bool exists = func.checForMatch(txtRetVare.Text);
string myNumber = txtRetVare.Text;
if (Page.IsValid)
{
if (!exists)
{
txtRetVare.Text= "Varenummer findes ikke";
}
else
{
XmlDocument xmldoc = new XmlDocument();
//xmldoc.Load(Server.MapPath(map));
xmldoc.LoadXml(Server.MapPath(map));
//var Varenummer2055component = xmldoc.SelectNodes("s/Reservedele/Component[Varenummer/text()='"+txtRetVare+"']/Remarks");
//if (Varenummer2055component.Count == 1)
//{
// var remarks = Varenummer2055component[0].InnerText;
// txtRetBemærkninger.Text = remarks.ToString();
//}
string remarks = (from xml2 in xmldoc.Descendants("Component")
where xml2.Element("Varenummer").Value == txtRetVare.Text
select xml2.Element("Remarks")).FirstOrDefault().Value;
txtRetBemærkninger.Text = remarks;
}
}

You can get it this way.
XDocument xdoc = XDocument.Load(XmlPath);
string remarks = (from xml2 in xdoc.Descendants("Component")
where xml2.Element("Varenummer").Value == "2055"
select xml2.Element("Remarks")).FirstOrDefault().Value;
I've tested this code.
Hope it helps.

Use XPath to select the correct node:
XmlDocument xml = new XmlDocument();
xml.LoadXml(#"
<Reservedele>
<Component>
<Type>Elektronik</Type>
<Art>Wheel</Art>
<Remarks>erter</Remarks>
<Varenummer>2055</Varenummer>
<OprettetAf>jg</OprettetAf>
<Date>26. januar 2017</Date>
</Component>
<Component>
<Type>Forbrugsvarer</Type>
<Art>Bulb</Art>
<Remarks>dfdh</Remarks>
<Varenummer>2055074</Varenummer>
<OprettetAf>jg</OprettetAf>
<Date>27. januar 2017</Date>
</Component>
</Reservedele>");
var Varenummer2055component = xml.SelectNodes("s/Reservedele/Component[Varenummer/text()='2055']/Remarks");
if (Varenummer2055component.Count == 1)
{
var remarks = Varenummer2055component[0].InnerText;
}

I think extension method First of LINQ to XML will be simple enough and fill requirements of your questions.
var document = XDocument.Load(pathTopXmlFile);
var remark =
document.Descendants("Component")
.First(component => component.Element("Varenummer").Value.Equals("2055"))
.Element("Remarks")
.Value;
First method will throw exception if xml doesn't contain element with Varenummer = 2055
In case where there is possibility that given number doesn't exists in the xml file you can use FirstOrDefault extension method and add checking for null
var document = XDocument.Load(pathTopXmlFile);
var component =
document.Descendants("Component")
.FirstOrDefault(comp => comp.Element("Varenummer").Value.Equals("2055"));
var remark = component != null ? component.Element("Remarks").Value : null;
For saving new value you can use same "approach" and after setting new value save it to the same file
var document = XDocument.Load(pathTopXmlFile);
var component =
document.Descendants("Component")
.FirstOrDefault(comp => comp.Element("Varenummer").Value.Equals("2055"));
component.Element("Remarks").Value = newValueFromTextBox;
document.Save(pathTopXmlFile);
One more approach, which will be overkill in your particular case, but can be useful if you use other values of xml. This approach is serialization.
You can create class which represent data of your xml file and then just use serialization for loading and saving data to the file. Examples of XML Serialization

how to get value from xml by Linq

i was reading huge xml file of 5GB size by using the following code, and i was success to get the first element Testid but failed to get another element TestMin coming under different namespace
this is the xml i am having
which i am getting as null
.What is wrong here?
EDIT
GMileys answer giving error like The ':' character, hexadecimal value 0x3A, cannot be included in a name

The element es:qRxLevMin is a child element of xn:attributes, but it looks like you are trying to select it as a child of xn:vsDataContainer, it is a grandchild of that element. You could try changing the following:
var dataqrxlevmin = from atts in pin.ElementsAfterSelf(xn + "VsDataContainer")
select new
{
qrxlevmin = (string)atts.Element(es + "qRxLevMin"),
};
To this:
var dataqrxlevmin = from atts in pin.Elements(string.Format("{0}VsDataContainer/{1}attributes", xn, es))
select new
{
qrxlevmin = (string)atts.Element(es + "qRxLevMin"),
};
Note: I changed your string concatenation to use string.Format for readability purposes, either is technically fine to use, but string.Format is a better approach.

What about this approach?
XDocument doc = XDocument.Load(path);
XName utranCellName = XName.Get("UtranCell", "un");
XName qRxLevMinName = XName.Get("qRxLevMin", "es");
var cells = doc.Descendants(utranCellName);
foreach (var cell in cells)
{
string qRxLevMin = cell.Descendants(qRxLevMinName).FirstOrDefault();
// Do something with the value
}

try this code which is very similar to your code but simpler.
using (XmlReader xr = XmlReader.Create(path))
{
xr.MoveToContent();
XNamespace un = xr.LookupNamespace("un");
XNamespace xn = xr.LookupNamespace("xn");
XNamespace es = xr.LookupNamespace("es");
while (!xr.EOF)
{
if(xr.LocalName != "UtranCell")
{
xr.ReadToFollowing("UtranCell", un.NamespaceName);
}
if(!xr.EOF)
{
XElement utranCell = (XElement)XElement.ReadFrom(xr);
}
}
}

actually namespace was the culprit,what i did is first loaded the small section i am getting from.Readform method in to xdocument,then i removed all the namespace,then i took the value .simple :)

Get certain xml node and save the value

Considering the following XML:
<Stations>
<Station>
<Code>HT</Code>
<Type>123</Type>
<Names>
<Short>H'bosch</Short>
<Middle>Den Bosch</Middle>
<Long>'s-Hertogenbosch</Long>
</Names>
<Country>NL</Country>
</Station>
</Stations>
There are multiple nodes. I need the value of each node.
I've got the XML from a webpage (http://webservices.ns.nl/ns-api-stations-v2)
Login (--) Pass (--)
Currently i take the XML as a string and parse it to a XDocument.
var xml = XDocument.Parse(xmlString);
foreach (var e in xml.Elements("Long"))
{
var stationName = e.ToString();
}

You can retrieve "Station" nodes using XPath, then get each subsequent child node using more XPath. This example isn't using Linq, which it looks like you possibly are trying to do from your question, but here it is:
XmlDocument xml = new XmlDocument();
xml.Load(xmlStream);
XmlNodeList stations = xml.SelectNodes("//Station");
foreach (XmlNode station in stations)
{
var code = station.SelectSingleNode("Code").InnerXml;
var type = station.SelectSingleNode("Type").InnerXml;
var longName = station.SelectSingleNode("Names/Long").InnerXml;
var blah = "you should get the point by now";
}
NOTE: If your xmlStream variable is a String, rather than a Stream, use xml.LoadXml(xmlStream); for line 2, instead of xml.Load(xmlStream). If this is the case, I would also encourage you to name your variable to be more accurately descriptive of the object you're working with (aka. xmlString).

This will give you all the values of "Long" for every Station element.
var xml = XDocument.Parse(xmlStream);
var longStationNames = xml.Elements("Long").Select(e => e.Value);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Parsing XML document using .Descendents(value) - c#

Related

Parse XML with quoted attributes

XMLinvalide chars replacement C#

Getting innertext of XML node

how to get value from xml by Linq

Get certain xml node and save the value

Categories

Resources