Extract Inner Nodes from XML string to a JSON string - c#

string inputxml = "<transaction>
<node1>value1</node1>
<node2>value2</node2>
<node3>value3</node3>
</transaction>"
I want to convert this XML string to JSON string in the below format after omitting the outermost node:
{"node1";"value1","node2":"value2","node3":"value3"}

You can use :
1 - XDocument to build anonymous object that match the Json like :
string inputxml = #"<transaction>
<node1>value1</node1>
<node2>value2</node2>
<node3>value3</node3>
</transaction>";
var node = XDocument.Parse(inputxml)
.Descendants("transaction")
.Select(x => new
{
Node1 = x.Element("node1").Value,
Node2 = x.Element("node2").Value,
Node3 = x.Element("node3").Value
}).FirstOrDefault();
2 - Newtonsoft to serialize the object like :
string json = JsonConvert.SerializeObject(node);
Demo
Console.WriteLine(json);
Result
{"Node1":"value1","Node2":"value2","Node3":"value3"}
I hope you find this helpful.

As far as i understood your problem you do not have model neither for source XML nor for JSON and names can chenge in future, so we shouldn't use strict names. So we will try to build it dynamically. Pay attention - u'll need to use Newtonsoft.Json nuget pack.
string inputxml = #"<transaction>
<node1>value1</node1>
<node2>value2</node2>
<node3>value3</node3>
</transaction>";
XDocument xdoc = XDocument.Parse(inputxml); //parse XML document
var jprops = xdoc.Root.Elements() // take elements in root of the doc
.Select(x => (x.Name, x.Value)) // map it to tuples (XName, string)
.Select(x => new JProperty(x.Name.LocalName, x.Value)); //map it to enumerbale of Json properties
JObject resultingObj = new JObject(jprops); // construct your json and populate its contents
Console.WriteLine(resultingObj.ToString()); // Write out - u r awesome

Related

how to clean duplicate (parent) nodes?

My input payload would be something like the following:
<ns0:SourceFacilityCode FieldTypeToTranslate="Store">
<ns0:SourceFacilityCode>CRA</ns0:SourceFacilityCode>
</ns0:SourceFacilityCode>
<ns0:Alex FieldTypeToTranslate="Facility">
<ns0:Alex>CRA</ns0:Alex>
</ns0:Alex>
<ns0:Shoes>Red</Shoes>
As you can see SourceFacilityCode and Alex are both unnecessary. In order to deserialize this to a concrete C# object, we would need to transform the input to something like this:
<ns0:SourceFacilityCode>CRA</ns0:SourceFacilityCode>
<ns0:Alex>CRA</ns0:Alex>
<ns0:Shoes>Red</nso0:Shoes>
How do I transform this payload to look like that?
What I've tried:
1. simple `string.Replace(a,b)`- but this is too messy and ungeneric
2. trying to load this into an Xml concrete object, but this was too difficult to do with the nested nodes having the same name
3. attempting to transform to json and then to concrete object
Here is a solution using LINQ to XML:
First, wrap your example XML into a Root element to make it valid XML which can be parsed by XDocument.Parse:
var xml = #"<Root xmlns:ns0=""http://example.org/ns0"">
<ns0:SourceFacilityCode FieldTypeToTranslate=""Store"">
<ns0:SourceFacilityCode>CRA</ns0:SourceFacilityCode>
</ns0:SourceFacilityCode>
<ns0:Alex FieldTypeToTranslate=""Facility"">
<ns0:Alex>CRA</ns0:Alex>
</ns0:Alex>
<ns0:Shoes>Red</ns0:Shoes>
</Root>";
var doc = XDocument.Parse(xml);
Then we determine all elements with a single child element that has the same name as the element and that has no child elements:
var elementsWithSingleChildHavingSameName = doc.Root.Descendants()
.Where(e => e.Elements().Count() == 1
&& e.Elements().First().Name == e.Name
&& !e.Elements().First().HasElements)
.ToArray();
Last, loop through the found elements removing the child element while transferring the value:
foreach (var element in elementsWithSingleChildHavingSameName)
{
var child = element.Elements().First();
child.Remove();
element.Value = child.Value;
}
To transform back to a string and remove the Root wrapper:
var cleanedUpXml = doc.ToString();
var output = Regex.Replace(cleanedUpXml, #"</?Root.*?>", "");

Parse XML with quoted attributes

I'm using a 3rd party XML parser (not my decision) and found it does something bad. Here is the inner part of an XML tag:
"Date=""2014-01-01"" Amounts=""100717.72 100717.72 100717.72 100717.72"""
To parse the attributes, the code does a .split on spaces, ignoring the quotes. This works fine as long as there's no strings with spaces, but here we are. It returns proper Date=2014-01-01 and semi-proper Amounts=100717.72, but then four more entries of just the numbers.
I have the C# code for the parser, and thought about replacing the spaces-inside-quotes with some other character, splitting, and the changing them back. But then I thought I should ask here first.
Is there a way to parse this text into two entries properly?
UPDATE: original XML follows (typed in from another computer, forgive me!)
<DetailAmounts Date="2014-01-01" Amounts="100717.72 100717.72 100717.72 100717.72" />
You should just use XmlSerializer to deserialize the data:
public class DetailAmounts
{
[XmlAttribute]
public DateTime Date { get; set; }
[XmlAttribute]
public string Amounts { get; set; }
}
// ...
var xml = "<DetailAmounts Date=\"2014-01-01\" Amounts=\"100717.72 100717.72 100717.72 100717.72\" />";
var serializer = new XmlSerializer(typeof(DetailAmounts));
using (var reader = new StringReader(xml))
{
var detailAmounts = (DetailAmounts)serializer.Deserialize(reader);
}
Or, you can use XElement to parse each individual values:
var xml = "<DetailAmounts Date=\"2014-01-01\" Amounts=\"100717.72 100717.72 100717.72 100717.72\" />";
var element = XElement.Parse(xml);
var detailAmounts = new
{
Date = (DateTime)element.Attribute("Date"),
Amounts = element.Attribute("Amounts").Value.Split(' ')
.Select(x => decimal.Parse(x, CultureInfo.InvariantCulture))
.ToArray()
};

XMLinvalide chars replacement C#

I have a string that is displayed in XML but in it I have some invalid chars like string
s = <root> something here <XMLElement>hello</XMLElement> somethig here too </root>
where XMLElement is a List like XMLElement = {"bold", "italic",...} .
What I need is to replace the < and </ if followed by any of the XMLElements to be replaced by > or < depending on the cases.
The <root> is to keep
I have tried so far some regEx
strAux = Regex.Replace(strAux, "bold=\"[^\"]*\"",
match => match.Value.Replace("<", "<").Replace(">", ">"));
or
List<string> startsWith = new List<string> { "<", "</"};
foreach(var stw in startsWith)
{
int nextLt = 0;
while ((nextLt = strAux.IndexOf(stw, nextLt)) != -1)
{
bool isMatch = strAux.Substring(nextLt + 1).StartsWith(BoldElement); // needs to ckeck all the XMLElements
//is element, leave it
if (isMatch)
{
//its not, replace
strAux = string.Format(#"{0}<{1}", strAux.Substring(0, nextLt), strAux.Substring(nextLt +1, strAux.Length - (nextLt + 1)));
}
nextLt++;
}
}
Also tried
XmlDocument doc = new XmlDocument();
XmlElement element = doc.CreateElement("root");
element.InnerText = strAux;
Console.WriteLine(element.OuterXml);
strAux = element.OuterXml.Replace("<root>", "").Replace("</root>", "");
return strAux; But it will repeat the `<root>` too
But nothing worked like I suposed. Is there any different ideias .Thanks
What you have is well-formed XML, so you can use the XML APIs to help you:
Using LINQ to XML (which is generally the better API):
var element = XElement.Parse(s);
element.Value = string.Concat(element.Nodes());
var result = element.ToString();
Or using the older XmlDocument API:
var doc = new XmlDocument();
doc.LoadXml(s);
var root = doc.DocumentElement;
root.InnerText = root.InnerXml;
var result = root.OuterXml;
The result for both is:
<root> something here <XMLElement>hello</XMLElement> somethig here too </root>
See this fiddle for a demo.
You should be using the XmlWriter class.
Sample from the documentation:
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.ConformanceLevel = ConformanceLevel.Fragment;
settings.CloseOutput = false;
// Create the XmlWriter object and write some content.
MemoryStream strm = new MemoryStream();
XmlWriter writer = XmlWriter.Create(strm, settings);
writer.WriteElementString("someNode", "someValue");
writer.Flush();
writer.Close();
https://msdn.microsoft.com/en-us/library/system.xml.xmlwriter(v=vs.110).aspx
It sounds like your input is well-formed XML, but you want to escape some of the tags. The issue here is that there's no way for the code to know which tags are valid and which aren't.
One way to do this is to create a list of valid tags.
List<string> validTags = new List<string>() { "root", "..." };
Then use regex to pick out all instances of <tag> or </tag>and replace them if they're not in the list.
Another way which is faster and easier, but requires more information up front, is to create a list of tags which aren't valid.
List<string> invalidTags = new List<string>() { "XMLElement", "..." };
Simple string manipulation will do, now.
string s = GetYourXMLString();
invalidTags.ForEach(t => s = s.Replace($"</{t}>",$"<{t}>")
.Replace($"<{t}>",$"</{t}>"));
The second way should really only be used if you know which foreign tags are making (or will ever make) an appearance. If not the first approach should be used. One clever possibility is to dynamically create the list of valid tags using reflection or a data contract so that changes to the XML spec will be automatically reflected in your code.
For example, if each element is a property of an object, you might get the list like this:
var validTags = typeof(MyObjectType).GetProperties()
.Select(p => p.PropertyName)
.ToList();
Of course, the property names likely won't be the actual tag names, AND often you'll want to only include certain properties. So you make an attribute class to designate the desired properties (let's call it XMLTagName) and then you can do this:
var validTags = typeof(MyObjectType).GetProperties()
.Select(p => p.GetCustomAttribute<XMLTagName>()?.TagName)
.Where(tagName => tagName != null) //gets rid of properties that aren't tagged
.ToList();
Even with all that, you'll still committing the crime of string manipulation on raw XML. After all, the best real solution here is to figure out how to fix the incoming XML to actually contain the data you want. But if that's not a possibility, the above should do the job.

Save the same element from XML file as an array C#

it could be a silly question, but i want ta ask, how can I save the same element from an .XML file with different content as an array.
Example XML:
<ithem>
<description>description1</description>
<description>description2</description>
<description>description3</description>
</ithem>
then string [] descriptions will be
descriptions[0] = "description1";
descriptions[1] = "description2";
descriptions[2] = "description3";
Please help!
Using LINQ to XML it would be:
XElement root = XElement.Load(xmlFile);
string[] descriptions = root.Descendants("description").Select(e => e.Value).ToArray();
or
string[] descriptions = root.Element("ithem").Elements("description").Select(e => e.Value).ToArray();
Use XmlDocument to parse the XML :
var map = new XmlDocument();
map.Load("path_to_xml_file"); // you can also load it directly from a string
var descriptions = new List<string>();
var nodes = map.DocumentElement.SelectNodes("ithem/description");
foreach (XmlNode node in nodes)
{
var description = Convert.ToString(node.Value);
descriptions.Add(description);
}
And you get it as an array from:
descriptions.ToArray();

Deserialize <table> nodes to double[][] from XML

I have an XML file (from somewhere) containing matrix values, which I wish to get into my code as double[][] objects. The XML contains table nodes, which look like standard serialized double[][] objects:
<table type="System.Double[][]"><table type="System.Double[]"><el type="System.Double">0.005</el><el type="System.Double">0.001</el><el type="System.Double">0.007</el><el type="System.Double">-0.012</el></table><table type="System.Double[]"><el type="System.Double">0.033</el><el type="System.Double">-0.146</el><el type="System.Double">-0.008</el><el type="System.Double">0.006</el></table><table type="System.Double[]"><el type="System.Double">-0.002</el><el type="System.Double">-0.004</el><el type="System.Double">-0.004</el><el type="System.Double">-0.003</el></table><table type="System.Double[]"><el type="System.Double">0</el><el type="System.Double">0</el><el type="System.Double">0</el><el type="System.Double">0</el></table></table>
Since not the whole XML is in this form, I only extract those nodes as XmlNode (since XElements don't have InnerXml). Lets call this myMatrixXmlNode.
Then, I try to put that into a MemoryStream, and then deserialize from that:
var deserializer = new XmlSerializer(typeof(double[][]));
var myMatrix = (double[][])deserializer.Deserialize(new MemoryStream(Encoding.UTF8.GetBytes(myMatrixXmlNode.InnerXml)));
This throws me a <table xmlns=''> was not expected. error, for which I have not found a solution yet.. and I'm geting really annoyed by this.
Probably best to use an XDocument to parse it like the following:-
var d = XDocument.Parse(testXml);
var r = d.Element("table");
var listOfDoubleArrays = new List<double[]>();
foreach (var outerArrayItem in r.Elements())
{
double[] arr = new double[r.Elements().Count()];
int i = 0;
foreach (var innerArrayItem in outerArrayItem.Elements())
{
arr[i] = System.Convert.ToDouble(innerArrayItem.Value);
i++;
}
listOfDoubleArrays.Add(arr);
}
double[][] result = listOfDoubleArrays.ToArray();
You can not use standart Xml serializer for deserializing this Xml into double[][].
Format for double[][] Xml serialization is like:
<ArrayOfArrayOfDouble xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
<ArrayOfDouble>
<double>1</double>
</ArrayOfDouble>
<ArrayOfDouble>
<double>2</double>
</ArrayOfDouble>
</ArrayOfArrayOfDouble>
You can try parse thos Xml manually using LinqToXml or transforming it to corresponding format.

Categories

Resources