Coding Platform: ASP.NET C#
I have an XML like this.
<Items>
<Map id="35">
<Terrains>
<Item id="1" row="0" column="0"/>
<Item id="1" row="0" column="1"/>
<Item id="1" row="0" column="2"/>
<Item id="1" row="0" column="3"/>
<Item id="1" row="0" column="4"/>
</Terrains>
</Map>
</Items>
I would like to minify this to
<Its>
<Map id="30">
<Te>
<It id="1" r="0" c="0"/>
<It id="1" r="0" c="1"/>
<It id="1" r="0" c="2"/>
<It id="1" r="0" c="3"/>
<It id="1" r="0" c="4"/>
</Te>
</Map>
</Its>
Then I am converting this to JSON using James Newton-King's JSON Converter.
The idea is to minify the xml data to the maximum as it contains tens of thousands of lines.
My questions are
What is the optimal method to minify the xml as mentioned above?
Now its done like XML-MinifyXML-Convert to JSON. Can I do it in two steps?(XML-Minify while converting to JSON)
Is James Newton-King's JSON converter a bit overkill for this simple conversion?
Please provide code snippets also if possible.
I suspect GZIP (via GZipStream, or simply via IIS, noting that you need to enable dynamic compression for the json mime-type) would be both simpler and smaller, but if you are using serializarion, simply adding some [XmlElement(...)] / [XmlAttribute(...)] should do it. Of course, if size is your concern, can I also suggest something like protobuf-net, which gives an extremely dense binary output.
If you aren't using serialisation, then this looks an ideal fit for some "xslt":
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy><xsl:apply-templates select="#* | node()"/></xsl:copy>
</xsl:template>
<xsl:template match="/Items">
<Its><xsl:apply-templates/></Its>
</xsl:template>
<xsl:template match="/Items/Map/Terrains">
<Te><xsl:apply-templates/></Te>
</xsl:template>
<xsl:template match="/Items/Map/Terrains/Item">
<It id="{#id}" r="{#row}" c="{#column}"><xsl:apply-templates select="*"/></It>
</xsl:template>
</xsl:stylesheet>
(with C#:)
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load("Condense.xslt"); // cache and re-use this object; don't Load each time
xslt.Transform("Data.xml", "Smaller.xml");
Console.WriteLine("{0} vs {1}",
new FileInfo("Data.xml").Length,
new FileInfo("Smaller.xml").Length);
Related
I'm trying to use XslCompiledTransform C# class to transform one xml file into another. However, the xmlns attribute is not being transferred.
My code:
XmlReader reader = XmlReader.Create("machine1.xml");
XmlWriter writer = XmlWriter.Create("machine2.xml");
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load("transform.xsl");
transform.Transform(reader, writer);
XSLT:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns="http://schemas.datacontract.org/2004/07/CMachines" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<!-- Copy everything not subject to the exceptions below -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<!-- Ignore the disabled element -->
<xsl:template match="Disabled" />
</xsl:stylesheet>
Input:
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfMachine xmlns="http://schemas.datacontract.org/2004/07/CMachines" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Machine>
<Name>DellM7600</Name>
<ID>1</ID>
<Type>Laptop</Type>
<Disabled>false</Disabled>
<SerialNum>47280420</SerialNum>
</Machine>
<Machine>
<Name>DellD600</Name>
<ID>2</ID>
<Type>Laptop</Type>
<Disabled>false</Disabled>
<SerialNum>53338123</SerialNum>
</Machine>
</ArrayOfMachine>
This is the actual Output:
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfMachine xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/CMachines" >
<Machine>
<Name>DellM7600</Name>
<ID>1</ID>
<Type>Laptop</Type>
<Disabled>false</Disabled>
<SerialNum>47280420</SerialNum>
</Machine>
<Machine>
<Name>DellD600</Name>
<ID>2</ID>
<Type>Laptop</Type>
<Disabled>false</Disabled>
<SerialNum>53338123</SerialNum>
</Machine>
</ArrayOfMachine>
This is the desired output:
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfMachine xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/CMachines" >
<Machine>
<Name>DellM7600</Name>
<ID>1</ID>
<Type>Laptop</Type>
<SerialNum>47280420</SerialNum>
</Machine>
<Machine>
<Name>DellD600</Name>
<ID>2</ID>
<Type>Laptop</Type>
<SerialNum>53338123</SerialNum>
</Machine>
</ArrayOfMachine>
You were previously try to use xpath-default-namespace in your XSLT, which is not supported in XSLT 1.0.
Instead, you will need to use namespace prefix, bound to the namespace specified in your XML, to match the Disabled element which is in that namespace.
Try this XSLT
<xsl:stylesheet version="1.0" xmlns:cm="http://schemas.datacontract.org/2004/07/CMachines"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<!-- Copy everything not subject to the exceptions below -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<!-- Ignore the disabled element -->
<xsl:template match="cm:Disabled" />
</xsl:stylesheet>
Note the namespace prefix used is arbitrary, as long as the namespace URI matches.
Remove namespaces, attributes, Xsi from Soap response using code or XSLT
Want to transform a soap response to a normal XML(without namespaces, atributes) using C# code (Serializer, XMLDoc, XDoc ) or XSLT.
here is the soap response.
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ns1="urn:Magento"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<ns1:catalogProductInfoResponse>
<info xsi:type="ns1:catalogProductReturnEntity">
<product_id xsi:type="xsd:string">3459</product_id>
<sku xsi:type="xsd:string">HK-BP001</sku>
<categories SOAP-ENC:arrayType="xsd:string[0]" xsi:type="ns1:ArrayOfString"/>
<websites SOAP-ENC:arrayType="xsd:string[7]" xsi:type="ns1:ArrayOfString">
<item xsi:type="xsd:string">1</item>
</websites>
<created_at xsi:type="xsd:string">2016-04-19 01:45:35</created_at>
<has_options xsi:type="xsd:string">1</has_options>
<special_from_date xsi:type="xsd:string">2016-04-19 00:00:00</special_from_date>
<tier_price SOAP-ENC:arrayType="ns1:catalogProductTierPriceEntity[0]" xsi:type="ns1:catalogProductTierPriceEntityArray"/>
<custom_design xsi:type="xsd:string">ultimo/default</custom_design>
<enable_googlecheckout xsi:type="xsd:string">1</enable_googlecheckout>
</info>
</ns1:catalogProductInfoResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
i want transformed xml like :
<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<Body>
<catalogProductInfoResponse>
<info>
<product_id>3459</product_id>
<sku>HK-BP001</sku>
<categories/>
<websites>
<item>1</item>
</websites>
<created_at>2016-04-19 01:45:35</created_at>
<has_options>1</has_options>
<special_from_date>2016-04-19 00:00:00</special_from_date>
<tier_price/>
<custom_design>ultimo/default</custom_design>
<enable_googlecheckout>1</enable_googlecheckout>
</info>
</catalogProductInfoResponse>
</Body>
</Envelope>
You can use XSLT:
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
I have been searching for the solution to convert XML into CSV, but I cannot find one which matches my case as XML structure is different
XML structure looks like
<VWSRecipeFile>
<EX_Extrusion User="ABC" Version="1.0" Description="" LastChange="41914.7876341204">
<Values>
<C22O01_A_TempFZ1_Set Item="A_TempFZ1_Set" Type="4" Hex="42700000" Value="60"/>
<C13O02_A_TempHZ2_Set Item="A_TempHZ2_Set" Type="4" Hex="43430000" Value="195"/>
<C13O03_A_TempHZ3_Set Item="A_TempHZ3_Set" Type="4" Hex="43430000" Value="195"/>
</Values>
</EX_Extrusion>
</VWSRecipeFile>
Expected CSV Format
A_TempFZ1_Set,A_TempHZ2_Set,A_TempHZ3_Set
60,195,195
i can achieve the new expected csv format, but don't know if it is the best way to do it, any suggestion is appreciated
'
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="no"/>
<xsl:template match="/VWSRecipeFile">
<xsl:for-each select="EX_Extrusion/Values/*">
<xsl:value-of select="concat(#Item,',')" />
</xsl:for-each>
<xsl:text>
</xsl:text>
<xsl:for-each select="EX_Extrusion/Values/*">
<xsl:value-of select="concat(#Value,',')" />
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>'
Thanks
One way you can do this is to use XSLT, the language designed to work with XML. You surely can parse the XML with C# but I like XSLT cause it's cleaner.
You define an external XSLT file, then call it within C# to do the transform.
Edit: added new columns based on new requirements.
File C:\XmlToCSV.xslt (
is the newline character)
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="no"/>
<xsl:template match="/VWSRecipeFile">
<xsl:variable name="User" select="EX_Extrusion/#User"/>
<xsl:variable name="Version" select="EX_Extrusion/#Version"/>
<xsl:variable name="Description" select="EX_Extrusion/#Description"/>
<xsl:variable name="LastChange" select="EX_Extrusion/#LastChange"/>
<xsl:text>Item,Type,Hex,Value,User,Version,Description,LastChange
</xsl:text>
<xsl:for-each select="EX_Extrusion/Values/*">
<xsl:value-of select="concat(#Item,',',#Type,',',#Hex,',',#Value,',',$User,',',$Version,',',$Description,',',$LastChange,'
')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Apply the transform with XslCompiledTransform:
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load("C:\\XmlToCSV.xslt");
xslt.Transform("InputFile.xml", "OutputFile.csv");
Adjust it based on your needs.
Basic idea would be to iterate though values nodes and select the attributes you want for each node and keep writing them to a file with comma separator. Simply name the file as .csv. If you want something ready made, check this out.
XSLT is one way to do it. Alternatively you can use, Cinchoo ETL - an open source library available to parse xml, produce CSV the way you want it.
string xml = #"<VWSRecipeFile>
<EX_Extrusion User=""ABC"" Version=""1.0"" Description="""" LastChange=""41914.7876341204"">
<Values>
<C22O01_A_TempFZ1_Set Item=""A_TempFZ1_Set"" Type=""4"" Hex=""42700000"" Value=""60""/>
<C13O02_A_TempHZ2_Set Item=""A_TempHZ2_Set"" Type=""4"" Hex=""43430000"" Value=""195""/>
<C13O03_A_TempHZ3_Set Item=""A_TempHZ3_Set"" Type=""4"" Hex=""43430000"" Value=""196""/>
</Values>
</EX_Extrusion>
</VWSRecipeFile>";
StringBuilder sb = new StringBuilder();
using (var p = ChoXmlReader.LoadText(xml).WithXPath("/Values/*"))
{
using (var w = new ChoCSVWriter(sb)
.WithFirstLineHeader()
)
w.Write(p.ToDictionary(r => r.Item, r => r.Value).ToDynamic());
}
Console.WriteLine(sb.ToString());
Output:
A_TempFZ1_Set,A_TempHZ2_Set,A_TempHZ3_Set
60,195,196
Disclaimer: I'm the author of this library.
I am currently trying to flatten a deep-structured XML document in C# so that every value of an element is converted to an attibute.
The XML structure is as follows:
<members>
<member xmlns="mynamespace" id="1" status="1">
<sensitiveData>
<notes/>
<url>someurl</url>
<altUrl/>
<date1>somedate</date1>
<date2>someotherdate</date2>
<description>some description</description>
<tags/>
<category>some category</category>
</sensitiveData>
<contacts>
<contact contactId="1">
<contactPerson>some contact person</contactPerson>
<phone/>
<mobile>mobile number</mobile>
<email>some#email.com</email>
</contact>
</contacts>
</member>
</members>
What I want it to look like is this:
<members>
<member xmlns="mynamespace" id="1" status="1" notes="" url="someurl" altUrl="" date1="somedate" date2="someotherdate" description="some description" tags="" category="some category" contactId="1" contactPerson="some contact person" phone="" mobile="mobile number" email="some#email.com" />
</members>
I could just parse away on the element names and their attributes, but since this XML comes from a webservice that I can't control, I have to create some sort of dynamic parser to flatten this as the structure can change at some point.
Should be worth noting that the XML structure comes as an XElement from the webservice.
Has anyone tried to do this before and would be helpful to share how? :-) It would be greatly appreciated!
Thanks a lot in advance.
All the best,
Bo
Try this:
var doc = XDocument.Parse(#"<members>...</members>");
var result = new XDocument(
new XElement(doc.Root.Name,
from x in doc.Root.Elements()
select new XElement(x.Name,
from y in x.Descendants()
where !y.HasElements
select new XAttribute(y.Name.LocalName, y.Value))));
Result:
<members>
<member notes="" url="someurl" altUrl="" date1="somedate" date2="someotherdate" description="some description" tags="" category="some category" contactPerson="some contact person" phone="" mobile="mobile number" email="some#email.com" xmlns="mynamespace" />
</members>
You could use this XSLT 1.0 stylesheet. You might want to modify how it handles multiple <contact> elements.
Input XML
<members>
<member xmlns="mynamespace" id="1" status="1">
<sensitiveData>
<notes/>
<url>someurl</url>
<altUrl/>
<date1>somedate</date1>
<date2>someotherdate</date2>
<description>some description</description>
<tags/>
<category>some category</category>
</sensitiveData>
<contacts>
<contact contactId="1">
<contactPerson>some contact person</contactPerson>
<phone/>
<mobile>mobile number</mobile>
<email>some#email.com</email>
</contact>
<contact contactId="2">
<contactPerson>second contact person</contactPerson>
<phone/>
<mobile>second mobile number</mobile>
<email>second some#email.com</email>
</contact>
</contacts>
</member>
</members>
XSLT 1.0
<xsl:stylesheet version="1.0" xmlns:my="mynamespace" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:apply-templates select="node()|#*"/>
</xsl:template>
<xsl:template match="members|my:member">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node()[text()][ancestor::my:member]|#*[ancestor::my:member]">
<xsl:variable name="vContact">
<xsl:if test="ancestor-or-self::my:contact">
<xsl:value-of select="count(ancestor-or-self::my:contact/preceding-sibling::my:contact) + 1"/>
</xsl:if>
</xsl:variable>
<xsl:attribute name="{name()}{$vContact}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
XML Output
<members>
<member xmlns="mynamespace" id="1" status="1" url="someurl" date1="somedate"
date2="someotherdate"
description="some description"
category="some category"
contactId1="1"
contactPerson1="some contact person"
mobile1="mobile number"
email1="some#email.com"
contactId2="2"
contactPerson2="second contact person"
mobile2="second mobile number"
email2="second some#email.com"/>
</members>
I think dtb answer is the best way to do it. However, you have to note one important issue. Try to add another contact information and dtb code would crash. Because a member can have more than one contact information but yet can not have duplicate attributes. In order to work around that I updated the code to select only distinct attributes. To do that I implemented IEqualityComparer<XAttribute>.
The updated linq expression would look like this
var result = new XDocument(new XElement(doc.Root.Name,
from x in doc.Root.Elements()
select new XElement(x.Name, (from y in x.Descendants()
where !y.HasElements
select new XAttribute(y.Name.LocalName, y.Value)).Distinct(new XAttributeEqualityComparer())
)));
As you can notice a Distinct call was added with a custom Equality comparer overload(XAttributeEqualityComparer)
class XAttributeEqualityComparer : IEqualityComparer<XAttribute>
{
public bool Equals(XAttribute x, XAttribute y)
{
return x.Name == y.Name;
}
public int GetHashCode(XAttribute obj)
{
return obj.Name.GetHashCode();
}
}
You could write an XSLT transform to convert the elements to attributes.
Are you doing this to create another XML document, or is just to make your processing simpler? If former is the case, then you just have to put all values in a map when you come across a leaf node and that's it. You can actually then iterate over the key-value pairs in the map to reconstruct an xml tag with just attributes.
Here's my problem.I have 2 xmlfiles with identical structure, with the second xml containing only few node compared to first.
File1
<root>
<alpha>111</alpha>
<beta>22</beta>
<gamma></gamma>
<delta></delta>
</root>
File2
<root>
<beta>XX</beta>
<delta>XX</delta>
</root>
This's what the result should look like
<root>
<alpha>111</alpha>
<beta>22</beta>
<gamma></gamma>
<delta>XX</delta>
</root>
Basically if the node contents of any node in File1 is blank then it should read the values from File2(if it exists, that is).
I did try my luck with Microsoft XmlDiff API but it didn't work out for me(the patch process didn't apply changes to the source doc). Also I'm a bit worried about the DOM approach that it uses, because of the size of the xml that I'll be dealing with.
Can you please suggest a good way of doing this.
I'm using C# 2
Here is a little bit simpler and more efficient solution that that proposed by Alastair (see my comment to his solution).
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vFile2"
select="document('File2.xml')"/>
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(text())]">
<xsl:copy>
<xsl:copy-of
select="$vFile2/*/*[name() = name(current())]/text()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document:
<root>
<alpha>111</alpha>
<beta>22</beta>
<gamma></gamma>
<delta></delta>
</root>
produces the wanted result:
<root>
<alpha>111</alpha>
<beta>22</beta>
<gamma></gamma>
<delta>XX</delta>
</root>
In XSLT you can use the document() function to retrieve nodes from File2 if you encounter an empty node in File1. Something like:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="root/*[.='']">
<xsl:variable name="file2node">
<xsl:copy-of select="document('File2.xml')/root/*[name()=name(current())]"/>
</xsl:variable>
<xsl:choose>
<xsl:when test="$file2node != ''">
<xsl:copy-of select="$file2node"/>
</xsl:when>
<xsl:otherwise>
<xsl:copy/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This merge seems very specific.
If that is the case, just write some code to load both xml files and apply the changes as you described.