Compare XML for Differences - c#

I am writing some C# code with the intention of periodically downloading an XML file, comparing the resulting XML with a previously downloaded version of XML (to detect changes), and then "actioning" the changes (via appropriate CRUD statements) by updating database records which reflect the entities in the XML.
I need some help with the "comparing the resulting XML with a previously downloaded version to detect changes" part of the requirements.
So... considering the following two XML documents which have minor differences...
Original
<ROOT>
<Stock>
<Vehicle id="2574074">
<DealerName>Super Cars London</DealerName>
<FriendlyName>Ford Ranger 3.2 double cab 4x4 XLT auto</FriendlyName>
<ModelName>Ranger</ModelName>
<MakeName>Ford</MakeName>
<Registration>DG55TPG</Registration>
<Price>40990</Price>
<Colour>WHITE</Colour>
<Year>2014</Year>
<Mileage>52000</Mileage>
<Images>
<Image Id="4771304" ThumbUrl="http://www.somewhere.com/GetImage.aspx?ImageId=4771304&Type=6&Width=60&Height=60&FeedId=42" FullUrl="http://www.somewhere.com/GetImage.aspx?ImageId=4771304&Type=6&Width=640&FeedId=42" LastModified="2016-02-02T08:24:51.48" Priority="1"/>
</Images>
</Vehicle>
<Vehicle id="2648665">
<DealerName>Super Cars London</DealerName>
<FriendlyName>BMW 320i</FriendlyName>
<ModelName>3 Series</ModelName>
<MakeName>BMW</MakeName>
<Registration>CN03YZG</Registration>
<Price>24990</Price>
<Colour>WHITE</Colour>
<Year>2013</Year>
<Mileage>96000</Mileage>
<Images/>
</Vehicle>
</Stock>
</ROOT>
New
<ROOT>
<Stock>
<Vehicle id="2575124">
<DealerName>Supercars London</DealerName>
<FriendlyName>Ford Ranger 3.2 double cab 4x4 XLT auto</FriendlyName>
<ModelName>Ranger</ModelName>
<MakeName>Ford</MakeName>
<Registration>DK08FKP</Registration>
<Price>43990</Price>
<Colour>WHITE</Colour>
<Year>2014</Year>
<Mileage>30000</Mileage>
<Images>
<Image Id="5119812" ThumbUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5119812&Type=6&Width=60&Height=60&FeedId=42" FullUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5119812&Type=6&Width=640&FeedId=42" LastModified="2016-04-11T13:08:42.81" Priority="1"/>
</Images>
</Vehicle>
<Vehicle id="2648665">
<DealerName>Super Cars London</DealerName>
<FriendlyName>BMW 320i</FriendlyName>
<ModelName>3 Series</ModelName>
<MakeName>BMW</MakeName>
<Registration>CN03YZG</Registration>
<Price>24990</Price>
<Colour>BRILLIANT WHITE</Colour>
<Year>2013</Year>
<Mileage>96000</Mileage>
<Images>
<Image Id="5201856" ThumbUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201856&Type=6&Width=60&Height=60&FeedId=42" FullUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201856&Type=6&Width=640&FeedId=42" LastModified="2016-04-25T12:12:05.827" Priority="1"/>
<Image Id="5201857" ThumbUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201857&Type=6&Width=60&Height=60&FeedId=42" FullUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201857&Type=6&Width=640&FeedId=42" LastModified="2016-04-25T12:12:09.117" Priority="2"/>
<Image Id="5201858" ThumbUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201858&Type=6&Width=60&Height=60&FeedId=42" FullUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201858&Type=6&Width=640&FeedId=42" LastModified="2016-04-25T12:12:13.59" Priority="3"/>
<Image Id="5201859" ThumbUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201859&Type=6&Width=60&Height=60&FeedId=42" FullUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201859&Type=6&Width=640&FeedId=42" LastModified="2016-04-25T12:12:18.453" Priority="4"/>
<Image Id="5201860" ThumbUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201860&Type=6&Width=60&Height=60&FeedId=42" FullUrl="http://www.somewhere.com/GetImage.aspx?ImageId=5201860&Type=6&Width=640&FeedId=42" LastModified="2016-04-25T12:12:22.853" Priority="5"/>
</Images>
</Vehicle>
</Stock>
</ROOT>
Summary of Differences
Vehicle id="2575124" is not present in the original. This represents a "create".
Vehicle id="2574074" is not present in the new. This represents a "delete".
Vehicle id="2648665" (which is present in both original and new) has a different <Colour> (WHITE -> BRILLIANT WHITE). This represents an "update".
Vehicle id="2648665" also has new <Image> nodes in the new. This represents a "create" (as images will be 1:M with the vehicle in the database).
I have looked at XMLDiff to generate a DiffGram with add/change/remove instructions but I can't see a way to make it generate a DiffGram that represent the changes I've summarised, e.g. it sees changes 1 and 2 as an "change" - <xd:change match="#id">2648665</xd:change> - rather than the absence and addition of a vehicle.
Is there a way to do this with XMLDiff?
Or, is there a "better" way to achieve the result I'm looking for?

I've found that LINQ can do a pretty good job of parsing the XML for differences, e.g...
XDocument xNewVehicle = new XDocument(new XElement("UsedStock",
from newVehicle in newXml.Descendants("Vehicle")
join oldVehicle in oldXml.Descendants("Vehicle")
on newVehicle.Attributes("id").First().Value equals oldVehicle.Attributes("id").First().Value into oldVehicles
where !oldVehicles.Any() // where the vehicle exists in new but not in old
|| newVehicle.ToString() != oldVehicles.First().ToString() // where the new vehicle is not the same as the old vehicle
select newVehicle));

Related

Creating XML Banking document from template

I have this template from a bank that is used to make payments on bank account transfers.
See xml below. I have included the sample data that has to be entered when sending the file to the bank.
<?xml version="1.0" encoding="UTF-8"?>
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<CstmrCdtTrfInitn>
<GrpHdr>
<MsgId>Cart Urgent28052018_57894</MsgId>
<CreDtTm>2018-06-29T11:52:23</CreDtTm>
<NbOfTxs>1</NbOfTxs>
<CtrlSum>667896.00</CtrlSum>
<InitgPty>
<Nm>CART LIMITED</Nm>
<Id>
<OrgId>
<Othr>
<Id>S001234/PJones</Id>
<SchmeNm>
<Cd>CUST</Cd>
</SchmeNm>
</Othr>
</OrgId>
</Id>
</InitgPty>
</GrpHdr>
<PmtInf>
<PmtInfId>Payment for addon development SAP B1</PmtInfId>
<PmtMtd>TRF</PmtMtd>
<BtchBookg>false</BtchBookg>
<NbOfTxs>1</NbOfTxs>
<CtrlSum>667896.00</CtrlSum>
<PmtTpInf>
<InstrPrty>HIGH</InstrPrty>
</PmtTpInf>
<ReqdExctnDt>2018-06-29</ReqdExctnDt>
<Dbtr>
<Nm>CART LIMITED</Nm>
</Dbtr>
<DbtrAcct>
<Id>
<Othr>
<Id>0112345110846</Id>
</Othr>
</Id>
<Ccy>KES</Ccy>
</DbtrAcct>
<DbtrAgt>
<FinInstnId>
<BIC>SBICKENX</BIC>
</FinInstnId>
</DbtrAgt>
<CdtTrfTxInf>
<PmtId>
<EndToEndId>156335578965</EndToEndId>
</PmtId>
<Amt>
<InstdAmt Ccy="KES">667896.00</InstdAmt>
</Amt>
<ChrgBr>DEBT</ChrgBr>
<CdtrAgt>
<FinInstnId>
<BIC>DTKEKENA</BIC>
<ClrSysMmbId>
<MmbId>63000</MmbId>
</ClrSysMmbId>
</FinInstnId>
</CdtrAgt>
<Cdtr>
<Nm>EOH SEAL LTD</Nm>
<PstlAdr>
<StrtNm>P.O. Box 10496</StrtNm>
<TwnNm>Nairobi</TwnNm>
<Ctry>KE</Ctry>
<AdrLine>P.O. Box 10496</AdrLine>
<AdrLine>00100 NAIROBI</AdrLine>
</PstlAdr>
</Cdtr>
<CdtrAcct>
<Id>
<Othr>
<Id>0112406001</Id>
</Othr>
</Id>
</CdtrAcct>
<RmtInf>
<Ustrd>Cart Urgent28052018_57894</Ustrd>
</RmtInf>
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>
The file is quite long as has to be in the format given. The black letters represent the details to be passed to the xml. To test if I understood what data goes where I filled it manually and sent to bank for testing. That is all good now.
I have a SAP addon program that captures details from a form and generates a list. Each payment must follow this structure.
Looking at the below:
<Nm>CART LIMITED</Nm>
<Id>
<OrgId>
<Othr>
<Id>S001234/PJones</Id>
<SchmeNm>
<Cd>CUST</Cd>
</SchmeNm>
</Othr>
</OrgId>
</Id>
Is creating a class with all properties according to the template the best way to create the xml needed.
How do I stagger the
<Id>
<OrgId>
<Othr>
as in the case above?
Also the <CtrlSum>667896.00</CtrlSum> is found in the group header and payment info tags. How do I deal with this?
For what i see, the problem is you have a addon in SAP to specify multiple payments methods, but in the XML template given from the bank you don't the structure for multiple payments, so you need to get this information to know how can you work with it.
when you have this information you can use a better XML template with VS or another tool to generate the correct class to work with the XML

how to delete root tag in xml C#

hi following XML is my input XML
<?xml version="1.0" encoding="utf-8"?>
<units xmlns="http://www.elsevier.com/xml/ani/ani" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ce="http://www.elsevier.com/xml/ani/common" xsi:schemaLocation="http://www.elsevier.com/xml/ani/ani http://www.elsevier.com/xml/ani/ani512-input-CAR.xsd">
<unit type="journal" xmlns="">
<unit-info>
<timestamp>2017-01-08T02:03:14Z</timestamp>
<order-id>12535892</order-id>
<parcel-id>none</parcel-id>
<unit-id>4756202</unit-id>
</unit-info>
<unit-content>
<bibrecord>
<item-info>
<status state="new" stage="S300" />
</item-info>
<head>
<citation-info>
<citation-type code="jo" />
<citation-language xmllang="ENG" />
<abstract-language xmllang="ENG" />
<author-keywords>
<author-keyword>Stroke</author-keyword>
<author-keyword> cerebral ischaemia</author-keyword>
<author-keyword> Neuro-protection</author-keyword>
<author-keyword> Neuro-protective agents</author-keyword>
</author-keywords>
</citation-info>
<citation-title xmllang="ENG" original="y">
<titletext>PATHOGENESIS AND NEURO-PROTECTIVE AGENTS OF STROKE</titletext>
</citation-title>
<abstracts>
<abstract>
<cepara>ABSTRACT: Stroke remains worlds second leading cause of mortality; and globally most frequent cause of long-lasting disabilities. The ischaemic pathophysiologic cascade leading to neuronal damage consists of peri-infarct depolarization, excitotoxicity, inflammation, oxidative stress, and apoptosis. Despite plethora of experimental evidences and advancement into the development of treatments, clinical treatment of acute stroke still remains challenging. Neuro-protective agents, as novel therapeutic strategy confer neuro-protection by targeting the pathophysiologic mechanism of stroke. The aim of this review is discussion of summary of the literature on stroke pathophysiology, current preclinical research findings of neuroprotective agents in stroke and possible factors that were responsible for the failure of these agents to translate in human stroke therapies.</cepara>
</abstract>
</abstracts>
<correspondence>
<person>
<ceinitials>M.</ceinitials>
<cesurname>Mubarak</cesurname>
</person>
<affiliation>
<organization> Bayero University Kano</organization>
<organization> Nigeria. Email: mubarakmahmad#yahoo.com</organization>
</affiliation>
</correspondence>
<root>
<author Seq="0">
<Inital>A.El</Inital>
<Surname>Khattabi</Surname>
<Givenname>Abdelkrim</Givenname>
</author>
</root>
</head>
</bibrecord>
</unit-content>
from this xml i need to delete root tag alone but i need all child element
<author Seq="0">
<Inital>A.El</Inital>
<Surname>Khattabi</Surname>
<Givenname>Abdelkrim</Givenname>
</author>
how to delete root tag alone. For this i followed following code
XDocument CarXML = new XDocument();
CarXML.Add(Root);
CarXML.Descendants("root").Remove();
CarXML.Save(#"CAR.XML");
But this code delete all xml tag. How to delete root element alone. i need child element
CarXML.ReplaceNodes(CarXML.Descendants("author").FirstOrDefault());
This replaces the whole content of the XML with the first descendant named author.

Most efficient way to read two nodes from multiple XML files?

I'm looking for a fast and efficient way to retreive 2 strings from multiple xml files.
The files them selves aren't that big only 50kb to 100kb and the number of files can range from none to 100. I've included a sample of a very small xml file i'll be using. All xml files will be the same format and there's only 2 things i need to know from all files namely:
<Company>test bv.</Company>
<Ship>sss testing</Ship>
i want to store these strings in a struct/class since i cannot use Tuple in .net 3.5 (or am i mistaking?)
so the question: what is the most efficient way to do this? using Xdocument? using xmlReader? or something else?
<?xml version="1.0" encoding="utf-8"?>
<Collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Customer>
<ID>-1</ID>
<Updated>true</Updated>
<Company>test bv.</Company>
<Ship>sss testing</Ship>
<Adress>fggfgggff gvg</Adress>
<City>fgcgg</City>
<ZipCode>5454fc</ZipCode>
<Country>fcgggff</Country>
<BTW_Nmbr>55</BTW_Nmbr>
<IsTempCustomer>true</IsTempCustomer>
<PhoneNumbers>
<ContactData>
<ID>21</ID>
<Updated>true</Updated>
<Description>455656567</Description>
<Name>ghbbvh</Name>
<IsSendable>false</IsSendable>
</ContactData>
<ContactData>
<ID>22</ID>
<Updated>true</Updated>
<Description>2315098234146124302134</Description>
<Name>asdfawegaebf</Name>
<IsSendable>false</IsSendable>
</ContactData>
</PhoneNumbers>
<EmailAdresses />
</Customer>
<Order>
<ID>-1</ID>
<Updated>true</Updated>
<OrderNr>10330200</OrderNr>
<OrderDate>1-1-2005</OrderDate>
<StartDate>10-9-2013</StartDate>
<ExpirationDate>20-10-2013</ExpirationDate>
<Executor>Andre de Zwart</Executor>
<Executors />
<Reference />
<OrderDetail />
<IsDigital>true</IsDigital>
</Order>
<Materials>
<MaterialData>
<ID>108</ID>
<Updated>true</Updated>
<Description>ffdffggfg</Description>
<Amount>34</Amount>
</MaterialData>
<MaterialData>
<ID>109</ID>
<Updated>true</Updated>
<Description>ffccff</Description>
<Amount>45</Amount>
</MaterialData>
</Materials>
<HourExpenses>
<HourExpensesData>
<ID>43850</ID>
<Updated>true</Updated>
<Date>2013-10-06T00:00:00</Date>
<Notes>lala</Notes>
<Day>Sunday</Day>
<Hours>0.01</Hours>
<BusinessHours>0.01</BusinessHours>
<TravledHoursTo>0</TravledHoursTo>
<TravledHoursFrom>0</TravledHoursFrom>
<Start>2013-10-06T12:27:00</Start>
<Stop>2013-10-06T12:27:00</Stop>
</HourExpensesData>
<HourExpensesData>
<ID>43849</ID>
<Updated>true</Updated>
<Date>2013-09-17T00:00:00</Date>
<Notes>oke dus ik ben nu lekker aan het werk en ik typ wat spul er bij</Notes>
<Day>Tuesday</Day>
<Hours>0</Hours>
<BusinessHours>0.01</BusinessHours>
<TravledHoursTo>0</TravledHoursTo>
<TravledHoursFrom>0</TravledHoursFrom>
<Start>2013-09-17T12:31:31</Start>
<Stop>2013-09-17T12:31:32</Stop>
</HourExpensesData>
<HourExpensesData>
<ID>43855</ID>
<Updated>true</Updated>
<Date>2013-10-03T00:00:00</Date>
<Notes>test</Notes>
<Day>Thursday</Day>
<Hours>0</Hours>
<BusinessHours>0</BusinessHours>
<TravledHoursTo>12</TravledHoursTo>
<TravledHoursFrom>12</TravledHoursFrom>
<Start>0001-01-01T00:00:00</Start>
<Stop>0001-01-01T00:00:00</Stop>
</HourExpensesData>
</HourExpenses>
<TravelExpenses>
<TravelExpensesData>
<ID>672</ID>
<Updated>true</Updated>
<Date>2013-09-27T00:00:00</Date>
<Notes />
<KmTo>45</KmTo>
<KmFrom>45</KmFrom>
<Declaration>0</Declaration>
</TravelExpensesData>
</TravelExpenses>
<Signatures>
<ID>-1</ID>
<Updated>true</Updated>
<OrderID>10330200</OrderID>
<Completed>false</Completed>
<Notes>yay het werkt ;D</Notes>
</Signatures>
<RemovedDataList />
</Collection>
I would just use LINQ to XML in the simplest possible way:
var query = from file in files
let doc = XDocument.Load(file)
from customer in doc.Descendants("Customer")
select new {
Company = (string) customer.Element("Company"),
Ship = (string) customer.Element("Ship")
};
I'd expect that to be pretty quick already - but you should work out what your exact performance requirements are, then test them. (You almost certainly don't need "the most efficient" way - you need a sufficiently efficient way, whilst keeping the code readable.)
Note that if you want the values to be propagated out of the current method, you should create your own class to represent the company/ship pair.

Preparing XSD of a complex xml

Thank you all for suggesting things and helping whenever in need.
Yesterday I was trying to develop and web app in asp.net 4.0 where I needed to parse the data from xml and save it in database. But before that I will also have to validate it.
I tried using .net provided tool xsd.exe to generate the schema file, but I dont know how will it know to mark which nodes or attributes are compulsory?
Like in my xml below items are mandatory
Root node <Market>
<Login> and its sub element
<ProductType> and its <ProductTypeID/>
The attribute DML is mandatory but should have only 3 values NONE, PUT or MODIFY
<ProductType> may or may not have <ProductItem>
If <ProductItem> is present then it should have <ProductItemID>
<ProductItem> may or may not have <Brand>
If <Brand> is present then it should have <BrandID>
Below is my xml
<?xml version="1.0" encoding="utf-8" ?>
<Market>
<Login>
<LoginId />
<Password />
</Login>
<ProductType DML="NONE">
<ProductTypeID/>
<Name/>
<Detail/>
<ProductItem DML="PUT">
<ProductItemID/>
<Name/>
<Detail/>
<Brand DML="PUT">
<BrandID/>
<Name/>
<Detail/>
</Brand>
<Brand DML="MODIFY">
<BrandID/>
<Name/>
<Detail/>
</Brand>
</ProductItem>
<ProductItem DML="MODIFY">
<ProductItemID/>
<Name/>
<Detail/>
</ProductItem>
</ProductType>
</Market>
How and where should I specify all the mandatory and optional parameters, so that my xsd is generated as per the requirement.
Thanks,
M.
xsd.exe can only try to infer, which elements/attributes are in you xml, but it cannot find out which information is mandatory. But this is a good startingpoint.
use a graphical XML_Schema_Editor to edit the genrated xsd to mark your mandatory fields. That is much easier than learning the xsd-language
I don't think XSD support nested XML. I would try to load the XML into an XmlDocument and check mandatory fields manually.

Pivotviewer's .cxml parsing

I'm trying to do very simple operations on a .cxml file. As you know it's basically an .xml file. This is a sample file I created to test the application:
<?xml version="1.0" encoding="utf-8"?>
<Collection xmlns:p="http://schemas.microsoft.com/livelabs/pivot/collection/2009" SchemaVersion="1.0" Name="Actresses" xmlns="http://schemas.microsoft.com/collection/metadata/2009">
<FacetCategories>
<FacetCategory Name="Nationality" Type="LongString" p:IsFilterVisible="true" p:IsWordWheelVisible="true" p:IsMetaDataVisible="true" />
</FacetCategories>
<!-- Other entries-->
<Items ImgBase="Actresses_files\go144bwo.0ao.xml" HrefBase="http://www.imdb.com/name/">
<Item Id="2" Img="#2" Name="Anna Karina" Href="nm0439344/">
<Description> She is a nice girl</Description>
<Facets>
<Facet Name="Nationality">
<LongString Value="Danish" />
</Facet>
</Facets>
</Item>
</Items>
<!-- Other entries-->
</Collection>
I can't get any functioning simple code like:
XDocument document = XDocument.Parse(e.Result);
foreach (XElement x in document.Descendants("Item"))
{
...
}
The test on a generic xml is working. The cxml file is correctly loaded in document.
While watching the expression:
document.Descendants("Item"), results
the answer is:
Empty "Enumeration yielded no results" string
Any hint on what can be the error? I've also add a quick look to get Descendants of Facet, Facets, etc., but there are no results in the enumeration. This obviously doesn't happen with a generic xml file I used for testing. It's a problem I have with .cxml.
Basically your XML defines a default namespace with the xmlns="http://schemas.microsoft.com/collection/metadata/2009" attribute:
That means you need to fully qualify your Descendants query e.g.:
XDocument document = XDocument.Parse(e.Result);
foreach (XElement x in document.Descendants("{http://schemas.microsoft.com/collection/metadata/2009}Item"))
{
...
}
If you remove the default namespace from the XML your code actually works as-is, but that is not the aim of the exercise.
See Metadata.CXML project under http://github.com/Zoomicon/Metadata.CXML sourcecode for LINQ-based parsing of CXML files.
Also see ClipFlair.Metadata project at http://github.com/Zoomicon/ClipFlair.Metadata for parsing one's CXML custom facets too
BTW, at http://ClipFlair.codeplex.com can checkout the ClipFlair.Gallery project for how to author ASP.net web-based forms to edit metadata fragments (parts of CXML files) and merge them together in a single one (that you then convert periodically to DeepZoom CXML with PAuthor tool from http://pauthor.codeplex.com).
If anyone is interested in doing nesting (hierarchy) of CXML collections see
http://github.com/Zoomicon/Trafilm.Metadata
and
http://github.com/Zoomicon/Trafilm.Gallery

Categories

Resources