Unsure of how to use LINQ to access this particular element - c#

* I'm completely new to this, and this is a personal project I am doing. *
So I have an XML document structured like this
<Licensing key="20325">
<Organization Org="500">
<Constraints>
<MaximumOrgsInSecurity>2</MaximumOrgsInSecurity>
<MaximumUsersInSecurity>999</MaximumUsersInSecurity>
<MaximumLoggedInUsers>999</MaximumLoggedInUsers>
<MaximumLenders>1</MaximumLenders>
<OptOutofPasswordPolicy>FALSE</OptOutofPasswordPolicy>
</Constraints>
<Modules>
<Module registered="true" name="DV" id="1" />
<Module registered="true" name="DP" id="2" />
<Module registered="true" name="DCC" id="3" />
<Module registered="false" name="DRE" id="4" />
</Modules>
</Organization>
</Licensing>
and I am trying to read it using LINQ in my C# code, and although I am attempting to following this tutorial on LINQ (http://www.dotnetcurry.com/linq/564/linq-to-xml-tutorials-examples), I just can't seem to access the elements I would like. For example, how would I use LINQ to get the key number of 20325, Org number of 500, id/name/registered of each module, and stuff like that? The XML document has to be in this format. Any help or walkthroughs would be appreciated, thank you!
EDIT:
For example, I've tried doing
IEnumerable<XElement> Licensing = xelement.Elements();
foreach (var Organization in Licensing)
{
System.Diagnostics.Debug.Write(Organization.Element("Constraints").Value);
}
to see what this would give me, and it gives 29999991FALSE, when I was hoping it would give something along the lines of
MaximumOrgsInSecurity
MaximumUsersInSecurity
MaximumLoggedInUsers
MaximumLenders
OptOutofPasswordPolicy
or at least
2
999
999
1
False
I've also tried doing
IEnumerable<XElement> Licensing = xelement.Elements();
foreach (var Organization in Licensing)
{
System.Diagnostics.Debug.Write(Organization.Element("Modules").Value);
}
to see what this would give, and it gives absolutely nothing.
If there is a better way than LINQ to do this, then I am all ears. The only reason I am saying LINQ is because based on what I've found so far, LINQ would be my best bet to achieve what I am attempting to do.

Those key values are called Attributes and here's a few different ways to access them:
Debug.WriteLine(xelement.Attribute("key").Value);
Debug.WriteLine(xelement.Element("Organization").Attribute("Org").Value);
Debug.WriteLine(((XElement)xelement.FirstNode).Attribute("Org").Value);
For the constraints you're selecting a level to high, need to select the child nodes with .Elements():
foreach (var constraint in xelement.Descendants("Constraints").Elements())
{
Debug.WriteLine(constraint.Name + ": " + constraint.Value);
}
foreach (var constraint in xelement.Element("Organization").Element("Constraints").Elements())
{
Debug.WriteLine(constraint.Name + ": " + constraint.Value);
}
You can also add using System.Diagnostics; to the top of the file so you don't need to add it before every Debug too.

So based on what #mattmanser said, and further looking into LINQ and Xelement/Xdocument, I figured out how to do what I'm looking to do.
For example, say I want to know all of the "registered" booleans within the modules element and store them in an array of booleans, Id do this:
string Name = FileUpload1.FileName;
bool[] ModuleBools = new bool[4];
for (int moduleID = 1; moduleID < 5; moduleID++)
{
var quotes = XDocument.Load("C:/Users/.../Created XMLs/" + Name)
.Descendants("Module")
.Where(x => (string)x.Attribute("id") == moduleID.ToString())
.Select(x => (string)x.Attribute("registered"))
.ToList();
ModuleBools[moduleID-1] = bool.Parse(quotes.First());
}

Excuse the VB. With the given example, here is how you would access each item. The variable names, though the same as the elements, are just names.
Dim someXE As XElement
' someXE = XElement.Load("path here") 'to load from file / uri
' for testing we can do this
someXE = <Licensing key="20325">
<Organization Org="500">
<Constraints>
<MaximumOrgsInSecurity>2</MaximumOrgsInSecurity>
<MaximumUsersInSecurity>999</MaximumUsersInSecurity>
<MaximumLoggedInUsers>999</MaximumLoggedInUsers>
<MaximumLenders>1</MaximumLenders>
<OptOutofPasswordPolicy>FALSE</OptOutofPasswordPolicy>
</Constraints>
<Modules>
<Module registered="true" name="DV" id="1"/>
<Module registered="true" name="DP" id="2"/>
<Module registered="true" name="DCC" id="3"/>
<Module registered="false" name="DRE" id="4"/>
</Modules>
</Organization>
</Licensing>
Dim key As String = someXE.#key
Dim MaximumOrgsInSecurity As String = someXE.<Organization>.<Constraints>.<MaximumOrgsInSecurity>.Value
Dim MaximumUsersInSecurity As String = someXE.<Organization>.<Constraints>.<MaximumOrgsInSecurity>.Value
Dim MaximumLoggedInUsers As String = someXE.<Organization>.<Constraints>.<MaximumLoggedInUsers>.Value
Dim MaximumLenders As String = someXE.<Organization>.<Constraints>.<MaximumLenders>.Value
Dim OptOutofPasswordPolicy As String = someXE.<Organization>.<Constraints>.<OptOutofPasswordPolicy>.Value

Related

How to load SharePoint lists from anywhere in site collection via TemplateID with CSOM?

Background:
I am trying to create a C# command line utility to extract list item information from lists that may exist anywhere within a particular site collection. All lists that I am trying to extract from were created from a particular template with ID 10003 (Custom template).
The powers that be are still deciding how frequently this thing is supposed to be run, but I'm expecting an answer on the order of every few minutes or so, so I need execution time to be no more than a minute, and don't think that's achievable with my current approach.
I have a site collection with 7 immediate children and ~200 total descendant sub-sites, and these lists may appear in any one of them. Most instances will only have a couple of items, but some of them will have a few thousand. I'm expecting ~10-20k results total.
This utility will be run on a remote server, and internally we'd prefer to use the CSOM over rest. I'm familiar with SP web part development but this is the first time I've needed to use the CSOM.
The Farm is a SP 2010 On Prem with 3 WFEs, updated with latest CU.
Immediate Issues:
I'm throwing a "Cannot complete this action." on a call to context.ExecuteQuery(), and I'm not sure why.
I am aware that I'm probably over-calling context.ExecuteQuery(), and would like to know a better way to load all of these lists.
My current code only gets the immediate child webs, when I need all descendants of the root.
Code:
My current attempt looks like this:
using (var ctxt = new ClientContext(url))
{
var webs = ctxt.Web.Webs;
ctxt.Load(webs);
ctxt.ExecuteQuery();
var allItems = new List<ListItem>();
foreach (var web in webs)
{
ctxt.Load(web.Lists);
ctxt.ExecuteQuery();
foreach (var list in web.Lists)
{
if (list.BaseTemplate == 10003)
{
ctxt.Load(list);
ctxt.ExecuteQuery();
var items = list.GetItems(query);
ctxt.Load(items);
ctxt.ExecuteQuery(); // <- **throws "Cannot complete this action." on first iteration of loop.**
allItems.AddRange(items);
}
}
}
results = allItems.Select(ConvertToNeededResultType).ToList();
}
The Query looks like:
<View Scope='RecursiveAll'>
<Webs Scope='SiteCollection' /> <!--**If I omit this line, I get a CollectionNotInitialized exception on allitems.AddRange(items)**--/>
<Lists ServerTemplate='10003' />
<Query>
<OrderBy>
<FieldRef Name='CreatedOn' Ascending='False' />
</OrderBy>
<Where>
<And>
<Eq>
<FieldRef Name='FSObjType' />
<Value Type='Integer'>0</Value>
</Eq>
<Geq>
<FieldRef Name='CreatedOn'/>
<Value Type='DateTime' IncludeTimeValue='FALSE'>{0}</Value>
</Geq>
</And>
</Where>
<Query>
<ViewFields>
<FieldRef Name='Title' Nullable='TRUE' />
<FieldRef Name='URL' Nullable='TRUE' />
<FieldRef Name='CreatedOn' Nullable='TRUE' />
<FieldRef Name='Category' Nullable='TRUE' />
<FieldRef Name='Attachments' Nullable='TRUE' />
<FieldRef Name='ID' />
<ProjectProperty Name='Title' />
<ListProperty Name='Title' />
</ViewFields>
</View>
where the {0} contains a date that is the number of days I want to go back with the list in format: "yyyy-MM-ddTHH:mm:ssZ"
What I am looking for:
I am looking for either advice on how to resolve the specific issues enumerated above with my current code, or a suggestion with examples of how to more efficiently achieve the same result.
I never did figure out the "Cannot complete this action" bit, but I've solved the requirements another way: I needed two methods, recursive for crawling the webs and retrieving items:
public static IEnumerable<WebLocation> GetWebLocations(string rootWebUrl)
{
List<WebLocation> results;
using (var cntxt = new ClientContext(rootWebUrl))
{
var web = cntxt.Web;
cntxt.Load(web, w => w.Webs, w => w.Id, w => w.ServerRelativeUrl, w => w.Lists);
cntxt.ExecuteQuery();
results = GetWebLocations(cntxt, web, Guid.Empty);
}
return results;
}
private static List<WebLocation> GetWebLocations(ClientContext cntxt, Web currentWeb, Guid parentId)
{
var results = new List<WebLocation>();
var currentId = currentWeb.Id;
var url = currentWeb.ServerRelativeUrl;
var location = new WebLocation { ParentSiteID = parentId, SiteID = currentId, WebUrl = url, HotLinksItems = new List<HotLinksItem>() };
foreach (var list in currentWeb.Lists)
{
cntxt.Load(list, l => l.BaseTemplate, l => l.RootFolder.ServerRelativeUrl);
cntxt.ExecuteQuery();
if (list.BaseTemplate == 10003)
{
var itemCollection =
list.GetItems(new CamlQuery
{
ViewXml = "<View Scope='RecursiveAll'><ViewFields><FieldRef Name='Title' Nullable='TRUE' /><FieldRef Name='ID' /><ProjectProperty Name='Title' /><ListProperty Name='Title' /></ViewFields></View>"
});
cntxt.Load(itemCollection);
cntxt.ExecuteQuery();
foreach (var item in itemCollection)
{
var hotlink = new HotLinksItem
{
Title = item["Title"] != null ? item["Title"].ToString() : null,
ID = item["ID"] != null ? item["ID"].ToString() : null,
};
location.HotLinksItems.Add(hotlink);
}
}
}
}
It still feels like I'm doing too many calls, but trimming them down with the sub selections has helped performance a lot, All in all, against my farm this thing runs in a little over 30 seconds. (Before trimming down return items, it ran in about 3 minutes.)

XML full file reading C#

So I have a code that reads partially into an XML document, precenting me with the first block of results which is great, but I have a file containing multiple blocks of the same code & my program seems to quit after the first.
Here's the code:
string path = "data//handling.meta";
var doc = XDocument.Load(path);
var items = doc.Descendants("HandlingData").Elements("Item");//.ToArray();
var query = from i in items
select new
{
HandlingName = (string)i.Element("handlingName"),
Mass = (decimal?)i.Element("fMass").Attribute("value"),
InitialDragCoeff = (decimal?)i.Element("fInitialDragCoeff").Attribute("value"),
PercentSubmerged = (decimal?)i.Element("fPercentSubmerged").Attribute("value"),
DriveBiasFront = (decimal?)i.Element("fDriveBiasFront").Attribute("value"),
InitialDriveGears = i.Element("nInitialDriveGears").Attribute("value")
}
string test = ("{0} - {1}" + query.First().HandlingName + query.First().Mass + query.First().InitialDragCoeff);
richTextBox1.Text = test;
Here's the XML Document :
<?xml version="1.0" encoding="UTF-8"?>
<CHandlingDataMgr>
<HandlingData>
<Item type="CHandlingData">
<handlingName>Car1</handlingName>
<fMass value="140000.000000" />
<fInitialDragCoeff value="30.000000" />
<fPercentSubmerged value="85.000000" />
<vecCentreOfMassOffset x="0.000000" y="0.000000" z="0.000000" />
<vecInertiaMultiplier x="1.000000" y="1.000000" z="1.000000" />
<fDriveBiasFront value="1.000000" />
<nInitialDriveGears value="1" />
</Item>
<Item type="CHandlingData">
<handlingName>Car2</handlingName>
<fMass value="180000.000000" />
<fInitialDragCoeff value="7.800000" />
<fPercentSubmerged value="85.000000" />
<vecCentreOfMassOffset x="0.000000" y="0.000000" z="0.000000" />
<vecInertiaMultiplier x="1.000000" y="1.300000" z="1.500000" />
<fDriveBiasFront value="0.200000" />
<nInitialDriveGears value="6" />
</Item>
</HandlingData>
</CHandlingDataMgr>
As shown, there's multiple handling Name's. The CSharp code above does work, but only for the first block & I'm wondering how to make it read the same values from the different handling name.
I have tried :
if (query.First().HandlingName == "Car2")
{
MessageBox.Show("Car 2 found");
}
but since the message box never appeared, I assume this code doesn't read the hole file?
I'm hoping for output like this:
Name: Car 1
Mass: 140000.000000
InitialDragCoeff: 30.000000
Name: Car 2
Mass: 180000.000000
InitialDragCoeff: 7.800000
My problem in a 'nut shell' : Program does not see Car 2
Any help would be really appreciated, as I've tried many solutions & read many pages regarding XML today
You have:
string test = ("{0} - {1}" + query.First().HandlingName + query.First().Mass
+ query.First().InitialDragCoeff);
that's only ever going to get you the first element, because that's what you asked for.
I think you probably want to loop:
foreach (var item in query) {
var s = "{0} - {1}" + item.HandlingName + query.item.Mass
+ item.InitialDragCoeff
// …
}

Using LINQ to XML for simple XML files: overkill? or no worse?

I'm working on building a simple top-down tile-based 2D game and I'm trying to parse the output of Tiled Map Editor (.tmx file). For those unfamiliar, TMX files are XML files which describe a game map using layers of re-used tiles from an image. I've never had to work with anything but parsing simple text before and I'm wondering if, for the case of a rather simple XML file, using LINQ is the most appropriate way to do this.
Here is an abridged .tmx file:
<?xml version="1.0" encoding="UTF-8"?>
<map width="100" height="100" tilewidth="16" tileheight="16">
<tileset>
<!-- This stuff in here is mostly metadata that the map editor uses -->
</tileset>
<layer name="Background" width="100" height="100">
<data>
<tile gid="1" />
<tile gid="2" />
<tile gid="3" />
<tile gid="1" />
<tile gid="1" />
<tile gid="1" />
<!-- etc... -->
</data>
</layer>
<layer name="Foreground" width="100" height="100">
<data>
<!-- gid="0" means don't load a tile there. It should be an empty cell for that layer (and the layers beneath this are visible) -->
<tile gid="0" />
<tile gid="4" />
<!-- etc. -->
</data>
</layer>
<!-- More layers.... -->
</map>
As you can see, it's rather simple (note that there is a 'tile' element for each tile (100x100) in each layer). Now it seems to me that the purpose of LINQ is to get very specific data from what could perhaps be a very large and almost database-like xml file where you don't really need the whole file. Mostly what I'll be doing here is going through and inserting the gid for each of the 'tile' elements into an array which represents the map in my application.
Here is my code for processing a layer:
public void AddLayer(XElement layerElement) {
TileMapLayer layer = new TileMapLayer(Convert.ToInt32(layerElement.Attribute("width")), Convert.ToInt32(layerElement.Attribute("height")));
layer.Name = (string)layerElement.Attribute("name");
layer.Opacity = Convert.ToDouble(layerElement.Attribute("opacity"));
layer.Visible = Convert.ToInt32(layerElement.Attribute("visible")) == 1;
if (layerElement.HasElements)
{
XElement data = layerElement.Element("data");
foreach (XElement tile in data.Elements())
{
layer.NextTile(Convert.ToInt32(tile.Attribute("gid")));
}
}
this.layers.Add(layer);
}
To make my question more succinct: When I'm going through and I care about every piece of data (i.e. I'm iterating through and getting data for all the child elements of each node sequentially), does using LINQ to XML afford me any benefits? Like are the LINQ to XML libraries better performing?, does my unfamiliarity with LINQ stop me from seeing an efficient way of doing what I want?, etc.? Or should I really be using different XML utilities?
For simply retrieving and updating data, LINQ to XML seems a square peg/round hole solution to XML parsing. XElement can of course be iterated over recursively, but I find XPath queries much more concise and easy to read.
Where I find LINQ to XML very useful is in manipulation of document namespaces. The other frameworks provided by .NET do not approach this in a graceful manner.
According to http://msdn.microsoft.com/en-us/library/ecf3e2k0.aspx:
Changing the prefix of a node does not change its namespace. The namespace can only be set when the node is created. When you persist the tree, new namespace attributes may be persisted out to satisfy the prefix you set. If the new namespace cannot be created, then the prefix is changed so the node preserves its local name and namespace.
Granted, this is an esoteric requirement, but I find LINQ to XML useful in preparing the namespace manager for XPath querying as well.
public XmlNamespaceManager NamespaceManager { get; set; }
public XPathNavigator Navigator { get; set; }
public SuperDuperXMLQueryingClass(System.IO.Stream stream)
{
var namespaces = RetrieveNameSpaceMapFromXml(XDocument.Load(stream).Root);
Navigator = new XPathDocument(stream).CreateNavigator();
NamespaceManager = new XmlNamespaceManager(Navigator.NameTable);
foreach (var t in namespaces)
{
NamespaceManager.AddNamespace(t.Key, t.Value.NamespaceName);
}
}
// LINQ to XML mostly used here.
private static Dictionary<string, XNamespace> RetrieveNamespaceMapFromXDocumentRoot(XElement root)
{
if (root == null)
{ throw new ArgumentNullException("root"); }
return root.Attributes().Where(a => a.IsNamespaceDeclaration)
.GroupBy(a => (
a.Name.Namespace == XNamespace.None ? String.Empty : a.Name.LocalName),
a => XNamespace.Get(a.Value)
)
.Where(g => g.Key != string.Empty)
.ToDictionary(g => g.Key, g => g.First());
}
public string DeliverFirstValueFromXPathQuery(string qry)
{
try
{
var iter = QueryXPathNavigatorUsingShortNamespaces(qry).GetEnumerator();
iter.MoveNext();
return iter.Current == null ? string.Empty : iter.Current.ToString();
}
catch (InvalidOperationException ex)
{
return "";
}
}
This gives you the option to use XPath to query your XML document without having to use the full URI.
//For Example
DeliverFirstValueFromXPathQuery("/ns0:MyRoot/ns1:MySub/nsn:ValueHolder");
The summary here is that each framework has some set of overlapping functionality; however, some are more graceful to use for specialized jobs than others.

Locating a value in XML

I have an xml file loaded into an XDocument that I need to extract a value from, and I'm not sure of the best way to do it. Most of the things I'm coming up with seem to be overkill or don't make good use of xml rules. I have the following snippet of xml:
<entry>
<observation classCode="OBS" moodCode="EVN">
<templateId root="2.16.840.1.113883.10.20.6.2.12" />
<code code="121070" codeSystem="1.2.840.10008.2.16.4" codeSystemName="DCM" displayName="Findings">
</code>
<value xsi:type="ED">
<reference value="#121071">
</reference>
</value>
</observation>
</entry>
There can be any number of <entry> nodes, and they will all follow a similar pattern. The value under the root attribute on the templateId element contains a known UID that identifies this entry as the one I want. I need to get the reference value.
My thought is to find the correct templateID node, back out to the observation node, find <valuexsi:type="ED"> and then get the reference value. This seems overly complex, and I am wondering if there is another way to do this?
EDIT
The xml I receive can sometimes have xml nested under the same node name. In other words, <observation> may be located under another node named <observation>.
You have problems, because your document uses Namespaces, and your query is missing them.
First of all, you have to find xsi namespace declaration somewhere in your XML (probably in the most top element).
It will look like that:
xmlns:xsi="http://test.namespace"
The, take the namespace Uri and create XNamespace instance according to it's value:
var xsi = XNamespace.Get("http://test.namespace");
And use that xsi variable within your query:
var query = from o in xdoc.Root.Element("entries").Elements("entry").Elements("observation")
let tId = o.Element("templateId")
where tId != null && (string)tId.Attribute("root") == "2.16.840.1.113883.10.20.6.2.12"
let v = o.Element("value")
where v != null && (string)v.Attribute(xsi + "type") != null
let r = v.Element("reference")
where r != null
select (string)r.Attribute("value");
var result = query.FirstOrDefault();
I have tested it for following XML structure:
<root xmlns:xsi="http://test.namespace">
<entries>
<entry>
<observation classCode="OBS" moodCode="EVN">
<templateId root="2.16.840.1.113883.10.20.6.2.12" />
<code code="121070" codeSystem="1.2.840.10008.2.16.4" codeSystemName="DCM" displayName="Findings">
</code>
<value xsi:type="ED">
<reference value="#121071">
</reference>
</value>
</observation>
</entry>
</entries>
</root>
The query returns #121071 for it.
For your input XML you will probably have to change first line of query:
from o in xdoc.Root.Element("entries").Elements("entry").Elements("observation")
to match <observation> elements from your XML structure.
Would something along the lines of the following help?
XDocument xdoc = GetYourDocumentHere();
var obsvlookfor =
xdoc.Root.Descendants("observation")
.SingleOrDefault(el =>
el.Element("templateId")
.Attribute("root").Value == "root value to look for");
if (obsvlookfor != null)
{
var reference = obsvlookfor
.Element("value")
.Element("reference").Attribute("value").Value;
}
My thought is as follows:
Pull out all the observation elements in the document
Find the only one (or null) where the observation's templateId element has a root attribute you're looking for
If you find that observation element, pull out the value attribute against the reference element which is under the value element.
You might have to include the Namespace in your LINQ. To retrieve that you would do something like this:
XNamespace ns = xdoc.Root.GetDefaultNamespace();
Then in your linq:
var obsvlookfor = xdoc.Root.Descendants(ns + "observation")
I know I had some issues retrieving data once without this. Not saying its the issue just something to keep in mind particularly if your XML file is very in depth.

How can I merge XML files?

I have two xml files that both have the same schema and I would like to merge into a single xml file. Is there an easy way to do this?
For example,
<Root>
<LeafA>
<Item1 />
<Item2 />
</LeafA>
<LeafB>
<Item1 />
<Item2 />
</LeafB>
</Root>
+
<Root>
<LeafA>
<Item3 />
<Item4 />
</LeafA>
<LeafB>
<Item3 />
<Item4 />
</LeafB>
</Root>
= new file containing
<Root>
<LeafA>
<Item1 />
<Item2 />
<Item3 />
<Item4 />
</LeafA>
<LeafB>
<Item1 />
<Item2 />
<Item3 />
<Item4 />
</LeafB>
</Root>
"Automatic XML merge" sounds like a relatively simple requirement, but when you go into all the details, it gets complex pretty fast. Merge with c# or XSLT will be much easier for more specific task, like in the answer for EF model. Using tools to assist with a manual merge can also be an option (see this SO question).
For the reference (and to give an idea about complexity) here's an open-source example from Java world: XML merging made easy
Back to the original question. There are few big gray-ish areas in task specification: when 2 elements should be considered equivalent (have same name, matching selected or all attributes, or also have same position in the parent element); how to handle situation when original or merged XML have multiple equivalent elements etc.
The code below is assuming that
we only care about elements at the moment
elements are equivalent if element names, attribute names, and attribute values match
an element doesn't have multiple attributes with the same name
all equivalent elements from merged document will be combined with the first equivalent element in the source XML document.
.
// determine which elements we consider the same
//
private static bool AreEquivalent(XElement a, XElement b)
{
if(a.Name != b.Name) return false;
if(!a.HasAttributes && !b.HasAttributes) return true;
if(!a.HasAttributes || !b.HasAttributes) return false;
if(a.Attributes().Count() != b.Attributes().Count()) return false;
return a.Attributes().All(attA => b.Attributes(attA.Name)
.Count(attB => attB.Value == attA.Value) != 0);
}
// Merge "merged" document B into "source" A
//
private static void MergeElements(XElement parentA, XElement parentB)
{
// merge per-element content from parentB into parentA
//
foreach (XElement childB in parentB.DescendantNodes())
{
// merge childB with first equivalent childA
// equivalent childB1, childB2,.. will be combined
//
bool isMatchFound = false;
foreach (XElement childA in parentA.Descendants())
{
if (AreEquivalent(childA, childB))
{
MergeElements(childA, childB);
isMatchFound = true;
break;
}
}
// if there is no equivalent childA, add childB into parentA
//
if (!isMatchFound) parentA.Add(childB);
}
}
It will produce desired result with the original XML snippets, but if input XMLs are more complex and have duplicate elements, the result will be more... interesting:
public static void Test()
{
var a = XDocument.Parse(#"
<Root>
<LeafA>
<Item1 />
<Item2 />
<SubLeaf><X/></SubLeaf>
</LeafA>
<LeafB>
<Item1 />
<Item2 />
</LeafB>
</Root>");
var b = XDocument.Parse(#"
<Root>
<LeafB>
<Item5 />
<Item1 />
<Item6 />
</LeafB>
<LeafA Name=""X"">
<Item3 />
</LeafA>
<LeafA>
<Item3 />
</LeafA>
<LeafA>
<SubLeaf><Y/></SubLeaf>
</LeafA>
</Root>");
MergeElements(a.Root, b.Root);
Console.WriteLine("Merged document:\n{0}", a.Root);
}
Here's merged document showing how equivalent elements from document B were combined together:
<Root>
<LeafA>
<Item1 />
<Item2 />
<SubLeaf>
<X />
<Y />
</SubLeaf>
<Item3 />
</LeafA>
<LeafB>
<Item1 />
<Item2 />
<Item5 />
<Item6 />
</LeafB>
<LeafA Name="X">
<Item3 />
</LeafA>
</Root>
If the format is always exactly like this there is nothing wrong with this method:
Remove the last two lines from the first file and append the second files while removing the first two lines.
Have a look at the Linux commands head and tail which can delete the first and last two lines.
It's a simple XSLT transformation something like this (which you apply to document a.xml):
<xsl:variable name="docB" select="document('b.xml')"/>
<xsl:template match="Root">
<Root><xsl:apply-templates/></Root>
</xsl:template>
<xsl:template match="Root/LeafA">
<xsl:copy-of select="*"/>
<xsl:copy-of select="$docB/Root/LeafA/*"/>
</xsl:template>
<xsl:template match="Root/LeafB">
<xsl:copy-of select="*"/>
<xsl:copy-of select="$docB/Root/LeafB/*"/>
</xsl:template>
vimdiff file_a file_b as just one example
BeyondCompare is a favorite when I'm on windows http://www.scootersoftware.com/
I ended up using C# and created myself a script. I knew I could do it when I asked the question, but I wanted to know if there was a faster way to do this since I've never really worked with XML.
The script went along the lines of this:
var a = new XmlDocument();
a.Load(PathToFile1);
var b = new XmlDocument();
b.Load(PathToFile2);
MergeNodes(
a.SelectSingleNode(nodePath),
b.SelectSingleNode(nodePath).ChildNodes,
a);
a.Save(PathToFile1);
And MergeNodes() looked something like this:
private void MergeNodes(XmlNode parentNodeA, XmlNodeList childNodesB, XmlDocument parentA)
{
foreach (XmlNode oNode in childNodesB)
{
// Exclude container node
if (oNode.Name == "#comment") continue;
bool isFound = false;
string name = oNode.Attributes["Name"].Value;
foreach (XmlNode child in parentNodeA.ChildNodes)
{
if (child.Name == "#comment") continue;
// If node already exists and is unchanged, exit loop
if (child.OuterXml== oNode.OuterXml&& child.InnerXml == oNode.InnerXml)
{
isFound = true;
Console.WriteLine("Found::NoChanges::" + oNode.Name + "::" + name);
break;
}
// If node already exists but has been changed, replace it
if (child.Attributes["Name"].Value == name)
{
isFound = true;
Console.WriteLine("Found::Replaced::" + oNode.Name + "::" + name);
parentNodeA.ReplaceChild(parentA.ImportNode(oNode, true), child);
}
}
// If node does not exist, add it
if (!isFound)
{
Console.WriteLine("NotFound::Adding::" + oNode.Name + "::" + name);
parentNodeA.AppendChild(parentA.ImportNode(oNode, true));
}
}
}
Its not perfect - I have to manually specify the nodes I want merged, but it was quick and easy for me to put together and since I have almost no knowledge of XML, I'm happy :)
It actually works out better that it only merges the specified nodes since I'm using it to merge Entity Framework's edmx files, and I only really want to merge the SSDL, CDSL, and MSL nodes.
The way you could do it, is load a dataset with the xml and merge the datasets.
Dim dsFirst As New DataSet()
Dim dsMerge As New DataSet()
' Create new FileStream with which to read the schema.
Dim fsReadXmlFirst As New System.IO.FileStream(myXMLfileFirst, System.IO.FileMode.Open)
Dim fsReadXmlMerge As New System.IO.FileStream(myXMLfileMerge, System.IO.FileMode.Open)
Try
dsFirst.ReadXml(fsReadXmlFirst)
dsMerge.ReadXml(fsReadXmlMerge)
Dim str As String = "Merge Table(0) Row Count = " & dsMerge.Tables(0).Rows.Count
str = str & Chr(13) & "Merge Table(1) Row Count = " & dsMerge.Tables(1).Rows.Count
str = str & Chr(13) & "Merge Table(2) Row Count = " & dsMerge.Tables(2).Rows.Count
MsgBox(str)
dsMerge.Merge(dsFirst, True)
DataGridParent.DataSource = dsMerge
DataGridParent.DataMember = "rulefile"
DataGridChild.DataSource = dsMerge
DataGridChild.DataMember = "rule"
str = ""
str = "Merge Table(0) Row Count = " & dsMerge.Tables(0).Rows.Count
str = str & Chr(13) & "Merge Table(1) Row Count = " & dsMerge.Tables(1).Rows.Count
str = str & Chr(13) & "Merge Table(2) Row Count = " & dsMerge.Tables(2).Rows.Count
MsgBox(str)
reposting answer from https://www.perlmonks.org/?node_id=127848
Paste following into a perl script
use strict;
require 5.000;
use Data::Dumper;
use XML::Simple;
use Hash::Merge;
my $xmlFile1 = shift || die "XmlFile1\n";
my $xmlFile2 = shift || die "XmlFile2\n";
my %config1 = %{XMLin ($xmlFile1)};
my %config2 = %{XMLin ($xmlFile2)};
my $merger = Hash::Merge->new ('RIGHT_PRECEDENT');
my %newhash = %{ $merger->merge (\%config1, \%config2) };
# XMLout (\%newhash, outputfile => "newfile", xmldecl => 1, rootname => 'config');
print XMLout (\%newhash);

Categories

Resources