Merge 2 duplicate xml nodes to one - c#

I'm building a database with meets, clubs, results and a lot more for swimmers in the Netherlands. Due to some changes in the data i am receiving i'm running into a problem with duplicate values in the XML files i am reading.
Here is part of an XML file that causes problems :
<LENEX version="3.0">
<MEETS>
<MEET name="Speedowedstrijd 2012 - 2013 deel 1">
<CLUB name="AZVD" type="CLUB" nation="NED" region="08" code="08-004">
<OFFICIALS>
<OFFICIAL nation="NED" gender="M" officialid="2329" lastname="xx">
<CONTACT email="xx" phone="xx" country="NL" />
</OFFICIAL>
</OFFICIALS>
</CLUB>
<CLUB name="A.Z.V.D." type="CLUB" nation="NED" region="8" code="08-004">
<ATHLETES>
<ATHLETE nation="NED" gender="M" athleteid="2358" license="xx" lastname="xx">
<RESULTS>
<RESULT eventid="1167" resultid="2359" swimtime="00:03:09.69">
<SPLITS>
<SPLIT distance="50" swimtime="00:00:40.71"/>
<SPLIT distance="100" swimtime="00:01:30.71"/>
</SPLITS>
</RESULT>
</RESULTS>
</ATHLETE>
</ATHLETES>
</CLUB>
</MEET>
</MEETS>
</LENEX>
Now the reading of the xml file is not a problem, using XDocument i get all the nodes, childs etc.
However, when i write the values to my database i get an keyconstraint error on the table Club_Meet. This table holds the link between the clubs table and the meet table and each conbination must be unique. As both clubs in the example above are pointing to the same club in my database (unique code = 08-004, i am trying to write the same values to the database twice, causing the error.
So waht i want to do is when i go through the xml file and find a club : check if this club was already found in this XML before and if so hang the childnodes under that first club-node.
The result of this action should be (internally) :
<LENEX version="3.0">
<MEETS>
<MEET name="Speedowedstrijd 2012 - 2013 deel 1">
<CLUB name="AZVD" type="CLUB" nation="NED" region="08" code="08-004">
<OFFICIALS>
<OFFICIAL nation="NED" gender="M" officialid="2329" lastname="xx">
<CONTACT email="xx" phone="xx" country="NL" />
</OFFICIAL>
</OFFICIALS>
<ATHLETES>
<ATHLETE nation="NED" gender="M" athleteid="2358" license="xx" lastname="xx">
<RESULTS>
<RESULT eventid="1167" resultid="2359" swimtime="00:03:09.69">
<SPLITS>
<SPLIT distance="50" swimtime="00:00:40.71"/>
<SPLIT distance="100" swimtime="00:01:30.71"/>
</SPLITS>
</RESULT>
</RESULTS>
</ATHLETE>
</ATHLETES>
</CLUB>
</MEET>
</MEETS>
</LENEX>
Note that the second club-node <CLUB name="A.Z.V.D." type="CLUB" nation="NED" region="8" code="08-004"> is removed completely, i dont need anything from that one.
How do i move the child nodes from one club to another and delete the empty club ?
Anyone that can point me in the right direction ?
(Hope this all makes some sense....)

OK so if you want to work strictly with the manipulation of your XML document, you can use the following extension method which I created.
public static class XmlExtensions
{
public static IEnumerable<XElement> CombineLikeElements(this IEnumerable<XElement> source, Func<XElement, object> groupSelector)
{
// used to record the newly combined elements
List<XElement> priElements = new List<XElement>();
// group the current xml nodes by the supplied groupSelector, and only
// select the groups that have more than 1 elements.
var groups = source.GroupBy(groupSelector).Where(grp => grp.Count() > 1);
foreach(var grp in groups)
{
// get the first (primary) child element and use it as
// element that all the other sibling elements get combined with.
var priElement = grp.First();
// get all the sibling elements which will be combined
// with the primary element. Skipping the primary element.
var sibElements = grp.Skip(1);
// add all the sibling element's child nodes to the primary
// element.
priElement.Add(sibElements.Select(node => node.Elements()));
// remove all of the sibling elements
sibElements.Remove();
// add the primary element to the return list
priElements.Add(priElement);
}
// return the primary elements incase we want to do some further
// combining of their descendents
return priElements;
}
}
You would use the extension method as follows:
XDocument xmlDoc = XDocument.Parse(xml);
xmlDoc
// Combine all of the duplicate CLUB nodes under each MEET node
.Descendants("MEET").Descendants("CLUB").CombineLikeElements(node => node.Attribute("code").Value);
And it would return the results that you requested.
I have the extension method returning a list of the XElements which everything was combined into in case you want to combine their child nodes. For example if after combining your identical CLUB elements, one or more of the CLUBs ends up having two or more ATHLETES or OFFICIALS nodes you can could combine those easily as well by doing the following:
xmlDoc
// Combine all of the duplicate CLUB nodes under each MEET node
.Descendants("MEET").Descendants("CLUB").CombineLikeElements(node => node.Attribute("code").Value)
// Combine all of the duplicate ALTHLETES or OFFICIALS nodes under the newly combined CLUB nodes
.Elements().CombineLikeElements(node => node.Name);

Related

How can I get a list of XML nodes by value using C#

I'm trying to get a XmlNodeList from an XmlDocument for nodes that have a certain value, with a view to removing those nodes.
XML:
<List xmlns="http://mynamespace.com/v1">
<Category>2144</Category>
<Title>My Object</Title>
<StartPrice>30.00</StartPrice>
<ReservePrice>-999</ReservePrice>
<BuyNowPrice>-999</BuyNowPrice>
</List>
Preferably I don't want to iterate through every node and check its value. I looked at trying to use LINQ from some examples but I just don't understand it enough to even attempt it.
I feel I'm getting close-ish with XPath (https://www.w3schools.com/xml/xpath_syntax.asp) but I'm beginning to think what I want to do isn't supported.
string xml = UtilityClass.SerializeObject<Listing> ( myListing);
XmlDocument xmlDocument = new XmlDocument ();
xmlDocument.LoadXml ( xml );
XmlElement root = xmlDocument.DocumentElement;
XmlNodeList nodes = root.SelectNodes ( "//*['-999']" );
Am open to other suggestions to get the same result, i.e. remove the nodes with -999 from the Xml document.
Thanks in advance
LINQ to XML is preferred API while dealing with XML in .Net Framework since 2007.
Check it out how easy to achieve what you need in one single statement.
LINQ methods are chained one after another and self-explanatory:
Get all descendants of the root node, taking into account a default namespace.
Whatever the names of the elements.
Where element value is -999.
Convert them to a List<>.
Remove those elements from the XML document.
c#
void Main()
{
XDocument xdoc = XDocument.Parse(#"<List xmlns='http://mynamespace.com/v1'>
<Category>2144</Category>
<Title>My Object</Title>
<StartPrice>30.00</StartPrice>
<ReservePrice>-999</ReservePrice>
<BuyNowPrice>-999</BuyNowPrice>
</List>");
XNamespace ns = xdoc.Root.GetDefaultNamespace();
xdoc.Descendants(ns + "List")
.Elements()
.Where(x => x.Value.Equals("-999"))
.ToList()
.ForEach(x => x.Remove());
Console.WriteLine(xdoc);
}
Output
<List xmlns="http://mynamespace.com/v1">
<Category>2144</Category>
<Title>My Object</Title>
<StartPrice>30.00</StartPrice>
</List>

Parsing Advanced XML File & Turning Information Into Class In C#

What needs to be done
Currently I have 1 Advanced XML file that needs to be parsed. I need to iterate through the file and read each "Entity" tag individually. Altough the issue I come across is reading & iterating through the Stats and Slots. Also, the amount of Stat & Slot tags will vary depending on the Entity. (Yes I have researched this topic but I still can't find a way without creating errors as I need some more guidance. The other posts haven't had the exact fix I've hoped for...)
XML File
<ROOT>
<Entity Type="Clothing" Name="Light Robe" ID="0">
<Armor>2</Armor>
<Weight>1</Weight>
<Usability>120</Usability>
<Rarity>0.1</Rarity>
<Stats>
<Stat Type="Health">10</Stat>
</Stats>
<Slots>
<Slot>Torso</Slot>
</Slots>
</Entity>
<Entity Type="Clothing" Name="Medium Robe" ID="1">
<Armor>4</Armor>
<Weight>2</Weight>
<Stats>
<Stat Type="Health">15</Stat>
</Stats>
<Usability>120</Usability>
<Rarity>0.1</Rarity>
<Slots>
<Slot>Torso</Slot>
</Slots>
</Entity>
<Entity Type="Clothing" Name="Heavy Robe" ID="2">
<Armor>6</Armor>
<Weight>4</Weight>
<Stats>
<Stat Type="Health">25</Stat>
</Stats>
<Usability>120</Usability>
<Rarity>0.1</Rarity>
<Slots>
<Slot>Torso</Slot>
</Slots>
</Entity>
</ROOT>
If anybody has any criticism of this post please say so as I'll edit accordingly.
Use XDocument for XML reading.
// Load Document
XDocument _doc = XDocument.Load("C:\\t\\My File2.txt");
// Get all Entity elements and put them into a list.
List<XElement> employees = _doc.XPathSelectElements("ROOT/Entity").ToList();
// Next you can loop thru the list to check the Entity's elements
foreach (var employee in employees)
{
// to get the armor element:
string armor = employee.Element("Armor").Value;
// to get the rarity element:
string rarity = employee.Element("Rarity").Value;
// to get the Stat element:
string stat = employee.Element("Stats").Element("Stat").Value;
// to get the Slot element:
string slot = employee.Element("Slots").Element("Slot").Value;
}
// To get one element specific by attribute use this (I check on attribute ID):
XElement emp = _doc.XPathSelectElements("ROOT/Entity").FirstOrDefault(c => c.Attribute("ID").Value == "0");
// Next you can extract information from this element just like in the foreach loop.

LINQ to XML Query to find all child values where parent element name like X C#

So lets say I have a XElement that looks something like this
<Root>
<ProductOne>
<Size>Large</Size>
<Height>2</Height>
</ProductOne>
<ProductTwo>
<Size>Small</Size>
<Type>Bar</Type>
</ProductOne>
<ProductThree>
<Size>Small</Size>
<Type>Crate</Type>
<Color>Blue</Color>
</ProductOne>
<SomeOtherStuff>
<OtherThing>CrazyData</OtherThing>
</SomeOtherStuff>
</Root>
I want to query this data and get a IEnumerable string of the child values (I.E. Size, Type, Color, and a lot of other possuble attributes) of anything that is in a element with the word "Product" in it.
So My resulting list would look like
Large
2
Small
Bar
Small
Crate
Blue
Could someone tell me how to construct such a query using LINQ?
First, you have a lot of typos with your xml. Here is the correct version:
var xml = #"
<Root>
<ProductOne>
<Size>Large</Size>
<Height>2</Height>
</ProductOne>
<ProductTwo>
<Size>Small</Size>
<Type>Bar</Type>
</ProductTwo>
<ProductThree>
<Size>Small</Size>
<Type>Crate</Type>
<Color>Blue</Color>
</ProductThree>
<SomeOtherStuff>
<OtherThing>CrazyData</OtherThing>
</SomeOtherStuff>
</Root>";
Now, here is some linq magic you can do to get the values you want.
var list = XElement.Parse(xml) //parses the xml as an XElement
.Elements() //gets all elements under the "root" node
.Where(x => x.Name.LocalName.StartsWith("Product")) // only selects elements that
// start with "product"
.SelectMany(x => x.Elements()) // inside of each of the "product" nodes, select
// all the inner nodes and flatten the results
// into a single list
.Select(x => x.Value) //select the node's inner text
.ToList(); //to list (optional)
This will give you back your wanted list as a List<string>.
Large
2
Small
Bar
Small
Crate
Blue

Can't loop children of returned XML Nodes in C#

I have a large, messy XML file and I want to retrieve ALL elements of the same name ("Item" for the sake of this post) from it, then be able to retrieve data from each element's children.
So far I have returned a list of every element called "Item" using this code, which just displays the namespace url and "Item" in p tags:
XDocument doc = XDocument.Load(#"C:\inetpub\wwwroot\mysite\myxml.xml");
XNamespace ns = "http://www.mynamespace.com";
var nodes = doc.Descendants().Elements(ns + "Item").Select(d => d.Name).ToList();
foreach(var x in nodes){
<p>#x</p>
}
However, by amending the code with the following, I can't retrieve any data of it's children and I get the error 'System.Xml.Linq.XName' does not contain a definition for 'Descendants':
foreach(var x in nodes){
<p>#x.Descendants().Element("Name")</p>
}
Here is a very basic version of my XML file:
<Item>
<Name>Item 1</Name>
<Type>Type 1</Type>
</Item>
I want to be able to search each 'Item' element for a 'Name' element and return the value. Can anyone see where I'm going wrong?
This is the problem:
.Select(d => d.Name)
You're explicitly selecting the names of the elements. If you want the actual elements (which I think you do), just get rid of that call:
var nodes = doc.Descendants().Elements(ns + "Item").ToList();
You could also get rid of the ToList() unless you need the query to be materialized eagerly.

How do I programmatically determine if an XML node is a leaf node?

I have a big XML tree like the following:
<CategoryArray>
<Category Name="Antiques" ID="20081">
<Category Name="Antiquities" ID="37903">
<Category Name="The Americas" ID="37908" />
<Category Name="Byzantine" ID="162922" />
<Category Name="Celtic" ID="162923" />
<Category Name="Egyptian" ID="37905" />
...
I'd like to iterate through all nodes to populate a control and, when doing so, check to see: is this node a leaft of a parent node? What is the easiest way to do this?
A leaf node is one that has no children so you can simply perform a check if it has children. There are various ways of doing this depending on how you're loading the XML document. For example, you can use the HasChildNodes property.
if (myXmlNode.HasChildNodes)
//is not a leaf
else
//is a leaf
Number of child nodes will give you the answer - 0 child nodes (or only text child node, depending on classes/queries you use) means it is leaf.
I.e. XElement sample form MSDN: Find a List of Child Elements
XDocument cpo = XDocument.Load("PurchaseOrders.xml");
XElement po = cpo.Root.Element("PurchaseOrder").Element("Address");
// list1 contains all children of PurchaseOrder using LINQ to XML query
IEnumerable<XElement> list1 = po.Elements();
// list2 contains all children of PurchaseOrder using XPath expression
IEnumerable<XElement> list2 = po.XPathSelectElements("./*");
I would first flatten the hierarchy - e.g. using the code from this post
How do I select recursive nested entities using LINQ to Entity
And then something like this...
using (XmlReader reader = XmlReader.Create(new StringReader(this.XML)))
{
XElement xml = XElement.Load(reader);
var all = xml.Elements("Category").Flatten(x => x.Elements("Category"));
var leafs = from cat in all
where cat.Elements("Category").Any() == false
select cat;
// or go through all...
var categories =
from cat in all
select new
{
Name = cat.Attribute("Name"),
ID = cat.Attribute("ID"),
IsLeaf = cat.Elements("Category").Any() == false,
SubCount = cat.Elements("Category").Count(),
// Subs = cat.Elements("Category").Select(x => x.Attribute("Name").ToString()).ToArray(),
};
// or put into dictionary etc.
var hash = categories.ToDictionary(x => x.Name);
}

Categories

Resources