Linq to XML, extracting attributes and elements - c#

I am new to XML and Linq to XML and I just can't find a good guide that explains how to work with it. I have a simple XML string structured as follows
<mainitem>
<items>
<itemdescription>ABC</itemdescription>
<item>
<itemtext>XXX</itemtext>
</item>
<item>
<itemtext>YYY</itemtext>
</item>
<item>
<itemtext>ZZZ</itemtext>
</item>
</items>
<overalldescription>ABCDEFG</overalldescription>
<itemnodes>
<node caption="XXX" image="XXX"></node>
<node caption="YYY" image="YYY"></node>
<node caption="ZZZ" image="ZZZ"></node>
</itemnodes>
</mainitem>
I am using C# code like
var Items = (from xElem in XMLCODEABOVE.Descendants("item")
select new ItemObject
{
ItemObjectStringProperty = xElem.Element("itemtext").Value,
}
);
to extract a list of the itemtext objects for use with my code. Where I need help is in extracting a list of the caption and image attributes of my node elements. I also need the overalldescription and the itemdescription. I have tried every variation of the above code substituting Descendant for Elements, Element for Attribute etc. I know this is probably a basic question but there doesn't seem to be a straight forward guide out there to explain this to a beginner.

To get the captions
// IEnumerable<string>
var captions = from node in doc.Descendants("node")
select node.Attribute("caption").Value;
Or both the captions and image attributes in one shot:
// IEnumerable of the anonymous type
var captions = from node in doc.Descendants("node")
select new {
caption = node.Attribute("caption").Value,
image = node.Attribute("image").Value
};
For the descriptions:
// null ref risk if element doesn't exist
var itemDesc = doc.Descendants("itemdescription").FirstOrDefault().Value;
var overallDesc = doc.Descendants("overalldescription ").FirstOrDefault().Value;

Related

Intersect 2 Xml Files with XDocument in C#

I have 2 XML Files and I want to get all the XNodes , which are in both files, only based on their same Attribute "id".
This is how an XML File looks like:
<parameters>
<item id="57">
<länge>8</länge>
<wert>0</wert>
</item>
<item id="4">
<länge>8</länge>
<wert>0</wert>
</item>
<item id="60">
<länge>8</länge>
<wert>0</wert>
</item>
</parameters>
Given a second XML File which looks like this:
<parameters>
<item id="57">
<länge>16</länge>
<wert>10</wert>
</item>
<item id="144">
<länge>16</länge>
<wert>10</wert>
</item>
</parameters>
Now I only want the XNode with the ID=57, since it is available in both files. So the output should look like this:
<item id="57">
<länge>8</länge>
<wert>0</wert>
</item>
I already did intersect both files like this:
aDoc = XDocument.Load(file);
bDoc = XDocument.Load(tmpFile);
intersectionOfFiles = aDoc.Descendants("item")
.Cast<XNode>()
.Intersect(bDoc.Descendants("item")
.Cast<XNode>(), new XNodeEqualityComparer());
This only seems to work, when all the descendant Nodes are the same. If some value is different, it won't work. But I need to get this to work on the same Attributes, the values or the descendants doesn't matter.
I also tried to get to the Attributes and intersect them, but this didn't work either:
intersectionOfFiles = tmpDoc
.Descendants(XName.Get("item"))
.Attributes()
.ToList()
.Intersect(fileDoc.Descendants(XName.Get("item")).Attributes()).ToList();
Am I missing something or is this a completely wrong approach?
Thanks in advance.
You should create your own IEqualityComparer that compares XML attributes you want:
public class EqualityComparerItem : IEqualityComparer<XElement>
{
public bool Equals(XElement x, XElement y)
{
return x.Attribute("id").Value == y.Attribute("id").Value;
}
public int GetHashCode(XElement obj)
{
return obj.Attribute("id").Value.GetHashCode();
}
}
which you would then pass to XML parsing code:
var intersectionOfFiles = aDoc.Root
.Elements("item")
.Intersect(
bDoc.Root
.Elements("item"), new EqualityComparerItem());
I also changed some parts of your XML parsing code (XElement instead of XNode since "item" is XML element and "id" is XML attribute).

How to get enclosure url with XElement C# Console

I read multiple feed from many sources with C# Console, and i have this code where i load XML From sources:
XmlDocument doc = new XmlDocument();
doc.Load(sourceURLX);
XElement xdoc = XElement.Load(sourceURLX);
How to get enclosure url and show as variable?
If I understand your question correctly (I'm making a big assumption here) - you want to select an attribute from the root (or 'enclosing') tag, named 'url'?
You can make use of XPath queries here. Consider the following XML:
<?xml version="1.0" encoding="utf-8"?>
<root url='google.com'>
<inner />
</root>
You could use the following code to retrieve 'google.com':
String query = "/root[1]/#url";
XmlDocument doc = new XmlDocument();
doc.Load(sourceURLX);
String value = doc.SelectSingleNode(query).InnerText;
Further information about XPath syntax can be found here.
Edit: As you stated in your comment, you are working with the following XML:
<item>
<description>
</description>
<enclosure url="blablabla.com/img.jpg" />
</item>
Therefore, you can retrieve the url using the following XPath query:
/item[1]/enclosure[1]/#url
With xml like below
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>title</title>
<link>https://www.link.com</link>
<description>description</description>
<item>
<title>RSS</title>
<link>https://www.link.com/xml/xml_rss.asp</link>
<description>description</description>
<enclosure url="https://www.link.com/media/test.wmv"
length="10000"
type="video/wmv"/>
</item>
</channel>
</rss>
You will get url by reading attribute
var document = XDocument.Load(sourceURLX);
var url = document.Root
.Element("channel")
.Element("item")
.Element("enclosure")
.Attribute("url")
.Value;
To get multiple urls
var urls = document.Descendants("item")
.Select(item => item.Element("enclosure").Attribute("url").Value)
.ToList();
Using foreach loop
foreach (var item in document.Descendants("item"))
{
var title = item.Element("title").Value;
var link = item.Element("link").Value;
var description = item.Element("description").Value;
var url = item.Element("enclosure").Attribute("url").Value;
// save values to database
}

How to correctly perform Linq-to-XML query?

I have a XDocument called currentIndex like that:
<INDEX>
<SUBINDEX>
<!-- Many tag and infos -->
<SUBINDEX>
<ITEM>
<IDITEM>1</IDITEM>
<ITEM>
<ITEM>
<IDITEM>2</IDITEM>
<ITEM>
...
<ITEM>
<IDITEM>n</IDITEM>
<ITEM>
</INDEX>
I would recreate a new XDocument similar to above one:
<INDEX>
<SUBINDEX>
<!-- Many tag and infos -->
<SUBINDEX>
<ITEM>
<IDITEM>2</IDITEM>
<ITEM>
</INDEX>
I want to do this in C#, I have tried starting in this way:
public void ParseItems(XDocument items)
{
IEnumerable<XElement> items = from a in indexGenerale.Descendants(XName.Get("ITEM"))
// where a.Element("IDITEM").Equals("2")
select a;
foreach(var item in items) {
// do something
}
}
Now the problem: If where clause is commented, items contains n elements (one for each ITEM tag), but if I remove that comments items is empty. Why this behaviour. How I need to perform a search?
Use an explicit cast:
from a in indexGenerale.Descendants("ITEM")
where (string)a.Element("IDITEM") == "2"
a.Element("IDITEM") will return an XElement and it will never be equal to "2".Maybe you meant a.Element("IDITEM").Value.Equals("2"), that will also work but explicit cast is safer.It doesn't throw exception if the element wasn't found`,

How to parse xml link tag href attribute using c#

This is the sample xml of a feed item
<item>
<pubDate>2013-12-11 10:28:55</pubDate>
<title>
SAG Awards Nominations: 12 Years a Slave, Breaking Bad lead the race
</title>
<link>
http://www.rottentomatoes.com/m/1929182/news/1929182/
</link>
<description>
<![CDATA[ ]]>
</description>
<atom:link rel="thumbnail" type="image/*" href="http://content6.flixster.com/movie/11/17/36/11173600_tmb.jpg"/>
</item>
c# code for parsing xml elements
List<XElement> elementsList = xmlItems.Descendants("item").ToList();
foreach (XElement rssItem in elementsList)
{
RSSItem rss = new RSSItem();
rss.Description1 = rssItem.Element("description").Value;
rss.Link1 = rssItem.Element("link").Value;
rss.Title1 = rssItem.Element("title").Value;
rss.ImageUrl= ;
}
I successfully parsed the xml elements except the atom:link tag url.
How we can parse the href property from the atom:link tag ?
Link has a namespace, you need to indicate it when parsing the XML. I don't remember exactly what namespace atom is, but it should be indicated somewhere in the XML file (usually on the root node). For instance, if it is:
<feed xmlns:atom="http://www.w3.org/2005/Atom">
Then you need to parse it like this:
rss.Link1 = (string)rssItem.Element(XName.Get("link", "http://www.w3.org/2005/Atom")).Attribute("href");
You need to specify the namespace when you look for the element:
XNamespace atom = "http://www.w3.org/2005/Atom";
...
rss.Link1 = rssItem.Element(atom + "link").Attribute("href").Value;
LINQ to XML makes namespace handling much simpler than any other XML API I've seen, but you still need to be aware of it. (I'm surprised the other elements aren't in a namespace, to be honest.)
I'd also transform your foreach loop into a LINQ query:
var items = xmlItems.Descendants("item")
.Select(x => new RSSItem {
Description1 = x.Element("description").Value,
Link1 = x.Element(atom + "link").Attribute("href").Value,
Title1 = x.Element("title").Value,
...
})
.ToList();
Also consider using a cast to string instead of the Value property, if some of the elements may be missing - that will set the relevant property to null, instead of throwing a NullReferenceException.
EDIT: If the link element is missing, you can fix that with:
Link1 = (string) x.Elements(atom + "link").Attributes("href").FirstOrDefault()
That will find the first href attribute within an atom link element, or use null - and then the cast to string will just return null if there's no attribute. (That's part of the user-defined conversion from XAttribute to string.)

Convert XML to a list of something

I have a xml file coming from my php site as an api.
This is the xml that is coming back from my php application.
<xml>
<overzicht>
<item>
<sessieID>6</sessieID>
<onderwerp>Vrijwilligers, een uitstervend rasnn</onderwerp>
<omschrijving>Ode aan de vrijwilligers jjj</omschrijving>
<sprekerID>1</sprekerID>
<lokaalID>20</lokaalID>
<themaID>1</themaID>
<typeID>2</typeID>
<periodeID>2</periodeID>
<datum>2012-02-20</datum>
<maximaleInschrijvingen>1</maximaleInschrijvingen>
<spreker>
<sprekerID>1</sprekerID>
<sprekerNaam>Rik Torfs</sprekerNaam>
<loginID>13</loginID>
</spreker>
<lokaal>
<lokaalID>20</lokaalID>
<campusNaam>Malle</campusNaam>
<lokaalOpCampus>W10</lokaalOpCampus>
<typeID>2</typeID>
</lokaal>
</item>
<item>
<sessieID>15</sessieID>
<onderwerp>VPKB</onderwerp>
<omschrijving/>
<sprekerID>6</sprekerID>
<lokaalID>2</lokaalID>
<themaID>1</themaID>
<typeID>1</typeID>
<periodeID>2</periodeID>
<datum>2012-02-20</datum>
<maximaleInschrijvingen>50</maximaleInschrijvingen>
<spreker>
<sprekerID>6</sprekerID>
<sprekerNaam>Dick Wursten</sprekerNaam>
<loginID>18</loginID>
</spreker>
<lokaal>
<lokaalID>2</lokaalID>
<campusNaam>KHK Vorselaar</campusNaam>
<lokaalOpCampus>A102</lokaalOpCampus>
<typeID>1</typeID>
</lokaal>
</item>
...
</overzicht>
</xml>
This is my C# code. I want to get al list of Sessie.
XDocument xmlDoc = XDocument.Parse(e.Result);
List<Sessie> sessies =
(
from item in xmlDoc.Descendants("overzicht")
select new Sessie(
item.Element("onderwerp").Value,
Convert.ToInt32(item.Element("sessieID").Value),
item.Element("omschrijving").Value,
(Spreker)(
new Spreker(
Convert.ToInt32(item.Element("spreker").Element("sprekerID").Value),
item.Element("spreker").Element("sprekernaam").Value)
),
Convert.ToDateTime(item.Element("datum").Value),
Convert.ToInt32(item.Element("maximaleInschrijvingen").Value),
(Lokaal)(
new Lokaal(
Convert.ToInt32(item.Element("lokaal").Element("lokaalID").Value),
item.Element("lokaal").Element("campusNaam").Value,
item.Element("lokaal").Element("lokaalOpCampus").Value)
)
)
).ToList<Sessie>();
I know my code isn't working with this exception.
"NullReferenceException"
There's one fairly obvious problem to start with. Look at the very start of your query:
from item in xmlDoc.Descendants("overzicht")
select new Sessie(item.Element("onderwerp").Value,
...
That will only work if there's an <onderwerp> directly under <overzicht>. There isn't - it's under the <item> element. Perhaps (given the range variable name) you meant:
from item in xmlDoc.Descendants("item")
select new Sessie(item.Element("onderwerp").Value,
...
The query
from item in xmlDoc.Descendants("overzicht")
will return a list of <overzicht> elements. item.Element("onderwerp") does not exist, you are missing the <item> element in between.
Simple fix:
from item in xmlDoc.Descendants("item")

Categories

Resources