Linq query of XML attributes - c#

So I'm trying to write a simple query that grabs all of a certain attribute from an XML file, but nothing seems to work. I've been able to do this with several other XML's but for some reason the one I'm working with here just won't cooperate. Any suggestions or advice would be hugely appreciated.
Here's what the XML looks like.
<Doc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="Name" xsi:schemaLocation="[there's a link here]" Name="Name">
<Wrapper>
<Box_Collection>
<Box name="Test A" test="Test B"/>
<Box name="Test C" test="Test D"/>
<Box name="Test E" test="Test F"/>
</Box_Collection>
</Wrapper>
</Doc>
Here's my C# code:
XDocument customers = XDocument.Load(#"C:\Users\folder\file.xml");
IEnumerable<string> names =
from c in customers.Descendants("Box").Attributes("name")
select c.Value;
string nameList = "Names:";
foreach (string c in names)
{
namer += " " + c;
}
textBox.AppendText(nameList);

The reason is that your XML has default namespace declared at the root element :
xmlns="Name"
XML elements inherit ancestor default namespace by default, unless otherwise specified (f.e by using explicit prefix that point to different namespace URI). You can use XNamespace + element's local name to point to element in namespace :
XNamespace ns = "Name";
IEnumerable<string> names =
from c in customers.Descendants(ns+"Box").Attributes("name")
select c.Value;

Your document has a default namespace of "Name". You need to reference the namespace when selecting a node like so:
IEnumerable<string> names =
from c in customers.Descendants(XName.Get("Box", "Name")).Attributes("name")
select c.Value;

Related

Extracting XML Child Elements Where the Parents are in a Defaulted Namespace

I have the below XML and I've been trying to extract the FirstName, LastName and OtherName for a while now I'm running into all sort of problems.
<OmdCds xmlns="cds"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:cdsd="cds_dt"
xsi:schemaLocation="cds ontariomd_cds.xsd">
<PatientRecord>
<Demographics>
<Names>
<cdsd:LegalName namePurpose="L">
<cdsd:FirstName>
<cdsd:Part>SARAH</cdsd:Part>
<cdsd:PartType>GIV</cdsd:PartType>
<cdsd:PartQualifier>BR</cdsd:PartQualifier>
</cdsd:FirstName>
<cdsd:LastName>
<cdsd:Part>GOMEZ</cdsd:Part>
<cdsd:PartType>FAMC</cdsd:PartType>
<cdsd:PartQualifier>BR</cdsd:PartQualifier>
</cdsd:LastName>
<cdsd:OtherName>
<cdsd:Part>GABRIELA</cdsd:Part>
<cdsd:PartType>GIV</cdsd:PartType>
<cdsd:PartQualifier>BR</PartQualifier>
I currently trying to extract with the below c# code but still can't extract the above data. I'm getting a nullreferenceexception.
XmlDocument doc = new XmlDocument();
doc.Load(folder + "\\" + o.ToString());
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(doc.NameTable);
namespaceManager.AddNamespace("cdsd", "http://www.w3.org/2001/XMLSchema-instance");
XmlNode firstName = doc.DocumentElement.SelectSingleNode("/PatientRecord/Demographics/Names/cdsd:LegalName/cdsd:FirstName/cdsd:Part", namespaceManager);
string fName = firstName.InnerText;
MessageBox.Show(fName);
I can see in the local watch item under doc.DocumentElement, all the InnerXML and InnerText. The InnerXML look something like this...
<PatientRecord xmlns=\"cds\"><Demographics><Names><cdsd:LegalName namePurpose=\"L\" xmlns:cdsd=\"cds_dt\"><cdsd:FirstName><cdsd:Part>SARAH</cdsd:Part><cdsd:PartType>GIV</cdsd:PartType><cdsd:PartQualifier>BR</cdsd:PartQualifier></cdsd:FirstName>
You have 3 namespace definitions in the document:
cds - as a default namespace
http://www.w3.org/2001/XMLSchema-instance- with the xsi prefix
cds_dt - with the cdsd prefix
I am wondering that you don't get an error message, because cds and cds_dt are no URIs and namspaces need to be URIs.
If you try to understand an element name you need to replaces the prefix with the actual namespace.
<PatientRecord> reads as {cds}:PatientRecord
<cdsd:LegalName> reads as {cds_dt}:LegalName
Now in XPath 1.0 the same happens with registered namespaces. But XPath does not have a default namespace. So elements without one are not expanded with a default namespace.
You need to register namespace prefixes on the namespace manager. The prefix does not need to be the same as in the document.
namespaceManager.AddNamespace("cdsd", "cds_dt");
namespaceManager.AddNamespace("cds", "cds");
Now you can use the registered namespaces in XPath:
doc.DocumentElement.SelectSingleNode(
"cds:PatientRecord/cds:Demographics/cds:Names/cdsd:LegalName/cdsd:FirstName/cdsd:Part",
namespaceManager
);
If the first character of an XPath expression is a slash the expression is relative to the document, otherwise to the current context node. You call SelectSingleNode() on the doc.DocumentElement - the OmdCds element node. PatientRecord is a child node, so you can start with it or use . for the current context node.
PatientRecord, Demographics and Names are in the cds namespace. This is because of the default namespace declaration on the OmdCds element (xmlns="cds"). The others are in the cdsd namespace, not xsi. You'll have to add them and use them in the XPATH:
namespaceManager.AddNamespace("cdsd", "cdsd");
namespaceManager.AddNamespace("cds", "cds");
XmlNode firstName = doc.DocumentElement.SelectSingleNode(
"/cds:PatientRecord/cds:Demographics/cds:Names/cdsd:LegalName/cdsd:FirstName/cdsd:Part",
namespaceManager);
BTW, you're getting a NullReferenceException because you're making the false assumption that your query will always return a node. You are now seeing what happens when it does not return a node. Always check for null whenever it's possible that a query returns no value.
Instead XmlDocument class you can use Linq to XML, is easy. You need using the System.Xml.Linq namspace, for example:
XDocument xdoc = XDocument.Load("path");
IEnumerable<XElement> nodes = (from p in xdoc.Descendants()
where p.Name.LocalName == "FirstName"
select p).Elements();
foreach (XElement nodeFirstName in nodes)
{
foreach (XElement parts in nodeFirstName.Elements())
{
string strExtracted = parts.Name.LocalName + " " + parts.Value;
}
}
The LocalName property is used beacuse elements have a prefix "cdsd"

How to parse xml link tag href attribute using c#

This is the sample xml of a feed item
<item>
<pubDate>2013-12-11 10:28:55</pubDate>
<title>
SAG Awards Nominations: 12 Years a Slave, Breaking Bad lead the race
</title>
<link>
http://www.rottentomatoes.com/m/1929182/news/1929182/
</link>
<description>
<![CDATA[ ]]>
</description>
<atom:link rel="thumbnail" type="image/*" href="http://content6.flixster.com/movie/11/17/36/11173600_tmb.jpg"/>
</item>
c# code for parsing xml elements
List<XElement> elementsList = xmlItems.Descendants("item").ToList();
foreach (XElement rssItem in elementsList)
{
RSSItem rss = new RSSItem();
rss.Description1 = rssItem.Element("description").Value;
rss.Link1 = rssItem.Element("link").Value;
rss.Title1 = rssItem.Element("title").Value;
rss.ImageUrl= ;
}
I successfully parsed the xml elements except the atom:link tag url.
How we can parse the href property from the atom:link tag ?
Link has a namespace, you need to indicate it when parsing the XML. I don't remember exactly what namespace atom is, but it should be indicated somewhere in the XML file (usually on the root node). For instance, if it is:
<feed xmlns:atom="http://www.w3.org/2005/Atom">
Then you need to parse it like this:
rss.Link1 = (string)rssItem.Element(XName.Get("link", "http://www.w3.org/2005/Atom")).Attribute("href");
You need to specify the namespace when you look for the element:
XNamespace atom = "http://www.w3.org/2005/Atom";
...
rss.Link1 = rssItem.Element(atom + "link").Attribute("href").Value;
LINQ to XML makes namespace handling much simpler than any other XML API I've seen, but you still need to be aware of it. (I'm surprised the other elements aren't in a namespace, to be honest.)
I'd also transform your foreach loop into a LINQ query:
var items = xmlItems.Descendants("item")
.Select(x => new RSSItem {
Description1 = x.Element("description").Value,
Link1 = x.Element(atom + "link").Attribute("href").Value,
Title1 = x.Element("title").Value,
...
})
.ToList();
Also consider using a cast to string instead of the Value property, if some of the elements may be missing - that will set the relevant property to null, instead of throwing a NullReferenceException.
EDIT: If the link element is missing, you can fix that with:
Link1 = (string) x.Elements(atom + "link").Attributes("href").FirstOrDefault()
That will find the first href attribute within an atom link element, or use null - and then the cast to string will just return null if there's no attribute. (That's part of the user-defined conversion from XAttribute to string.)

LINQ and XmlNodes elements

I am trying to return the the attribute values from this XML, which is a collection of XmlNodes called from a Sharepoint Webmethod.
XML Data
<Lists xmlns="http://schemas.microsoft.com/sharepoint/soap/">
<List DocTemplateUrl="" DefaultViewUrl="/Lists/Announcements/AllItems.aspx" MobileDefaultViewUrl="" ID="{E6172717-EB95-4845-B8CB-8161832565C6}" Title="Announcements" Description="Use the Announcements list to post messages on the home page of your site." ImageUrl="/_layouts/images/itann.gif" Name="{E6172717-EB95-4845-B8CB-8161832565C6}" BaseType="0" FeatureId="00bfea71-d1ce-42de-9c63-a44004ce0104" />
<List DocTemplateUrl="" DefaultViewUrl="/Lists/Calendar/calendar.aspx" MobileDefaultViewUrl="" ID="{C0735477-BE48-4DDF-9D93-3E1F8E993CEC}" Title="Calendar" Description="Use the Calendar list to keep informed of upcoming meetings, deadlines, and other important events." ImageUrl="/_layouts/images/itevent.gif" Name="{C0735477-BE48-4DDF-9D93-3E1F8E993CEC}" BaseType="0" FeatureId="00bfea71-ec85-4903-972d-ebe475780106" />
///... Several more like this
</Lists>
I have been following a few different guides, just been going through like this one on DiC, and I've managed to get the examples to work.
public List<Dictionary<string, XmlAttribute>> GetListData(XmlNode collection)
{
#region Test
string nodeInput = Convert.ToString(collection.OuterXml);
TextReader sr = new StringReader(nodeInput);
//from <List> node, decendant of <Lists>
var lists = (from list in XElement.Load(sr).Descendants("List")
//where the baseType element value equals 0
where int.Parse(list.Element("BaseType").Value) == 0
//Output the titles values to a list
select list.Element("Title").Value).ToList();
}
#endregion
I've been trying to adapt a few of the examples to my data to get more of an idea how it works, but this query has not returned any results unlike I expected. I've written besides each line in a comment what I thought the command was doing, could someone illuminate my mistake?
Solution
Very easy to find once I knew namespace was the issue.
http://msdn.microsoft.com/en-us/library/bb669152.aspx
C# unlike VB requires the namespace even when the nodes aren't prefixed by it.
So I needed an XNamespace
XNamespace nameSpace = "http://schemas.microsoft.com/sharepoint/soap/";
XElement node = XElement.Parse(nodeInput);
var lists = from list in node.Descendants(nameSpace + "List")
select list;
foreach (var list in lists)
{
var doc = list.Document;
}
Your code should be
XNamespace ns = "http://schemas.microsoft.com/sharepoint/soap/";
var lists = (from list in XElement.Parse(nodeInput).Descendants(ns + "List")
where (int)list.Attribute(ns + "BaseType") == 0
select (string)list.Attribute(ns + "Title")).ToList();
Solution
Very easy to find once I knew namespace was the issue.
http://msdn.microsoft.com/en-us/library/bb669152.aspx
C# unlike VB requires the namespace even when the nodes aren't prefixed by it.
So I needed an XNamespace
XNamespace nameSpace = "http://schemas.microsoft.com/sharepoint/soap/";
XElement node = XElement.Parse(nodeInput);
var lists = from list in node.Descendants(nameSpace + "List")
select list;
foreach (var list in lists)
{
var doc = list.Document;
}

A simple question about LINQ to XML

<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
I am trying to practice LinqToXml but i can't figure out what i wanted.Simply how can i query table elements which has h or f namespace ?
This was what i tried .Also i tried different ones but didn't work.
var query = from item in XDocument.Parse(xml).Elements(ns + "table")
select item;
This won't work because you're missing the root element from your query. This would work:
XNamespace ns = "http://www.w3schools.com/furniture";
var query = XDocument.Parse(xml).Element("root").Elements(ns + "table");
Now if the problem is that you want to find all "table" elements regardless of the namespace, you'd need something like this:
var query = XDocument.Parse(xml)
.Element("root")
.Elements()
.Where(element => element.Name.LocalName == "table");
(EDIT: As noted, you could use XDocument.Root to get to the root element if you want to. The important point is that trying to get to the table element directly from the document node itself won't work.)
Namespace prefixes are not guaranteed to be a particular letter or string. The best approach would be to search by the qualified namespace.
This would get all direct child nodes of XElement xml where the namespace is uri:namespace...
var selectedByNamespace = from element in xml.Elements()
where element.Name.NamespaceName == "uri:namespace"
select element;
Another option would be to select the elements based on the fully qualified name.
var ns = "{uri:namespace}";
var selectedElements = xml.Elements(ns + "table");

XML Element and Namespace

I have the following method to parse XMLElements:
DisplayMessages(XElement root)
{
var items = root.Descendants("Item");
foreach (var item in items)
{
var name = item.Element("Name");
....
}
}
In debug mode, I can see the root as XML like this:
<ItemInfoList>
<ItemInfo>
<Item>
<a:Name>item 1</a:Name>
...
<Item>
...
and var name is null (I expect to get "item 1"). I tried to use "a:Name" but it caused exception("character : cannot be used in name"). I am not sure if I have to set namespace in root XElelement or not. All the xml node under root should be in the same namespace.
I am new to XElement. In my codes, item.Element("Name") will get its children node "Name"'s value value, is that right?
You need to use element names that include namespace. Try this:
static void DisplayMessages(XElement root)
{
var items = root.Descendants(root.GetDefaultNamespace() + "Item");
foreach (var item in items)
{
var name = item.Element(item.GetNamespaceOfPrefix("a") + "Name");
Console.WriteLine(name.Value);
}
}
Note that operator + is overloaded for XNamespace class in order to make code shorter: XNamespace.Addition Operator.
You do need to define the "a" namespace in the root element:
<Root a:xmlns="http:///someuri.com">
...
</Root>
Then you can select an element in a non-default namespace using this syntax in LINQ to XML:
XNamespace a = "http:///someuri.com"; // must match declaration in document
...
var name = item.Element(a + "Name");
EDIT:
To retrieve the default namespace:
XNamespace defaultNamespace = document.Root.GetDefaultNamespace();
// XNamespace.None is returned when default namespace is not explicitly declared
To find other namespace declarations:
var declarations = root.Attributes().Where(a => a.IsNamespaceDeclaration);
Note that namespaces can be declared on any element though so you would need to recursively search all elements in a document to find all namespace declarations. In practice though this is generally done in the root element, if you can control how the XML is generated then that won't be an issue.
You need to create XNames that have a non-null Namespace. To do so, you have to create an XNamespace, and add the element name, see Creating an XName in a Namespace.
If you work with XML data that contains namespaces, you need to declare these namespaces. (That's a general observation I made, even though it seems to make it difficult to "just have a look" on data you don't know).
You need to declare an XNamespace for your XElements, as in these MSDN samples: Element(), XName

Categories

Resources