C# linq-to-xml, Getting a list with nodes? - c#

This is the test xml that i am using:
<categories>
<category id="1" name="Test1">
<category id="2" name="Test2">
<misc id="1"></misc>
</category>
</category>
<category id="3" name="Test3">
<misc id="2"></misc>
</category>
Now i want to bind that to an ASPX treeview, i want only the elements that have the name category and i want the name of those to appear in the treeview.
Its easy to get the id and names:
var d = from t in data.Descendants("category")
select new { ID = t.Attribute("id").Value, Name = t.Attribute("name").Value };
but how do i keep the structure in the treeview?
This should look like this:
Test1
-> Test2
Test3

Maybe something like this if I understand you correctly? (I have not tested it)
class Category
{
public string ID { get; set; }
public string Name { get; set; }
public IEnumerable<Category> SubCategories { get; set; }
}
IEnumerable<Category> CategoryTreeStructure(XElement e)
{
var d = from t in e.Elements("category")
select new Category()
{
ID = t.Attribute("id").Value,
Name = t.Attribute("name").Value,
SubCategories = CategoryTreeStructure(t)
};
return d;
}
Call it with:
var structure = CategoryTreeStructure(doc.Root);
"i want only the elements that have the name category" - I do not understand what you mean here? But if you only want to select those elements which have a "name" attribute then this should work:
...
var d = from t in e.Elements("category")
where t.Attribute("name") != null
select new Category()
...
I understand that the upper (the "name" attribute part) is not what you wanted but I leave it there. I have tested the code against:
XDocument doc = XDocument.Parse(#"<categories>
<category id=""1"" name=""Test1"">
<category id=""2"" name=""Test2"">
<misc id=""1""></misc>
</category>
</category>
<category id=""3"" name=""Test3"">
<misc id=""2""></misc>
</category>
</categories>");
var structure = CategoryTreeStructure(doc.Root);

Actually, I have found this link which does exactly what you are asking for :) And it is without LINQ, so I thought it deserved another answer.
http://www.15seconds.com/issue/041117.htm

Related

Linq to XML casting XML file to custom object

I'm using Linq to XML to read in an XML file and as part of this I'd like to create an object. My object looks like this:
public class Address
{
public string AccountRef { get; set; }
public string AddressLine1 { get; set; }
public string AddressLine2 { get; set; }
// more stuff here
}
And my XML file looks like this:
<rows>
<row>
<FIELD NAME="AccountRef">1234</FIELD>
<FIELD NAME="AddressLine1">My Address Line 1</FIELD>
<FIELD NAME="AddressLine2">My Address Line 2</FIELD>
</row>
<row>
<FIELD NAME="AccountRef">5678</FIELD>
<FIELD NAME="AddressLine1">My Address Line 3</FIELD>
<FIELD NAME="AddressLine2">My Address Line 4</FIELD>
</row>
</rows>
In terms of code, I've tried various things, but at present I have the following which returns the correct number of rows in the format:
<row><FIELD NAME="AccountRef">1234</FIELD><FIELD>...rest of data</row>
<row><FIELD NAME="AccountRef">5678</FIELD><FIELD>...rest of data</row>
The code that does this is:
var results = (from d in document.Descendants("row")
select d).ToList();
So basically what I'm trying to do is something like:
var results = (from d in document.Descendants("row")
select new Address
{
AccountRef = d.Attribute("AccountRef").Value,
AddressLine1 = d.Attribute("AddressLine1").Value
}).ToList();
Obviously because my nodes are the same (FIELD NAME) that won't work, so does anyone have an idea how I can achieve this?
you need to retrive field names and values before creating objects
var results = document.Descendants("row")
.Select(row=>row.Elements("FIELD").ToDictionary(x=>x.Attribute("NAME").Value, x=>x.Value))
.Select(d=>new Address
{
AccountRef = d["AccountRef"],
AddressLine1 = d["AddressLine1"],
AddressLine2 = d["AddressLine2"],
});
check demo

Combining XML Nodes

I am looking for some help in seeing how to combine nodes from two separate sections of an XML file. The idea is there is going to be a section with default information and another section that can add more information or remove some of the default information. Here is an example of what it would look like.
<data>
<products>
<product name="Product A" />
<product name="Product B">
<category name="Category 2">
<issue name="Special Issue" />
</category>
</product>
<product name="Product C">
<category name="Category 1" remove="true" />
<category name="Special Category">
<issue name="Secret Issue" />
</category>
</product>
<product name="Product D">
<category name="Category 1">
<issue name="Standard Issue" remove="true"/>
<issue name="Complex Issue">
</category>
</product>
</products>
<categories>
<category name="Category 1">
<issue name="Standard Issue" />
<issue name="Advanced Issue" />
</category>
<category name="Category 2" />
</categories>
</data>
The idea is that I can define products separately from the categories/issues since there is a lot of overlap with this information. However, some products need to have slightly different categories or issues. Below is how it should look afterwards.
Product A
Category 1
Standard Issue
Advanced Issue
Category 2
Product B
Category 1
Standard Issue
Advanced Issue
Category 2
Special Issue
Product C
Category 2
Special Category
Secret Issue
Product D
Category 1
Advanced Issue
Complex Issue
Category 2
I could use a bunch of for loops to iterate over the information, however, I am trying to see if there are any more elegant ways of doing this.
PS - Right now just outputting the information as it should be is fine. I do not want to edit the XML itself since it is just a one-time load at the beginning of my program. I am going to be adding either some classes or structs to represent this data.
The complex data structure that you are trying to set up, is imho not the best way to simplify your XML. Eventually, it takes quite some effort in trying to read your data file, and saving it would probably be quite tiresome.
Some points to think about:
How do you save the data
What happens if half of your products suddenly need a new standard issue, or a new category, do you decide at that time to add it to half of your nodes, or to add it to the default array and add the remove directive to your product.category or product.category.issues nodes?
Is the remove directive really a requirement you need/want to add?
If you want to "read" the database as an outsider, would you understand the structure yourself (I found it quite hard, and these were only 4 products)
UPDATE (for the original implementation, please check below the update)
The more I think about the problem, I would say that you should re-examine your data structure from the top.
I see the structure now rather like:
Product -> Has 0 or more Issues
Issue -> Has exactly 1 category
Categories
So, in my opinion, the easier way to represent your data, would be to remove the category tag from your products, and directly add the Issues under the product. The categories could then still be in a separate nodelist containing potential extra information, as such:
<?xml version="1.0" encoding="utf-8" ?>
<data>
<products>
<product name="Product A">
<issue name="Standard Issue" category="Category 1" />
<issue name="Advanced Issue" category="Category 1" />
</product>
<product name="Product B">
<issue name="Standard Issue" category="Category 1" />
<issue name="Advanced Issue" category="Category 1" />
<issue name="Special Issue" category="Category 2" />
</product>
<product name="Product C">
<issue name="Secret Issue" category="Special Category" />
</product>
<product name="Product D">
<issue name="Advanced Issue" category="Category 1" />
<issue name="Complex Issue" category="Category 1" />
</product>
</products>
<categories>
<category name="Category 1" />
<category name="Category 2" />
<category name="Special category" />
</categories>
</data>
It would simplify reading your nodes, making it more readable (from a human perspective as well), more maintainable over a long term (face it, how many times will the data structure stay as originally designed?), and it would be possible to separate the categories from the issues for the products.
I explicitly removed the empty Category 2 option from each single product, as there would be no need for them with this structure (it's the issues which are important here, imho)
below is the implementation to give an idea how much effort has to be done to actually get so far as to reading the original xml
ORIGINAL
I checked to create an elegant reader design, but this eventually will only help you so far and would be very dependent on how much your simplified data structure posted here conforms with the actual requirements.
As a base, I made some classes for the data, I used Abstract classes to provide the Name & Remove attribute, so that it would be easier to "generalize" the reader, though, some special implementations were needed
[XmlRoot("data")]
public class Data
{
[XmlArray("products")]
[XmlArrayItem("product")]
public Product[] Products { get; set; }
[XmlArray("categories")]
[XmlArrayItem("category")]
public Category[] Categories { get; set; }
}
public abstract class AbstractNamedNode
{
[XmlAttribute("name")]
public string Name { get; set; }
public override bool Equals(object obj)
{
if (obj == null)
{
return false;
}
if (obj is AbstractNamedNode)
{
return string.Equals(((AbstractNamedNode)obj).Name, this.Name);
}
return base.Equals(obj);
}
public override int GetHashCode()
{
if (string.IsNullOrEmpty(Name))
{
return base.GetHashCode();
}
return Name.GetHashCode();
}
public override string ToString()
{
return string.Format("{0}", Name);
}
public virtual T CloneBasic<T>()
where T: AbstractNamedNode, new()
{
T result = new T();
result.Name = this.Name;
return result;
}
}
public abstract class AbstractNamedRemovableNode : AbstractNamedNode
{
[XmlAttribute("remove")]
public bool Remove { get; set; }
public override T CloneBasic<T>()
{
var result = base.CloneBasic<T>() as AbstractNamedRemovableNode;
result.Remove = this.Remove;
return result as T;
}
}
public class Product : AbstractNamedNode
{
[XmlElement("category")]
public Category[] Categories { get; set; }
}
public class Category : AbstractNamedRemovableNode
{
[XmlElement("issue")]
public Issue[] Issues { get; set; }
}
public class Issue : AbstractNamedRemovableNode
{
// intended blank
}
This offers the structure to read the raw format of the XML already (using the XmlSerializer). The Issue class is currently quite basic, though, I guess that would be different in the actual implementation.
To read the raw dataset, you could use the standard XmlSerializer way:
Data dataFromXml = null;
string path = Path.Combine(Environment.CurrentDirectory, "datafile.xml");
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
XmlSerializer xs = new XmlSerializer(typeof(Data));
dataFromXml = xs.Deserialize(fs) as Data;
}
At this time (and when no exception occured trying to parse the data), the dataFromXML would contain all defined products (with their respective categories and issues) and the categories nodes.
To read this into your eventual setup, you would have to compare each category (and clone their info) and then compare your resultset another time with your standards set (to see which undefined categories in your product should still be added).
For this implementation, this DataProvider would return a combined cloned set of products, from the original dataFromXML file. By relying on the Abstract classes (with generic specification) it is possible to compress all the loops a bit, just it becomes harder to read
public static class DataProvider
{
private static T GetMatchingItem<T, K>(T[] SourceArray, K MatchingNode)
where T: AbstractNamedRemovableNode
where K: AbstractNamedRemovableNode
{
if (SourceArray == null || SourceArray.Length == 0 || MatchingNode == null)
{
return null;
}
var query = from i in SourceArray
where !i.Remove && i.Equals(MatchingNode)
select i;
return query.SingleOrDefault();
}
private static T[] CombineArray<T>(T[] sourceArray, T[] baseArray)
where T : AbstractNamedRemovableNode, new()
{
IList<T> results = new List<T>();
if (sourceArray != null)
{
foreach (var item in sourceArray)
{
if (item.Remove)
{
continue;
}
T copy = default(T);
copy = item.CloneBasic<T>();
if (copy is Category)
{
Category category = copy as Category;
Category original = item as Category;
Category matching = GetMatchingItem(baseArray, item) as Category;
if (matching != null)
{
category.Issues = CombineArray(original.Issues, matching.Issues);
}
else
{
category.Issues = CombineArray(original.Issues, null);
}
}
results.Add(copy);
}
}
if (baseArray != null)
{
foreach (var item in baseArray)
{
if (results.Contains(item))
{
continue;
}
if (sourceArray != null && sourceArray.Contains(item))
{
// the remove option would have worked here
continue;
}
T copy = item as T;
if (copy is Category)
{
Category category = copy as Category;
Category original = item as Category;
category.Issues = CombineArray(original.Issues, null);
}
results.Add(copy);
}
}
return results.OrderBy((item) => item.Name).ToArray();
}
public static Product[] GetCombinedProductInfoFromData(Data data)
{
if (data == null)
{
throw new ArgumentNullException("data");
}
IList<Product> products = new List<Product>();
if (data.Products != null)
{
foreach (var originalProduct in data.Products)
{
Product product = originalProduct.CloneBasic<Product>();
if (originalProduct.Categories != null && originalProduct.Categories.Length > 0)
{
product.Categories = CombineArray(originalProduct.Categories, data.Categories);
}
else
{
product.Categories = CombineArray(data.Categories, null);
}
products.Add(product);
}
}
return products.ToArray();
}
}
By using the following display program, it eventually works as specified, but it shouldn't be this "hard" to built your data, and it would be highly dependent on the size of the products and the standard nodes (categories and issues) how fast the algorithm would be.
var productList = DataProvider.GetCombinedProductInfoFromData(dataFromXml);
foreach (var product in productList)
{
Console.WriteLine(product);
if (product.Categories == null)
{
continue;
}
foreach (var category in product.Categories)
{
Console.WriteLine("\t{0}", category);
if (category.Issues == null)
{
continue;
}
foreach (var issue in category.Issues)
{
Console.WriteLine("\t\t{0}", issue);
}
}
}
The result however would be, a list of cloned products, with the default categories cloned combined with the specified categories and the same goes for the issues in the categories.
The only thing, I want to point out, is to think carefully if this is really the way you want to structure your data. Especially the "remove" directive is a b*** ;)

Get XML nodes attributes and set as properties of List<myType>?

Example XML:
<Root>
<Product value="Candy">
<Item value="Gum" price="1.00"/>
<Item value="Mints" price="0.50"/>
</Product>
</Root>
Let's say I have a class with properties:
public class CandyItems
{
public string Value{get; set;}
public string Price{get; set;}
}
And within my main program class, I have a list:
var Candies = new List<CandyItems>;
I am struggling with a concise way to populate the Candies list, using LINQ.
I could do it in steps, like this:
//Get list of Items within <Product value="Candy">
XElement tempCandies = XDocument.Load("file.xml").Root.Elements("Product").Single(c => c.Attributes("value") == "Candy").Descendants("Item");
//Loop through the elements
foreach(var item in tempCandies){
Candies.Add(new CandyItems{Value = item.Attributes("value"), Price = item.Attributes("price")});
}
But it seems like I could do this more concisely with pure LINQ somehow. Or is there another recommended method?
Try this:-
XDocument xdoc = XDocument.Load(#"Path\Candies.xml");
List<CandyItems> Candies = xdoc.Descendants("Item")
.Select(x => new CandyItems
{
Value = (string)x.Attribute("value"),
Price = (string)x.Attribute("price")
}).ToList();
Although, you have not mentioned but if you want to just fetch Candies and your XML may contain other products too like:-
<Root>
<Product value="Candy">
<Item value="Gum" price="1.00"/>
<Item value="Mints" price="0.50"/>
</Product>
<Product value="Chocolate">
<Item value="MilkChocolate" price="7.00"/>
<Item value="DarkChocolate" price="10.50"/>
</Product>
</Root>
Then you can apply a filter to fetch only Candy products like this:-
List<CandyItems> Candies = xdoc.Descendants("Item")
.Where(x => (string)x.Parent.Attribute("value") == "Candy")
.Select(x => new CandyItems
{
Value = (string)x.Attribute("value"),
Price = (string)x.Attribute("price")
}).ToList();
How about something like this (after loading the document):
var candies =
xdoc.Root.Elements("Product")
.Where(p => p.Attribute("value").Value == "Candy")
.SelectMany(p => p.Descendants("Item").Select(i => new CandyItems {
Value = i.Attribute("value").Value,
Price = i.Attribute("price").Value }));
Note: any and all error handling omitted.

Linq to XML query analyz

This query does work but I am not sure it is proper way to write this kind of query. I feel it is using too many Descendants and Parent.
Is there a better way to write this query?
There can be more than one catalog in XML.
static IEnumerable<Parts> GetAllParts(XDocument doc, string catalog, string groupId, string subGroupId)
{
var parts = (from p in doc.Descendants("ROOT").Descendants("CATALOG").Descendants("GROUP").Descendants("SUBGROUP").Descendants("BOM").Descendants("PARTS")
where (string)p.Parent.Parent.Parent.Parent.Element("IDENT").Value == catalog
&& p.Parent.Parent.Parent.Element("IDENT").Value == groupId
&& p.Parent.Parent.Element("IDENT").Value == subGroupId
select new Parts
{
ObjectId = int.Parse(p.Attribute("OBJECTID").Value),
Ident = p.Element("IDENT").Value,
List = p.Element("LIST").Value,
Descr = p.Element("DESC").Value
});
return parts;
}
}
public class Parts
{
public int ObjectId { get; set;}
public string Descr { get; set; }
public string Ident { get; set; }
public string List { get; set; }
}
Update: XML added.
<ROOT>
<CATALOG>
<OBJECT_ID>001</OBJECT_ID>
<OBJECT_IDENT>X001</OBJECT_IDENT>
<GROUP>
<OBJECT_ID>1001</OBJECT_ID>
<OBJECT_IDENT>01</OBJECT_IDENT>
<NAME>HOUSING</NAME>
<SUBGROUP>
<OBJECT_ID>5001</OBJECT_ID>
<OBJECT_IDENT>01.05</OBJECT_IDENT>
<NAME>DESIGN GROUP 1</NAME>
<BOM>
<OBJECT_ID>6001</OBJECT_ID>
<OBJECT_IDENT>010471</OBJECT_IDENT>
<PARTS>
<OBJECT_ID>2316673</OBJECT_ID>
<OBJECT_IDENT>A002010660</OBJECT_IDENT>
<DESC>SHORT BLOCK</DESC>
<NOTES>
<ROW>
<NOTES>Note 1</NOTES>
<BOM>010471</BOM>
<POS>1</POS>
</ROW>
<ROW>
<NOTES>Note 2</NOTES>
<BOM>010471</BOM>
<POS>2</POS>
</ROW>
</NOTES>
</PARTS>
<PARTS>
</PARTS>
<PARTS>
</PARTS>
</BOM>
</SUBGROUP>
<SUBGROUP>
</SUBGROUP>
<SUBGROUP>
</SUBGROUP>
</GROUP>
<GROUP>
</GROUP>
</CATALOG>
</ROOT>
Some suggestions regarding your question (I hope you'll improve your future posts) :
Your code and the XML posted doesn't work. For instance, the XML has <OBJECT_IDENT> element, but your LINQ has IDENT. Please craft it carefully and make sure it does work to avoid confusion.
Please put some effort in explaining the problem and giving clarification. "I am retrieving Parts data as function name says" is not clear enough as simply getting <Parts> elements doesn't need filtering but your LINQ has where .... clause.
This question seems to suits better in https://codereview.stackexchange.com/
And here is some suggestions regarding your code :
Use .Elements() to get direct child of current node as opposed to Descendants() which get all descendat nodes.
You can use Elements().Where() to filter the element so you can avoid traversing Parents
You can cast XElement to string/int to avoid exception in case such element not found
Example code snippet :
var parts = (from p in doc.Root
.Elements("CATALOG").Where(o => catalog == (string)o.Element("OBJECT_IDENT"))
.Elements("GROUP").Where(o => groupId == (string)o.Element("OBJECT_IDENT"))
.Elements("SUBGROUP").Where(o => subGroupId == (string)o.Element("OBJECT_IDENT"))
.Elements("BOM")
.Elements("PARTS")
select new Parts
{
ObjectId = (int)p.Element("OBJECT_ID"),
Ident = (string)p.Element("OBJECT_IDENT"),
List = (int)p.Element("LIST"),
Descr = (string)p.Element("DESC")
});

linq to xml query: Object reference not set error while trying to retrieve child element

i am trying to retrieve all the child elements but getting System.Collections.ListDictionaryInternal. Object reference not set to an instance of an object error.
my c# code retrieve all the question on according to the test_id and category_id passed:-
public static List<Questions> GetQuestion_Catgy(int test_id, int ctgy_id)
{
try
{
XDocument data = XDocument.Load(docurl);
return (from exm in data.Descendants("test_details")
where exm.Attribute("id").Value.Equals(test_id.ToString())
from ctgy in exm.Descendants("category")
where ctgy.Attribute("id").Value.Equals(ctgy_id.ToString())
orderby (int)ctgy.Attribute("id")
select new Questions
{
quesID = Convert.ToInt32(ctgy.Attribute("id").Value),
quesSTRING = ctgy.Attribute("ques").Value,
quesRATE = Convert.ToInt32(ctgy.Attribute("rating").Value),
quesOPT1 = (string)ctgy.Element("opt1").Value,
quesOPT2 = (string)ctgy.Element("opt2").Value,
quesOPT3 = (string)ctgy.Element("opt3").Value,
quesOPT4 = (string)ctgy.Element("opt4").Value,
quesANS = Convert.ToInt32(ctgy.Element("ans").Value),
quesIMG = (string)ctgy.Element("img").Value
}).ToList();
}
catch (Exception ex)
{
throw new ArgumentException(ex.Data + "\n" + ex.Message);
}
}
my xml
<test_details id="1" name="test exam" time="30" marks="100" difficulty="1">
<category id="1" name="HTML">
<question id="1" ques="what is HTML ?" rating="5">
<opt1>Markup Language</opt1>
<opt2>Scripting Language</opt2>
<opt3>Server-Side Lanugae</opt3>
<opt4>Client-Side Language</opt4>
<ans>1</ans>
<img>null</img>
</question>
<question id="2" ques="what is LMTH ?" rating="5">
<opt1>Markup Language</opt1>
<opt2>Scripting Language</opt2>
<opt3>Server-Side Lanugae</opt3>
<opt4>Client-Side Language</opt4>
<ans>2</ans>
<img>null</img>
</question>
</category>
<category id="2" name="C#" />
</test_details>
It looks like you need to go down an extra level to the 'question' elements if you want to access the ques attribute. ctgy will not have ques.
ctgy.Attribute("ques").Value
ctgy.Attribute("rating").Value
There is no such attribute.
Also do a null check before doing things like
(string)ctgy.Element("opt2").Value,
Your error is on this line:
from ctgy in exm.Descendants("category")
The exm elements are at the same level of your categories. You need to replace exm with data.
Example:
from ctgy in data.Descendants("category")

Categories

Resources