Extracting Data from XML to List<>

Extracting Data from XML to List<> - c#

I have this XML file:
<?xml version="1.0" encoding="utf-8" ?>
<Record>
<File name="1.mot">
<Line address="040004" data="0720" />
<Line address="040037" data="31" />
<Line address="04004C" data="55AA55AA" />
</File>
<File name="2.mot">
<Line address="00008242" data="06" />
<Line address="00008025" data="AFC8" />
<Line address="00009302" data="476F6C64" />
</File>
</Record>
What I want to do is to extract the information from the XML and convert that to list. Although I kind of don't know where and how to start. I've googled samples, and questions and the code below is what I've managed to construct so far. I'm not even sure if this code is appropriate for what I wanted to happen. This list will be used for some kind of lookup in the program. Like in file 1.mot, program would read 1.mot, read the xml file, parse both files, extract the info from the xml file and then do a search function to verify if the info in the xml exists in 1.mot.
XElement xmlReqs = XElement.Load("XMLFile1.xml");
List<Requirement> reqs = new List<Requirement>();
foreach (var xmlReq in xmlReqs.Elements("File"))
{
string name = xmlReqs.Attribute("name").Value);
List<InfoLine> info = new List<InfoLine>();
foreach (var xmlInfo in xmlReq.Elements("Line"))
{
string address = xmlProduct.Attribute("address").Value;
string data = xmlProduct.Attribute("data").Value;
}
reqs.Add(new Requirement(address, data));
}
A friend of mine suggested something about using int array or string array and then using this reqs.Find(val => val[0]==target) but I'm not sure how to do so. I'm not well-versed with linq, but what I've gathered, it seems to be quite notable and powerful (?).
Anyway, will the code above work? And how do I call the objects from the list to use for the lookup function of the program?
UPDATE:
Program would be reading 1.mot or 2.mot (depending on the user preference, that's why file name in xml needs to be specified) simultaneously (or not) with the xml file.
1.mot file contains:
S0030000FC
S21404000055AA55AA072000010008000938383138D7
S21404001046305730343130302020202027992401B0
...
Address starts at the 3rd byte. So yeah, would be comparing the data to these bunch of lines.

You can de-serialize the xml file
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
namespace ConsoleApplication2
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XmlSerializer xs = new XmlSerializer(typeof(Record));
XmlTextReader reader = new XmlTextReader(FILENAME);
Record record = (Record)xs.Deserialize(reader);
}
}
[XmlRoot("Record")]
public class Record
{
[XmlElement("File")]
public List<File> files {get;set;}
}
[XmlRoot("File")]
public class File
{
[XmlAttribute("name")]
public string name { get; set; }
[XmlElement("Line")]
public List<Line> lines {get;set;}
}
[XmlRoot("Line")]
public class Line
{
[XmlAttribute("address")]
public string address {get;set;}
[XmlAttribute("data")]
public string data {get;set;}
}
}

You could use XmlSerializer to handle the reading of the XML. Create some classes that look like these:
public class Record
{
[XmlElement("File")]
public List<File> Files { get; set; }
}
public class File
{
[XmlAttribute("name")]
public string Name { get; set; }
[XmlElement("Line")]
public List<Line> Lines { get; set; }
}
public class Line
{
[XmlAttribute("address")]
public int Address { get; set; }
[XmlAttribute("data")]
public string Data { get; set; }
}
And deserialise like so:
var serializer = new XmlSerializer(typeof (Record));
using (var reader = XmlReader.Create("XMLFile1.xml"))
{
var record = (Record) serializer.Deserialize(reader);
var first = record.Files.Single(f => f.Name == "1.mot");
var second = record.Files.Single(f => f.Name == "2.mot");
}

Related

Tricky XML Manipulation: Create an element out of its own and other sibling's data

I have this replicate scenario my XML document below:
<?xml version="1.0" encoding="utf-8"?>
<Home>
<Kitchen>
<Pantry>
<Ingredients>
<Name>Tomato</Name>
<ID>1</Price_ID>
<Name>Tomato</Name>
<Income>Sales</Income> // replace the <Income> element with its value <Sales>
<Cost>Materials</Cost>
<Price>100</Price> // the value for new <Sales> element shall be this <Price> value
</Ingredients>
//.. and thousands more sets of ingredients
</Pantry>
</Kitchen>
</Home>
//.. and thousands more sets of ingredients
And I want to restructure it in the following manner:
<?xml version="1.0" encoding="utf-8"?>
<Home>
<Kitchen>
<Pantry>
<Ingredients>
<Name>Tomato</Name>
<ID>1</ID>
<Name>Tomato</Name>
<Sales>100</Sales> // the <Income> was replaced by its value and the value was taken from the <Price> element that was deleted also
<Cost>Materials</Cost>
</Ingredients>
//.. and thousands more sets of ingredients
</Pantry>
</Kitchen>
</Home>
I'm still trying to figure out how I'm going to do this. I will appreciate any help here.

Using Xml Ling :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication37
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
List<XElement> ingredients = doc.Descendants("Ingredients").ToList();
foreach (XElement ingredient in ingredients)
{
XElement xIncome = ingredient.Element("Income");
XElement xPrice = ingredient.Element("Price");
xIncome.ReplaceWith(new XElement("Sales", (string)xPrice));
xPrice.Remove();
}
}
}
}

Firstly create a Class for the new Model
public class NewIngredients
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Sales { get; set; }
public string Cost{ get; set; }
}
Presuming the Xml Document is in a file called Kitchen.xml
XElement Kitchen = XElement.Load(#"Kitchen.xml");
then use Linq to Xml to create your new model from the old something like this (Note probably need to check for nulls etc)
var newIngredients = Kitchen.Descendants("Ingredients").Select(x => new NewIngredients
{
Id = int.Parse(x.Element("ID").Value),
Name = x.Element("Name").Value,
Sales = decimal.Parse(x.Element("Price").Value)
Cost = x.Element("Cost").Value
});
Convert back to xml if needed
var serializer = new XmlSerializer(newIngredients.First().GetType());
serializer.Serialize(Console.Out, newIngredients.First()); //Printed to console but could move to file if needed

How to return an XML element value from a Descendant when passing in another element value from that Descendant

I am new to working with LINQ to XML, but I can see how it could be helpful to my current problem. I want a method that you pass in an XML Document and an Element value, the method would then return a different Element Value from the same Descendant. For example if I provided a "StationName" I would like to know what "ScannerType" belongs to that "StationName"
Here is my XML
<?xml version="1.0" encoding="utf-8" ?>
<stations>
<station>
<stationName>CH3CTRM1</stationName>
<scannerType>GE LightSpeed VCT</scannerType>
<scannerID>COL02</scannerID>
<siteName>CUMC</siteName>
<inspDose>180</inspDose>
<expDose>100</expDose>
<kernel>STANDARD</kernel>
</station>
<station>
<stationName>CTAWP75515</stationName>
<scannerType>SIEMENS Force</scannerType>
<scannerID>UIA07</scannerID>
<siteName>Iowa</siteName>
<inspDose>careDose</inspDose>
<expDose>careDose</expDose>
<kernel>Qr40 5</kernel>
</station>
<station>
<stationName>JHEB_CT06N_JHOC2</stationName>
<scannerType>SIEMENS Force</scannerType>
<scannerID>JHU04</scannerID>
<siteName>JHU</siteName>
<inspDose>careDose</inspDose>
<expDose>careDose</expDose>
<kernel>Qr40 5</kernel>
</station>
</stations>
Here are the methods that are currently in question
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Xml.Linq;
namespace ManualPhantomProcessor.XMLParser
{
class SearchXML
{
public string filename = "SiteData.xml";
public string currentDirectory = Directory.GetCurrentDirectory();
public XDocument LoadXML()
{
string siteDataFilePath = Path.Combine(currentDirectory, filename);
XDocument siteData = XDocument.Load(siteDataFilePath);
return siteData;
}
public IEnumerable<string> GetScannerModel(XDocument xmlDocument, string stationName)
{
var query = xmlDocument.Descendants("station")
.Where(s => s.Element("stationName").Value == stationName)
.Select(s => s.Element("scannerType").Value)
.Distinct();
return query;
}
}
}
Here is my Programs.cs file
using ManualPhantomProcessor.XMLParser;
using System;
using System.Collections.Generic;
using System.Xml.Linq;
namespace ManualPhantomProcessor
{
class Program
{
static void Main(string[] args)
{
SearchXML searchXML = new SearchXML();
XDocument siteData = searchXML.LoadXML();
IEnumerable<string> data = searchXML.GetScannerModel(siteData, "CH3CTRM1");
Console.WriteLine(data);
}
}
}
I should be a simple console application, but it seem like no matter what I try I keep getting a null value when I expect the scannerType value from the XML document that corresponds with the station name "CH3CTRM1"
the application doesn't crash but in my console I get the following:
System.Linq.Enumerable+DistinctIterator`1[System.String]
Could explain what I am doing incorrectly?

Your code is good, the problem is here Console.WriteLine(data); the WriteLine take string like a parameter not a list of string. to display the station names use a loop, like the following code :
foreach(string stationName in data)
{
Console.WriteLine(stationName);
}
The documentation of WriteLine
i hope that will help you fix the issue.

What you're seeing is the string form of the IEnumerable<string> returned by GetScannerModel().
There are two possibilities:
Only one scanner model is expected to be found (because a station name is expected
to be unique).
Any number of scanner models can be found.
In the first case, change GetScannerModel() to return a string and have it do soemthing like return query.FirstOrDefault(); (or .First() if you want an exception if no match was found). Your client program then remains unchanged.
In the second case, #Sajid's answer applies - you need to enumerate the IEnumerable in some way, for example through foreach.

Use a dictionary :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
List<Station> stations = doc.Descendants("station").Select(x => new Station
{
stationName = (string)x.Element("stationName"),
scannerType = (string)x.Element("scannerType"),
scannerID = (string)x.Element("scannerID"),
siteName = (string)x.Element("siteName"),
inspDose = (string)x.Element("inspDose"),
expDose = (string)x.Element("expDose"),
kernel = (string)x.Element("kernel")
}).ToList();
Dictionary<string, Station> dict = stations
.GroupBy(x => x.stationName, y => y)
.ToDictionary(x => x.Key, y => y.FirstOrDefault());
}
}
public class Station
{
public string stationName {get;set;}
public string scannerType {get;set;}
public string scannerID {get;set;}
public string siteName {get;set;}
public string inspDose {get;set;}
public string expDose {get;set;}
public string kernel { get; set; }
}
}

Deserialize dependent on field value

I need to deserialize XML that uses a field "type" to indicate what content to expect.
Type 0 says that I can expect simple text whilst type 1 indicates that the content is of a more complex structure.
I know that I could write some custom deserialization mechanism but would like to know whether there was any builtin way to solve this.
Since the XMLSerializer expects a string it simply throws away the content in case it is XML. This stops me from running the content deserialization as a second step.
<Msg>
<MsgType>0</MsgType>
<Data>Some text</Data>
</Msg>
<Msg>
<MsgType>1</MsgType>
<Data>
<Document>
<Type>PDF</Type>
.....
</Document>
</Data>
</Msg>

That isn't supported out of the box; however, you could perhaps use:
public XmlNode Data {get;set;}
and run the "what to do with Data?" as a second step, once you can look at MsgType.
Complete example:
using System;
using System.Collections.Generic;
using System.IO;
using System.Xml;
using System.Xml.Serialization;
static class P
{
static void Main()
{
const string xml = #"<Foo>
<Msg>
<MsgType>0</MsgType>
<Data>Some text</Data>
</Msg>
<Msg>
<MsgType>1</MsgType>
<Data>
<Document>
<Type>PDF</Type>
.....
</Document>
</Data>
</Msg>
</Foo>";
var fooSerializer = new XmlSerializer(typeof(Foo));
var docSerializer = new XmlSerializer(typeof(Document));
var obj = (Foo)fooSerializer.Deserialize(new StringReader(xml));
foreach (var msg in obj.Messages)
{
switch (msg.MessageType)
{
case 0:
var text = msg.Data.InnerText;
Console.WriteLine($"text: {text}");
break;
case 1:
var doc = (Document)docSerializer.Deserialize(new XmlNodeReader(msg.Data));
Console.WriteLine($"document of type: {doc.Type}");
break;
}
Console.WriteLine();
}
}
}
public class Foo
{
[XmlElement("Msg")]
public List<Message> Messages { get; } = new List<Message>();
}
public class Message
{
[XmlElement("MsgType")]
public int MessageType { get; set; }
public XmlNode Data { get; set; }
}
public class Document
{
public string Type { get; set; }
}

Importing CSV data into C# classes

I know how to read and display a line of a .csv file. Now I would like to parse that file, store its contents in arrays, and use those arrays as values for some classes I created.
I'd like to learn how though.
Here is an example:
basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3
As you can see, each field is separated by commas. I've created the Basketball.cs and Baseball classes which is an extension of the Sport.cs class, which has the fields:
private string sport;
private string date;
private string team1;
private string team2;
private string score;
I understand that this is simplistic, and that there's better ways of storing this info, i.e. creating classes for each team, making the date a DateType datatype, and more of the same but I'd like to know how to input this information into the classes.
I'm assuming this has something to do with getters and setters... I've also read of dictionaries and collections, but I'd like to start simple by storing them all in arrays... (If that makes sense... Feel free to correct me).
Here is what I have so far. All it does is read the csv and parrot out its contents on the Console:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace Assign01
{
class Program
{
static void Main(string[] args)
{
string line;
FileStream aFile = new FileStream("../../sportsResults.csv", FileMode.Open);
StreamReader sr = new StreamReader(aFile);
// read data in line by line
while ((line = sr.ReadLine()) != null)
{
Console.WriteLine(line);
line = sr.ReadLine();
}
sr.Close();
}
}
}
Help would be much appreciated.

For a resilient, fast, and low effort solution, you can use CsvHelper which handles a lot of code and edge cases and has pretty good documentation
First, install the CsvHelper package on Nuget
a) CSV with Headers
If your csv has headers like this:
sport,date,team 1,team 2,score 1,score 2
basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3
You can add attributes to your class to map the field names to your class names like this:
public class SportStats
{
[Name("sport")]
public string Sport { get; set; }
[Name("date")]
public DateTime Date { get; set; }
[Name("team 1")]
public string TeamOne { get; set; }
[Name("team 2")]
public string TeamTwo { get; set; }
[Name("score 1")]
public int ScoreOne { get; set; }
[Name("score 2")]
public int ScoreTwo { get; set; }
}
And then invoke like this:
List<SportStats> records;
using (var reader = new StreamReader(#".\stats.csv"))
using (var csv = new CsvReader(reader))
{
records = csv.GetRecords<SportStats>().ToList();
}
b) CSV without Headers
If your csv doesn't have headers like this:
basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3
You can add attributes to your class and map to the CSV ordinally by position like this:
public class SportStats
{
[Index(0)]
public string Sport { get; set; }
[Index(1)]
public DateTime Date { get; set; }
[Index(2)]
public string TeamOne { get; set; }
[Index(3)]
public string TeamTwo { get; set; }
[Index(4)]
public int ScoreOne { get; set; }
[Index(5)]
public int ScoreTwo { get; set; }
}
And then invoke like this:
List<SportStats> records;
using (var reader = new StreamReader(#".\stats.csv"))
using (var csv = new CsvReader(reader))
{
csv.Configuration.HasHeaderRecord = false;
records = csv.GetRecords<SportStats>().ToList();
}
Further Reading
Reading CSV file and storing values into an array (295🡅)
Parsing CSV files in C#, with header (245🡅)
Import CSV file to strongly typed data structure in .Net (104🡅)
Reading a CSV file in .NET? (45🡅)
Is there a “proper” way to read CSV files (17🡅)
... many more

Creating array to keep the information is not a very good idea, as you don't know how many lines will be in the input file. What would be the initial size of your Array ?? I would advise you to use for example a Generic List to keep the information (E.g. List<>).
You can also add a constructor to your Sport Class that accepts an array (result of the split action as described in above answer.
Additionally you can provide some conversions in the setters
public class Sport
{
private string sport;
private DateTime date;
private string team1;
private string team2;
private string score;
public Sport(string[] csvArray)
{
this.sport = csvArray[0];
this.team1 = csvArray[2];
this.team2 = csvArray[3];
this.date = Convert.ToDateTime(csvArray[1]);
this.score = String.Format("{0}-{1}", csvArray[4], csvArray[5]);
}
Just for simplicity I wrote the Convert Method, but keep in mind this is also not a very safe way unless you are sure that the DateField always contains valid Dates and Score always contains Numeric Values. You can try other safer methods like tryParse or some Exception Handling.
I all honesty, it must add that the above solution is simple (as requested), on a conceptual level I would advise against it. Putting the mapping logic between attributes and the csv-file in the class will make the sports-class too dependent on the file itself and thus less reusable. Any later changes in the file structure should then be reflected in your class and can often be overlooked. Therefore it would be wiser to put your “mapping & conversion” logic in the main program and keep your class a clean as possible
(Changed your "Score" issue by formatting it as 2 strings combined with a hyphen)

splitting the sting into arrays to get the data can be error prone and slow. Try using an OLE data provider to read the CSV as if it were a table in an SQL database, this way you can use a WHERE clause to filter the results.
App.Config:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<connectionStrings>
<add name="csv" providerName="System.Data.OleDb" connectionString="Provider=Microsoft.Jet.OLEDB.4.0;Data Source='C:\CsvFolder\';Extended Properties='text;HDR=Yes;FMT=Delimited';" />
</connectionStrings>
</configuration>
program.cs:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.OleDb;
using System.Configuration;
using System.Data;
using System.Data.Common;
namespace CsvImport
{
class Stat
{
public string Sport { get; set; }
public DateTime Date { get; set; }
public string TeamOne { get; set; }
public string TeamTwo { get; set; }
public int Score { get; set; }
}
class Program
{
static void Main(string[] args)
{
ConnectionStringSettings csv = ConfigurationManager.ConnectionStrings["csv"];
List<Stat> stats = new List<Stat>();
using (OleDbConnection cn = new OleDbConnection(csv.ConnectionString))
{
cn.Open();
using (OleDbCommand cmd = cn.CreateCommand())
{
cmd.CommandText = "SELECT * FROM [Stats.csv]";
cmd.CommandType = CommandType.Text;
using (OleDbDataReader reader = cmd.ExecuteReader(CommandBehavior.CloseConnection))
{
int fieldSport = reader.GetOrdinal("sport");
int fieldDate = reader.GetOrdinal("date");
int fieldTeamOne = reader.GetOrdinal("teamone");
int fieldTeamTwo = reader.GetOrdinal("teamtwo");
int fieldScore = reader.GetOrdinal("score");
foreach (DbDataRecord record in reader)
{
stats.Add(new Stat
{
Sport = record.GetString(fieldSport),
Date = record.GetDateTime(fieldDate),
TeamOne = record.GetString(fieldTeamOne),
TeamTwo = record.GetString(fieldTeamTwo),
Score = record.GetInt32(fieldScore)
});
}
}
}
}
foreach (Stat stat in stats)
{
Console.WriteLine("Sport: {0}", stat.Sport);
}
}
}
}
Here's how the csv should look
stats.csv:
sport,date,teamone,teamtwo,score
basketball,28/01/2011,Rockets,Blazers,98
baseball,22/08/2011,Yankees,Redsox,4

While there are a lot of libraries that will make csv reading easy (see: here), all you need to do right now that you have the line, is to split it.
String[] csvFields = line.Split(",");
Now assign each field to the appropriate member
sport = csvFields[0];
date = csvFields[1];
//and so on
This will however overwrite the values each time you read a new line, so you need to pack the values into a class and save the instances of that class to a list.

Linq also has a solution for this and you can define your output as either a List or an Array. In the example below there is a class that as the definition of the data and data types.
var modelData = File.ReadAllLines(dataFile)
.Skip(1)
.Select(x => x.Split(','))
.Select(dataRow => new TestModel
{
Column1 = dataRow[0],
Column2 = dataRow[1],
Column3 = dataRow[2],
Column4 = dataRow[3]
}).ToList(); // Or you can use .ToArray()

// use "Microsoft.VisualBasic.dll"
using System;
using Microsoft.VisualBasic.FileIO;
class Program {
static void Main(string[] args){
using(var csvReader = new TextFieldParser(#"sportsResults.csv")){
csvReader.SetDelimiters(new string[] {","});
string [] fields;
while(!csvReader.EndOfData){
fields = csvReader.ReadFields();
Console.WriteLine(String.Join(",",fields));//replace make instance
}
}
}
}

Below is for newbie and eye catching solution that most newbie like to try and error
please don;t forget to add System.Core.dll in references
Import namespace in your .cs file : using System.Linq;
Perhaps add iterator will be better code
private static IEnumerable<String> GetDataPerLines()
{
FileStream aFile = new FileStream("sportsResults.csv",FileMode.Open);
StreamReader sr = new StreamReader(aFile);
while ((line = sr.ReadLine()) != null)
{
yield return line;
}
sr.Close();
}
static void Main(string[] args)
{
var query = from data in GetDataPerLines()
let splitChr = data.Split(",".ToCharArray())
select new Sport
{
sport = splitChr[0],
date = splitChr[1],.. and so on
}
foreach (var item in query)
{
Console.Writeline(" Sport = {0}, in date when {1}",item.sport,item.date);
}
}
Maybe like this, the sample above is creating your own iteration using yield (please look at MSDN documentation for that) and create collection based on your string.
Let me know if I write the code wrong since I don;t have Visual studio when I write the answer.
For your knowledge, an array one dimension like "Sport[]" will translate into CLR IEnumerable

Handling the children of a parent-child relationship using Linq to XML

I am new trying to learn LINQ to XML and having trouble with "children". I have an XML file of info about documents; each document has some number of INDEX elements as in this snippet:
<DOCUMENTCOLLECTION>
<DOCUMENT>
<FILE filename="Z:\Consulting\ConverterRun4\B0000001\Submission\D003688171.0001.tif" outputpath="Z:\Consulting\ConverterRun4\B0000001\Submission"/>
<ANNOTATION filename=""/>
<INDEX name="CAN(idmDocCustom4)" value=""/>
<INDEX name="Comment(idmComment)" value="GENERAL CORRESPONDENCE 11-6-96 TO 10-29-"/>
<INDEX name="DiagnosticID(idmDocCustom5)" value="983958-0006.MDB-2155504"/>
<INDEX name="Document Class(idmDocType)" value="Submission"/>
<INDEX name="Original File Name(idmDocOriginalFile)" value="40410.TIF"/>
<INDEX name="Title(idmName)" value="1997-12"/>
<FOLDER name="/Accreditation/NCACIHE/1997-12"/>
</DOCUMENT>
<DOCUMENT>
I only need a few values from the INDEX elements - those with name attributes of:
Comment(idmComment)
Document Class(idmDocType)
Title(idmName)
This is what I have so far in my testing:
namespace ConsoleApplication1
{
class DocMetaData
{
public string Comment { get; set; }
public string DocClass { get; set; }
public string Title { get; set; }
public string Folder { get; set; }
public string File { get; set; }
}
class Program
{
static void Main(string[] args)
{
XDocument xmlDoc = XDocument.Load(#"convert.B0000001.Submission.xml");
List<DocMetaData> docList =
(from d in xmlDoc.Descendants("DOCUMENT")
select new DocMetaData
{
Folder = d.Element("FOLDER").Attribute("name").Value,
File = d.Element("FILE").Attribute("filename").Value,
// need Comment, DocClass, Title from d.Element("INDEX").Attribute("name")
}
).ToList<DocMetaData>();
foreach (var c in docList)
{
Console.WriteLine("File name = {0}", c.File);
Console.WriteLine("\t" + "Folder = {0}", c.Folder);
}
Console.ReadLine();
}
}
}
I don't think I want a List<Index> inside my DocMetaData class. I want to get rid of the one-to-many aspect of the INDEX elements within DOCUMENT and assign properties as shown in the DocMetaData class. I can't get my head around how to handle these children!
--------EDIT-UPDATE----27 May 2011 ----------------------
Made the following change which caused compile error; have researched the error and tried some rearrangement of using directives but so far unable to get past this:
using System;
using System.Collections.Generic;
using System.Text;
using System.Xml.Linq;
using System.Xml.XPath;
using System.Linq;
namespace ConsoleApplication1
{
class DocMetaData
{
public string Comment { get; set; }
public string DocClass { get; set; }
public string Title { get; set; }
public string Folder { get; set; }
public string File { get; set; }
}
class Program
{
static void Main(string[] args)
{
XDocument xmlDoc = XDocument.Load(#"convert.B0000001.Submission.xml");
List<DocMetaData> docList =
(from d in xmlDoc.Descendants("DOCUMENT")
select new DocMetaData
{
Folder = d.Element("FOLDER").Attribute("name").Value,
File = d.Element("FILE").Attribute("filename").Value,
Comment = d.Element("INDEX")
.Where(i => i.Attribute("name") == "Comment(idmComment)")
.First()
.Attribute("value").Value
}
).ToList<DocMetaData>();
foreach (var c in docList)
{
Console.WriteLine("File name = {0}", c.File);
Console.WriteLine("\t" + "Folder = {0}", c.Folder);
Console.WriteLine("\t\t" + "Comment = {0}", c.Comment);
}
Console.ReadLine();
}
Here is the error (NOTE: I have System.Xml.Linq as a Reference and a using directive for it also):
Error 1 'System.Xml.Linq.XElement' does not contain a definition for 'Where' and no extension method 'Where' accepting a first argument of type 'System.Xml.Linq.XElement' could be found (are you missing a using directive or an assembly reference?) C:\ProjectsVS2010\ConsoleApplication_LINQ\ConsoleApplication1\Program.cs 31 37 ConsoleApplication1

You probably want to get the INDEX elements and then use Where and First to get the one you want.
select new DocMetaData
{
Folder = d.Element("FOLDER").Attribute("name").Value,
File = d.Element("FILE").Attribute("filename").Value,
Comment = d.Elements("INDEX")
.Where(i => i.Attribute("name").Value == "Comment(idmComment)")
.First()
.Attribute("value").Value
//similarly for other index elements
}
Note that this will throw an exception if there is not an INDEX element with the right attribute. If you want to ignore properties for which there is not a corresponding index, I would pull the select code into its own method, use FirstOrDefault, and do the appropriate null checks before assigning.

The secret lies in SelectMany. Here is a blog post that will help you wrap your head around the problem.
http://craigwatson1962.wordpress.com/2010/11/04/linq-to-xml-using-let-yield-return-and-selectmany/

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Extracting Data from XML to List<> - c#

Related

Tricky XML Manipulation: Create an element out of its own and other sibling's data

How to return an XML element value from a Descendant when passing in another element value from that Descendant

Deserialize dependent on field value

Importing CSV data into C# classes

Handling the children of a parent-child relationship using Linq to XML

Categories

Resources