I am new trying to learn LINQ to XML and having trouble with "children". I have an XML file of info about documents; each document has some number of INDEX elements as in this snippet:
<DOCUMENTCOLLECTION>
<DOCUMENT>
<FILE filename="Z:\Consulting\ConverterRun4\B0000001\Submission\D003688171.0001.tif" outputpath="Z:\Consulting\ConverterRun4\B0000001\Submission"/>
<ANNOTATION filename=""/>
<INDEX name="CAN(idmDocCustom4)" value=""/>
<INDEX name="Comment(idmComment)" value="GENERAL CORRESPONDENCE 11-6-96 TO 10-29-"/>
<INDEX name="DiagnosticID(idmDocCustom5)" value="983958-0006.MDB-2155504"/>
<INDEX name="Document Class(idmDocType)" value="Submission"/>
<INDEX name="Original File Name(idmDocOriginalFile)" value="40410.TIF"/>
<INDEX name="Title(idmName)" value="1997-12"/>
<FOLDER name="/Accreditation/NCACIHE/1997-12"/>
</DOCUMENT>
<DOCUMENT>
I only need a few values from the INDEX elements - those with name attributes of:
Comment(idmComment)
Document Class(idmDocType)
Title(idmName)
This is what I have so far in my testing:
namespace ConsoleApplication1
{
class DocMetaData
{
public string Comment { get; set; }
public string DocClass { get; set; }
public string Title { get; set; }
public string Folder { get; set; }
public string File { get; set; }
}
class Program
{
static void Main(string[] args)
{
XDocument xmlDoc = XDocument.Load(#"convert.B0000001.Submission.xml");
List<DocMetaData> docList =
(from d in xmlDoc.Descendants("DOCUMENT")
select new DocMetaData
{
Folder = d.Element("FOLDER").Attribute("name").Value,
File = d.Element("FILE").Attribute("filename").Value,
// need Comment, DocClass, Title from d.Element("INDEX").Attribute("name")
}
).ToList<DocMetaData>();
foreach (var c in docList)
{
Console.WriteLine("File name = {0}", c.File);
Console.WriteLine("\t" + "Folder = {0}", c.Folder);
}
Console.ReadLine();
}
}
}
I don't think I want a List<Index> inside my DocMetaData class. I want to get rid of the one-to-many aspect of the INDEX elements within DOCUMENT and assign properties as shown in the DocMetaData class. I can't get my head around how to handle these children!
--------EDIT-UPDATE----27 May 2011 ----------------------
Made the following change which caused compile error; have researched the error and tried some rearrangement of using directives but so far unable to get past this:
using System;
using System.Collections.Generic;
using System.Text;
using System.Xml.Linq;
using System.Xml.XPath;
using System.Linq;
namespace ConsoleApplication1
{
class DocMetaData
{
public string Comment { get; set; }
public string DocClass { get; set; }
public string Title { get; set; }
public string Folder { get; set; }
public string File { get; set; }
}
class Program
{
static void Main(string[] args)
{
XDocument xmlDoc = XDocument.Load(#"convert.B0000001.Submission.xml");
List<DocMetaData> docList =
(from d in xmlDoc.Descendants("DOCUMENT")
select new DocMetaData
{
Folder = d.Element("FOLDER").Attribute("name").Value,
File = d.Element("FILE").Attribute("filename").Value,
Comment = d.Element("INDEX")
.Where(i => i.Attribute("name") == "Comment(idmComment)")
.First()
.Attribute("value").Value
}
).ToList<DocMetaData>();
foreach (var c in docList)
{
Console.WriteLine("File name = {0}", c.File);
Console.WriteLine("\t" + "Folder = {0}", c.Folder);
Console.WriteLine("\t\t" + "Comment = {0}", c.Comment);
}
Console.ReadLine();
}
Here is the error (NOTE: I have System.Xml.Linq as a Reference and a using directive for it also):
Error 1 'System.Xml.Linq.XElement' does not contain a definition for 'Where' and no extension method 'Where' accepting a first argument of type 'System.Xml.Linq.XElement' could be found (are you missing a using directive or an assembly reference?) C:\ProjectsVS2010\ConsoleApplication_LINQ\ConsoleApplication1\Program.cs 31 37 ConsoleApplication1
You probably want to get the INDEX elements and then use Where and First to get the one you want.
select new DocMetaData
{
Folder = d.Element("FOLDER").Attribute("name").Value,
File = d.Element("FILE").Attribute("filename").Value,
Comment = d.Elements("INDEX")
.Where(i => i.Attribute("name").Value == "Comment(idmComment)")
.First()
.Attribute("value").Value
//similarly for other index elements
}
Note that this will throw an exception if there is not an INDEX element with the right attribute. If you want to ignore properties for which there is not a corresponding index, I would pull the select code into its own method, use FirstOrDefault, and do the appropriate null checks before assigning.
The secret lies in SelectMany. Here is a blog post that will help you wrap your head around the problem.
http://craigwatson1962.wordpress.com/2010/11/04/linq-to-xml-using-let-yield-return-and-selectmany/
Related
I have this replicate scenario my XML document below:
<?xml version="1.0" encoding="utf-8"?>
<Home>
<Kitchen>
<Pantry>
<Ingredients>
<Name>Tomato</Name>
<ID>1</Price_ID>
<Name>Tomato</Name>
<Income>Sales</Income> // replace the <Income> element with its value <Sales>
<Cost>Materials</Cost>
<Price>100</Price> // the value for new <Sales> element shall be this <Price> value
</Ingredients>
//.. and thousands more sets of ingredients
</Pantry>
</Kitchen>
</Home>
//.. and thousands more sets of ingredients
And I want to restructure it in the following manner:
<?xml version="1.0" encoding="utf-8"?>
<Home>
<Kitchen>
<Pantry>
<Ingredients>
<Name>Tomato</Name>
<ID>1</ID>
<Name>Tomato</Name>
<Sales>100</Sales> // the <Income> was replaced by its value and the value was taken from the <Price> element that was deleted also
<Cost>Materials</Cost>
</Ingredients>
//.. and thousands more sets of ingredients
</Pantry>
</Kitchen>
</Home>
I'm still trying to figure out how I'm going to do this. I will appreciate any help here.
Using Xml Ling :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication37
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
List<XElement> ingredients = doc.Descendants("Ingredients").ToList();
foreach (XElement ingredient in ingredients)
{
XElement xIncome = ingredient.Element("Income");
XElement xPrice = ingredient.Element("Price");
xIncome.ReplaceWith(new XElement("Sales", (string)xPrice));
xPrice.Remove();
}
}
}
}
Firstly create a Class for the new Model
public class NewIngredients
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Sales { get; set; }
public string Cost{ get; set; }
}
Presuming the Xml Document is in a file called Kitchen.xml
XElement Kitchen = XElement.Load(#"Kitchen.xml");
then use Linq to Xml to create your new model from the old something like this (Note probably need to check for nulls etc)
var newIngredients = Kitchen.Descendants("Ingredients").Select(x => new NewIngredients
{
Id = int.Parse(x.Element("ID").Value),
Name = x.Element("Name").Value,
Sales = decimal.Parse(x.Element("Price").Value)
Cost = x.Element("Cost").Value
});
Convert back to xml if needed
var serializer = new XmlSerializer(newIngredients.First().GetType());
serializer.Serialize(Console.Out, newIngredients.First()); //Printed to console but could move to file if needed
I am new to working with LINQ to XML, but I can see how it could be helpful to my current problem. I want a method that you pass in an XML Document and an Element value, the method would then return a different Element Value from the same Descendant. For example if I provided a "StationName" I would like to know what "ScannerType" belongs to that "StationName"
Here is my XML
<?xml version="1.0" encoding="utf-8" ?>
<stations>
<station>
<stationName>CH3CTRM1</stationName>
<scannerType>GE LightSpeed VCT</scannerType>
<scannerID>COL02</scannerID>
<siteName>CUMC</siteName>
<inspDose>180</inspDose>
<expDose>100</expDose>
<kernel>STANDARD</kernel>
</station>
<station>
<stationName>CTAWP75515</stationName>
<scannerType>SIEMENS Force</scannerType>
<scannerID>UIA07</scannerID>
<siteName>Iowa</siteName>
<inspDose>careDose</inspDose>
<expDose>careDose</expDose>
<kernel>Qr40 5</kernel>
</station>
<station>
<stationName>JHEB_CT06N_JHOC2</stationName>
<scannerType>SIEMENS Force</scannerType>
<scannerID>JHU04</scannerID>
<siteName>JHU</siteName>
<inspDose>careDose</inspDose>
<expDose>careDose</expDose>
<kernel>Qr40 5</kernel>
</station>
</stations>
Here are the methods that are currently in question
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Xml.Linq;
namespace ManualPhantomProcessor.XMLParser
{
class SearchXML
{
public string filename = "SiteData.xml";
public string currentDirectory = Directory.GetCurrentDirectory();
public XDocument LoadXML()
{
string siteDataFilePath = Path.Combine(currentDirectory, filename);
XDocument siteData = XDocument.Load(siteDataFilePath);
return siteData;
}
public IEnumerable<string> GetScannerModel(XDocument xmlDocument, string stationName)
{
var query = xmlDocument.Descendants("station")
.Where(s => s.Element("stationName").Value == stationName)
.Select(s => s.Element("scannerType").Value)
.Distinct();
return query;
}
}
}
Here is my Programs.cs file
using ManualPhantomProcessor.XMLParser;
using System;
using System.Collections.Generic;
using System.Xml.Linq;
namespace ManualPhantomProcessor
{
class Program
{
static void Main(string[] args)
{
SearchXML searchXML = new SearchXML();
XDocument siteData = searchXML.LoadXML();
IEnumerable<string> data = searchXML.GetScannerModel(siteData, "CH3CTRM1");
Console.WriteLine(data);
}
}
}
I should be a simple console application, but it seem like no matter what I try I keep getting a null value when I expect the scannerType value from the XML document that corresponds with the station name "CH3CTRM1"
the application doesn't crash but in my console I get the following:
System.Linq.Enumerable+DistinctIterator`1[System.String]
Could explain what I am doing incorrectly?
Your code is good, the problem is here Console.WriteLine(data); the WriteLine take string like a parameter not a list of string. to display the station names use a loop, like the following code :
foreach(string stationName in data)
{
Console.WriteLine(stationName);
}
The documentation of WriteLine
i hope that will help you fix the issue.
What you're seeing is the string form of the IEnumerable<string> returned by GetScannerModel().
There are two possibilities:
Only one scanner model is expected to be found (because a station name is expected
to be unique).
Any number of scanner models can be found.
In the first case, change GetScannerModel() to return a string and have it do soemthing like return query.FirstOrDefault(); (or .First() if you want an exception if no match was found). Your client program then remains unchanged.
In the second case, #Sajid's answer applies - you need to enumerate the IEnumerable in some way, for example through foreach.
Use a dictionary :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
List<Station> stations = doc.Descendants("station").Select(x => new Station
{
stationName = (string)x.Element("stationName"),
scannerType = (string)x.Element("scannerType"),
scannerID = (string)x.Element("scannerID"),
siteName = (string)x.Element("siteName"),
inspDose = (string)x.Element("inspDose"),
expDose = (string)x.Element("expDose"),
kernel = (string)x.Element("kernel")
}).ToList();
Dictionary<string, Station> dict = stations
.GroupBy(x => x.stationName, y => y)
.ToDictionary(x => x.Key, y => y.FirstOrDefault());
}
}
public class Station
{
public string stationName {get;set;}
public string scannerType {get;set;}
public string scannerID {get;set;}
public string siteName {get;set;}
public string inspDose {get;set;}
public string expDose {get;set;}
public string kernel { get; set; }
}
}
I need to deserialize XML that uses a field "type" to indicate what content to expect.
Type 0 says that I can expect simple text whilst type 1 indicates that the content is of a more complex structure.
I know that I could write some custom deserialization mechanism but would like to know whether there was any builtin way to solve this.
Since the XMLSerializer expects a string it simply throws away the content in case it is XML. This stops me from running the content deserialization as a second step.
<Msg>
<MsgType>0</MsgType>
<Data>Some text</Data>
</Msg>
<Msg>
<MsgType>1</MsgType>
<Data>
<Document>
<Type>PDF</Type>
.....
</Document>
</Data>
</Msg>
That isn't supported out of the box; however, you could perhaps use:
public XmlNode Data {get;set;}
and run the "what to do with Data?" as a second step, once you can look at MsgType.
Complete example:
using System;
using System.Collections.Generic;
using System.IO;
using System.Xml;
using System.Xml.Serialization;
static class P
{
static void Main()
{
const string xml = #"<Foo>
<Msg>
<MsgType>0</MsgType>
<Data>Some text</Data>
</Msg>
<Msg>
<MsgType>1</MsgType>
<Data>
<Document>
<Type>PDF</Type>
.....
</Document>
</Data>
</Msg>
</Foo>";
var fooSerializer = new XmlSerializer(typeof(Foo));
var docSerializer = new XmlSerializer(typeof(Document));
var obj = (Foo)fooSerializer.Deserialize(new StringReader(xml));
foreach (var msg in obj.Messages)
{
switch (msg.MessageType)
{
case 0:
var text = msg.Data.InnerText;
Console.WriteLine($"text: {text}");
break;
case 1:
var doc = (Document)docSerializer.Deserialize(new XmlNodeReader(msg.Data));
Console.WriteLine($"document of type: {doc.Type}");
break;
}
Console.WriteLine();
}
}
}
public class Foo
{
[XmlElement("Msg")]
public List<Message> Messages { get; } = new List<Message>();
}
public class Message
{
[XmlElement("MsgType")]
public int MessageType { get; set; }
public XmlNode Data { get; set; }
}
public class Document
{
public string Type { get; set; }
}
I am working with the USPS Tracking API. The have a specification for a request as I have listed below;
<TrackFieldRequest PASSWORD="" USERID="prodsolclient" APPID="">
<Revision>1</Revision>
<ClientIp>111.0.0.1</ClientIp>
<TrackID ID="5551212699300000962610" />
</TrackFieldRequest>
And they state in their user manual that "Up to 10 tracking IDs may be contained in each request input to the Web Tool server."
I interpret this as meaning that the TrackFieldRequest can have up to 10 of the TrackID child elements. However, these multiple TrackID elements are not defined as being in an array. They are just up to 10 consecutive TrackID child elements of the TrackFieldRequest element.
So, I am not sure how to build up the CLR object to pass to the XMLSerializer if I want to include 10 of the TrackID child elements.
I tried creating a TrackFieldRequest class that has a property that is a "List TrackIds" but the USPS website gives me an error response saying "The element 'TrackFieldRequest' has invalid child element 'TrackIds'. List of possible elements expected: 'TrackID'"
How do I model the CLR class so that the XMLSerializer can use it to generate up to 10 TrackID child elements, without using a List or Array property in my TrackFieldRequest class?
Here is my current TrackFieldRequest class
public class TrackFieldRequest
{
// Based upon USPS Web Tools API User Guide(Track & Confirm API) version 3.3 dated 2/28/16
// at https://www.usps.com/business/web-tools-apis/track-and-confirm-api.pdf
[XmlAttribute("USERID")]
public string UserId { get; set; }
[XmlElement("Revision")]
public int Revision { get; set; }
[XmlElement("ClientIp")]
public string ClientIp { get; set; }
[XmlElement("SourceIdZIP")]
public string SourceIdZip { get; set; }
public List<TrackId> TrackIds { get; set; }
}
Here is my current TrackID class
public class TrackId
{
// Based upon USPS Web Tools API User Guide(Track & Confirm API) version 3.3 dated 2/28/16
// at https://www.usps.com/business/web-tools-apis/track-and-confirm-api.pdf
public TrackId(string a_Id, string a_destinationZipCode, string a_mailingDate)
{
ID = a_Id;
DestinationZipCode = a_destinationZipCode;
MailingDate = a_mailingDate.ToString();
}
// Parameterless constructor is needed for the XMLSerializer
public TrackId()
{
}
[XmlAttribute]
public string ID { get; set; }
[XmlElement("DestinationZipCode")]
public string DestinationZipCode { get; set; }
[XmlElement("MailingDate")]
public string MailingDate { get; set; }
}
Here is my methods to convert the the CLR class into Xml using an XmlWriter
private string ConvertTrackingRequestToXml(TrackFieldRequest a_trackingRequest)
{
try
{
var xmlWriterSettings = new XmlWriterSettings
{
Encoding = new UTF8Encoding(false),
Indent = true,
IndentChars = "\t"
};
XmlSerializer xmlSerializer = new XmlSerializer(a_trackingRequest.GetType());
using (StringWriter stringWriter = new StringWriter())
using (XmlWriter xmlWriter = XmlWriter.Create(stringWriter, xmlWriterSettings))
{
xmlSerializer.Serialize(xmlWriter, a_trackingRequest);
return stringWriter.ToString();
}
}
catch (Exception ex)
{
Logger.LogError("Could not convert tracking request into Xml.", ex);
return null;
}
}
I would prefer not to use the XmlSerializer rather than manually building up the request XML string from a string builder, if possible.
Any ideas?
Thanks in advance for any help you can provide.
Try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] trackingNumbers = {"5551212699300000962610", "5551212699300000962611", "5551212699300000962612"};
XElement trackFieldRequest = new XElement("TrackFieldRequest", new object[] {
new XAttribute("PASSWORD", "password"),
new XAttribute("USERID", "prodsolclient"),
new XAttribute("APPID", ""),
new XElement("Revision",1),
new XElement("ClientIp", "111.0.0.1")
});
foreach (string trackingNumber in trackingNumbers)
{
trackFieldRequest.Add(new XElement("TrackID", trackingNumber));
}
string xml = trackFieldRequest.ToString();
}
}
}
I have this XML file:
<?xml version="1.0" encoding="utf-8" ?>
<Record>
<File name="1.mot">
<Line address="040004" data="0720" />
<Line address="040037" data="31" />
<Line address="04004C" data="55AA55AA" />
</File>
<File name="2.mot">
<Line address="00008242" data="06" />
<Line address="00008025" data="AFC8" />
<Line address="00009302" data="476F6C64" />
</File>
</Record>
What I want to do is to extract the information from the XML and convert that to list. Although I kind of don't know where and how to start. I've googled samples, and questions and the code below is what I've managed to construct so far. I'm not even sure if this code is appropriate for what I wanted to happen. This list will be used for some kind of lookup in the program. Like in file 1.mot, program would read 1.mot, read the xml file, parse both files, extract the info from the xml file and then do a search function to verify if the info in the xml exists in 1.mot.
XElement xmlReqs = XElement.Load("XMLFile1.xml");
List<Requirement> reqs = new List<Requirement>();
foreach (var xmlReq in xmlReqs.Elements("File"))
{
string name = xmlReqs.Attribute("name").Value);
List<InfoLine> info = new List<InfoLine>();
foreach (var xmlInfo in xmlReq.Elements("Line"))
{
string address = xmlProduct.Attribute("address").Value;
string data = xmlProduct.Attribute("data").Value;
}
reqs.Add(new Requirement(address, data));
}
A friend of mine suggested something about using int array or string array and then using this reqs.Find(val => val[0]==target) but I'm not sure how to do so. I'm not well-versed with linq, but what I've gathered, it seems to be quite notable and powerful (?).
Anyway, will the code above work? And how do I call the objects from the list to use for the lookup function of the program?
UPDATE:
Program would be reading 1.mot or 2.mot (depending on the user preference, that's why file name in xml needs to be specified) simultaneously (or not) with the xml file.
1.mot file contains:
S0030000FC
S21404000055AA55AA072000010008000938383138D7
S21404001046305730343130302020202027992401B0
...
Address starts at the 3rd byte. So yeah, would be comparing the data to these bunch of lines.
You can de-serialize the xml file
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
namespace ConsoleApplication2
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XmlSerializer xs = new XmlSerializer(typeof(Record));
XmlTextReader reader = new XmlTextReader(FILENAME);
Record record = (Record)xs.Deserialize(reader);
}
}
[XmlRoot("Record")]
public class Record
{
[XmlElement("File")]
public List<File> files {get;set;}
}
[XmlRoot("File")]
public class File
{
[XmlAttribute("name")]
public string name { get; set; }
[XmlElement("Line")]
public List<Line> lines {get;set;}
}
[XmlRoot("Line")]
public class Line
{
[XmlAttribute("address")]
public string address {get;set;}
[XmlAttribute("data")]
public string data {get;set;}
}
}
You could use XmlSerializer to handle the reading of the XML. Create some classes that look like these:
public class Record
{
[XmlElement("File")]
public List<File> Files { get; set; }
}
public class File
{
[XmlAttribute("name")]
public string Name { get; set; }
[XmlElement("Line")]
public List<Line> Lines { get; set; }
}
public class Line
{
[XmlAttribute("address")]
public int Address { get; set; }
[XmlAttribute("data")]
public string Data { get; set; }
}
And deserialise like so:
var serializer = new XmlSerializer(typeof (Record));
using (var reader = XmlReader.Create("XMLFile1.xml"))
{
var record = (Record) serializer.Deserialize(reader);
var first = record.Files.Single(f => f.Name == "1.mot");
var second = record.Files.Single(f => f.Name == "2.mot");
}