I have few XML files which I would be using in my C# code.
So far I have been using XPATH for accessing the XML node / attributes
Question is what advantage would I get if i convert the XML to Class file (XSD.EXE) and use it in terms of maintainability and code readability.
In both the cases I know if I add or remove some nodes, code needs to be changed
In my case the DLL goes into GAC.
I am just trying to get your views
Cheers,
Karthik
The beauty of converting your XML to XSD and then to a C# class is the ease in which you can grab yet another file. Your code would be something like:
XmlSerializer ser = new XmlSerializer(typeof(MyClass));
FileStream fstm = new FileStream(#"C:\mysample.xml", FileMode.Open, FileAccess.Read);
MyClass result = ser.Deserialize(fstm) as MyClass;
if(result != null)
{
// do whatever you want with your new class instance!
}
With these few lines, you now have an object that represent exactly what your XML contained, and you can access its properties as properties on the object instance - much easier than doing lots of complicated XPath queries into your XML, in my opinion.
Also, thanks to the fact you now have a XSD, you can also easily validate incoming XML files to make sure they actually do correspond to the contract defined - which causes less constant error-checking in your code (you don't have to check after each XPath to see whether there's any node(s) that actually match that expression etc.).
Related
When I deserialize an XML document with XmlTextReader, a textual element for which there is no corresponding class is simply ignored.
Note: this is not about elements missing from the XML, which one requires to be present, but rather being present in the XML text, while not having an equivalent property in code.
I would have expected to get an exception because if the respective element is missing from the runtime data and I serialize it later, the resulting XML document will be different from the original one. So it's not safe to ignore it (in my real-world case I have just forgotten to define one of the 99+ classes the given document contains, and I didn't notice at first).
So is this normal and if yes, why? Can I somehow request that I want to get exceptions if elements cannot be serialized?
In the following example-XML I have purposely misspelled "MyComandElement" to illustrate the core problem:
<MyRootElement>
<MyComandElement/>
</MyRootElement>
MyRootElement.cs:
public class CommandElement {};
public class MyRootElement
{
public CommandElement MyCommandElement {get; set;}
}
Deserialization:
XmlSerializer xmlSerializer = new XmlSerializer(typeof(MyRootElement));
XmlTextReader xmlReader = new XmlTextReader(#"pgtest.xml");
MyRootElement mbs2 = (MyRootElement)xmlSerializer.Deserialize(xmlReader);
xmlReader.Close();
As I have found out by accident during further research, this problem is actually ridiculously easy to solve because...
...XmlSerializer supports events! All one has to do is to define an event handler for missing elements
void Serializer_UnknownElement(object sender, XmlElementEventArgs e)
{
throw new Exception("Unknown element "+e.Element.Name+" found in "
+e.ObjectBeingDeserialized.ToString()+" in line "
+e.LineNumber+" at position "+e.LinePosition);
}
and register the event with XmlSerializer:
xmlSerializer.UnknownElement += Serializer_UnknownElement;
The topic is treated at MSDN, where one also learns that
By default, after calling the Deserialize method, the XmlSerializer ignores XML attributes of unknown types.
Not surprisingly, there are also events for missing attributes, nodes and objects.
So is this normal and if yes, why?
Because maybe you're consuming someone else's XML document and whilst they define 300 different elements within their XML, you only care about two. Should you be forced to create classes for all of their elements and deserialize all of them just to be able to access the two you care about?
Or perhaps you're working with a system that is going to be in flux over time. You're writing code that consumes today's XML and if new elements/attributes are introduced later, they shouldn't stop your tested and deployed code from being able to continue to consume those parts of the XML that they do understand (Insert caveat here that, hopefully, if you're in such a situation, you/the XML author don't introduce elements later which it is critical to understand to cope with the document correctly).
These are two sides of the same coin of why it can be desirable for the system not to blow up if it encounters unexpected parts within the XML document it's being asked to deserialize.
I have a XML file, with a structure like
<items>
<item>
<someDetail>
A value here
</someDetail>
</item>
<item>
<someDetail>
Another value here
</someDetail>
</item>
</items>
With multiple items in it.
I want to deserialize the XML on session start ideally, to turn the XML data to objects based on a class (c# asp.net 4)
I have tried several ways with either no success, or a solution which seems clunky and inelegant.
What would people suggest?
I have tried using the xsd.exe tool, and have tried with the xml reader class, as well as usin XElement class to loop through the xml and then create new someObject(props).
These maybe the best and/or only way, but with it being so easy for database sources using the entities framework, I wondered if there was a similar way to do the same but from a xml source.
The best way to deserialize XML it to create a class that corresponds to the XML structure into which the XML data will deserialize.
The latest serialization technology uses Data Contracts and the DataContractSerializer.
You decorate the class I mentioned above with DataMember and DataItem attributes and user the serializer to deserialize.
I'd use directly the .NET XML serialization - classes declarations:
public class Item {
[XmlElement("someDetail")]
public string SomeDetail;
} // class Item
[XmlRoot("items")]
public class MyData {
[XmlElement("item")]
public List<Item> Items;
public static MyData Deserialize(Stream source)
{
XmlSerializer serializer = new XmlSerializer(typeof(MyData));
return serializer.Deserialize(source) as MyData;
} // Deserialize
} // class MyData
and then to read the XML:
using (FileStream fs = new FileStream(#"c:\temp\items.xml", FileMode.Open, FileAccess.Read)) {
MyData myData = MyData.Deserialize(fs);
}
I've concluded is there is not simple unified mechanism (probably due to the inherent complexities involved with non trivial cases - this question always crops up in the context of simple scenarios like your example xml).
Xml serialization is pretty easy to use. For your example, you would just have to create a class to contain a items and another class for the actual item. You might have to apply some attributes to get everything to work correctly, but the coding will not be much. Then it's as easy as -
var serializer = new XmlSerializer(typeof(ItemsContainer));
var items = serializer.Deserialize(...) as ItemsContainer;
Datasets are sometimes considered "yesterday tech" but I use them when they solve the problem well, and you can leverage the designer. The generated code is not pretty but the bottom line is you can persist to a database via the auto generated adapters and to XML using a method right on the data set. You can read it in this way as well.
XSD.exe isn't that bad once you get used to it. I printed the help to a text file and included it in my solutions for a while. When you use the /c option to create classes, you get clean code that can be used with the XmlSerialzier.
Visual Studio 2010 (maybe other versions too) has an XML menu which appears when you have an Xml file open and from that you can also generate an XSD from sample Xml. So in a couple of steps you could take your example xml and generate the XSD, then run it through XSD.exe and use the generated classes with a couple of lines XmlSerializer code... it feels like a lot of machinations but you get used to it.
I am writing a code generation tool that will take in a XSD file generated from Visual Studio's Data Set Generator and create a custom class for each column in each table. I already understand how to implement a IVsSingleFileGenerator to do the code generation and how to turn that single file generator in to a multi-file generator. However it seems the step I am having the most trouble with is the one that should be the most simple. I have never really worked with XML or XML-Schema before and I have no clue what is the correct way to iterate through a XSD file and read out the column names and types so I can build my code.
Any recommendation on a tutorial on how to read a XSD file? Also any recommendations on how to pull each xs:element that represents a column out and read its msprop:Generator_UserColumnName, type, and msprop:Generator_ColumnPropNameInTable properties from each element.
You'll want to create an XmlSchemaSet, read in your schema and then compile it to create an infoset. Once you've done that, you can start iterating through the document
XmlSchemaElement root = _schema.Items[0] as XmlSchemaElement;
XmlSchemaSequence children = ((XmlSchemaComplexType)root.ElementSchemaType).ContentTypeParticle as XmlSchemaSequence;
foreach(XmlSchemaObject child in children.Items.OfType<XmlSchemaElement>()) {
XmlSchemaComplexType type = child.ElementSchemaType as XmlSchemaComplexType;
if(type == null) {
// It's a simple type, no sub-elements.
} else {
if(type.Attributes.Count > 0) {
// Handle declared attributes -- use type.AttributeUsers for inherited ones
}
XmlSchemaSequence grandchildren = type.ContentTypeParticle as XmlSchemaSequence;
if(grandchildren != null) {
foreach(XmlSchemaObject xso in grandchildren.Items) {
if(xso.GetType().Equals(typeof(XmlSchemaElement))) {
// Do something with an element.
} else if(xso.GetType().Equals(typeof(XmlSchemaSequence))) {
// Iterate across the sequence.
} else if(xso.GetType().Equals(typeof(XmlSchemaAny))) {
// Good luck with this one!
} else if(xso.GetType().Equals(typeof(XmlSchemaChoice))) {
foreach(XmlSchemaObject o in ((XmlSchemaChoice)xso).Items) {
// Rinse, repeat...
}
}
}
}
}
}
Obviously you'll want to put all the child handling stuff in a separate method and call it recursively, but this should show you the general flow.
As btlog says, XSDs should be parsed as XML files. C# does provide functionality for this.
XPath Tutorial: http://www.w3schools.com/xpath/default.asp
XQuery Tutorial: http://www.w3schools.com/xquery/default.asp
Random C# XmlDocument tutorial: http://www.codeproject.com/KB/cpp/myXPath.aspx
In C#, XPath/XQuery are used via XmlDocument . In particular, through calls like SelectSingleNode and SelectNodes.
I recommend XmlDocument over XmlTextReader if your goal is to pull out specific chunks of data. If you prefer to read it line by line, XmlTextReader is more appropriate.
Update: For those interested in using Linq to query XML, .Net 4.0 introduced XDocument as an alternative to XmlDocument. See discussion at XDocument or XmlDocument.
You could just load it into and XmlDocument. Xsd is valid Xml, so if you are familiar with this type it is pretty simple. Alteratively XmlTextReader.
EDIT:
Having a quick search there is a System.Xml.Schema.XmlSchema object that represents a schema, which is most likely more applicable. http://msdn.microsoft.com/en-us/library/system.xml.schema.xmlschema.aspx has a good example of using this class.
Just so you know too, Visual Studio includes a tool called XSD that will already take an XSD file and generate classes for either C# or VB.NET: http://msdn.microsoft.com/en-us/library/x6c1kb0s.aspx
Here is an example of how to get a sorted list of the tableadapters from a generated XSD file. The XML is different depending on if the dataset is part of a web application or a web site. You will need to read through the XSD file to determine exactly what you want to read. Hopefully this will get you started.
Dim XMLDoc As New System.Xml.XmlDocument
XMLDoc.Load("MyDataset.xsd")
Dim oSortedTableAdapters As New Collections.Generic.SortedDictionary(Of String, Xml.XmlElement)
Const WebApplication As Boolean = False
Dim TableAdapters = XMLDoc.GetElementsByTagName("TableAdapter")
For Each TableAdapter As Xml.XmlElement In TableAdapters
If WebApplication Then
'pre-compiled way'
oSortedTableAdapters.Add(TableAdapter.Attributes("GeneratorDataComponentClassName").Value, TableAdapter)
Else
'dynamic compiled way'
oSortedTableAdapters.Add(TableAdapter.Attributes("Name").Value, TableAdapter)
End If
Next
I have to send information too a third party in an XML format they have specified, a very common task I'm sure.
I have set of XSD files and, using XSD.exe, I have created a set of types. To generate the XML I map the values from the types within my domain to the 3rd party types:
public ExternalBar Map(InternalFoo foo) {
var bar = new ExternalBar;
bar.GivenName = foo.FirstName;
bar.FamilyName = foo.LastName;
return bar;
}
I will then use the XMLSerializer to generate the files, probably checking them against the XSD before releasing them.
This method is very manual though and I wonder if there is a better way using the Framework or external tools to map the data and create the files.
LINQ to XML works quite well for this... e.g.
XElement results = new XElement("ExternalFoos",
from f in internalFoos
select new XElement("ExternalFoo", new XAttribute[] {
new XAttribute("GivenName", f.FirstName),
new XAttribute("FamilyName", f.LastName) } ));
Firstly, I'm assuming that the object properties in your existing domain map to the 3rd party types without much manipulation, except for the repetitive property assignments.
So I'd recommend just using standard XML serialization of your domain tree (generate an outbound schema for your classes using XSD), then post-processing the result via a set of XSLT stylesheets. Then after post-processing, validate the resulting XML documents against the 3rd party schemas.
It'll probably be more complicated than that, because it really depends on the complexity of the mapping between the object domains, but this is a method that I've used successfully in the past.
As far as GUI tools are concerned I've heard (but not used myself) that Stylus Studio is pretty good for schema-to-schema mappings (screenshot here).
I'm new to .net and c#, so I want to make sure i'm using the right tool for the job.
The XML i'm receiving is a description of a directory tree on another machine, so it go many levels deep. What I need to do now is to take the XML and create a structure of objects (custom classes) and populate them with info from the XML input, like File, Folder, Tags, Property...
The Tree stucture of this XML input makes it, in my mind, a prime candidate for using recursion to walk the tree.
Is there a different way of doing this in .net 3.5?
I've looked at XmlReaders, but they seem to be walking the tree in a linear fashion, not really what i'm looking for...
The XML i'm receiving is part of a 3rd party api, so is outside my control, and may change in the futures.
I've looked into Deserialization, but it's shortcomings (black box implementation, need to declare members a public, slow, only works for simple objects...) takes it out of the list as well.
Thanks for your input on this.
I would use the XLINQ classes in System.Xml.Linq (this is the namespace and the assembly you will need to reference). Load the XML into and XDocument:
XDocument doc = XDocument.Parse(someString);
Next you can either use recursion or a pseudo-recursion loop to iterate over the child nodes. You can choose you child nodes like:
//if Directory is tag name of Directory XML
//Note: Root is just the root XElement of the document
var directoryElements = doc.Root.Elements("Directory");
//you get the idea
var fileElements = doc.Root.Elements("File");
The variables directoryElements and fileElements will be IEnumerable types, which means you can use something like a foreach to loop through all of the elements. One way to build up you elements would be something like this:
List<MyFileType> files = new List<MyFileType>();
foreach(XElelement fileElement in fileElements)
{
files.Add(new MyFileType()
{
Prop1 = fileElement.Element("Prop1"), //assumes properties are elements
Prop2 = fileElement.Element("Prop2"),
});
}
In the example, MyFileType is a type you created to represent files. This is a bit of a brute-force attack, but it will get the job done.
If you want to use XPath you will need to using System.Xml.XPath.
A Note on System.Xml vs System.Xml.Linq
There are a number of XML classes that have been in .Net since the 1.0 days. These live (mostly) in System.Xml. In .Net 3.5, a wonderful, new set of XML classes were released under System.Xml.Linq. I cannot over-emphasize how much nicer they are to work with than the old classes in System.Xml. I would highly recommend them to any .Net programmer and especially someone just getting into .Net/C#.
XmlReader isn't a particularly friendly API. If you can use .NET 3.5, then loading into LINQ to XML is likely to be your best bet. You could easily use recursion with that.
Otherwise, XmlDocument would still do the trick... just a bit less pleasantly.
This is a problem which is very suitable for recursion.
To elaborate a bit more on what another poster said, you'll want to start by loading the XML into a System.Xml.XmlDocument, (using LoadXml or Load).
You can access the root of the tree using the XmlDocument.DocumentElement property, and access the children of each node by using the ChildNodes property. Child nodes returns a collection, and when the Collection is of size 0, you know you'll have reached your base case.
Using LINQ is also a good option, but I'm unable to elaborate on this solution, cause I'm not really a LINQ expert.
As Jon mentioned, XmlReader isn't very friendly. If you end up having perf issues, you might want to look into it, but if you just want to get the job done, go with XmlDocument/ChildNodes using recursion.
Load your XML into an XMLDocument. You can then walk the XMLDocuments DOM using recursion.
You might want to also look into the factory method pattern to create your classes, would be very useful here.