How to parse a XSD file - c#

I am writing a code generation tool that will take in a XSD file generated from Visual Studio's Data Set Generator and create a custom class for each column in each table. I already understand how to implement a IVsSingleFileGenerator to do the code generation and how to turn that single file generator in to a multi-file generator. However it seems the step I am having the most trouble with is the one that should be the most simple. I have never really worked with XML or XML-Schema before and I have no clue what is the correct way to iterate through a XSD file and read out the column names and types so I can build my code.
Any recommendation on a tutorial on how to read a XSD file? Also any recommendations on how to pull each xs:element that represents a column out and read its msprop:Generator_UserColumnName, type, and msprop:Generator_ColumnPropNameInTable properties from each element.

You'll want to create an XmlSchemaSet, read in your schema and then compile it to create an infoset. Once you've done that, you can start iterating through the document
XmlSchemaElement root = _schema.Items[0] as XmlSchemaElement;
XmlSchemaSequence children = ((XmlSchemaComplexType)root.ElementSchemaType).ContentTypeParticle as XmlSchemaSequence;
foreach(XmlSchemaObject child in children.Items.OfType<XmlSchemaElement>()) {
XmlSchemaComplexType type = child.ElementSchemaType as XmlSchemaComplexType;
if(type == null) {
// It's a simple type, no sub-elements.
} else {
if(type.Attributes.Count > 0) {
// Handle declared attributes -- use type.AttributeUsers for inherited ones
}
XmlSchemaSequence grandchildren = type.ContentTypeParticle as XmlSchemaSequence;
if(grandchildren != null) {
foreach(XmlSchemaObject xso in grandchildren.Items) {
if(xso.GetType().Equals(typeof(XmlSchemaElement))) {
// Do something with an element.
} else if(xso.GetType().Equals(typeof(XmlSchemaSequence))) {
// Iterate across the sequence.
} else if(xso.GetType().Equals(typeof(XmlSchemaAny))) {
// Good luck with this one!
} else if(xso.GetType().Equals(typeof(XmlSchemaChoice))) {
foreach(XmlSchemaObject o in ((XmlSchemaChoice)xso).Items) {
// Rinse, repeat...
}
}
}
}
}
}
Obviously you'll want to put all the child handling stuff in a separate method and call it recursively, but this should show you the general flow.

As btlog says, XSDs should be parsed as XML files. C# does provide functionality for this.
XPath Tutorial: http://www.w3schools.com/xpath/default.asp
XQuery Tutorial: http://www.w3schools.com/xquery/default.asp
Random C# XmlDocument tutorial: http://www.codeproject.com/KB/cpp/myXPath.aspx
In C#, XPath/XQuery are used via XmlDocument . In particular, through calls like SelectSingleNode and SelectNodes.
I recommend XmlDocument over XmlTextReader if your goal is to pull out specific chunks of data. If you prefer to read it line by line, XmlTextReader is more appropriate.
Update: For those interested in using Linq to query XML, .Net 4.0 introduced XDocument as an alternative to XmlDocument. See discussion at XDocument or XmlDocument.

You could just load it into and XmlDocument. Xsd is valid Xml, so if you are familiar with this type it is pretty simple. Alteratively XmlTextReader.
EDIT:
Having a quick search there is a System.Xml.Schema.XmlSchema object that represents a schema, which is most likely more applicable. http://msdn.microsoft.com/en-us/library/system.xml.schema.xmlschema.aspx has a good example of using this class.

Just so you know too, Visual Studio includes a tool called XSD that will already take an XSD file and generate classes for either C# or VB.NET: http://msdn.microsoft.com/en-us/library/x6c1kb0s.aspx

Here is an example of how to get a sorted list of the tableadapters from a generated XSD file. The XML is different depending on if the dataset is part of a web application or a web site. You will need to read through the XSD file to determine exactly what you want to read. Hopefully this will get you started.
Dim XMLDoc As New System.Xml.XmlDocument
XMLDoc.Load("MyDataset.xsd")
Dim oSortedTableAdapters As New Collections.Generic.SortedDictionary(Of String, Xml.XmlElement)
Const WebApplication As Boolean = False
Dim TableAdapters = XMLDoc.GetElementsByTagName("TableAdapter")
For Each TableAdapter As Xml.XmlElement In TableAdapters
If WebApplication Then
'pre-compiled way'
oSortedTableAdapters.Add(TableAdapter.Attributes("GeneratorDataComponentClassName").Value, TableAdapter)
Else
'dynamic compiled way'
oSortedTableAdapters.Add(TableAdapter.Attributes("Name").Value, TableAdapter)
End If
Next

Related

C# What is the best approach for reaching any given XML node by referencing its parent?

(Answer to - if this question is a duplicate of how to deserialize XML to objects in C#)
My question is far more abstract than a concrete one, I didn't ask for a specific implementation, I wanted to hear approaches to find the best practice, should be relevant for any C# developer that seeks for the best C# code design when it comes to dealing with XML configuration files.
In JavaScript you can point to a JSON parent node & save it's reference.
That reference holds only 1 descendants level, and those descendants carry the same reference to their descendants.
Example:
Config.Json (the file name)
"parentNode" : {
"descendant_1" :{
"content_1" : "abcde",
"content_2" : "abcdef"
},
"descendant_2" :{
"content_1" : "abcdefg",
"content_1" : "abcdefgh"
}
}
Now in JavaScript I can create a reference to that config.json plus to any node I want in it:
var data = Config.get('parentNode.descendant_2');
Get implementation:
Config.prototype.get = function get (key, defaultVal) {
try {
return JSON.parse(JSON.stringify(config.get(key)));
}catch (err) {
//return default value if one exists
if (typeof defaultVal !== 'undefined') return defaultVal; //replaces if(defaultVal) with if (arguments.length > 1) to avoid mistaking default value = false as null
//nothing to do, crash
throw err;
}
};
And now I'm able to pass as an argument to any function I want that config reference and inside any function I could keep digging into the nested nodes to reach which ever node I desire...
data.content_1 will give me abcdefg
data.content_2 will give me abcdefgh
My question is, what is the most efficient way to achieve such ability in C# using XML?
(Using XML and Strongly typed language isn't my call, I don't call those shots, so please avoid such irrelevant comments, thanks in advance)
XML file:
<role name="parentNode">
<descendant name="descendant_1">
<content_1 name="abcde"></content_1>
<content_2 name="abcdef"></content_2>
</descendant >
<descendant name="descendant_2">
<content_1 name="abcdefg"></content_3>
<content_2 name="abcdefgh"></content_4>
</descendant >
</role>
In C# I can call the parentNode in the following:
Linq to XML using XDocument
XPath
My goal is to find an approach that brings me as closer as possible to JavaScript's fluent api code writing.
Which means creating a reference to the parent node and gaining the ability to reach any of its descendants with a "fluent api".
One can use Linq to Xml + the XPathSelectElements extension method: (add using System.Xml.XPath;)
var items = XDocument.Load("data.xml")
.XPathSelectElements("/role[#name='parentNode']/*[#name='descendant_2']/*");
You can also just with linq to xml:
var items = XDocument.Load("data.xml")
.Descendants("descendant")
.Where(element => element.Attribute("name").Value == "descendant_2")
.Descendants();
If your XML config has a static format, you should look into xml serialization, This will allow you to convert the XML into a C# Object.
You could also use dynamic, but it I wouldn't recommend it, unless, well your xml format is dynamic.

Asp.net XML to objects

I have a XML file, with a structure like
<items>
<item>
<someDetail>
A value here
</someDetail>
</item>
<item>
<someDetail>
Another value here
</someDetail>
</item>
</items>
With multiple items in it.
I want to deserialize the XML on session start ideally, to turn the XML data to objects based on a class (c# asp.net 4)
I have tried several ways with either no success, or a solution which seems clunky and inelegant.
What would people suggest?
I have tried using the xsd.exe tool, and have tried with the xml reader class, as well as usin XElement class to loop through the xml and then create new someObject(props).
These maybe the best and/or only way, but with it being so easy for database sources using the entities framework, I wondered if there was a similar way to do the same but from a xml source.
The best way to deserialize XML it to create a class that corresponds to the XML structure into which the XML data will deserialize.
The latest serialization technology uses Data Contracts and the DataContractSerializer.
You decorate the class I mentioned above with DataMember and DataItem attributes and user the serializer to deserialize.
I'd use directly the .NET XML serialization - classes declarations:
public class Item {
[XmlElement("someDetail")]
public string SomeDetail;
} // class Item
[XmlRoot("items")]
public class MyData {
[XmlElement("item")]
public List<Item> Items;
public static MyData Deserialize(Stream source)
{
XmlSerializer serializer = new XmlSerializer(typeof(MyData));
return serializer.Deserialize(source) as MyData;
} // Deserialize
} // class MyData
and then to read the XML:
using (FileStream fs = new FileStream(#"c:\temp\items.xml", FileMode.Open, FileAccess.Read)) {
MyData myData = MyData.Deserialize(fs);
}
I've concluded is there is not simple unified mechanism (probably due to the inherent complexities involved with non trivial cases - this question always crops up in the context of simple scenarios like your example xml).
Xml serialization is pretty easy to use. For your example, you would just have to create a class to contain a items and another class for the actual item. You might have to apply some attributes to get everything to work correctly, but the coding will not be much. Then it's as easy as -
var serializer = new XmlSerializer(typeof(ItemsContainer));
var items = serializer.Deserialize(...) as ItemsContainer;
Datasets are sometimes considered "yesterday tech" but I use them when they solve the problem well, and you can leverage the designer. The generated code is not pretty but the bottom line is you can persist to a database via the auto generated adapters and to XML using a method right on the data set. You can read it in this way as well.
XSD.exe isn't that bad once you get used to it. I printed the help to a text file and included it in my solutions for a while. When you use the /c option to create classes, you get clean code that can be used with the XmlSerialzier.
Visual Studio 2010 (maybe other versions too) has an XML menu which appears when you have an Xml file open and from that you can also generate an XSD from sample Xml. So in a couple of steps you could take your example xml and generate the XSD, then run it through XSD.exe and use the generated classes with a couple of lines XmlSerializer code... it feels like a lot of machinations but you get used to it.

Question on usage of XML with XPath vs XML as Class

I have few XML files which I would be using in my C# code.
So far I have been using XPATH for accessing the XML node / attributes
Question is what advantage would I get if i convert the XML to Class file (XSD.EXE) and use it in terms of maintainability and code readability.
In both the cases I know if I add or remove some nodes, code needs to be changed
In my case the DLL goes into GAC.
I am just trying to get your views
Cheers,
Karthik
The beauty of converting your XML to XSD and then to a C# class is the ease in which you can grab yet another file. Your code would be something like:
XmlSerializer ser = new XmlSerializer(typeof(MyClass));
FileStream fstm = new FileStream(#"C:\mysample.xml", FileMode.Open, FileAccess.Read);
MyClass result = ser.Deserialize(fstm) as MyClass;
if(result != null)
{
// do whatever you want with your new class instance!
}
With these few lines, you now have an object that represent exactly what your XML contained, and you can access its properties as properties on the object instance - much easier than doing lots of complicated XPath queries into your XML, in my opinion.
Also, thanks to the fact you now have a XSD, you can also easily validate incoming XML files to make sure they actually do correspond to the contract defined - which causes less constant error-checking in your code (you don't have to check after each XPath to see whether there's any node(s) that actually match that expression etc.).

How can I transform an object graph to an external XML format

I have to send information too a third party in an XML format they have specified, a very common task I'm sure.
I have set of XSD files and, using XSD.exe, I have created a set of types. To generate the XML I map the values from the types within my domain to the 3rd party types:
public ExternalBar Map(InternalFoo foo) {
var bar = new ExternalBar;
bar.GivenName = foo.FirstName;
bar.FamilyName = foo.LastName;
return bar;
}
I will then use the XMLSerializer to generate the files, probably checking them against the XSD before releasing them.
This method is very manual though and I wonder if there is a better way using the Framework or external tools to map the data and create the files.
LINQ to XML works quite well for this... e.g.
XElement results = new XElement("ExternalFoos",
from f in internalFoos
select new XElement("ExternalFoo", new XAttribute[] {
new XAttribute("GivenName", f.FirstName),
new XAttribute("FamilyName", f.LastName) } ));
Firstly, I'm assuming that the object properties in your existing domain map to the 3rd party types without much manipulation, except for the repetitive property assignments.
So I'd recommend just using standard XML serialization of your domain tree (generate an outbound schema for your classes using XSD), then post-processing the result via a set of XSLT stylesheets. Then after post-processing, validate the resulting XML documents against the 3rd party schemas.
It'll probably be more complicated than that, because it really depends on the complexity of the mapping between the object domains, but this is a method that I've used successfully in the past.
As far as GUI tools are concerned I've heard (but not used myself) that Stylus Studio is pretty good for schema-to-schema mappings (screenshot here).

Walking an XML tree in C#

I'm new to .net and c#, so I want to make sure i'm using the right tool for the job.
The XML i'm receiving is a description of a directory tree on another machine, so it go many levels deep. What I need to do now is to take the XML and create a structure of objects (custom classes) and populate them with info from the XML input, like File, Folder, Tags, Property...
The Tree stucture of this XML input makes it, in my mind, a prime candidate for using recursion to walk the tree.
Is there a different way of doing this in .net 3.5?
I've looked at XmlReaders, but they seem to be walking the tree in a linear fashion, not really what i'm looking for...
The XML i'm receiving is part of a 3rd party api, so is outside my control, and may change in the futures.
I've looked into Deserialization, but it's shortcomings (black box implementation, need to declare members a public, slow, only works for simple objects...) takes it out of the list as well.
Thanks for your input on this.
I would use the XLINQ classes in System.Xml.Linq (this is the namespace and the assembly you will need to reference). Load the XML into and XDocument:
XDocument doc = XDocument.Parse(someString);
Next you can either use recursion or a pseudo-recursion loop to iterate over the child nodes. You can choose you child nodes like:
//if Directory is tag name of Directory XML
//Note: Root is just the root XElement of the document
var directoryElements = doc.Root.Elements("Directory");
//you get the idea
var fileElements = doc.Root.Elements("File");
The variables directoryElements and fileElements will be IEnumerable types, which means you can use something like a foreach to loop through all of the elements. One way to build up you elements would be something like this:
List<MyFileType> files = new List<MyFileType>();
foreach(XElelement fileElement in fileElements)
{
files.Add(new MyFileType()
{
Prop1 = fileElement.Element("Prop1"), //assumes properties are elements
Prop2 = fileElement.Element("Prop2"),
});
}
In the example, MyFileType is a type you created to represent files. This is a bit of a brute-force attack, but it will get the job done.
If you want to use XPath you will need to using System.Xml.XPath.
A Note on System.Xml vs System.Xml.Linq
There are a number of XML classes that have been in .Net since the 1.0 days. These live (mostly) in System.Xml. In .Net 3.5, a wonderful, new set of XML classes were released under System.Xml.Linq. I cannot over-emphasize how much nicer they are to work with than the old classes in System.Xml. I would highly recommend them to any .Net programmer and especially someone just getting into .Net/C#.
XmlReader isn't a particularly friendly API. If you can use .NET 3.5, then loading into LINQ to XML is likely to be your best bet. You could easily use recursion with that.
Otherwise, XmlDocument would still do the trick... just a bit less pleasantly.
This is a problem which is very suitable for recursion.
To elaborate a bit more on what another poster said, you'll want to start by loading the XML into a System.Xml.XmlDocument, (using LoadXml or Load).
You can access the root of the tree using the XmlDocument.DocumentElement property, and access the children of each node by using the ChildNodes property. Child nodes returns a collection, and when the Collection is of size 0, you know you'll have reached your base case.
Using LINQ is also a good option, but I'm unable to elaborate on this solution, cause I'm not really a LINQ expert.
As Jon mentioned, XmlReader isn't very friendly. If you end up having perf issues, you might want to look into it, but if you just want to get the job done, go with XmlDocument/ChildNodes using recursion.
Load your XML into an XMLDocument. You can then walk the XMLDocuments DOM using recursion.
You might want to also look into the factory method pattern to create your classes, would be very useful here.

Categories

Resources