DeSerializing an XML file to a C# class - c#

Does anyone know what advantages (memory/speed) there are by using a class generated by the XSD tool to explore a deserialized XML file as opposed to XPATH?

I'd say the advantage is that you get a strongly typed class which is more convenient to use, and also the constructor for the class will throw an exception if the XML data in the file is invalid for creating the object, so you get a minimal data validation for free.

If you don't want to write boilerplate code, and you need to check ANY values of your XML on the way through, you can't go wrong with the XSD.exe generated classes.

The two are very different; but XmlSerializer will always deserialize entire objects; with XPath you can pick and choose. I'd use XmlSerializer personally, though - harder to get wrong.
XPath, however, is a complex beast that depends on the back-end. For example, XmlDocument (mutable) will behave differently to XPathDocument (read-only, optimized for query).

Related

Store xml in C# class field

I have a requirement like i need to write an entity class in C# which can hold xml data.
I want to avoid overhead of checking the well-formedness of saved xml.
I have a corresponding column with type XML. Do we have xml data type or some class which can be used as a class field to hold xml.
Thanks in advance
Update: The service using this Entity class is WCF service and in future we are making it REST compatible. Will XmlDocument or XElement work with it?
There are a number of ways, two of which are string and XmlDocument.
string would be 'easier' for fragments and not-well-formed XML, but XmlDocument can be configured with options to allow fragments; you'll have more trouble with ill-formed data though.
if you describe your object in a XSD file you can get a compiler to generate all your C# classes automatically and easily regenerate them when you make changes.
This makes XML / C# a breeze. You can go to other languages too using equivilent generators.
See the tools described here: XSDObjectGen.exe vs XSD.exe
I believe the XSD.exe tool will read in example XML and do most of the work of producing an XSD which you can refine.
If you don't need support for a fixed XML/XSD format file why can't you make any class serialize to XML by using the [Serializable] class attribute and .Net APIS to serialize/deserialize?

XML (de)serialization and schema upgrades

I have a complex graph of XML-serializable classes that I'm able to (de)serialize to hard-disk just fine. But how do I handle massive changes to the graph schema structure? Is there some mechanism to handle XML schema upgrades? Some classes that would allow me to migrate old data to the new format?
Sure I could just use XmlReader/XmlWriter, go through every node and attribute and write several thousand lines of code to convert data to the new format, but maybe there is a better way?
I have found Object graph serialization in .NET and code version upgrades, but I don't think the linked articles apply when there are major changes in the model.
Instead of writing several thousand lines of code to convert files using XmlReader / XmlWriter, you could use XSLT. We are still talking hundreds of lines of code, and perhaps slower execution speeds, but if you are good at XSLT you could get it done much faster.
The other approach would be to build a C# program that links both the old class and the new class (of course you'd need to rename the old class to avoid naming collision). The program would load OldMyClass from disk, construct NewMyClass from the values of its attributes, and serialize NewMyClass to disk. Essentially, this approach moves the task of conversion into the C# territory, which may be a lot more familiar to you.
In this case, i keep my changes in my object and recreate my xml through the XmlSerializer: http://support.microsoft.com/kb/815813
With this i load and save new xml schema based in my object.

Serialization for document storage

I write a desktop application that can open / edit / save documents.
Those documents are described by several objects of different types that store references to each other. Of course there is a Document class that that serves as the root of this data structure.
The question is how to save this document model into a file.
What I need:
Support for recursive structures.
It must be able to open files even if they were produced from slightly different classes. My users don't want to recreate every document after every release just because I added a field somewhere.
It must deal with classes that are not known at compile time (for plug-in support).
What I tired so far:
XmlSerializer -> Fails the first and last criteria.
BinarySerializer -> Fails the second criteria.
DataContractSerializer: Similar to XmlSerializer but with support for cyclic (recursive) references. Also it was designed with (forward/backward) compatibility in mind: Data Contract Versioning. [edit]
NetDataContractSerializer: While the DataContractSerializer still requires to know all types in advance (i.e. it can't work very well with inheritance), NetDataContractSerializer stores type information in the output. Other than that the two seem to be equivalent. [edit]
protobuf-net: Didn't have time to experiment with it yet, but it seems similar in function to DataContractSerializer, but using a binary format. [edit]
Handling of unknown types [edit]
There seem two be two philosophies about what to do when the static and dynamic type differ (if you have a field of type object but a, lets say, Person-object in it). Basically the dynamic type must somehow get stored in the file.
Use different XML tags for different dynamic types. But since the XML tag to be used for a particular class might not be equal to the class name, its only possible to go this route if the deserializer knows all possible types in advance (so that he can scan them for attributes).
Store the CLR type (class name, assembly name & version) during serialization. Use this info during deserialization to instantiate the right class. The types must not be known prior to deserialization.
The second one is simpler to use, but the resulting file will be CLR dependent (and less sensitive to code modifications). Thats probably why XmlSerializer and DataContractSerializer choose the first way. NetDataContractSerializer is not recomended because its using the second approch (So does BinarySerializer by the way).
Any ideas?
The one you haven't tried is DataContractSerializer. There is a constructor that takes a parameter bool preserveObjectReferences that should handle the first criteria.
The WCF data contract serializer is probably closest to your needs, although not perfect.
There is only limited support for backwards compatibility (i.e. whether old versions of the program can read documents generated with a newer version). New fields are supported (via IExtensibleDataObject), but new classes or new enum values not.
I would think the XmlSerializer is your best bet. You won't be able to support everything on your requirements list without a bit of work in your Document classes - but the XmlSerializer architecture gives you extensibility points which should allow you to tap into its mechanism deep enough to do just about anything.
Using the IXmlSerializable interface - by implementing that on your classes you want to store - you should be able to do just about anything, really.
The interface exposes basically two methods - ReadXml And WriteXml
public void WriteXml (XmlWriter writer)
{
// do what you need to do to write out your XML for this object
}
public void ReadXml (XmlReader reader)
{
// do what you need to do to read your object from XML
}
Using these two methods, you should be able to capture the necessary state information from just about any object you might want to store, and turn it into XML that can be persisted to disk - and deserialized back into an object when the time comes!
XmlSerializer can work for your first criteria, however you must provide the recursion for objects like the TreeView control.
BinaryFormatter can work for all 3 criteria. If a class changes, you may have to create a conversion tool to convert old format documents to a new format. Or recognize an older format, deserialize to the old, and then save to the new - keeping your old class format around for a little while.
This will help cover version tolerance which is what I think you're after: MSDN - Version Tolerant Serialization

XML Serialization, in this case, IXmlSerializable or Attributes

I've done some XML serialization before but i used Attributes, I'm not sure this is doable for my next assignment, here's a brief list of XML manip requirementes.
General Purpose XMl manipulation, tied to a treeview, no schema.
Load/Save XML.
Load/Save Attributes as well as Values (i believe the term is Element Text?), and be mindful of the Node's name.
Comments can be safely ignored as can Document info markup (ie, the UTF-8 and schema tags)
Any suggestions on how best to handle this?
I probably wouldn't bother with an object model and IXmlSerializable - it sounds like you might just as well talk in terms of an XmlElement / XmlDocument - i.e. pass the data around as a block of xml. Since you have no schema it would be pointless to shred it out; you might as well do it via an xml DOM.
When you say treeview - is this winforms, asp.net, wpf? I believe the asp.net treeview can take an xml source, but for winforms you'd have to iterate the nodes yourself.
Don't know what exactly you mean with "before but i used Attributes" but I would recommend XmlSerializer too:
With "simple" classes it works usually out of the box.
Collections might need some more work, but it depends on your requirements and object structure.
There are other build in XML serializers like XAML or the WCF DataContractSerializer. All have pros and cons. But if you want to fine tune your XML format, XMLSerializer is the most flexibel one.
You can approach your format step by step: If the default looks good, your done. If not you have to add just some attributes in most cases.
If you want complete control, you can still implement IXmlSerialize to fine tune your format.
Everything applies on a per class basis: Use the default where appropriate, add some attributes where required and implement IXmlSerializable as required.
I would suggest you to use the simple XML serialization supported by the .NET framework.
Go through these MSDN documentation
How to Serialize an object
How to Deserialize an object

Xml Serialization and Schemas in .net (C#)

The following questions are about XML serialization/deserialization and schema validation for a .net library of types which are to be used for data exchange.
First question, if I have a custom xml namespace say "http://mydomain/mynamespace" do I have to add a
[XmlRoot(Namespace = "http://mydomain/mynamespace")]
to every class in my library. Or is there a way to define this namespace as default for the whole assembly?
Second question, is there a reason behind the always added namespaces
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"
even if there is no actual reference to any of the namespaces? I just feel they add noise to the resulting xml. Is there a way to remove them an only have the custom namespace in the resulting xml?
Third question, are there tools to support the generation of schema definitions (e.g. for all public [Serializable] classes of an assembly) and the validation of xml against specific schemas available?
If there are, would you recommend XML Schema from W3C or RELAX NG?
Just to add - the "xsi" etc is there to support things like xsi:nil on values later on - a well-known pattern for nullable values. It has to write the stream "forwards only", and it doesn't know (when it writes the first bit) whether it will need nil or not, so it assumes that writing it unnecessarily once is better than having to use the full namespace potentially lots of times.
1) XmlRoot can only be set at the class/struct/interface level (or on return values). So you can't use it on the assembly level. What you're looking for is the XmlnsDefinitionAttribute, but I believe that only is used by the XamlWriter.
2) If you're worried about clutter you should avoid xml. Well formed xml is full of clutter. I believe there are ways to interract with the xml produced by the serializer, but not directly with the XmlSerializer. You have much more control over the XML produced with the XmlWriter class. Check here for how you can use the XmlWriter to handle namespaces.
3) XSD.exe can be used to generate schemas for POCOs, I believe (I've always written them by hand; I may be using this soon to write up LOTS, tho!).
Tools,
- xsd.exe, with a command line like
xsd /c /n:myNamespace.Schema.v2_0 myschema_v2_0.xsd
I put the schema in a separate project.
liqudXML which is useful if there are several schemas, or you want full support of the schema features (DateTimes with offsets, positive/Negative decimals,), and cross platform generation.

Categories

Resources