XML Serialization vs Reflection in C#

XML Serialization vs Reflection in C# - c#

XML Serialization from MSDN:
Serializes and deserializes objects
into and from XML documents. The
XmlSerializer enables you to control
how objects are encoded into XML.
Reflection from MSDN
Reflection provides objects (of type
Type) that encapsulate assemblies,
modules and types. You can use
reflection to dynamically create an
instance of a type, bind the type to
an existing object, or get the type
from an existing object and invoke its
methods or access its fields and
properties. If you are using
attributes in your code, Reflection
enables you to access them.
As far as my understanding goes, I could create objects in run time using XML Serialization? In other words, let's say I have a database, I could define my "classes" or "objects" in couple of tables. I could then get the XML for the object's data and then create the object at run-time.
I could also already have those objects compiled as libraries readily available and then use Reflection to access it's functions.
From your understanding, which one of these two concepts would grant the most flexibility while sacrificing the least performance? Bonus points if you can provide a detailed explanation with considerations and perhaps a sample of code.

Serialization and Reflection are not mutually exclusive. You could definitely serialize and deserialize an object and then subsequently modify it using Reflection.
Serialization
Serialization is simply the concept of taking a 'snapshot' of an object's state so that you can potentially restore that snapshot at a later time.
If you wish to store objects in a persistent store, serialization is a good option if you don't need to be able to query after particular values.
Note that there are at least two different types of serialization:
XML Serialization, that represents an object as XML. Since it is XML, this representation is (in theory at least) human-readable and interoperable.
Binary serialization, that simply stores and reads an object as an array of bytes. This representation is proprietary and not human-readable.
Reflection
Reflection is the ability to use object metadata to manipulate an object. You could, for example, decide that you want to assign the string "Foo" to all writable string properties of a given object, irrespective of the type of object.
This is mostly interesting when the type of object is not known at design time.

To use deserialize an object you need to know the type you want to deserialize it into. So you can't just create you objects from xml without having its type defined in an assembly.
What you can do is to store xml in your database that represents serialized objects. You could then load the xml from the database and deserialize it into object instances as needed. The source of the xml need not be a serialized object, so you can create it manually if you need.
Using reflection would be a different situation. With reflection you could take an object and get a list of the method and properties available on the object. You could then write code that worked dynamically with your objects regardless of which type they implement. The problem with this approach is that the code is clumsy to write and read, and it is very easy to introduce errors that are only visible at run time. On top of that the code will run slowly due to the overhead introduced by using reflection.
Instead of using reflection I would have my objects implement some well known interfaces that I could cast them to. That would allow my code to be type-safe and I could avoid the hassle of reflection. The code would also run much faster and be more readable.

You cannot create new types on the fly using neither XML serialization nor reflection. These techniques only applies to existing types. If you need to create new types at runtime you will have to use another approach. However, generating types of the fly is of limited usefulness since you can only use reflection to access these types. Using the dynamic runtime in the next major release of .NET will give you more options for creating and using dynamic types.
XML serialization is for serializing objects to and from a well known format (XML). Reflection is much more general and enables you to inspect type information at runtime and manipulate objects without knowing their type at compile time. You can also do serialization using reflection, but it is much more cumbersome compared to XML serialization.

Related

JSON C# Classes

I am building a C# integration tool, but I am having some trouble figuring out if I should create different classes for the data that I am receiving from different requests from the source application using REST. The responses are similar in a way that the constructs are the same, but for different information. I.e they would have an "Attributes" tag, but the attributes may vary per class. In the same breath, about 60% or more of the attributes are the same.
It looks like they reused the same constructs, but depending on the data, there are may be more things in the result.
My question is, what is the best practice when creating the classes for the JSON Deserialisation? Do you create multiple classes with the same name and same content(diff namespaces), or do you combine the classes into a "Generic" data type and just include the "extra" attributes, even though they wont all be used by one object.
The assumption is that the "null" values will not be considered in the deserialisation. Thus "extra" fields defined will just be ignored if not found.
The problem comes in the Classes where I would like to be able to define DataType1 and DataType2, but when combining the classes this becomes a problem...
Would like to hear your thoughts :)
Rgs,
Francois

Personally I prefer to deserialize in generic classes (lists and dictionaries or whatever your deserialization library offers) and then manually copy the data to whatever further data structures I use internally. Most of the time the "deserialization classes" really are used just for deserialization and the after that the data is immediately copied to further data structures that don't match the deserialization structures. So there's very little value to them.

Non-intrusive XML Serialization techniques?

I have long held the believe that your domain model should not be responsible for serializing itself to XML. I have used the IXmlSerializable interface in the past to control how my objects are serialized but ideally I'd prefer the nuts and bolts of the serialization to live outside the object.
However I've never been able to actually implement this in a clean manner and I was wondering if there was any patterns I was overlooking to make this happen. Basically I want my object model to do it's thing and be oblivious to XML serialization (or any other serialization for that matter) and then handed off to some service that spiders the object and serializes it.
I've tried doing this with extension methods but this falls short when you want to serialize a collection of type object. I've looked at doing it with object wrappers and DTO's that then serialize but then you've got the overhead of maintaining another set of objects and having to create these objects when you want to serialize which again can get messy when you have collections of type object.
The only other thing is using reflection but I'd worry about the processing overheads.
Is there a sane way to do what I'm asking or should I just bite the bullet and make my objects xml aware?

Using the System.Xml.Serialization Attributes is putting the nuts and bolts outside of your code. You are defining metadata and with the exception of optional parameters, no extra code is required. Implementing IXmlSerializable and doing the serialization by hand is error prone and should be avoided. Why? You are defining your data 3 times.
XML Schema
Class
Serialization code
Using attributes, you can scrub step 3.
XML and C# has an impedance mismatch. Like it or not, at some point, you will need to define the serialization to create the right document model.
Arguably, the classes you are serializing should not be performing any work. They are just a data store. Try abstracting your logic away from serialized objects - it may give you a warmer feeling.
Update
If you really, really hate attributes, try using the adapter pattern to serialize your model. The XML code will be in a separate class or assembly and you can work with your model across storage mediums. You will suffer the consequence of having to update the serialization separately when you update your model.

How to write an object of unknown type to a binary file?

Well this is basically like a generic binary writer... let's say you have an object, and you don't know what it is, but you have it. How do you write it's binary data to a binary file to be able to retrieve later?
My original idea that I don't know how to do was:
Figure out all the members of the object somehow (reflection maybe)
Unless the members are of types writable by the BinaryWriter, repeat step 1 on the member
Make a header that states the types of the members and how they are assembled into the object (somehow)
Write the header thing
Write all the core level members
I don't know how to use Reflection much so I'm not sure how to do most of the above.
It should be quite doable however.
How should I do this, if it's possible? Or how should I implement the above?
bin

Simplest approach is to use BinaryFormatter. However you should be very careful with any changes to your classes if you want to load instances saved by previous versions of your application.

The hard aspect is not writing out objects, but reading them back. The .NET framework provides various techniques for serialization and deserialization of class types which are supposed to automate the process, but all of the built-in techniques I'm familiar with have various limitations.
A major problem is that .NET makes no distinction between a storage location which holds a reference to an object for the purpose of identifying an object which is used by other code, for the purpose of only identifying immutable aspects of the object's state other than identity, or for the purpose of encapsulating the object's mutable state. Without knowing what a field is supposed to represent, it's not possible to know how it should be serialized or deserialized. For example, suppose that a particular type has a field of type int[], which holds a reference to a single-element array which holds the value 23. It may be that the purpose of that field is to hold the value 23, or it may be that the purpose of that field is to identify an array whose first element should be incremented every time something happens. In the former scenario, serialization should write out the fact that it's a single element array containing the value 23. In the latter scenario, if serialization is going to be possible at all, it will require knowing what is significant about the array to which the field holds a reference.
While various people have written various methods to automatically serialize various classes, I tend to be skeptical of such things. If one doesn't know what the fields of a class are used for, one should be cautious making any assumptions about what state is encapsulated thereby.

It might be possible with BinaryFormatter. But think of an object structure where you have many of your unknown objects which all reference a common object. If you serialize all of your unknown objects you end up with as many copies of the common object as there are unknown objects.
And there might be many fields of the unknown object which are not relevant as they are set by the constructor or other classes, they could be in an inconsistent state when deserialized.
So it might be not so hard to serialize them, but how do you want to deserialize them?

Convert from binary to XML serialization without needing the type

I have a project with a few types that I binary-serialize (BinaryFormatter) to files. I'd now like to create a second project which allows admins to temporarily decode those files into a more readable XML format (e.g., using XmlSerializer).
I could deserialize them into an object of the original type, then reserialize them, but is it at all possible to
skip the deserialization (at least in my own code), and
better yet, not have to reference the type at all in my decoder tool?

If you are referring to .NET's native binary serialization (BinaryFormatter), the problem is that it saves your object (along with all the necessary metadata for deserialization) using an undocumented format (AFAIK).
If you really want to try doing it without deserialization, you can check this article, which appears to have analyzed its format (but the author himself states that it might be incomplete). But my opinion is that it's far too much trouble.

There's no way that you can deserialize to your type without specifying that type, at least with standard XML serialization, however, as long as the object was serialized using xml, you could use one of many XML reader classes to traverse the object without needing to deserialize it. Alternatively, if you happen to be serializing to JSON, there are some libraries out there that will deserialize to anonymous types that you can use.
If the type of conversion you are looking for is purely cosmetic (e.g. to make it more human readable), you could write some RegEx to replace the element tags.

Serialization for document storage

I write a desktop application that can open / edit / save documents.
Those documents are described by several objects of different types that store references to each other. Of course there is a Document class that that serves as the root of this data structure.
The question is how to save this document model into a file.
What I need:
Support for recursive structures.
It must be able to open files even if they were produced from slightly different classes. My users don't want to recreate every document after every release just because I added a field somewhere.
It must deal with classes that are not known at compile time (for plug-in support).
What I tired so far:
XmlSerializer -> Fails the first and last criteria.
BinarySerializer -> Fails the second criteria.
DataContractSerializer: Similar to XmlSerializer but with support for cyclic (recursive) references. Also it was designed with (forward/backward) compatibility in mind: Data Contract Versioning. [edit]
NetDataContractSerializer: While the DataContractSerializer still requires to know all types in advance (i.e. it can't work very well with inheritance), NetDataContractSerializer stores type information in the output. Other than that the two seem to be equivalent. [edit]
protobuf-net: Didn't have time to experiment with it yet, but it seems similar in function to DataContractSerializer, but using a binary format. [edit]
Handling of unknown types [edit]
There seem two be two philosophies about what to do when the static and dynamic type differ (if you have a field of type object but a, lets say, Person-object in it). Basically the dynamic type must somehow get stored in the file.
Use different XML tags for different dynamic types. But since the XML tag to be used for a particular class might not be equal to the class name, its only possible to go this route if the deserializer knows all possible types in advance (so that he can scan them for attributes).
Store the CLR type (class name, assembly name & version) during serialization. Use this info during deserialization to instantiate the right class. The types must not be known prior to deserialization.
The second one is simpler to use, but the resulting file will be CLR dependent (and less sensitive to code modifications). Thats probably why XmlSerializer and DataContractSerializer choose the first way. NetDataContractSerializer is not recomended because its using the second approch (So does BinarySerializer by the way).
Any ideas?

The one you haven't tried is DataContractSerializer. There is a constructor that takes a parameter bool preserveObjectReferences that should handle the first criteria.

The WCF data contract serializer is probably closest to your needs, although not perfect.
There is only limited support for backwards compatibility (i.e. whether old versions of the program can read documents generated with a newer version). New fields are supported (via IExtensibleDataObject), but new classes or new enum values not.

I would think the XmlSerializer is your best bet. You won't be able to support everything on your requirements list without a bit of work in your Document classes - but the XmlSerializer architecture gives you extensibility points which should allow you to tap into its mechanism deep enough to do just about anything.
Using the IXmlSerializable interface - by implementing that on your classes you want to store - you should be able to do just about anything, really.
The interface exposes basically two methods - ReadXml And WriteXml
public void WriteXml (XmlWriter writer)
{
// do what you need to do to write out your XML for this object
}
public void ReadXml (XmlReader reader)
{
// do what you need to do to read your object from XML
}
Using these two methods, you should be able to capture the necessary state information from just about any object you might want to store, and turn it into XML that can be persisted to disk - and deserialized back into an object when the time comes!

XmlSerializer can work for your first criteria, however you must provide the recursion for objects like the TreeView control.
BinaryFormatter can work for all 3 criteria. If a class changes, you may have to create a conversion tool to convert old format documents to a new format. Or recognize an older format, deserialize to the old, and then save to the new - keeping your old class format around for a little while.
This will help cover version tolerance which is what I think you're after: MSDN - Version Tolerant Serialization

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

XML Serialization vs Reflection in C# - c#

Related

JSON C# Classes

Non-intrusive XML Serialization techniques?

How to write an object of unknown type to a binary file?

Convert from binary to XML serialization without needing the type

Serialization for document storage

Categories

Resources