Serializing a complex object with BinaryFormatter

Serializing a complex object with BinaryFormatter - c#

I am trying to serialize a complex object which contains 2 lists of complex objects using the code below
public static byte[] SerializeObject(object obj)
{
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream stream = new MemoryStream();
formatter.Serialize(stream, obj);
return stream.ToArray();
}
When I deserialize though I get NHibernate exceptions that my list objects failed to initialize, so I suspect that they haven't been serialized correctly in the first place. The error I receive is failure to lazily initialize a collection of some object, no session or session was closed.
But if they were properly serialized then there would not be a need to lazily initialize, they would already be there, right?

What may be happening here is that you are serializing NHibernate proxies for the collections. Depending on your mapping, for performance reasons, NHibernate will not load a collection up until the point you explicitly access its elements.
It's also able to do this for associations of various kinds (it's called 'lazy loading') and the way it works is that NHibernate actually instantiates and uses a proxy object that implements the correct interface (or derives from your classes in case of other associations).
You may already know all that, but I'm explaining it for context in case you don't.
If you need to know more about lazy loading check out this article: http://nhibernate.info/doc/howto/various/lazy-loading-eager-loading.html
In this case, NHibernate may be using a proxy for your lists and as the BinaryFormatter is accessing them in a non-conventional way that is what you end up serializing.
If this is the case, there are many ways in which you could go ahead and fix it and they depend on how you are structuring your project.
A quick way to confirm if this takes care of your issues is, before serializing your object go ahead and initialize its lazy properties (note that you need to do this for each one or recursively as the Initialize method will only load data for the proxy you give to it):
NHibernateUtil.Initialize(yourObject);
NHibernateUtil.Initialize(yourObject.List1);
NHibernateUtil.Initialize(yourObject.OtherList);
...etc...
SerializeObject(yourObject);

Related

Using a subset of an object in different services

lets say that I have a system that heavily relies on a single object, for example a Person.
I have a storage service (service 1) that handles basic CRUD operations for Person, and saves this person in mongo.
But this Person object is very big and nested, it has a lot of properties, some of them are irrelevant for some of the other services.
For example we have service 2 that gets a Person object from the storage service and renders it in the ui, he only cares about some of the properties and doesnt need the whole big and nested Person object.
And we have service 3 that gets a Person object from the storage service and needs a diff subset of properties then service 2.
We are using .Net so everything is strongly typed.
The straightforward solution is to define a subset of the Person class in each of the services, and have some converter that converts the Person object to the object this service needs (removing irrelevant properties). But some services need the exact Person object besides 5-10 properties, and as I said the Person is a huge nested object.
What is the best practice for this scenario? We dont want to re define a new “”mini person” for every service with its relevant properties because that feels like huge code duplication + creating heavy dependencies between every service and the storage service.
But we are using .Net so we have to have some strongly typed object, otherwise we wont be able to make any manipulations on the object we received from the storage service, considering we dont want to use it as plain json and just traverse the keys.
We thought of 2 solutions:
First is to use the same Person object between all services. Each service will get the person obj, so any manipulation it needs and then serialize it with a custom serializer that removes some keys from the json, this way the one whos getting the response will get only the relevant props.
Second is to add some kind of annotations to the props that says “if the req came from service 2 then do json ignore” and just dont serialize this prop in the return value from the storage service. But this makes the storage service notnot isolated and simple, and this way in service 2 we again cant deserialize and manipulate the obj cause we dont have a strongly typed “mini person”, so we have to use the json.
Is there a better known solution for this situation?
And again, this is under the assumption that the Person obj is huge and requires a lot of work do re define it again and again and will create heavy dependencies.
Thanks.

If we're talking best practices, maybe this Person object shouldn't have gotten so big to begin with. You can always break nested arrays and objects in their own separate files, entities or mongo collection.
But as it stands, maybe you could use dynamic or IDictionary<string, object> instead of creating a type Person and mapping every single strongly typed field in that class.
using System.Dynamic;
var person_from_database = person_repository.GetById(1);
dynamic person_result = new ExpandoObject();
person_result.name = person_from_database.fullname; //declare whatever properties your services need
person_result.first_address = person_from_database.addresses[0]; //even nested arrays and objects
person_result.name = person_result.name.ToUpper(); //modify values if needed
return Ok(person_result); //return whatever, no mapping
More details here!

How to serialize an object that has a constantly updated collection

I have a cache service that holds multiple Price objects which are updated as new price deltas arrive, sometimes multiple times a second.
Each object holds it's various prices in a collection assigned to an ID. If someone subscribes to a particular price I need to serialize the latest price object into JSON each time a new price arrives in order to send it over RMQ. The problem I am having is that in some cases I receive the following error message while serializing because a new price has arrived and updated the collection on the object during the serialization of the previous.
"Collection was modified; enumeration operation may not execute."
I've tried various ways of serializing the object (it needs to be as fast as possible) but I still get the same issue.
What would be the best and most efficient way of solving this so I can serialize even if the object changes.
The simplified objects are:
//This is the collection on an object that holds the prices which are being updated
public ConcurrentDictionary<Id, Prices> Asset{ get; set; }
//Class that holds the ever updating prices
[Serializable]
public class Prices
{
public Prices()
{
Prices1 = new List<PriceVolume>();
Prices2 = new List<PriceVolume>();
}
}
thanks in advance!

Instead of serializing the actual object you pull from the concurrent dictionary you should create a deep copy of it and serialize the copy. unfortunately you will still have to put the code to get the copy inside a mutex. The ConcurrentDic is only protecting you from having an item changed or deleted while you are retrieving it, it does not protect the object from manipulation after you have retrieved your reference to it.

You could probably benefit from locking the element while you are serializing it.
Locking prevents other threads from modifying the element while you are inside the lock.
This will make the operations trying to change the prices wait until you are done serializing it to modify them.
[Serializable]
public class Prices
{
public string Serialize()
{
lock (this)
{
// logic for serilization here
}
}
}

Just a thought, but how about looking into Serialization Callbacks (some refer to this as Serialization Hooks) and Implementing the ISerializable Interface. You seem to need more fine-grained control over the serialization of your object. Have a look at this link:
Custom Serialization
Look at the following
OnDeserializingAttribute (Before deserialization)
OnDeserializedAttribute (After deserialization)
OnSerializingAttribute (Before serialization)
OnSerializedAttribute (After serialization)
Also have a look at this link:
Version Tolerant Serialization
You can consider if the object warrants the use of the OptionalFieldAttribute or NonSerializedAttribute on certain fields to control if they need to be serialized or optionally serialized. Just be wary of the use of the NonSerializedAttribute. Have a look at the best practices mentioned in the article (reproduced here for reference):
To ensure proper versioning behavior, follow these rules when modifying a type from version to version:
Never remove a serialized field
Never apply the NonSerializedAttribute attribute to a field if the attribute was not applied to the field in the previous version
Never change the name or the type of a serialized field
When adding a new serialized field, apply the OptionalFieldAttribute attribute
When removing a NonSerializedAttribute attribute from a field (that was not serializable in a previous version), apply the OptionalFieldAttribute attribute
For all optional fields, set meaningful defaults using the serialization callbacks unless 0 or null as defaults are acceptable
To ensure that a type will be compatible with future serialization engines, follow these guidelines:
Always set the VersionAdded property on the OptionalFieldAttribute attribute correctly
Avoid branched versioning

Problems trying to pre-load objects to a text file for faster load times

I have been working on a Windows Form Control project to import into a 3rd party client software using their supplied SDK. The custom control written by yet another company I am trying to load requires sign on to a server before displaying information, which can take 20-30 seconds. In order to speed things up I had the idea to pre-load information needed by the control to a text file. Since it is not a known type it is throwing errors when trying to serialize the class.
I have a Dictionary I am using to reference back to the proper ICamera class. If I change "cam" from an ICamera type to a string, for example "cam.GetLiveURL()". It writes the text file without issue. This is the code I am using to populate the Dictionary.
foreach (ICamera cam in _adapter.Cameras())
{
OCCamera.Add(cam.GetDisplayName(), cam);
}
I have tried XMLSerializer, and it seems it has difficulty dealing with a Dictionary.
I have tried BinaryFormatter and get the error:
Type 'OCAdapter.OCCamera' in Assembly 'OCAdapter.dll' in not marked as serializable.
I have tried DataContractSerializer and get the error:
Type 'OCAdapter.OCCamera' with data contract name
'OCCamera:http://schemas.datacontract.org/2004/07/OCAdapter' is not
expected. Consider using a DataContractResolver or add ant types not
known statically to the list of known types - for example, by using
the KnownTypeAttribute attribute or by adding the to the list of known
types passed to DataContractSerializer.
I have tried playing around with the DataContractResolver and can not seem to get it to work, I do not understand it at all.
The code I am using for the BinaryFormatter and DataContractSerializer are straight from MSDN or elsewhere, and test fine without the custom type.
Maybe there is a better way to handle all this, and I am missing it. I am not opposed to ditching the Dictionary route for something else, or I can rewrite any amount of other code to make this work.

Mistake 1: trying to serialize your implementation rather than the *data.
Mistake 2: using BinaryFormatter... just about ever (except maybe AppDomain marshalling)
My advice: create a simple model ("DTO" model) that just represents the data you need, but not in terms of your specific implementation (no OCAdapter.OCCamera etc). You can construct this DTO model in whatever way is convenient for whatever serialization library you like. I'm partial to protobuf-net, but many others exist. Then map to/from your DTO model and your implementation model.
Advantages:
it'll work
changes to the implementation don't impact the data; it only impacts the mapping code
you can use just about any serializer you want
you can version the data sensibly

Deserialize binary data without knowing the exact type written

I've encountered a problem where a small number of data objects stored using a BinaryFormatter are coming back with parameters missing (null/default).
I'd like to know if the missing items were saved as null, or if the objects that were serialized were changed from the versions in source control and then reverted before a code commit (eg int numDogs vs unsigned int dogCount).
The former would represent a serious bug in the data validation code ran before the serialization was done; while the latter is just junk data in a test DB and ignorable.
Since the BinaryFormatter is able to get everything else out when a member is changed, added, or removed I assume it's writing objects in a form similar to a key value store. Is there any way to get a human readable representation of it, without having to try and guess the exact details of the object that was serialized?

If you implement ISerializable on your objects, you can have a look at what's been serialized by trying to deserialize.
You will need to add a constructor with the same signature as ISerializable.GetObjectData - this is where deserialization occurs.

Serialization - Viewing the Object Graph from a Stream

I'm wondering if there's a way in which I can create a tree/view of a serialised object graph, and whether anyone has any pointers? EDIT The aim being that should we encounter a de-serialization problem for some reason, that we can actually view/produce a report on the serialized data to help us identify the cause of the problem before having to debug the code. Additionally I want to extend this in the future to take two streams (version 1, version 2) and highlight differences between the two of them to help ensure that we don't accidently remove interesting information during code changes. /EDIT
Traditionally we've used Soap or XML serialization, but these are becoming too restricted for our needs, and Binary serialization would generally do all that we need. The reason that this hasn't been adopted, is because it's much harder to view the serialized contents to help fix upgrade issues etc.
So I've started looking into trying to create a view on the serialized information. I can do this from an ISerializable constructor to a certain extent :
public A(SerializationInfo info, StreamingContext context)
{}
Given the serialization info I can reflect the m_data member and see the actual serialized contents. The problem with this approach is
It will only display a branch from the tree, I want to display the entire tree from the root and it's not really possible to do from this position.
It's not a convenient place to interrogate the information, I'd like to pass a stream to a class and do the work there.
I've seen the ObjectManager class but this works on an existing object graph, whereas I need to be able to work from the stream of data. I've looked through the BinaryFormatted which uses an ObjectReader and a __BinaryParser, hooking into the ObjectManager (which I think will then have the entire contents, just maybe in a flat list), but to replicate this or invoke it all via reflection (2 of those 3 classes are internal) seems like quite a lot of work, so I'm wondering if there's a better approach.

You could put a List<Child class> in every parent class (Even if there the same)
and when you create a child you immediately place it in that list or better yet declare it whilst adding it the list
For instance
ListName.Add(new Child(Constructer args));
Using this you would serialize them as one file which contains the hierarchy of the objects and the objects themselves.
If the parent and child classes are the same there is no reason why you cannot have dynamic and multi leveled hierarchy.

In order to achieve what you describe you would have to deserialize whole object graph from stream without knowing a type from which it was serialized. But this is not possible, because serializer doesn't store such information.
AFAIK it works in a following way. Suppose you have a couple of types:
class A { bool p1 }
class B { string p1; string p2; A p3}
// instantiate them:
var b = new B { p1 = "ppp1", p2 = "ppp2", p3 = new A { p1 = true} };
When serializer is writing this object, it starts walking object graph in some particular order (I assume in alphabetic order) and write object type and then it's contents. So your binary stream will like this:
[B:[string:ppp1][string:ppp2][A:[bool:true]]]
You see, here there are only values and their types. But order is implicit - like it is written.
So, if you change your object B, to suppose
class B { A p1; string p3; string p3;}
Serialzer will fail, because it will try to assing instance of string (which was serialized first) to pointer to A. You may try to reverse engineer how binary serialization works, then you may be able to create a dynamic tree of serialized objects. But this will require considerable effort.
For this purpose I would create class similar to this:
class Node
{
public string NodeType;
public List<Node> Children;
public object NodeValue;
}
Then while you will be reading from stream, you can create those nodes, and recreate whole serialized tree and analyze it.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.