Binary (De)Serialization of a stream of objects to 1 file

Binary (De)Serialization of a stream of objects to 1 file - c#

I am faced with the following problem. I need to (de)serialize (binary) a stream of objects to a single file on disk. Serialization part is not an issue, just open a stream in append mode and use .Net's BinaryFormatter Serialize method and you are done. The problem with this approach is that I can't just give this stream to BinaryFormatter's deserialize function, what it contains is not a single instance of the object I've serialized.
Does a common solution to this problem exists? All objects serialized to a given stream are of the same type, so at least we don't need to figure out what is to be deserialized, that's a given, but it doesn't seem to suggest a way out of this to me.
Clarification based on replies: The number of objects sent in is expected to be large and it is therefore infeasible to hold them all in a wrapper collection (as flushing to disk would require to load them all into memory -> add the new ones -> flush to disk).
Normally when you serialize a single object you get a file that contains:
[Object]
What I am creating is a file that contains:
[Object][Object][Object][Object]...[Object]
And I need to deserialize the individual Object instances.
Thanks in advance!
Answer: Since the answer is alluded to in this thread (with sufficient clarity), but never explicitly stated, I thought I'll state it here:
while (fileStream.Position < fileStream.Length)
messages.Add((Message)formatter.Deserialize(fileStream));
The BinaryFormatter will deserialize one object at a time as desired :) You might want to cache the fileStream.Length property, since the length appears to be re-computed every time you call the property, slowing things down. I've got no clue why that didn't work the first time I tried it before posting this question, but it does work flawlessly now.

Try putting your objects into a serializable collection (I believe List is serializable), then (de)serializing that object
EDIT in response to clarification:
I just realized that this question has the same answer as this question. Rather than try and reinvent an answer, I'd just take a look at Mark Gravell's answer, or to this one

A file is a serialization, so I would think sending your stream directly to a file would do what you seem to want. A more specific explanation of your situation would help, to provide a more useful answer. (I wish I could enter this as a 'comment', but somehow the comment button is not available to me.)

Related

c# can a saved List<OwnObject> be deserialized in another application?

Can a List that is saved in a .txt file be opened and deserialized in another application?
The thing is that I have an .txt file that I know contains a List that has stored Products in it: List<Products>.
This list needs to be opened in an other application so the itemvalues from a product in the list can be listed.
I tried with the code below, but obviously it does not work because the object is an own created object with the name Product.
Stream s = File.Open(directory, FileMode.Open);
BinaryFormatter bf = new BinaryFormatter();
List<object> fromFile = (List<object>)bf.Deserialize(s);
str.Close();
I've tried to search for answers myself but I can't find any similar questions or answers and I don't know how to solve this issue, can you guys help me out?

You'll need three things to be commonly understood by each application:
The serialization/deserialization algorithm
The format of data (XML, JSON, proprietary text, binary)
The data type OwnType
As comments indicate above, the first two are easily solvable and there are established libraries to help you with them.
#3 is more interesting, especially when:
a) You're implementing each application in a different language, and/or
b) The type OwnType evolves over time, making it inconsistent with files serialized in previous versions
Attempts to solve this entire set of problems have been made in several forms - there's Google Protobuf and the almost-extinct CORBA, for example.

How to recognize versions of objects placed in isolated storage using .NET runtime serialization?

We are building application that stores objects to isolated storage using .NET runtime serialization.
Problems occur when we update application by adding some new properties to the classes of objects we are serializing. So we want to do some kind of versioning of the objects in isolated storage so we can check if they are obsolete before they are deserialized.
Any advice and ideas how to do this on best possible way?
What do you think about custom formatter implementing IFormatter interface and can it help instead of vesioning objects?
I wrote about this issue on MS forum more detailed here.

You COULD have a serialization in the serialization. First a wrapper class telling the version, and holding the inner true class.
This however feels a bit bad smelly..

Here are a few options (at in any particular order).
Name the file based on the version
Place the file in a directory based on a version
Create a wrapper object that contains metadata about each serialized object such as the version number.
Add a property to each object that contains the persisting application's version number

If its binary serialization, you could read the bytes directly, and determine the assembly version from this. Byte number 22 onwards contains information on the assembly and object type, so you could write something that would read this, and then determine if your objects are obsolete.

Marc Gravell was propose in comment great idea to use version-tolerant serializer.
It enables enough control of deserialization for us even to make obsolete objects reusable.
More on msdn
Thanks to all for suggestions.

Any way to "save state" in a C# game?

It's ok if the answer to this is "it's impossible." I won't be upset. But I'm wondering, in making a game using C#, if there's any way to mimic the functionality of the "save state" feature of console emulators. From what I understand, emulators have it somewhat easy, they just dump the entire contents of the virtualized memory, instruction pointers and all. So they can resume exactly the same way, in the exact same spot in the game code as before. I know I won't be able to resume from the same line of code, but is there any way I can maintain the entire state of the game without manually saving every single variable? I'd like a way that doesn't need to be extended or modified every single time I add something to my game.
I'm guessing that if there is any possible way to do this, it would use a p/invoke...

Well, in C# you can do the same, in principle. It's called serialization. Agreed, it's not the exact same thing as a memory dump but comes close enough.
To mark a class as serializable just add the Serializable attribute to it:
[Serializable]
class GameState
Additional information regarding classes that might change:
If new members are added to a serializable class, they can be tagged with the OptionalField attribute to allow previous versions of the object to be deserialized without error. This attribute affects only deserialization, and prevents the runtime from throwing an exception if a member is missing from the serialized stream. A member can also be marked with the NonSerialized attribute to indicate that it should not be serialized. This will allow the details of those members to be kept secret.
To modify the default deserialization (for example, to automatically initialize a member marked NonSerialized), the class must implement the IDeserializationCallback interface and define the IDeserializationCallback.OnDeserialization method.
Objects may be serialized in binary format for deserialization by other .NET applications. The framework also provides the SoapFormatter and XmlSerializer objects to support serialization in human-readable, cross-platform XML.
—Wikipedia: Serialization, .NET Framework

If you make every single one of your "state" classes Serializable then you can literally serialize the objects to a file. You can then load them all up again from this file when you need to resume.
See ISerializable

I agree with the other posters that making your game state classes Serializable is probably the way you want to go. Others have covered basic serialization; for a high end alternative you could look into NHibernate which will persist objects to a database. You can find some good info on NHibernate at these links:
http://www.codeproject.com/KB/database/Nhibernate_Made_Simple.aspx
http://nhibernate.info/doc/burrow/faq

How can I get the size of an object in the HttpRuntime.Cache?

I am currently storing many different types of objects in the ASP.NET HttpRuntime.Cache and I was wondering if there is a way to figure out how big each object is?

Look at these questions:
Getting the size of a field in bytes with C#
Find out the size of a .net object
In particular, look for Jon Skeet's answer in those questions. They will tell you why the number won't be accurate.
As for getting an estimate, there is no way to do that unless there are certain criteria to be met for your object.
For instance, if you have many objects in the cache, sharing references to some common object instances, serializing out one of those objects in the cache will serialize out a copy of those common objects as well, inflating the results.

One thing you can do is serialize the object to a file on disk. That should give you an idea.
This cache manager from ASP Alliance might help as well

Serialization byte array vs XML file

I am heavily using byte array to transfer objects, primitive data, over the network and back. I adapt java's approach, by having a type implement ISerializable, which contains two methods, as part of the interface, ReadObjectData and WriteObjectData. Any class using this interface, would write date into the byte array. Something Like that
class SerializationType:ISerializable
{
void ReadObjectData (/*Type that manages the write/reads into the byte array*/){}
void WriteObjectData(/*Type that manages the write/reads into the byte array*/){}
}
After write is complete for all object, I send an array of the network.
This is actually two-fold question. Is it a right way to send data over the network for the most efficiency (in terms of speed, size)?
Would you use this approach to write objects into the file, as opposed to use typically xml serialization?
Edit #1
Joel Coehoorn mentioned BinaryFormatter. I have never used this class. Would you elaborate, provide good example, references, recommendations, current practices -- in addition to what I currently see on msdn?

This should be fine, but you're doing work that is already done for you. Look at the System.Runtime.Serialization.Formatters.Binary.BinaryFormatter class.
Rather than needing to implement your own Read/WriteOjbectData() methods for each specific type you can just use this class that can already handle most any object. It basically takes an exact copy of the memory representation of almost any .Net object and writes it to or reads it from a stream:
BinaryFormatter bf = new BinaryFormatter();
bf.Serialize(outputStream, objectToSerialize);
objectToDeserialize = bf.Deserialize(inputStream) as DeserializedType;
Make sure you read through the linked documents: there can be issues with unicode strings, and an exact memory representation isn't always appropriate (things like open Sockets, for example).

If you are after simple, lightweight and efficient binary serialization, consider protobuf-net; based on google's protocol buffers format, but implemented from scratch for typical .NET usage. In particular, it can be used either standalone (via protobuf-net's Serializer), or via BinaryFormatter by implementing ISerializable (and delegating to Serializer).
Apart from being efficient, this format is designed to be extensible and portable (i.e. compatible with java/php/C++ "protocol buffers" implementations), unlike BinaryFormatter that is both implementation-specific and version-intolerant. And it means you don't have to mess around writing any serialization code...

Creating your own ISerializable interface when there's already one in the framework sounds like a bit of a recipe for disaster. At least give it a different name.
You'll have a bit of a problem when it comes to reading - you won't have an instance to call the method on. You might want to make it a sort of "factory" instead:
public interface ISerializationFactory<T>
{
T ReadObjectData(Stream input);
void WriteObjectData(Stream output);
}
As for XML vs binary... it entirely depends on the situation: how much data will there be, do you need backwards and forwards compatibility, does the XML serialization in .NET give you enough control already etc.

Yes this will be faster than sending XML as you will be sending less data over the wire. Even if you compressed the XML (which would drastically reduce its size) you would still have the overhead of compression and decompression. So I would say that between what you are currently doing and XML serialization you are currently using the most efficient solution.
However I am curious as to how much of a performance hit you would incur by using XML instead of a marshaled object. The reason that I would encourage you to look into XML serialization is because you will be storing the data in an application-neutral format that is also human readable. If you are able to serialize the data to XML in a way that does not incur performance penalties in your application I would recommend that you look into it.

Regarding writing to file, generally you want to serialize an object to XML if you want to be able to read the serialization or perhaps alter it. If you have no desire for the serialization to be human readable, you might as well reuse your binary serialization.
If you do want it to be human readable, then XML is something to consider, but it depends on the type of data you need to serialize. XML is inherently recursive and is therefore good for serializing likewise recursive data. It's less of a good fit on other types of data.
In other words, pick a persistent serialization that suits your needs. There's no one-way-fits-all solution here.
As for network, generally you'll want to keep size to a minimum, so XML is usually never a good choice due to its verbosity.

Serialization (in Java) is deceptively simple. As long as you do simple stuff (like never change the class) it is easy - but there are a number of "fun" things with it too.
Foe a good discussion on Java serialization look at Effective Java (specifically chapter 10).
For C#, not sure, but likely the core issues are the same.
There is an example here on C# serialization: http://www.codeproject.com/KB/cs/objserial.aspx.

XStream library provide an exceptionally good way of dealing with serialisation including support for XML, JSON and supporting custom converters. Specifically, the use of custom converters allowed us to reduce XML verbosity and to serialise strictly what is needed.
XStream has no requirement to declare everything as Serializable, which is very important when one utilises a third-party lib and needs to serialise an instance of a class from that lib, which is not declared as Serializable.
The answer is already accepted, but for the sake of completeness of this discussion here is a link to a good comparison between different serialisation approaches/libraries:
http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
The kryo library looks very compelling for Java serialisation. Similarly to XStream is supports custom converters.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Binary (De)Serialization of a stream of objects to 1 file - c#

Related

c# can a saved List<OwnObject> be deserialized in another application?

How to recognize versions of objects placed in isolated storage using .NET runtime serialization?

Any way to "save state" in a C# game?

How can I get the size of an object in the HttpRuntime.Cache?

Serialization byte array vs XML file

Categories

Resources