C# best practice when serializing objects to file

C# best practice when serializing objects to file - c#

I'm building a small app that needs to save an object to a file in order to save user data.
I have two questions about my serialization to this file :
The object I'm creating has some public properties and an event. I added the [Serializable] attribute to my object, and then realized I can't serialize an object with an event in it.
I then discovered that I can just add an attribute above my event [field:NonSerialized] and it will work. Is this the best way to do this, or should I try to build my Serializable objects without any events inside ?
The object I'm serializing saves some user settings about the app. These settings aren't sensitive enough to go about encrypting them in the file, but i still don't want them to be tampered with manually without opening my application. When i serialize my object to a file using a plain BinaryFormatter object, via the Serialize() method, I see readable names of .net object types in the file i'm saving this to. Is there a way for someone to reverse engineer this and see what's being saved without using my program ? Is there a way for someone to build a small application and find out how to DeSerialize the information in this file ? If so, how would i go about hiding the information in this file ?
Are there any other tips/suggestions/best practices i should stick to when going about serializing an object to a file in this kind of scenario ?
Thanks in advance!

If your object implements the ISerializable interface, you can control all the data that is stored/serialized yourself, and you can control the deserialization.
This is important if your project evolves in time. Because you might drop some properties, add others, or change the behaviour.
I always add a version to the serialization bag. That way I know what was the version of the object when it was stored, and I therefor know how to deserialize it.
[Serializable]
class Example : ISerializable {
private static const int VERSION = 3;
public Example(SerializationInfo info, StreamingContext context) {
var version = info.GetInt32("Example_Version", VERSION);
if (version == 0) {
// Restore properties for version 0
}
if (version == 1) {
// ....
}
}
void ISerializable.GetObjectData(SerializationInfo info, StreamingContext context) {
info.AddValue("Example_Version", VERSION);
// Your data here
}
}
And if you do not encrypt, it will be very easy to "read" your data. Very easy meaning you might have to invest a couple of hours. If the data you store is worth a couple of days, this means it is easy, if it is only worth a couple of minutes it is hard. If you get the point.
A very easy way to encrypt your data is using the Windows DPAPI through the ProtectedData class.

1: with BinaryFormatter, yes - you need NonSerialized for events (unless you implement ISerializable, but that adds lots of work); however I'm pretty much on-record as saying that I simply wouldn't use BinaryFormatter here. It is not very forgivig for a range of changes to your type. I would use something less tied to the internals of your code; XmlSerializer; DataContractSerializer, JavaScriptSerializer. I can suggest binary alternatives too; NetDataContractSerializer, protobuf-net (my own), etc.
2: yes, with almost any implementation that doesnt involve proper encryption, if anyone cares they can reverse engineer and obtain the strings. So it depends how hidden it needs to be. Simply running your existing serialization through GZipStream may be enough obfuscation for your needs, BUT this is just a mask against casual inspection. It will not deter anyone with a reason to look for the data.
If the data needs to be secure, you'll need proper encryption using either a key the user enters at app startup, or something like a certificate securely stores against their user-profile.

I would remove the events from the objects. It's a little cleaner that way.
Anything can be reverse engineered. Just encrypt it when saving the file. It's pretty easy to do. Of course, the encryption key is going to have to be stored in the app somewhere, so unless you're obfuscating your code a determined hacker will be able to get to it.

Related

Why use serialization

I've seen couple of examples with serializable attribute like this:
[Serializable()]
public class sampleClass
{
public int Property1{ get; set; }
public string Proerty2{ get; set; }
public sampleClass() { }
public sampleClass(int pr1, int pr2)
{
pr1 = pr1;
Name = pr2.ToString();
}
}
I never had a good grasp on understanding of how this works, but from msdn:
Serialization allows the developer to save the state of an object
and recreate it as needed, providing storage of objects as well as
data exchange. Through serialization, a developer can perform actions
like sending the object to a remote application by means of a Web
Service, passing an object from one domain to another, passing an
object through a firewall as an XML string, or maintaining security or
user-specific information across applications.
But the problem is that in my code example I see no use for it. Just an object that is used to retrieve data from the database, nothing special. What are some other uses on when to use and when not to use serialization.
For example, should I always use serializzation because it is more secure? is it goin to be slower this way?
Update: Thanks for all nice answers

Serialization is useful any time you want to move a representation of your data into or out of your process boundary.
Saving an object to disk is a trivial example you'll see in many tutorials.
More commonly, serialization is used to transfer data to and from a web service, or to persist data to or from a database.

Several answers have covered the reasons of why you might want to use serialization in general. You seem to also want to know why a specific class has attribute [Serializable] and you are wondering why that may have been done.
With ASP.NET the default Session state storage is InProc which allows you to store any object as a reference and leave it on the heap. This is the best performing way to store session state, however, it only works if you are using a single worker thread or if all your session state could be rebuilt automatically if the worker thread were to change (unlikely). For the other state storage modes (StateServer and SQL Server) all the session state objects must be serializable as the ASP.NET engine will first serialize these objects using binary serialization before sending them to the storage medium.
In your case, you may be using InProc. One reason though to still mark all classes that are used in session state as Serializable and test them that way is that you may have a need to change this in the future (for example, to use a Web Farm). If you do not design your session state classes with this in mind it will be quite difficult to do the migration in the future.
Also, just because you can remove the Serializable attribute and the program "works" in one environment does not mean that it will work in another environment. For example, it may work fine for you under Visual Studio test web server (which always uses InProc session state mode) instance and even in a development IIS instance but then, perhaps a production IIS instance is setup to use a different storage mode.
These environmental/configuration differences are not necessarily limited to ASP.NET applications. There are other application engines that may do this or even standalone applications that do (it is not difficult to build this kind of configurable environment).
Finally, you may be working with a library which may be consumed by different applications. Some may need to store state in a serializable manner and others may not.
Because of these factors it is often a very good idea, at least when building a library, to consider marking simple value classes or state management classes with [Serializable]. Keep in mind that this increases the work for testing these classes and there are limits to what can be serialized (i.e. a class that contains a socket reference or open file reference may not be a good candidate for serialization as open external resources cannot be serialized) so do not overuse this.
You asked if using [Serializable] will be slower. No, it will not be. This attribute has no direct affect on performance. However, if the application environment is changed to serialize the object, then yes, performance will be affected. It is the act of serializing and deserializing that is slower than just storing the object on the heap. [Note that some routines could be written to look for the Serializable attribute and then choose to serialize but this is rare; usually it is like ASP.NET and left up to an administrator or user to decide if they want to change the store medium.]

The MSDN quote you provide explains when serialization is useful: for transporting or storing data. Writing to a file is serialization, and serialization is required t send an object over a network.
If you are just populating the object in a single application, perhaps from a database, then indeed: serialization is not a concern at all. Imaging a class for serialization has no impact on security or performance: if you don't need it, don't worry about it.
Note also that [Serializable] mainly relates to BinaryFormatter, but there are actually many more serializers than that. For example: you might want to expose your object via JSON or XML - both of those require serialization, but neither requires [Serializable].

Simple example: Imagine you have a custom shape to store application settings.
namespace My.Namespace
{
[Serializable]
public class Settings
{
public string Setting1 { get; set; }
public string Setting2 { get; set; }
}
}
You could then have a file an xml file as such:
<?xml version="1.0" encoding="utf-8" ?>
<Settings>
<Setting1>Foo</Setting1>
<Setting2>Bar</Setting2>
</Settings>
Using XmlSerializer you could simply serialize and deserialize your settings.
It is also necessary for your shape to be Serializable if you wish to stuff it into ASP.NET ViewState
These are very basic examples but demonstrate it's usefulness

What are some other uses on when to use and when not to use serialization.
Let me give you one practical example. In one of my application, I was given XML schemas (XSD files) for request and response XML files. I need to parse the request XML file, process and save the information back into several tables. Later I need to prepare response XML file accordingly and send it back to our client.
I used Xsd2Code to generate C# classes based on the schema. So parsing the request XML file is simply deserializing it to the generated request class object. Then I can access properties from the object the way it appears in request XML file. While generating response XML file is simply serializing from the generated response class object which I populate in my code. This way I can work with C# objects rather than XML files. I hope it makes sense.
For example, should I always use serializzation because it is more secure
I don't think this is related to security in any way.

protobuf-net: how to store in the users session

I'm currently able to store an object I've created into HttpContext.Current.Session, and I've come across protobuf-net. Is there a way to store my object by serializing it with protobuf?
It looks like protobuf wants to store the information into a Stream, so should I (can I?) store a Stream object into the users session? Or should I first convert it from a Stream into another object type? If so, will converting the serialized object circumvent the original purpose of using protobuf (cpu usage, memory usage)? Has anyone done this before?
My goal is to use protobuf as a compression layer for storing information into the users session. Is there a better way (smaller sizes, faster compression, easier to maintain, smaller implementation overhead) of doing this, or is protobuf the right tool for this task?
Update
I'm using this class object
[Serializable]
public class DynamicMenuCache
{
public System.DateTime lastUpdated { get; set; }
public MenuList menu { get; set; }
}
This class is a wrapper for my MenuList class, which is (basically) a List of Lists containing built-in types (strings, ints). I've created the wrapper to associate a timestamp with my object.
If I have a session cache miss (session key is null or session.lastUpdated is greater than a globally stored time), I do my normal db lookup (MSSQL), create the MenuList object, and store it into the session, like so
HttpContext.Current.Session.Add("DynamicMenu" + MenuType, new DynamicMenuCache()
{
lastUpdated = System.DateTime.Now,
menu = Menu
});
Currently our session is stored in memory, but we might move to a DB session store in the future.
Our session usage is pretty heavy, as we store lots of large objects into it (although I hope to cleanup what we store in the session at some future point).
For example, we store each user's permission set into their session store to avoid the database hit. There are lots of permissions and permission storing structs that get stored into the session currently.
At this point I'm just viewing the options available, as I'd like to make more intelligent and rigorous use of the session cache in the future.
Please let me know if there is anything else you need.

Note that using protobuf-net here mainly only makes sense if you are looking at moving to a persisted state provider at some point.
Firstly, since you are using in-memory at the moment (so the types are not serialized, AFAIK), some notes on changing session to use any kind of serialization-based provider:
the types must be serializable by the provider (sounds obvious, but this has particular impact if you have circular graphs, etc)
because data is serialized, the semantic is different; you get a copy each time, meaning that any changes you make during a request are lost - this is fine as long as you make sure you explicitly re-store the data again, and can avoid some threading issues - double-edged
the inbuilt state mechanisms typically retrieve session as single operation - which can be a problem if (as you mention) you have some big objects in there; nothing to do with protobuf-net, but I once got called in to investigate a dying server, which turned out to be a multi-MB object in state killing the system, as every request (even those not using that piece of data) caused this huge object to be transported (both directions) over the network
In many ways, I'm actually simply not a fan of the standard session-state model - and this is before I even touch on how it relates to protobuf-net!
protobuf-net is, ultimately, a serialization layer. Another feature of the standard session-state implementation is that because it was originally written with BinaryFormatter in mind, it assumes that the objects can be deserialized without any extra context. protobuf-net, however, is (just like XmlSerializer, DataContractSerializer and JavaScriptSerializer) not tied to any particular type system - it takes the approach "you tell me what type you want me to populate, I'll worry about the data". This is actually a hugely good thing, as I've seen web-servers killed by BinaryFormatter when releasing new versions, because somebody had the audacity to touch even slightly one of the types that happened to relate to an object stored in persisted session. BinaryFormatter does not like that; especially if you (gasp) rename a type, or (shock) make something from a field+property to an automatically-implemented-property. Hint: these are the kinds of problems that google designed protobuf to avoid.
However! That does mean that it isn't hugely convenient to use with the standard session-state model. I have implemented systems to encode the type name into the stream before (for example, I wrote an enyim/memcached transcoder for protobuf-net), but... it isn't pretty. IMO, the better way to do this is to transfer the burden of knowing what the data is to the caller. I mean, really... the caller should know what type of data they are expecting in any given key, right?
One way to do this is to store a byte[]. Pretty much any state implementation can handle a BLOB. If it can't handle that, just use Convert.ToBase64String / Convert.FromBase64String to store a string - any implementation not handling string needs shooting! To use with a stream, you could do something like (pseudo-code here):
public static T GetFromState<T>(string key) {
byte[] blob = {standard state provider get by key}
using(var ms = new MemoryStream(blob)) {
return Serializer.Deserialize<T>(ms);
}
}
(and similar for adding)
Note that protobuf-net is not the same as BinaryFormatter - they have different expectations of what is reasonable, for example by default protobuf-net expects to know in advance what the data looks like (i.e. public object Value {get;set;} would be a pain), and doesn't handle circular graphs (although there are provisions in place to support both of these scenarios). As a general rule of thumb: if you can serialize your data with something like XmlSerializer or DataContractSerializer it will serialize easily with protobuf-net; protobuf-net supports additional scenarios too, but doesn't make an open guarantee to serialize every arbitrary data model. Thinking in terms of DTOs will make life easier. In most cases this isn't a problem at all, since most people have reasonable data. Some people do not have reasonable data, and I just want to set expectation appropriately!
Personally, though, as I say - especially when large objects can get involved, I'm simply not a fan of the inbuilt session-state pattern. What I might suggest instead is using a separate per-key data store (meaning: one record per user per key, rather than just one record per user) - maybe just for the larger objects, maybe for everything. This could be SQL Server, or something like redis/memcached. This is obviously a bit of a pain if you are using 3rd-party controls (webforms etc) that expect to use session-state, but if you are using state manually in your code, is pretty simple to implement. FWIW, BookSleeve coupled to redis works well for things like this, and provides decent access to byte[] based storage. From a byte[] you can deserialize the object as shown above.
Anyway - I'm going to stop there, in case I'm going too far off-topic; feel free to ping back with any questions, but executive summary:
protobuf-net can stop a lot of the versioning issues you might see with BinaryFormatter
but it isn't necessarily a direct 1:1 swap, since protobuf-net doesn't encode "type" information (which the inbuilt session mechanism expects)
it can be made to work, most commonly with byte[]
but if you are storing large objects, you may have other issues (unrelated to protobuf-net) related to the way session-state wants to work
for larger objects in particular, I recommend using your own mechanism (i.e. not session-state); the key-value-store systems (redis, memcached, AppFabric cache) work well for this

Untrusted deserialization strategy

I have a pretty complex web of objects I'd like to serialize and deserialize in an untrusted environment (web browser, using Unity 3D). Plain BinaryFormatter serialization is working fine, but deserialization crashes with "access to private field" errors. It works perfectly when I am running locally.
I would rather not make my codebase suck by making all my private fields public. What is the best way to get deserialization to work in an untrusted environment without doing this? I am open to changing serialization methods, BinaryFormatter was just the easiest to get started with.
UPDATE I don't want to prevent serialization from accessing my private data, I want to allow serialization to access my private data without having to make it public, compromising the encapsulation of my code.
Thanks.

Serializers like XmlSerializer and JavaScriptSerializer work against the public members, so they should (crosses fingers) work acceptably in terms of trust. You could also try protobuf-net if you want binary - but I haven't aggressively tested this scenario (it works in things like Silverlight, though, which has a fairly picky trust model).
If you want to stick with BinaryFormatter but don't want it touching your fields directly you could implement ISerializable, but doing it all manually is... painful.

None of the answers really answered my question (see the update for clarification). I ended up writing my own simple serialization format using BinaryWriter. In the end I realized what I did was equivalent to manually implementing the ISerializable interface for my classes. I had to manually implement the graph serializatoin code. While not hard, it's a bit subtle, and it has already been done for me. For future perople with this question, if there are no better answers, I recommend manually implementing ISerialzable.

Well, if you want to prevent the private field access by serialization, you may want to move over to Xml Serialization of perhaps even Json serialization.
You can prevent the private fields from serializing by placing NonSerializaed attributes on them, but you may run into problems when developers expect their fields to contain valid values and do not take into account the fact that those field values get lost when transferred to the Unity plugin.

Any way to "save state" in a C# game?

It's ok if the answer to this is "it's impossible." I won't be upset. But I'm wondering, in making a game using C#, if there's any way to mimic the functionality of the "save state" feature of console emulators. From what I understand, emulators have it somewhat easy, they just dump the entire contents of the virtualized memory, instruction pointers and all. So they can resume exactly the same way, in the exact same spot in the game code as before. I know I won't be able to resume from the same line of code, but is there any way I can maintain the entire state of the game without manually saving every single variable? I'd like a way that doesn't need to be extended or modified every single time I add something to my game.
I'm guessing that if there is any possible way to do this, it would use a p/invoke...

Well, in C# you can do the same, in principle. It's called serialization. Agreed, it's not the exact same thing as a memory dump but comes close enough.
To mark a class as serializable just add the Serializable attribute to it:
[Serializable]
class GameState
Additional information regarding classes that might change:
If new members are added to a serializable class, they can be tagged with the OptionalField attribute to allow previous versions of the object to be deserialized without error. This attribute affects only deserialization, and prevents the runtime from throwing an exception if a member is missing from the serialized stream. A member can also be marked with the NonSerialized attribute to indicate that it should not be serialized. This will allow the details of those members to be kept secret.
To modify the default deserialization (for example, to automatically initialize a member marked NonSerialized), the class must implement the IDeserializationCallback interface and define the IDeserializationCallback.OnDeserialization method.
Objects may be serialized in binary format for deserialization by other .NET applications. The framework also provides the SoapFormatter and XmlSerializer objects to support serialization in human-readable, cross-platform XML.
—Wikipedia: Serialization, .NET Framework

If you make every single one of your "state" classes Serializable then you can literally serialize the objects to a file. You can then load them all up again from this file when you need to resume.
See ISerializable

I agree with the other posters that making your game state classes Serializable is probably the way you want to go. Others have covered basic serialization; for a high end alternative you could look into NHibernate which will persist objects to a database. You can find some good info on NHibernate at these links:
http://www.codeproject.com/KB/database/Nhibernate_Made_Simple.aspx
http://nhibernate.info/doc/burrow/faq

How to GetByte from Object without serialize?

I want to save properties of controls that user change at runtime.(.Net windows form application)
I just using BinaryFormatter to serialize object, It's work but some properties not serialize, therefore i want to save object as binary
Note: I using third-party component without source code
Could you please help me?

Serializing the object is saving it "as binary". If you're looking for a straight memory dump, you're out of luck - that's just not realistic in .NET.
If serialization doesn't work out of the box, you may need to serialize what you can and then bolt on extra information about the properties that aren't currently being serialized. I would personally be slightly worried at how brittle this solution could be though - there may be very good reasons for the properties not being serialized. (.NET binary serialization is pretty brittle to start with...)
Why not contact the author of the component and ask for their advice? They're likely to know more about any quirks you might run into than we are.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.