Force (BinaryFormatter) serializer to use SerializableAttribute semantics when ISerializable? - c#

I am trying to learn to use C# serialization as a way to save objects into files that can be reloaded back into objects.
A plain class like this that I tested
[Serializable()]
public class PlainClass
{
public string Name;
private int Age;
protected decimal Price;
}
Can be directly BinaryFormatter.Serialize() and BinaryFormatter.Deserialize() without errors. (by the way, the private and protected properties also get serialized although the docs say only public)
But the moment it implements ISerialization or inherit some class like Hashtable that implements ISerialization, the you-know-what deserialization constructor is required. The word or concept of "implement" becomes a misnomer because Hashtable does not actually implement that constructor.
Is there a way to fall back to the "auto" Serialization/Deserialzation provided only by the attribute? Or is there an easier way to write info.GetValue() for a hundred properties in a class?

There is a lot of confusion in your post:
by the way, the prive and protected properties also get serialized although the docs say only public)
I suspect you are confusing two different serializers; BinaryFormatter is and always has been documented as field-centric. It doesn't distinguish between public/private, and it never looks at properties: only fields. XmlSerializer, by contrast, only looks at public properties and fields.
The word or concept of "implement" becomes a misnomer because Hashtable does not actually implement that constructor.
Yes it does; it is a protected constructor:
protected Hashtable(SerializationInfo info, StreamingContext context)
{...}
If you inherit Hashtable you can chain to this constructor:
protected YourType(SerializationInfo info, StreamingContext context)
: base(info, context)
{ /* your extra data */ }
Note, however, that you probably shouldn't be using Hashtable much unless you are on .NET 1.1.
Is there a way to fall back to the "auto" Serialization/Deserialzation provided only by the attribute?
No; none.
Or is there an easier way to write info.GetValue() for a hundred properties in a class?
In the case of inherited data, you could chain the base-constructor or switch to encapsulation instead of inheritance - either avoids the need to worry about data other than your own.
Note, however, that I almost always guide against BinaryFormatter - it can be vexing, and is quirky with versioning. For every annoyance with BinaryFormatter, I use protobuf-net (I would, since I wrote it) - this generally makes serialization much more controlled (and more efficient), and includes an ISerializable hook if you really want to use BinaryFormatter (i.e. it can use BinaryFormatter as the wrapper, but with a protobuf-net payload).

Hashtable/Dictionary requires the implementation of the respective methods...
To work around that you would have to implement a separate class to hold the Dictionary data and provide an IList interface instead which in turn need not work on TKeyValuePair but with a separately implemented class with Key/Value (refer to http://blogs.msdn.com/b/adam/archive/2010/09/10/how-to-serialize-a-dictionary-or-hashtable-in-c.aspx)...
As you can see from the start of my explanation - this is nothing you would usually want to do...
There are better serialization solutions out there - see for a very good one http://code.google.com/p/protobuf-net/

Related

C# Serialization limitations

i want to implement a general Memento-Pattern in C#. It is working fine but i use the Serializeable() Attribute to do a deep copy of a object. My implementation using generics so if someone use it he has to give his class as type.
Now the class from user must have the Attribute Serializeable() too. Are there any limitations for a class which using Serializeable()?
In fact:
Are there any performance problems?
Is it possible to use an interface?
Is it possible to use inerhitence?
Is it possible to use Auto-Properties?
I dont know how the Attribute works and so iam a bit scary of using this in such a global way.
regards
for small models that you are cloning in memory, not usually
irrelevent; when using [Serializable] you are typically using BinaryFormatter - which looks at the objects themselves; it doesn't matter what interfaces they implement - the interfaces are not used
yes, for the same reason - but all types in the model must be [Serializable]
yes, for the same reason; note : the default BinaryFormatter implementation is looking at fields - it won't even touch the properties
Personally, I try to advise against BinaryFormatter, but this is perhaps not an unreasonable use. However! Be careful that it is easy to suck extra objects into the model accidentally, must commonly through events. Note that it is a good idea to mark all events as non-serialized:
[field:NonSerialized]
public event EventHandler Something;
(or apply to the field directly if using explicit add/remove accessors)
Note also that any members like:
public object Tag {get;set;} // caller-defined
should also probably be [field:NonSerialized].
Personally, I'd prefer a different serializer, but: this will often work. I will say, though: try to avoid persisting the output of BinaryFormatter, as it is hard to guarantee compatibility between revisions of your code.
I dont know how the Attribute works
It does nothing at all except add an IL flag that says "by the way, consider this ok to be serialized"; actually, most serializers don't even look at this flag - but BinaryFormatter is one of the few that do look at this flag. The real code here is BinaryFormatter, which basically does:
have I seen this object before? if so, store the key only
what type is it? is it [Serializable]? store the type info
invent a new reference and store that as the identity
does it have a custom serializer? if so: use that
what fields does it have? access each in turn and store the name/value pair

Best place for serialisation code. Internal to class being serialised, or external class per format?

I often find myself in a quandary in where to put serialisation code for a class, and was wondering what others' thoughts on the subject were.
Bog standard serialisation is a no brainer. Just decorate the class in question.
My question is more for classes that get serialised over a variety of protocols or to different formats and require some thought/optimisation to the process rather than just blindly serialising decorated properties.
I often feel it's cleaner to keep all code to do with one format in its own class. It also allows you to add more formats just by adding a new class.
eg.
class MyClass
{
}
Class JSONWriter
{
public void Save(MyClass o);
public MyClass Load();
}
Class BinaryWriter
{
public void Save(MyClass o);
public MyClass Load();
}
Class DataBaseSerialiser
{
public void Save(MyClass o);
public MyClass Load();
}
//etc
However, this often means that MyClass has to expose a lot more of its internals to the outside world in order for other classes to serialise effectively. This feels wrong, and goes against encapsulation. There are ways around it. eg in C++ you could make the serialiser a friend, or in C# you could expose certain members as an explicit interface, but it still doesn't feel great.
The other option of course, is to have MyClass know how to serialize itself to/from various formats:
class MyClass
{
public void LoadFromJSON(Stream stream);
public void LoadFromBinary(Stream stream);
public void SaveToJSON(Stream stream);
public void SaveToBinary(Stream stream);
//etc
}
This feels more encapsulated and correct, but it couples the formatting to the object. What if some external class knows how to serialise more efficiently because of some context that MyClass doesn't know about? (Maybe a whole bunch of MyClass objects are referencing the same internal object, so an external serialiser could optimise by only serialising that once). Additionally if you want a new format, you have to add support in all your objects, rather than just writing a new class.
Any thoughts? Personally I have used both methods depending on the exact needs of the project, but I just wondered if anyone had some strong reasons for or against a particular method?
The most flexible pattern is to keep the objects lightweight and use separate classes for specific types of serialization.
Imagine the situation if you were required to add another 3 types of data serialization. Your classes would become quickly bloated with code they do not care about. "Objects should not know how they are consumed"
I guess it really depends on the context in which serialization will be used and also on limitations of systems using it. For example due to Silverlight reflection limitations some class properties need to be exposed in order for serializers to work. Another one, WCF serializers require you to know possible runtime types ad-hoc.
Apart from what you pointed out, putting serialization logic into the class violates SRP. Why would a class need to know how to "translate" itself to another format?
Different solutions are required in different situations, but I've mostly seen separated serializers classes doing the work. Sure it required exposing some parts of class internals, but in some cases you'll have to do it anyways.

c# Public Nested Classes or Better Option?

I have a control circuit which has multiple settings and may have any number of sensors attached to it (each with it's own set of settings). These sensors may only be used with the control circuit. I thought of using nested classes like so:
public class ControlCircuitLib
{
// Fields.
private Settings controllerSettings;
private List<Sensor> attachedSensors;
// Properties.
public Settings ControllerSettings
{ get { return this.controllerSettings; } }
public List<Sensor> AttachedSensors
{ get { return this.attachedSensors; } }
// Constructors, methods, etc.
...
// Nested classes.
public class Settings
{
// Fields.
private ControlCircuitLib controllerCircuit;
private SerialPort controllerSerialPort;
private int activeOutputs;
... (many, many more settings)
// Properties.
public int ActiveOutputs
{ get { return this.activeOutputs; } }
... (the other Get properties for the settings)
// Methods.
... (method to set the circuit properties though serial port)
}
public class Sensor
{
// Enumerations.
public enum MeasurementTypes { Displacement, Velocity, Acceleration };
// Fields.
private ControlCircuitLib controllerCircuit;
private string sensorName;
private MeasurementTypes measurementType;
private double requiredInputVoltage;
... (many, many more settings)
// Properties.
public string SensorName {...}
... (Get properties)
// Methods.
... (methods to set the sensor settings while attached to the control circuit)
}
}
I have read that public nested classes are a "no-no" but that there are exceptions. Is this structure OK or is there a better option?
Thanks!
EDIT
Below is a crude hierarchy of the control circuit for which I am trying to write a library class for; I used code formatting to prevent text-wrap.
Control Circuit (com. via serial port) -> Attached Sensors (up to 10) -> Sensor Settings (approx. 10 settings per sensor)
Basic Controller Settings (approx. 20 settings)
Output Settings (approx. 30 settings)
Common Settings (approx. 30 settings)
Environment Settings (approx. 10 settings)
All of the settings are set through the controller but I would like an organized library instead of just cramming all ~100 methods, properties, and settings under one Controller class. It would be HUGELY appreciated if someone could offer a short example outlining the structure they would use. Thanks!
The contents of a class should be the implementation details of that class. Are the nested classes implementation details of the outer class, or are you merely using the outer class as a convenient name scoping and discovery mechanism?
If the former, then you shouldn't be making the private implementation details publically available. Make them private if they are implementation details of the class.
If the latter, then you should be using namespaces, not outer classes, as your scoping and discovery mechanism.
Either way, public nested classes are a bad code smell. I'd want to have a very good reason to expose a nested class.
I don't have too much problem with public nested classes (I'm not a fan of dogmatic rules, in general) but have you considered putting all of these types in their own namespace instead? That's the more common way of grouping classes together.
EDIT: Just to clarify, I would very rarely use public nested classes, and I probably wouldn't use them here, but I wouldn't completely balk at them either. There are plenty of examples of public nested types in the framework (e.g. List<T>.Enumerator) - no doubt in each case the designers considered the "smell" of using a nested class, and considered it to be less of a smell than promoting the type to be a top-level one, or creating a new namespace for the types involved.
From your comment to Eric's answer:
These sensors can ONLY be used with a specific circuit
This kind of relationship is commonly known as a dependency. The Sensor constructor should take a ControlCircuit as a parameter. Nested classes do not convey this relationship.
and you can't get/set any sensor settings without going through the controller circuit;
I think that means that all Sensor properties will delegate to (call) or somehow inform (fire an event on) the ControlCircuit when they're used. Or, you'd have some kind of internal interface to the sensor that only the control circuit uses, making Sensor an opaque class to the outside world. If that's the case, Sensor is just an implementation detail and could be nested private or internal (there's also no need to "save" a sensor instance if you can't do anything with it).
Also, I don't even want to expose a Sensor constructor (the controller will have a method for this)
The fact that the Sensor constructor now takes a control circuit is enough of a hint as to what depends on what that you could leave the constructor public. You can also make it internal.
A general comment that I have is that this design is very coupled. Maybe if you had some interfaces between control circuit, sensor and settings, it would be easier to understand each component independently, and the design would be more testable. I always find beneficial to make the roles that each component plays explicit. That is, if they're not just implementation details.
I would say the better option is moving those nested classes out of the class they're in and have them stand on their own. Unless I'm missing something you appear only to have them in the main class in order for some sort of scoping concept, but really, that's what namespaces are for.
I generally disagree with Eric on this.
The thing I usually consider is: how often should the end user use the type name ControlCircuitLib.Sensor. If it's "almost never, but the type needs to be public so that doing something is possible", then go for inner types. For anything else, use a separate type.
For example,
public class Frobber {
public readonly FrobType Standard = ...;
public readonly FrobType Advanced = ...;
public void Frob(FrobType type) { ... }
public class FrobType { ... }
}
In this example, the FrobType only acts as an opaque 'thing'. Only Frobber needs to know what it actually is, although it needs to be possible to pass it around outside that class. However, this sort of example is quite rare; more often than not, you should prefer to avoid nested public classes.
One of the most important things when designing a library is to keep it simple. So use whichever way makes the library and the using code simpler.
I like nested classes in cases like this because it shows the relationship. If you do not want users of the outer class to be able to create items of the inner class separately from the outer class, you can always hide the constructors and use factory methods in the outer class to create elements of the inner class. I use this structure a lot.
This structure seems completely reasonable to me. I wasn't aware until today that Microsoft has advised against doing this, but I'm still not aware why they've advised as such.
I've used this structure in a situation where the nested class only exists to support the containing class (i.e. it's part of its implementation), but other classes need to be able to see it in order to interact with the containing class (i.e. it's part of the class's API).
That being said, Eric generally knows what he's talking about, so out of respect for his knowledge and for the time being, I've converted those classes to use namespaces instead.
Currently, I'm not liking the results. I have a class named BasicColumn, which exists only to represent a column in a class called Grid. Previously, that class was always addressed as Grid.BasicColumn, which was great. That's exactly how I want it to be referred to. Now, with the Grid and the BasicColumn both in the Grids namespace, it's just referred to as BasicColumn with a 'using Grids' at the top of the file. There's nothing to indicate its special relationship with Grid, unless I want to type out the entire namespace (which has a few prefixes before Grid I've left out for simplicity).
If anyone can point out an actual reason why using public nested classes is somehow counterproductive or suboptimal, other than the irrelevant fact that Microsoft doesn't intend for them to be used that way, then I'd love to hear it.
While I feel Eric's answer is correct, it is important to realize it doesn't really fully address what your situation is.
Your case sounds very similar to one I frequently find myself in where you have a class which is really implementation details of another class, however, some details or functionality of that sub-component naturally lend themselves towards being exposed directly for some minor aspects that are not governed by the parent.
In these cases, what you can do is use interfaces. The nested classes need not be public as they really are internal details of the class they are nested within, but a subset of functionality (an interface) needs to be made publicly available and can be implemented by the class.
This allows for construction of the internal structures to be controlled by the class they are nested within while still allowing direct access to the type from a namespace for external references. (The caller will use SomeNamespace.IChildApi as the name rather than SomeNamespace.NestingClass.NestedClass.SecondNestedClass.ThirdNestedClass, etc.)

Difference in using Attributes/Interfaces in C#

This is not properly a question but something more like a thought I had recently.
I'm taking XmlAttribute to XmlSerialize a class as an example: you can set attributes to a class to choose which properties should be serialized, but the same thing can be done quite easy by implementing a teorical interface IXmlSerializable (it does exist something similar, I don't remember) and by overloading a method "Serialize" for that class which just call Serialize on properties you want to serialize (this.myProp1.Serialize()), same for Deserialize
So what I'm basically saying: isn't Attribute method a bit redundant? (I like it actually, but I don't find it logically different from an interface)
Thanks for any answer, as I've said this is just a thought... hopefully someone will find it interesting
Update 1: Well I explained myself in a wrong way, what I'm asking is "why should I choose attribute instead of an Interface (or opposite)", not exactly this specific case (I took serialization because was the first thing that pop out in my mind), by the way thanks for your answer because they are very interesting
From the comments and downvote, maybe I should highlight my main point here: something that can save me hours of work (per type) and horrible code complexity is very much not redundant, but very, very welcome.
"quite easy"? OK; I'm pretty experienced at serialization, but the implementation for that is not what I call easy. Quite the contrary, in fact.
If you don't want to use attributes, there is an overload for XmlSerializer that allows you to configure it at runtime.
But I shudder whenever I hear "implement IXmlSerializable". The attribute approach is very quick and easy:
[XmlRoot("foo"), XmlType("foo")]
[XmlInclude(typeof(SuperFoo))]
public class Foo {
public string X {get;set;}
[XmlAttribute("y")]
public int? Y {get;set;}
[XmlElement("item")]
public List<string> Items {get;set;}
}
public class SuperFoo : Foo {}
I challenge you to write a robust implementation of IXmlSerializable for that very simple example in under 2 hours... and remember that every line you write is a line you have to maintain.
Well, from the best I can tell, they are logically different.
Implementing IXmlSerializable directly impacts the class itself, because you are adding an interface and one or more methods into the implementation of the class. In essence, You are making your own class directly responsibly for the it's serialization.
However, adding the XmlAttribute attributes does not directly impact the functionality of the class, instead you are just decorating it with attributes so that XmlSerializer can carry out the actual serialization functiohality. In this case, you are deferring the serialization to the XmlSerializer class, and providing just enough metadata about your class for XmlSerializer to do it's work.
This is why I prefer the latter attribute approach. When I'm writing a class, I want it to be serializable, but the last thing I care about is the specifics of the implementation, so I always start with thaqt approach and 99% of the time it works fine with very little work. However, if you did need more fine-grain control over the serialization, the implement the IXmlSerializable interface and write your own serialization code.
The programmatic method of implementing the interface may give a bit more control (and likely more speed), but is harder to create and maintain than the attribute method. I mostly use attributes.
You can select properties to (not) serialize with attributes. Implementation of interface is serialization by code.

Why can't the 'NonSerialized' attribute be used at the class level? How to prevent serialization of a class?

I have a data object that is deep-cloned using a binary serialization. This data object supports property changed events, for example, PriceChanged.
Let's say I attached a handler to PriceChanged. When the code attempts to serialize PriceChanged, it throws an exception that the handler isn't marked as serializable.
My alternatives:
I can't easily remove all handlers from the event before serialization
I don't want to mark the handler as serializable because I'd have to recursively mark all the handlers dependencies as well.
I don't want to mark PriceChanged as NonSerialized - there are tens of events like this that could potentially have handlers. EDIT: Another reason why I can't do this is because the data classes (and hence the events) are generated and I don't have direct control over the generation code. Ideally, the generation code would just mark all events as NonSerialized.
Ideally, I'd like .NET to just stop going down the object graph at that point and make that a 'leaf'. So why doesn't .NET allow an entire class to be marked as NonSerialized?
--
I finally worked around this problem by making the handler implement ISerializable and doing nothing in the serialize constructor/ GetDataObject method. But, the handler still is serialized, just with all its dependencies set to null - so I had to account for that as well.
Is there a better way to prevent serialization of an entire class? That is, one that doesn't require accounting for the null dependencies?
While I tend to disagree with the approach (I would simply mark the events as NonSerialized, regardless of how many there are) you could probably do this using serialization surrogates.
The idea is that you create an object that implements ISerializationSurrogate and basically does what you're already doing - nothing in the GetObjectData and SetObjectData methods. The difference is that you would be customizing the serialization of the delegate, not the class containing it.
Something like:
class DelegateSerializationSurrogate : ISerializationSurrogate {
public void GetObjectData(object obj, SerializationInfo info, StreamingContext context) {
// do nothing
}
public object SetObjectData(object obj, SerializationInfo info, StreamingContext context) {
// do nothing
return null;
}
}
Then you register this with the formatter using the procedures outlined in this MSDN column. Then whenever the formatter encounters a delegate, it uses the surrogate instead of serializating the delegate directly.
...there are tens of events...
Personally, then I'd just add the non-serialized markers, which for field-like events is most easily done via:
[field: NonSerialized]
public event SomeEventType SomeEventName;
(you don't need to add a manual backing delegate)
What are your serialization requirements exactly? BinaryFormatter is in many ways the least friendly of the serializers; the implications on events are a bit ugly, and it is very brittle if stored (IMO it is only really suitable for transport, not for storage).
However; there are plenty of good alternatives that would support most common "deep clone" scenarios:
XmlSerializer (but limited to public members)
DataContractSerializer / NetDataContractSerializer
protobuf-net (which includes Serializer.DeepClone for this purpose)
(note that in most of those serialization support would require extra attributes, so not much different to adding the [NonSerialized] attributes in the first place!)

Categories

Resources