how to use Protobuf-net for complex objects?

how to use Protobuf-net for complex objects? - c#

we're using wpf, and would like to serialize a complex object -- a view model.
Using binary formatter, I can just add an attribute [Serializable], and it would automatically work for the entire class, recursively.
Do we have something similar in protobuf?
Also, where is the documentation?
I learned about protoinclude, and protomembers, but these are complex objects that may change.
We want to use protobuf because it is compact, fast, and portable. But I don't rule out other options, if it accomplishes the same goals, more or less, and is easy to use.
Please answer or suggest options. Thank you

BinaryFormatter manages this by including the field name in the output, which is both verbose and brittle (for example, it won't withstand changing something from a field+property to an automatically implemented property).
If you want to do something similar in protobuf-net, you can use "ImplicitFields", however, note that this assigns an integer key to each member alphabetically, so is only suitable if your model is totally fixed as a contract and will not add/rename members as this will break the contract (meaning: you can't deserialize existing data correctly). For example:
[ProtoContract(ImplicitFields = ImplicitFields.AllPublic)]
public class Foo {...} // all public fields and properties are serialized,
// similar to XmlSerializer
[ProtoContract(ImplicitFields = ImplicitFields.AllFields)]
public class Bar {...} // all fields (not properties; public or private)
// are serialized, similar to BinaryFormatter
If your contract is not totally fixed, it would be preferable to explicitly assign a key to each serialized member, which can be done in a great many ways. The simplest being:
[ProtoContract]
public class Foo {
[ProtoMember(1)]
public int A {get;set;}
...
}

Related

C# generate XML for non-serializable properties

I'm having a lot of classes that I would like to export as XML for other applications to consume. The properties I want to export don't have a setter and the classes don't have a constructor without properties since I don't want this behaviour in my code. Therefore, it seems I can't use (XML) serialization on these classes and properties, even though I do want to export it into XML. I don't need deserialization though, as the serialization to XML is meant to be export-only.
I have tried XML serialization, but it appears this really only supports classes that can be used in both directions (serialization AND deserialization), which makes my classes not applicable. https://learn.microsoft.com/en-us/dotnet/standard/serialization/introducing-xml-serialization
Obviously, I could make serializable versions of each class, but doing this by hand would need to me to check manually after any updates of the original classes that I have updated the serializable classes. Additionally, I would need to write for every class, the code to transform is to its serializable version.
Is there a way to use the strength of XML serialization which takes care of all fuzz about XML, without needing the classes to be deserializable? Or do you have any other suggestions for easy ways to export XML for these classes and properties?

There's a really simple bit of guidance that applies to virtually every serializer, regardless of format (xml, json, protobuf, etc), implementation details, etc:
if your type model happens to be a 1:1 fit for the serializer, great! use it!
otherwise, don't try; create a completely separate model that is purely for serialization; the shape and behavior should be exactly what the serializer needs to get the result you want; then map between the two models (your domain model, and the serialization model) as needed
There is a caveat around "many serializers allow a custom serializer API", but in my experience, it usually isn't worth the pain, and switching to a separate serialization model is a better idea. This applies especially in the case of XmlSerializer, since IXmlSerializable is virtually impossible to implement absolutely correctly manually.
So; to be explicit here; if the problem is that your type lacks the correct constuctors and property setters to work with XmlSerializer : create a new model that has those things, and just shim between them.

The DataContractSerializer can serialize classes without a default constructor.
You will have to mark the class with DataContractAttribute in order to do so. Also, the properties to serialize need setters. The setters may be private, but the setter does have to exist:
[DataContract] // Need this to serialize classes without default constructor.
public class Person
{
public Person(string name, DateTime dob)
{
this.Name = name;
this.DateOfBirth = dob;
}
[DataMember] // Need this to serialize this property
public string Name { get; private set; } // Need setter for serializer to work
[DataMember]
public DateTime DateOfBirth { get; private set; }
}
Usage:
var person = new Person("Jesse de Wit", new DateTime(1988, 5, 27));
var serializer = new DataContractSerializer(typeof(Person));
using (var stream = new MemoryStream())
{
serializer.WriteObject(stream, person);
}

Binary serialization and automatic properties

I have a class like this:
public class Foo
{
public IBar {get;set;}
//tons of other properties
}
public interface IBar
{
//whatever
}
The class is used for binary serialization (standard use of BinaryFormatter). An implementation of IBar is marked with [Serializable] so everything works.
Now I want not to serialize Bar and preserve backwards compatibility (it was not referenced in the code anyway).
NonSerialized attribute seems to be enough. However it can be applied only to fields, not to automatic properties. So I tried this:
public class Foo
{
private IBar _bar;
[NonSerializable]
public IBar Bar
{
get { return _bar; }
set { _bar = value; }
}
}
Suprisingly it works well - I can both deserialize old Foos and the new ones.
My question is: how can it possibly work if these are the fields that are serialized and the automatic property's backing field is likely to have some non-C# characters in its name?
In other words:
Old Foo's IBar field name (my guess): k__BackingField
New Foo's IBar field name: _bar
Obviously they don't match, so how BinaryFormatter overcomes this?

I think there is something strange in your example. BinaryFormatter shouldn't be able to handle this (as far as I know, unless this is changed in 4.5 which I doubt), which is why it is quite dangerous to use if backwards compatibility is necessary. Are you sure the value is serialized from the old version and deserialized to the new version? Can you verify that the deserialized data matches, and aren't null?
For a complete example of a program that verifies that it does not work, see here.
http://www.infragistics.com/community/blogs/josh_smith/archive/2008/02/05/automatic-properties-and-the-binaryformatter.aspx
You will not see any exceptions, but the old value from the field named xyz__backingfield will be lost, and replaced in the new class by a default value.
If you want to be backwards compatible, avoid using automatic properties, or you will be in a world of trouble very soon. In fact it doesn't really matter, since the BinaryFormatter in default (automatic) mode is only really useful if you want to serialize objects and deserialize them again in the same application, for example for copy & paste or a similar operation. In that case you have no versioning issues since it will be the same code doing both the serialization and deserialization.
To make serialization backwards compatible without losing your mind, make sure you have full control of the schema. Good examples of serializers where you have a decent chance of staying out of trouble are DataContractSerializer, Json.NET or Protocol buffers (for example protobuf-net).
As a last possibility you can implement ISerializable and use the dictionary storage of BinaryFormatter, but then you have all the drawbacks of hand-rolling your serialization anyway.
On a sidenote if you want to apply attributes to a backing field try [field:AttriuteType] which is useful to mark backing fields of events as non serialized for example .

Best place for serialisation code. Internal to class being serialised, or external class per format?

I often find myself in a quandary in where to put serialisation code for a class, and was wondering what others' thoughts on the subject were.
Bog standard serialisation is a no brainer. Just decorate the class in question.
My question is more for classes that get serialised over a variety of protocols or to different formats and require some thought/optimisation to the process rather than just blindly serialising decorated properties.
I often feel it's cleaner to keep all code to do with one format in its own class. It also allows you to add more formats just by adding a new class.
eg.
class MyClass
{
}
Class JSONWriter
{
public void Save(MyClass o);
public MyClass Load();
}
Class BinaryWriter
{
public void Save(MyClass o);
public MyClass Load();
}
Class DataBaseSerialiser
{
public void Save(MyClass o);
public MyClass Load();
}
//etc
However, this often means that MyClass has to expose a lot more of its internals to the outside world in order for other classes to serialise effectively. This feels wrong, and goes against encapsulation. There are ways around it. eg in C++ you could make the serialiser a friend, or in C# you could expose certain members as an explicit interface, but it still doesn't feel great.
The other option of course, is to have MyClass know how to serialize itself to/from various formats:
class MyClass
{
public void LoadFromJSON(Stream stream);
public void LoadFromBinary(Stream stream);
public void SaveToJSON(Stream stream);
public void SaveToBinary(Stream stream);
//etc
}
This feels more encapsulated and correct, but it couples the formatting to the object. What if some external class knows how to serialise more efficiently because of some context that MyClass doesn't know about? (Maybe a whole bunch of MyClass objects are referencing the same internal object, so an external serialiser could optimise by only serialising that once). Additionally if you want a new format, you have to add support in all your objects, rather than just writing a new class.
Any thoughts? Personally I have used both methods depending on the exact needs of the project, but I just wondered if anyone had some strong reasons for or against a particular method?

The most flexible pattern is to keep the objects lightweight and use separate classes for specific types of serialization.
Imagine the situation if you were required to add another 3 types of data serialization. Your classes would become quickly bloated with code they do not care about. "Objects should not know how they are consumed"

I guess it really depends on the context in which serialization will be used and also on limitations of systems using it. For example due to Silverlight reflection limitations some class properties need to be exposed in order for serializers to work. Another one, WCF serializers require you to know possible runtime types ad-hoc.
Apart from what you pointed out, putting serialization logic into the class violates SRP. Why would a class need to know how to "translate" itself to another format?
Different solutions are required in different situations, but I've mostly seen separated serializers classes doing the work. Sure it required exposing some parts of class internals, but in some cases you'll have to do it anyways.

Set DataContract and DataMember Without All the Attributes

I find the [DataContract] and [DataMember] attributes a bit messy and would rather do this with code in a config method or something. Is this possible?

You don't have to use these attributes at all. DataContractSerializer will serialize all public properties with getter and setter but in case of serializing entities with navigation properties you will easily end with exception due to "cyclic reference".
To avoid that exception you must either use [DataContract(IsReference = true)] on your entity class with DataMember on every property you want to serilize or IgnoreDataMember on every property you don't want to serialize.
The last and the most complex option is avoiding attributes completely and custom classes implementing IDataContractSurrogate to control serialization outside of the type.
You can also write your completely custom serialization process or use XML serialization or binary serialization with all its requirements.

No, the DataContractSerializer is an opt-in serializer - you have to tell it what you want included.
With other serializers you need to use things like NonSerializedAttribute or XmlIgnoreAttribute to tell the serializer to leave things alone.

I know this is a rather old post, but I came here thinking the same thing if there is a way to set all member attributes automatically on some legacy code with public fields and no getters and setters.
What makes it look just a little bit less messy is shortening up the name DataMember:
using DM = System.Runtime.Serialization.DataMemberAttribute;
[DataContract]
public class SomeClass
{
[DM] public bool IsMO;
[DM] public string LabCode;
[DM] public string OrderNumber;
}

Fields vs. Properties and XMLSerializers (101)

So I've been studying the use of various Serializers in the .NET Framework and while trying to experiment on preventing certain objects in a class from being serialized I was thrusted back to some very basic programming questions that I "thought" I knew. Given this example:
public class Example
{
public string examName;
[XmlIgnore]
public int exampleNumber;
public Example()
{ }
[XmlIgnore]
public int ExampleNumberTwo { get; set; }
}
I can create an instance of this class and using the XMLSerializer can output the content of this class in XML format. The [XmlIgnore] attribute actually does what I'd expected; it prevents the serialization of the referenced items.
So venturing further I replaced the [XmlIgnore] declaration for "exampleNumber" with [NonSerializable] expecting the similar results but the output did not change. After searching through resources, it was stated that the [NonSerializable] attribute should only be used on fields and [XmlIgnore] attributes should be used on properties.
Yet another post stated that the [NonSerializable] attribute has no effect when using the XMLSerializer but will produce the expected results when using the SOAP or BinaryFormatter. So I'm lost on the concept at this point.
But this brought me to the basic question, what defines a field vs. a property? I know its a basic question and I've even viewed other discussions here but the degree of clarity I am looking for still wasn't really clear.
I can use the [XmlIgnore] attribute on the property (ExampleNumberTwo) or the variable (exampleNumber) so the statement that it can ONLY be used on Properties doesn't seem correct.
But then again, I have always referred to the objects in my example such as (examName) and (exampleNumber) as being member variables. So what exactly is the signature of a "Field"
Can anyone shed some light on this?

The MSDN documentation supports the idea that [NonSerialized] only gives the expected results with the binary and SOAP serializers:
When using the BinaryFormatter or SoapFormatter classes to serialize
an object, use the NonSerializedAttribute attribute to prevent a field
from being serialized. For example, you can use this attribute to
prevent the serialization of sensitive data.
The target objects for the NonSerializedAttribute attribute are public
and private fields of a serializable class. By default, classes are
not serializable unless they are marked with SerializableAttribute.
During the serialization process all the public and private fields of
a class are serialized by default. Fields marked with
NonSerializedAttribute are excluded during serialization. If you are
using the XmlSerializer class to serialize an object, use the
XmlIgnoreAttribute class to get the same functionality. Alternatively,
implement the ISerializable interface to explicitly control the
serialization process. Note that classes that implement ISerializable
must still be marked with SerializableAttribute.
In terms of "field" vs. "property", fields are straight data variables contained by a class. Properties are actually specially named methods on the class (get_PropName() and set_PropName()). In your code, the compiler allows you to use properties the same way you would use a field, and then inserts the appropriate get/set call for you.
Oftentimes, properties will be simple wrappers around a field:
private int myField;
public int MyProperty
{
get { return myField; }
set { myField = value; }
}
But they don't have to be:
public int TodaysDate
{
get { return DateTime.Today; }
}
In general, you want all your fields to be private, since they're supposed to be implementation details. Any simple data that you'd like to expose should be done via a property, since you can easily surround the data access with (changeable) logic.

In C#, the short answer is that properties have get and/or set methods, while fields do not. VB.NET makes it a little more evident by requiring the "Property" qualifier to be used to differentiate one.
With C#, you can just append " { get; set; }" to the end of a field's definition and it's now a property.
Where this really comes into play is in reflection. Fields and Properties are segregated from one another into different enumerable collections.

This answer to What are the differences between the XmlSerializer and BinaryFormatter will help you get started in the right direction.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.