We have been using BinarySerialization with our C# app, but the size and complexity of the classes which need to be serialized results in sloooooow (de)serialization, and large files.
We suspect that we should just write our own custom serializers; but protobuf-net claims significant speed and size advantages over standard .Net binary serialization, and may be easier to add to our app than a large number of bespoke serializers.
Before spending significant time and effort trying to get it to work for us, I would love to know whether there are any deal-breakers. We are using properties defined with interfaces, generic lists of abstract sub-classes, custom bit flag enums, etc etc etc. What would stop protobuf-net working for us?
protobuf-net does what it can to adhere to the core protobuf spec, and then some (for example, it includes inheritance), however:
v1 is not very good at interface-based properties (i.e. ICustomer etc); I'm working on getting this improved in v2
v1 likes there to be a parameterless constructor (this requirement is lifted in v2)
you need to tell it how to map the model to fields; in v1 this needs to be decorated on the type (or there is an option to infer some things from the names etc); in v2 this can be done externally
in v1, flags enums are a pain; in v2 there is an option to pass-thru enums as raw integers, making it much more suitable for falgs
abstracts and inheritance are fine, but you must be able to determine all the concrete types ahead of time (to map them to integer keys)
generics should be fine
jagged arrays / nested lists without intermediate types aren't OK - you can shim this by introducing an intermediate type in the middle
not all core types have inbuilt support (the new date/time offset types, for example); in "v2" you can introduce your own shims for this if necessary
it is a tree serializer, not a graph serializer; I have some thoughts there, but nothing implemented yet
If there is some limited example of what you want to serialize, I'll happily take a look to see if it is likely to work (I'm the author).
It's not appropriate when you have to interact with existing software / an existing standard. For example, you can't use it to communicate with an SMTP server.
Please read this here on a blog about protobuf-net, to quote
What’s the catch?
In the most part, that’s it. WCF will use protobuf-net for any suitable
objects (data-contracts etc). Note that this is a coarser brush than the
per-operation control, though (you could always split the interface into
different endpoints, of course).
Also, protobuf-net does have some subtle differences (especially regarding empty
objects), so run your unit tests etc.
Note that it only works on the full-fat WCF; it won’t help Silverlight etc, since
it lacks the extension features – but that isn’t new here.
Finally, the resolver in WCF is a pain, and AFAIK wants the full assembly details
including version number; so one more thing to maintain when you get new versions.
If anyone knows how to get around this?
Related
First of all excuse me if this is a noob question - but I'm new to protobuf-net.
I noticed some people use TypeModel.Create() when serializing with protobuf-net, while others just call Serializer directly (meaning using the default singleton RuntimeTypeModel.Default).
What is the difference? I would assume if I reuse the same RuntimeTypeModel.Default all the time, I'd get some performance benefits, but what do I give up in exchange?
If I already know the Type of my object when I invoke the serialization, which approach is better?
Thanks
k; the methods on Serializer.* now work primarily as a shortcut to RuntimeTypeModel.Default.*. There are three reasons that they still exist:
convenience
lots of existing example code
v1 API compatibility
Most people will only ever need a single model. However, the system supports different parallel models with different configurations if you need that. In most usage that is unlikely, however: it does make it vastly more testable, as I can reset the entire system simply be using a different model instance. So: reasons that you can (if you want, which most people won't) have multiple model instances:
testing, mainly me :)
migrating between different layouts / versions
As an aside, the TypeModel API is also exposed if you are using the "precompile" feature (targetted mainly at things like phone devices). This generates an assembly with a custom model type, usable via:
var serializer = new MyCustomSerializer();
where MyCustomSerializer : TypeModel - although in this case it won't be a RuntimeTypeModel.
I am looking to build a small application to talk with a ruby msgpack server in C#. My only holdup so far is that the API behind the server is expecting to pull out a ruby hash. Can I use a simple dictionary/key-value pair type in C#? If not, what would you suggest?
I will be using the library mentioned on the msgpack website (http://wiki.msgpack.org/display/MSGPACK/QuickStart+for+C+Sharp). However, it only seems to support primitive types? I have tried to go the IronRuby way, however there is a very crippling bug in mono that prevents you from using it. https://bugzilla.xamarin.com/show_bug.cgi?id=2770
It is normal that different part of the system can be built using different technology stacks. Because these parts should be able to talk to each other (this way or another) it is important to specify contracts between subsystems.
It is really important to think first about these contracts as these parts of your system (subsystems) can be (and will be, no doubt) subjects of changes (due to evolving their business logic, bug fixes, etc.).
By having these contracts you allow subsystems to be changed independently without impacting all their "clients" (other subsystems). Otherwise you will end up with "I need to fix this, but may affect tonnes of places I even don't know about" syndrome.
Well, as soon as you hold the contract you can do whatever you want within the given subsystem, which is just a Heaven! :)
This means that instead of "pulling out the ruby hash" you normally want to define a platform-agnostic contract that will be exposed as an aspect in terms of the business logic of your application. This contract then can be consumed by any other subsystem written in any technology.
It also means that instead of just passing some data between subsystems you want to pass some objects. These objects not only contain the data you want to pass, but also describe this data, dive it some meaning. By this "description" I mean the object type, property names, ect. Objects are self-descriptive, you know.
You may declare the contract for your ruby subsystem saying "I accept these queries and I return these results". Both query (method) and result (object) should be formulated in terms of business logic of the specified subsystem. For example, GetProducts contract should probably return a list of Product objects, not some meaningless "ruby hashes". So all the consumers will know what the contract is and what to expect.
You can make it a standard then, saying "between subsystems all the objects passed are serialized to JSON (or XML)", which is more than trivial in Ruby, C# or any other language, as well as truly platform-agnostic.
Therefore, back to your question, you normally just have no such problem in your live as translating ruby types into .NET types using some buggy libraries, or doing similar crazy things :)
Simply defining contracts and standardizing transport (JSON?) helps you in many ways starting from getting rid of this problem and all the way through to having the clean and easily maintainable system.
I have a system built around protobuf.net, the system exposes an abstract class (foo) which I expect the end user to implement. The abstract class is serialisable by protobuf.net. Currently, when I try to serialise an implementation of foo, I get an error:
Unexpected type found during
serialization; types must be included
with ProtoIncludeAttribute; found
bar passed as foo
This makes sense, I haven't told the system about bar, so when I pass a bar as a foo it gets confused. Is there a neat way to set things up such that it's simple for the programmer using my library to do things (preferably just marking fields as serialisable like normal protobuf.net usage?
Edit: Obviously, I cannot use protoinclude, as that requires modifying the source code of the base library.
In v1, the base will have to be decorated to know about the children. In v2 this restriction is removed; you can create a model at runtime and define everything you want. It can still read attributes too, this is all side-by-side (you can use different approaches on different types if you like).
You might, however, choose to hide the RuntimeTypeModel details away behind your own API if the caller doesn't want to know any gory details.
v2 is available to build from the trunk, and pretty much stable - there are some TODO items, though - mainly edge cases that need completing for full compatibility. Most people will not see these cases.
I am looking for a serializer that will match my requirements,
the serializer can be in the .Net framework, an Open-Sorce or a pay-for product (as long as it can b used directly from the code).
now, my requirements are these:
mandatory
Able to handle a cyclic reference.
Automatic, usues either attribute or inheritance in the target class, and then simply writes to the file.
Positive filtering, meaning that in the target class the fields are marked as what to serialize, and not what not to serialize (like [DataMember] in DataContractSerializer and not like [XmlIgnore] in XmlSerializer).
Must use a default constructor.
Supports polymorphism (no things like 'KnownTypes' in DataContractSerializer).
preferable
Generates file as light-wight as possible.
Serialize as fast as possible.
Works on non-public fields.
I checked most of the .Net serializers and tryied to find more online, and came out short,
all of wiche either not supports cyclic reference, polymorphism, or dose not use any constructor.
so right now i'm prettey out of ideas, and i will be glad for some halp.
Thank you!
The closest in the BCL is BinaryFormatter but it is not interoperable.
I would look at Google's Protocol Buffers They are available for a wide range of languages C++, Java, Python and .NET C#.
The problem withe BinaryFormatter is that it is negative filtering (marking the fildes not to serialze) and that it does not use a constractor.
about google Protocol Buffers (or ProtoBuff) i had a chance to work with it and its very complicated and can hardly be refered as automatic
I would like to know whether the usage of Attributes in .Net, specifically C#, is expensive, and why or why not?
I am asking about C# specifically, unless there is no difference between the different .Net languages (because the base class libraries are the same?).
All the newer .Net technologies make extensive use of attributes, such as Linq to SQL, ASP.Net MVC, WCF, Enterprise Library, etc, and I was wondering what effect this would have on performance. Alot of the classes get automatically decorated with certain Attributes, or these attributes are required for certain functionality/features.
Does the question of expense depend on implementation specific details? How are Attributes compiled to IL? Are they cached automatically, or is this up to the implementor?
"The usage of attributes" is too vague. Fetching the attributes is a reflection operation effectively - you wouldn't want to regularly do it in a loop - but they're not expensive to include in the metadata, and the typical usage pattern (IMO) is to build some other representation (e.g. an in-memory schema) after reading the attributes once.
There may well be some caching involved, but I'd probably cache the other representation anyway. For example, if I were decorating enum values with descriptions, I'd generally fetch the attributes once to build a string to enum dictionary (or vice versa).
It depends on how you use them... Some attributes are just for information purpose (ObsoleteAttribute for instance), so they don't have any impact on runtime performance. Other attributes are used by the compiler (like DllImportAttribute) or by post-compilers like PostSharp, so the cost is at compile time, not run-time. However, if you use reflection to inspect attributes at runtime, it can be expensive.