How can I show all elements in a protocol buffer message?
Do I need to use reflection or convert the message into an XML message and then show it?
Ideally some generic code that will work for any message.
Lars
A protobuf message is internally ambiguous unless you have the .proto schema (or can infer a schema) available, as (for example) a "string" wire-type could represent:
a utf-8 string
a BLOB
a sub-message
a packed array
Similar ambiguity exists for all wire-types (except perhaps "groups").
My recommendation would be to run it through your existing deserialization process (against the type-model that you presumably have available in the project) to get an object model suitable for inspection. From the object-model you have all the usual options - reflection, serialization via XmlSerializer / JavaScriptSerializer, etc.
If all you have is the raw data, there is a wireshark plugin that might help, or protobuf-net exists a ProtoReader class that might be useful for parsing such a stream; but the emphasis here is that the stream is tricky to decipher in isolation.
Related
So I am currently exploring a few efficient ways to transfer data over MQTT. JSON is just too large for me. So I can across protobuf and this seems to fit the use-case.
But the issue I am having is that MQTT doesn't have a way to tell me where the message come from. So for instance, if I get a message I have no way to tell if it came from for source A or source B in some cases this isn't a problem but in my case, these have different data so I cannot know what model I have to use to deserialize.
I am using the C# implementation of protobuf. Is there some way to maybe partially deserialize a message if I enforce them to have a common field? (messageType field). And then being able to correctly deserialize the entire message.
Any help is appreciated.
MQTT doesn't have a way to tell me where the message come from
Of course it does. This is the purpose of message topic. You will be publishing topics like sourceA/messageTypeX or sourceB/messageTypeY.
Partial deserialization would imply some kind of inheritance (all your message types implement a common field), which is not how protobuf is designed.
Don't go looking for facilities similar to class inheritance, though – protocol buffers don't do that.
https://developers.google.com/protocol-buffers/docs/csharptutorial
For those who come in later:
Your first path should be a way to include the source and message type in the topic. Just as #Zdenek says above.
However, in the case that you need to do some kind of partial deserialization (especially with proto 3), you could do that by using a message struct that just has the fields you want to use, with the same exact numeric identifiers.
See Protobuf lazy decoding of sub message
I have a MemoryStream which I write into as I receive data off the network. Since the data can be broken up, there is the potential for the stream to have a partial message or multiple messages stored in the stream. When deserializing, I place the pointer back at the beginning of the stream and try to deserialize a class of mine. I have the deserialize wrapped in a try catch block, but I get to the deserialize line, the application just quits out (no exception, not more lines run in the function, etc).
I have multiple questions:
What is the best way to receive a stream of XML data from the network that may or may not be complete, and if so may or may not have more than one message?
Does the deserializer need to know about the encoding to decode the XML within the MemoryStream?
Does deserialization place the stream pointer after the deserialized object?
Can you deserialize multiple objects within a single stream?
1) You can leverage the XmlReader class which "provides forward-only, read-only access to a stream of XML data". That may help you translate xml data that may not be complete. http://msdn.microsoft.com/en-us/library/vstudio/system.xml.xmlreader
2) If you are referring to the mixing ASCII, UTF-8, etc. then yes, otherwise I am not sure what the question is.
3) That depends on the deserializer you are using.
4) Yes, with the XMlReader class you can cleverly extract attributes and xml fragments for later consumption (although the solution is not elegant and rather ugly)
I have a project with a few types that I binary-serialize (BinaryFormatter) to files. I'd now like to create a second project which allows admins to temporarily decode those files into a more readable XML format (e.g., using XmlSerializer).
I could deserialize them into an object of the original type, then reserialize them, but is it at all possible to
skip the deserialization (at least in my own code), and
better yet, not have to reference the type at all in my decoder tool?
If you are referring to .NET's native binary serialization (BinaryFormatter), the problem is that it saves your object (along with all the necessary metadata for deserialization) using an undocumented format (AFAIK).
If you really want to try doing it without deserialization, you can check this article, which appears to have analyzed its format (but the author himself states that it might be incomplete). But my opinion is that it's far too much trouble.
There's no way that you can deserialize to your type without specifying that type, at least with standard XML serialization, however, as long as the object was serialized using xml, you could use one of many XML reader classes to traverse the object without needing to deserialize it. Alternatively, if you happen to be serializing to JSON, there are some libraries out there that will deserialize to anonymous types that you can use.
If the type of conversion you are looking for is purely cosmetic (e.g. to make it more human readable), you could write some RegEx to replace the element tags.
why we can't Serialize objects into Random Access file ? and on the other hand we can serialize objects into sequential access file ?
""C# does not provide a means to obtain an object’s size at runtime. This means that,
if we serialize the class, we cannot guarantee a fixed-length record size "" (from the book that i read in).
so we cannot read the the random access file because we don't know every object size in the file so how we could do seeking ??????
Any object marked with the SerializableAttribute attribute can be serialized (in most scenarios). The result from serialization is always directed to a stream, which may very well be a file output stream.
Are you asking why an object graph cannot be deserialized partially? .NET serialization only [de]serializes complete object graphs. Otherwise you'll have to turn to other serialization formatters, or write your own.
For direct random access to a file, you must open the file with a stream that supports seeking.
EDIT:
Seeking in the resulting stream from a serialization has no practical purpose - only the serialiation formatter knows what's in there anyway and should always be fed the very start of the stream.
For persisting the data into other structures; do it in a two-stage process: First, target the serialization bytes to a [i.e. memory-backed] stream that you can read the size from afterwards, then write the data to the actual backing store, using said knowledge of size.
You can't predict the size of a serialized object, because the serialized representation might differ a lot from the runtime representation.
It it still possible to achieve exact control over output size, if you use only primitive types, and you write using a BinaryWriter - but that is not serialization per-se.
The default binary serialization in .NET serializes a whole object graph, which, by its nature of being a graph, doesn't have a constant size, which means each serialization object (record) won't have a constant size, preventing random access.
To be able to randomly access any record in a file, write your own implementation of the binary serialization of your class, or use a database. If you need a simple, no-install single-threaded database engine, have a look at SQL Server Compact.
I spent a good portion of time last week working on serialization. During that time I found many examples utilizing either the BinaryFormatter or XmlSerializer. Unfortunately, what I did not find were any examples comprehensively detailing the differences between the two.
The genesis of my curiosity lies in why the BinaryFormatter is able to deserialize directly to an interface whilst the XmlSerializer is not. Jon Skeet in an answer to "casting to multiple (unknown types) at runtime" provides an example of direct binary serialization to an interface. Stan R. provided me with the means of accomplishing my goal using the XmlSerializer in his answer to "XML Object Deserialization to Interface."
Beyond the obvious of the BinaryFormatter utilizes binary serialization whilst the XmlSerializer uses XML I'd like to more fully understand the fundamental differences. When to use one or the other and the pros and cons of each.
The reason a binary formatter is able to deserialize directly to an interface type is because when an object is originally serialized to a binary stream metadata containing type and assembly information is stuck in with the object data. This means that when the binary formatter deserializes the object it knows its type, builds the correct object and you can then cast that to an interface type that object implements.
The XML serializer on the otherhand just serializes to a schema and only serializes the public fields and values of the object and no type information other then that (e.g. interfaces the type implements).
Here is a good post, .NET Serialization, comparing the BinaryFormatter, SoapFormatter, and XmlSerializer. I recommend you look at the following table which in addition to the previously mentioned serializers includes the DataContractSerializer, NetDataContractSerializer and protobuf-net.
Just to weigh in...
The obvious difference between the two is "binary vs xml", but it does go a lot deeper than that:
fields (BinaryFormatter=bf) vs public members (typically properties) (XmlSerializer=xs)
type-metadata based (bf) vs contract-based (xs)
version-brittle (bf) vs version-tolerant (xs)
"graph" (bf) vs "tree" (xs)
.NET specific (bf) vs portable (xs)
opaque (bf) vs human-readable (xs)
As a discussion of why BinaryFormatter can be brittle, see here.
It is impossible to discuss which is bigger; all the type metadata in BinaryFormatter can make it bigger. And XmlSerializer can work very with compression like gzip.
However, it is possible to take the strengths of each; for example, Google have open-sourced their own data serialization format, "protocol buffers". This is:
contract-based
portable (see list of implementations)
version-tolerant
tree-based
opaque (although there are tools to show data when combined with a .proto)
typically "contract first", but some implementations allow implicit contracts based on reflection
But importantly, it is very dense data (no type metadata, pure binary representation, short tags, tricks like variant-length base-7 encoding), and very efficient to process (no complex xml structure, no strings to match to members, etc).
I might be a little biased; I maintain one of the implementations (including several suitable for C#/.NET), but you'll note I haven't
linked to any specific implementation; the format stands under its own merits ;-p
The XML Serializer produces XML and also an XML Schema (implicitly). It will produce XML that conforms to this schema.
One implication is that it will not serialize anything which cannot be described in XML Schema. For instance, there is no way to distinguish between a list and an array in XML Schema, so the XML Schema produced by the serializer can be interpreted either way.
Runtime serialization (which the BinaryFormatter is part of) serializes the actual .NET types to the other side, so if you send a List<int>, the other side will get a List<int>.
That obviously works better if the other side is running .NET.
The XmlSerializer serialises the type by reading all the type's properties that have both a public getter and a public setter (and also any public fields). In this sense the XmlSerializer serializes/deserializes the "public view" of the instance.
The binary formatter, by contrast, serializes a type by serializing the instance's "internals", i.e. its fields. Any fields that are not marked as [NonSerialized] will be serialized to the binary stream. The type itself must be marked as [Serializable] as must any internal fields that are also to be serialized.
I guess one of the most important ones is that binary serialization can serialize both public and private members, whereas the other one works only with public ones.
In here, it provides a very helpful comparison between these two in terms of size. It's a very important issue, because you might send your serialized object to a remote machine.
http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/