protobuf + mqtt message routing

protobuf + mqtt message routing - c#

So I am currently exploring a few efficient ways to transfer data over MQTT. JSON is just too large for me. So I can across protobuf and this seems to fit the use-case.
But the issue I am having is that MQTT doesn't have a way to tell me where the message come from. So for instance, if I get a message I have no way to tell if it came from for source A or source B in some cases this isn't a problem but in my case, these have different data so I cannot know what model I have to use to deserialize.
I am using the C# implementation of protobuf. Is there some way to maybe partially deserialize a message if I enforce them to have a common field? (messageType field). And then being able to correctly deserialize the entire message.
Any help is appreciated.

MQTT doesn't have a way to tell me where the message come from
Of course it does. This is the purpose of message topic. You will be publishing topics like sourceA/messageTypeX or sourceB/messageTypeY.
Partial deserialization would imply some kind of inheritance (all your message types implement a common field), which is not how protobuf is designed.
Don't go looking for facilities similar to class inheritance, though – protocol buffers don't do that.
https://developers.google.com/protocol-buffers/docs/csharptutorial

For those who come in later:
Your first path should be a way to include the source and message type in the topic. Just as #Zdenek says above.
However, in the case that you need to do some kind of partial deserialization (especially with proto 3), you could do that by using a message struct that just has the fields you want to use, with the same exact numeric identifiers.
See Protobuf lazy decoding of sub message

Related

How to deserialize LogEvents from Serilog stored in CouchDB

I'm currently logging (Logging application) to a CouchDB database with Serilog, and with a handful of Types being decomposed into the database.
I've got a separate application (Reporting application) that is trying to pull LogEvents out of the database and deserialize them into the original LogEvents. The Reporting application is just as aware of the same types as the logging application and the specific Types in the database are fully decomposed into it.
Json.Net's deserializer has problems with deserializing the MessageTemplate. Even with a custom converter, it has so many problems that I'm probably doing it wrong (various exceptions deserializing, but no real pattern that I can tell).
Has anyone been able to do this successfully? I was under the impression that being able to pull Types out of the logs is one of the features of Serilog, and all the data is there, so I don't see why it's not possible.
These Types are all fully serializable as well, they're regularly serialized/deserialized by Json.net.

After more research, I've found a way to partially solve the problem. Generate new classes with http://json2csharp.com/ - rename the RootObject to something (e.g., SpecificLogEvent) and use:
var logEvent = JsonConvert.DeserializeObject<SpecificLogEvent>(doc.Value);
Then convert the objects to the real objects where needed.
I'll not mark this as the answer for awhile, because I'd love an easy back and forth and avoid this extra step which creates redundant classes.

Implementation of message in publish-subscribe pattern?

I'm currently implementing the publish-subscribe pattern for use in my future applications. Right now I'm having trouble figuring out the "best" way to design the message part of the pattern. I have a couple of ideas in mind but please tell me if there's a better way to do it.
Idea 1: Each message is an object that implements a simple tag interface IMessage.
Idea 2: Each message is represented as an array where the first index is the type of message and the second contains the payload.
Are any of these "better" than the other and if so, why? Please excuse me if this seems like a stupid question.

Your first idea make more sense, take a look at the NServiceBus github implementation of messaging patterns using marker interfaces or unobtrusive message definitions.
In essence a message in publish/subscribe scenario is an event, it's name should describe the event and have the relevant reference to data related to this event.
Andreas has a good article
HTH

Both approaches are useful. The first is useful when working with the message in your application. The second is useful if you are receiving raw message data over the network and have to determine how to deserialize it.
If you look at how WCF serializes, then they put the type as an attribute in the serialization, so it knows what to deserialize it to. However if you are going for JSON serialization fx, then you are probably better off having a property to hold your type information. Also be aware that this type information does not have to specify an actual CLR type, just an identifier to let you know how to read the data.
Once you know how to read the data, then you can create your object and take advantage of the type system, ex. using tag interfaces.

You don't specify whether your messages cross process boundaries or not.
In the latter case, where messages are passed between layers in the same application, the first approach where messages are just objects (optionally implementing the same interface) is probably the easiest.
In the former, where you have interprocess and interoperable messaging, I think you get the most of XML. XML is very flexible, easy to support in different techologies, allows you to sign messages in an interoperable way (XMLDSig) and allows you to create variety of different input/output ports (tcp/http/database/filesystem). Also, messages can be easily validated for their integrity with XSD specifications.

In the pypubsub library (a publish-subscribe for python), I found that there was great benefit to name the payload data, so the sender and receiver can just populate fields and not have to rely on order of items in message, plus it provides "code as documentation". For example, compare these, written in pseudocode. Using array:
function listener(Object[] message):
do stuff with message[0], message[1], ...
message = { 123, 'abc', obj1 } // an array
sendMessage('topicName', message)
Using keywords:
function listener(int radius, string username = None):
do stuff with radius, username, ...
// username is marked as optional for receiver but we override the default
sendMessage('topicName', radius=123, username='abc')
Doing this in C# may be more of a challenge than in Python, but that capability is really useful in pypubsub. Also, you can then use XML to define the schema for your messages, documenting the payload items, and you can mark some payload items as optional (when they have a default value) vs required (when they don't). The lib can also check that the listener adheres to the "payload contract", and that the sender is providing all data promised via the "contract".
You should probably take a look at (and even use) existing libraries to get some ideas (pypubsub is at pypubsub.sourceforge.net).

Both approaches are viable, the second one involves that you are responsible for the de/serialization of the message, it gives you much more freedom, power and control over the message, espacially versioning, but all this comes at a cost and I see this cost sustainable only if some of the actors are not .net actors. Otherwise go with the first approach and as Sean pointed out take a look at toolkits and frameworks that can greatly help you with all the plumbing.

Where can I find a list of all possible messages that an XmlException can contain?

I'm writing an XML code editor and I want to display syntax errors in the user interface. Because my code editor is strongly constrained to a particular problem domain and audience, I want to rewrite certain XMLException messages to be more meaningful for users. For instance, an exception message like this:
'"' is an unexpected token. The
expected token is '='. Line 30,
position 35
.. is very technical and not very informative to my audience. Instead, I'd like to rewrite it and other messages to something else. For completeness' sake that means I need to build up a dictionary of existing messages mapped to the new message I would like to display instead. To accomplish that I'm going to need a list of all possible messages XMLException can contain.
Is there such a list somewhere? Or can I find out the possible messages through inspection of objects in C#?
Edit: specifically, I am using XmlDocument.LoadXml to parse a string into an XmlDocument, and that method throws an XmlException when there are syntax errors. So specifically, my question is where I can find a list of messages applied to XmlException by XmlDocument.LoadXml. The discussion about there potentially being a limitless variation of actual strings in the Message property of XmlException is moot.
Edit 2: More specifically, I'm not looking for advice as to whether I should be attempting this; I'm just looking for any clues to a way to obtain the various messages. Ben's answer is a step in the right direction. Does anyone know of another way?

Technically there is no such thing, any class that throws an XmlException can set the message to any string. Really it depends on which classes you are using, and how they handle exceptions. It is perfectly possible you may be using a class that includes context specific information in the message, e.g. info about some xml node or attribute that is malformed. In that case the number of unqiue message strings could be infinite depending on the XML that was being processed. It is equally possible that a particular class does not work in this way and has a finite number of messages that occur under specific circumstances. Perhaps a better aproach would be to use try/catch blocks in specific parts of your code, where you understand the processing that is taking place and provide more generic error messages based on what is happening. E.g. in your example you could simply look at the line and character number and produce an error along the lines of "Error processing xml file LineX CharacterY" or even something as general as "error processing file".
Edit:
Further to your edit i think you will have trouble doing what you require. Essentially you are trying to change a text string to another text string based on certain keywords that may be in the string. This is likely to be messy and inconsistent. If you really want to do it i would advise using something like Redgate .net Reflector to reflect out the loadXML method and dig through the code to see how it handles different kinds of syntax errors in the XML and what kind of messages it generates based on what kind of errors it finds. This is likely to be time consuming and dificult. If you want to hide the technical errors but still provide useful info to the user then i would still recomend ignoring the error message and simply pointing the user to the location of the problem in the file.

Just my opinion, but ... spelunking the error messages and altering them before displaying them to the user seems like a really misguided idea.
First, The messages are different for each international language. Even if you could collect them for English, and you're willing to pay the cost, they'll be different for other languages.
Second, even if you are dealing with a single language, there's no way to be sure that an external package hasn't injected a novel XmlException into the scope of LoadXml.
Last, the list of messages is not stable. It may change from release to release.
A better idea is to just emit an appropriate message from your own app, and optionally display -- maybe upon demand -- the original error message contained in the XmlException.

WCF error service error message with shared classes

Source code:
http://code.google.com/p/sevenupdate/source/browse/#hg/Source/SevenUpdate.Base
SevenUpdate.Base.Sui cannot be used since it does not match imported DataContract. Need to exclude this type from referenced types.
Now I tried unchecking reuse reference types and I was able to get my project to compile. but when sending a collection from the client it was never received or couldn't be deserialized on the server end.
I really need this to work. Any help would be appreciated, the fullsource code is provided by google code.

I didnt download the source and build it, but could it be that you are missing DataContract on this class? Sui class has a property of type Sua as DataMember so it will need to be serialized as well. It looks like this in your code currently
[ProtoContract, ]
[KnownType(typeof(ObservableCollection<LocaleString>))]
public class Sua

What would I need to do to reproduce this error? The first bit (about matching data-contract) sounds like WCF isn't very happy with you, which suggests you have two similar (but different) contracts "in play". If you are re-using the types from a shared library this shouldn't be a problem.
If you do end up excluding the types (and having a different model at the client and server) then it can get a bit tricker, since "mex" doesn't guarantee the positions will remain intact (and indde, they regularly change) - but you can fix this in a partial class, by using a few [ProtoPartialMember(...)] against the type (not pretty but it works).
But I stress - the main problem here seems to be WCF; if that isn't happy such that the code doesn't codegen / compile etc, then my hands are fairly tied (since it won't get as far as talking to protobuf-net).

WCF: Individual methods or a generic ProcessMessage method accepting xml

My company is developing an application that receives data from another company via TCP sockets and xml messages. This is delivered to a single gateway application which then broadcasts it to multiple copies of the same internal application on various machines in our organisation.
WCF was chosen as the technology to handle the internal communications (internally bi-directional). The developers considered two methods.
Individual methods exposed by the
WCF service for each different
message received by the gateway
application. The gateway
application would parse the incoming
external message and call the
appropriate WCF service method. The
incoming XML would be translated
into DataContract DTO’s and supplied
as argument to the appropriate WCF
method.
The internal application
exposed a WCF service with one
method “ProcessMessage” which
accepted an Xml string message as
argument. The internal app would
parse then deserialize the received
xml and process it accordingly.
The lead developer thought option two was the better option as it was “easier” to serialized/deserialize the xml. I thought the argument didn’t make sense because DataContracts are serialized and deserialized by WCF and by using WCF we had better typing of our data. In option 2 someone could call the WCF service and pass in any string. I believe option 1 presents a neater interface and makes the application more maintainable and useable.
Both options would still require parsing and validation of the original xml string at some point, so it may also be a question where is the recommended place to perform this validation.
I was wondering what the current thoughts are for passing this kind of information and what people’s opinions on both alternatives are.

Option 1 is suited if you can ensure that the client always sends serialized representations of data contracts to the server.
However if you need some flexibility in the serialization/deserialization logic and not get tightly coupled with DataContracts, then option 2 looks good. Particularly useful when you want to support alternate forms of xml (say Atom representations, raw xml in custom format etc)
Also in option 2 inside the ProcessMessage() method, you have the option of deciding whether or not to deserialize the incoming xml payload (based on request headers or something that is specific to your application).
In option 1, the WCF runtime will always deserialize the payload.

I recently asked a couple of questions around this area: XML vs Objects and XML vs Objects #2. You'll find the answers to those questions interesting.
For our particular problem we've decided on a hybrod approach, with the interface looking something like this:
// Just using fields for simplicity and no attributes shown.
interface WCFDataContract
{
// Header details
public int id;
public int version;
public DateTime writeDateTime;
public string xmlBlob;
// Footer details
public int anotherBitOfInformation;
public string andSoemMoreInfo;
public book andABooleanJustInCase;
}
The reason we use an xmlBlob is because we own the header and footer schema but not the blob in the middle. Also, we don't really have to process that blob, rather we just pass it to another library (created by another department). The other library returns us more strongly typed data.
Good luck - I know from experience that your option 2 can be quite seductive and can sometimes be hard to argue against without being accused of being overly pure and not pragmatic enough ;)

I hope I understood this right. I think it might make sense to have your gateway app handle all the deserialization and have your internal app expose WCF services that take actual DataContract objects.
This way, your deserialization of the TCP-based XML is more centralized at the gateway, and your internal apps don't need to worry about it, they just need to expose whatever WCF services make sense, and can deal with actual objects.
If you force the internal apps to do the deserialization, you might end up with more maintenance if the format changes or whatever.
So I think I would say option 1 (unless I misunderstood).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.