Implementation of message in publish-subscribe pattern?

Implementation of message in publish-subscribe pattern? - c#

I'm currently implementing the publish-subscribe pattern for use in my future applications. Right now I'm having trouble figuring out the "best" way to design the message part of the pattern. I have a couple of ideas in mind but please tell me if there's a better way to do it.
Idea 1: Each message is an object that implements a simple tag interface IMessage.
Idea 2: Each message is represented as an array where the first index is the type of message and the second contains the payload.
Are any of these "better" than the other and if so, why? Please excuse me if this seems like a stupid question.

Your first idea make more sense, take a look at the NServiceBus github implementation of messaging patterns using marker interfaces or unobtrusive message definitions.
In essence a message in publish/subscribe scenario is an event, it's name should describe the event and have the relevant reference to data related to this event.
Andreas has a good article
HTH

Both approaches are useful. The first is useful when working with the message in your application. The second is useful if you are receiving raw message data over the network and have to determine how to deserialize it.
If you look at how WCF serializes, then they put the type as an attribute in the serialization, so it knows what to deserialize it to. However if you are going for JSON serialization fx, then you are probably better off having a property to hold your type information. Also be aware that this type information does not have to specify an actual CLR type, just an identifier to let you know how to read the data.
Once you know how to read the data, then you can create your object and take advantage of the type system, ex. using tag interfaces.

You don't specify whether your messages cross process boundaries or not.
In the latter case, where messages are passed between layers in the same application, the first approach where messages are just objects (optionally implementing the same interface) is probably the easiest.
In the former, where you have interprocess and interoperable messaging, I think you get the most of XML. XML is very flexible, easy to support in different techologies, allows you to sign messages in an interoperable way (XMLDSig) and allows you to create variety of different input/output ports (tcp/http/database/filesystem). Also, messages can be easily validated for their integrity with XSD specifications.

In the pypubsub library (a publish-subscribe for python), I found that there was great benefit to name the payload data, so the sender and receiver can just populate fields and not have to rely on order of items in message, plus it provides "code as documentation". For example, compare these, written in pseudocode. Using array:
function listener(Object[] message):
do stuff with message[0], message[1], ...
message = { 123, 'abc', obj1 } // an array
sendMessage('topicName', message)
Using keywords:
function listener(int radius, string username = None):
do stuff with radius, username, ...
// username is marked as optional for receiver but we override the default
sendMessage('topicName', radius=123, username='abc')
Doing this in C# may be more of a challenge than in Python, but that capability is really useful in pypubsub. Also, you can then use XML to define the schema for your messages, documenting the payload items, and you can mark some payload items as optional (when they have a default value) vs required (when they don't). The lib can also check that the listener adheres to the "payload contract", and that the sender is providing all data promised via the "contract".
You should probably take a look at (and even use) existing libraries to get some ideas (pypubsub is at pypubsub.sourceforge.net).

Both approaches are viable, the second one involves that you are responsible for the de/serialization of the message, it gives you much more freedom, power and control over the message, espacially versioning, but all this comes at a cost and I see this cost sustainable only if some of the actors are not .net actors. Otherwise go with the first approach and as Sean pointed out take a look at toolkits and frameworks that can greatly help you with all the plumbing.

Related

protobuf + mqtt message routing

So I am currently exploring a few efficient ways to transfer data over MQTT. JSON is just too large for me. So I can across protobuf and this seems to fit the use-case.
But the issue I am having is that MQTT doesn't have a way to tell me where the message come from. So for instance, if I get a message I have no way to tell if it came from for source A or source B in some cases this isn't a problem but in my case, these have different data so I cannot know what model I have to use to deserialize.
I am using the C# implementation of protobuf. Is there some way to maybe partially deserialize a message if I enforce them to have a common field? (messageType field). And then being able to correctly deserialize the entire message.
Any help is appreciated.

MQTT doesn't have a way to tell me where the message come from
Of course it does. This is the purpose of message topic. You will be publishing topics like sourceA/messageTypeX or sourceB/messageTypeY.
Partial deserialization would imply some kind of inheritance (all your message types implement a common field), which is not how protobuf is designed.
Don't go looking for facilities similar to class inheritance, though – protocol buffers don't do that.
https://developers.google.com/protocol-buffers/docs/csharptutorial

For those who come in later:
Your first path should be a way to include the source and message type in the topic. Just as #Zdenek says above.
However, in the case that you need to do some kind of partial deserialization (especially with proto 3), you could do that by using a message struct that just has the fields you want to use, with the same exact numeric identifiers.
See Protobuf lazy decoding of sub message

Why is passing a dataset to a web service method not good?

Please explain in detail. I have been telling my mentor that everything I've researched on WCF and many programmers all over the net say it is NOT good to pass DataSets to the service, why is that? I created a BUNCH of classes in the service and the work great with the application, but he says that I just wasted time doing all that work, he has a better way of doing it.
He keeps telling me to create a SINGLE OperationContract. There will be many functions in the service, but the OperationContract will take the string name of the function and the dataset providing the details for that function.
Is his way bad practice? Not safe? I'm just trying to understand why many people say don't use datasets.

The first reason is interoperability. If you expect consumers of your service to be implemented in any other technologies other than .NET, they may have lots of trouble extracting or generating the data in the DataSet, as they will have no equivalent data structure on their end.
Performance can be affected quite a bit, as well. In particular, the serialization format for untyped datasets can be huge because it will contain not just the data, but also the XSD schema for the data set, which can be quite large depending on the complexity of the DataSet. This can make your messages a lot larger, which will use more network bandwidth, take longer to transfer (particularly over high latency links), and will take more resources at the endpoint to parse.

So the web service you have does something specific lets say it sends a bunch of emails. Lets say this service has one method that sends an email. The method should accept and email address, subject and a body.
Now if we send a data set with the information required the service would have to know the shape of the data and parse it.
Alternatively if the web service accepted a object with properties for email address, subject and body. It can be used in more than one place and is less prone to going wrong dues to a malformed dataset.

One more thing: you can get incorrect data using DataSet.
For example a value in the DataSet might look like the following before serialization:
<date_time>12:19:38</date_time>
In the client it would come with a offset specified:
<date_time>12:19:38.0000000-04:00</date_time>
The client code would adjust this to its local time (much like Outlook when you schedule an appointment with someone in a different timezone).
More details can be found here.

Using WCF is not just an implementation decision - it is a design choice. When you choose to use WCF you have to abandon many of your treasured OO principles behind and embrace a new set of patterns and principals that are associated to service orientation.
One such principle is that of explicit contracts: A service should have well defined public contracts (see this Wikipedia article). This is crucial for interoperability, but is also important so clients have an accurate picture of what functionality your service provides.
A DataSet is basically just a big bag of "stuff" - there is no limitation to what it could contain - or any well defined contract that explains how I can get data out. By using a DataSet you introduce inherent coupling between the client and the server - the client has to have "inside information" about how the DataSet was created in order to get the data out. By introducing this level of coupling between the client and service you have just negated one of the main motivations for using WCF (precisely that of decoupling the two areas of functionality to allow for independent deployment and/or development lifecycle).

How to understand the meaning of "Contract"

I always see the word contract and it seems to have different meanings or at least it is how it looks to me (I am not a native English speaker) so when I see the word "contract" I cannot be sure what I should understand and expect from it. I don't know if anyone else is having the same trouble but this bugs me a lot. For example, when I see "interface" it makes me think "abstraction, dependency injection, inheritance, etc." and I know what I am looking for and it is getting formed in my mind nicely and easily.
But when it comes to the word contract I cannot visualize a pattern, class etc. whatever it is. Is it something formed using interface or a class or maybe an attribute etc.
For example, there is a class here (in Json.NET) which talks about something called IContractResolver and the page is explained what it is used for :
The IContractResolver interface provides a way to customize how the
JsonSerializer serializes and deserializes .NET objects to JSON
without placing attributes on your classes.
The explaination is very comprehensible but I cannot just form the idea when I see Contract and I cannot say :
"Umm, I am expecting some methods which do this and that so I can
override it and later I use this class here/there to change/fulfill
some functionality etc."
and this bugs me a lot. I read some article about it but they are talking about Design by Contract and it is not something useful for someone who has troubles with the meaning of "contract".
So can some one please explain how I should understand this term and what I should expect when I see it? It would be very nice you could add some sample code in order for me to visualize it.

In "Design by Contract", a contract means an agreement between the developer of a library (class, function) and the consumer.
It could be an agreement that no one will pass null for a certain parameter. It could be an agreement that a particular method always completes in less than 200ms. Or that events are raised on the thread which created the object. What's important is that there is some document full of rules and both caller and function agree to those rules.
IContractResolver sounds like it provides a data format. It is not a contract. (There may be a contract that says both endpoints of the communication will use this format for particular messages, but a format is not by itself a complete contract. A contract would need to also describe when each message should be sent, and so on.)

Contract is an agreement among at least two parties. In this context .NET contracts make perfect sense.
In design by contract context, it's similar. Designing by an agreement, you're agreeing on an interface and some verifiable obligations.

Contract is rather broad term, but have some specific meanings. So to understand it correctly you should know the context. The general definition is (from here):
A binding agreement between two or more persons or parties
It can be agreement on what operations are provided (partially synonymous to protocol), like here:
The service contract specifies what operations the service supports.
Or in what format data has to be passed (here):
A data contract is a formal agreement between a service and a client
Also interface sometimes is called contract - because it is exactly it: binding agreement on what can be called and how.
Contracts in data-driven development are also agreements on what data can be passed, what data can be returned, and what are valid states of objects. It is essentially the same thing as in first quote: binding agreement between two different pieces of code.
So if you are not sure about context, try to use common sense. If you are not familiar with context, try to understand or ask:
What does contract define in this case?
How is it defined?
What are the parties that are involved?

Well, contract is one of those burdened words which have many meanings depending on context.
To me, it is any agreement between two parties, so it can be an interface in the sense of .NET interface (i.e. a type), or it can be a set and sequence of messages exchanged between the parties (i.e. a protocol) or in your JSON example a mapping between an object and its persisted form.
It is interesting that you mention "interface" as clearer because it is not necessarily so. I don't associate it with abstraction, dependency injection or inheritance (especially that last one), but more loosely with any kind of protocol. Maybe the reason is that I started with languages which don't have interface built in with specific meaning and as a keyword (e.g. C++). The point is that it also depends on context.

SOA Data Contract Patterns. I am sure the framework I am using introduces redundancy, anyone care to enlighten me?

I'll start in the way most people seem to have taken to, on here....
So I was....
Nah thats gash. I'll start again (might lose a few points for not being straight to the point but wth)
Right,
I have inherited a framework which utilities WCF to provide some operation and data contracts.
This might be irksome to some, but I haven't done enough reading on SOA or WCF to garner knowledge about effective patterns (or best practices..) and therefore, don't really have a weighted opinion on my team on this subject, as of yet.
As an example in the framework I am using, there are a bunch of models for users.
Specifically we have the following models (data contracts):
users_Loaded
users_Modify
users_Create
For all intents and purposes these data contracts are exactly the same - in so much that other than their "type", they have the same members and properties etc, and therein is my first problem.
The operations which utilise the data contracts have parameters which match the data contract you might want to perform some action with
Thus the operations utilising the data contracts:
CreateUser(users_Create createdUser, ..., ...)
ModifyUser(users_Modify modifyUser, ..., ...)
GetUser(out users_Load loadedUser, .., ...) (out parameters on left most side of parameter list to boot!?)
Maybe the intent was to delineate the models and the operations from one another, but from my experience a method and its parameter list, usually give a good indication of what we are going to need to do.
Surely one data contract would have sufficed, and maybe even one operation (with a operation type parameter)
Am I missing the point. Why would you do what I have described?
Thanks.
i

It sounds like the previous developer(s) were either trying to implement some bastardized Command pattern, or they flat didn't understand WCF.
Long answer short, yes, from what you've said, you should be just fine combining these into a UserDto class that is the DataContract for all three operations. svcutil, for its part, should have no trouble generating one DataContract class on the client side that will work for all three OperationContract methods (or, since you seem to control both sides of this service, just use a shared assembly containing your DTOs on both client and server).

WCF: Individual methods or a generic ProcessMessage method accepting xml

My company is developing an application that receives data from another company via TCP sockets and xml messages. This is delivered to a single gateway application which then broadcasts it to multiple copies of the same internal application on various machines in our organisation.
WCF was chosen as the technology to handle the internal communications (internally bi-directional). The developers considered two methods.
Individual methods exposed by the
WCF service for each different
message received by the gateway
application. The gateway
application would parse the incoming
external message and call the
appropriate WCF service method. The
incoming XML would be translated
into DataContract DTO’s and supplied
as argument to the appropriate WCF
method.
The internal application
exposed a WCF service with one
method “ProcessMessage” which
accepted an Xml string message as
argument. The internal app would
parse then deserialize the received
xml and process it accordingly.
The lead developer thought option two was the better option as it was “easier” to serialized/deserialize the xml. I thought the argument didn’t make sense because DataContracts are serialized and deserialized by WCF and by using WCF we had better typing of our data. In option 2 someone could call the WCF service and pass in any string. I believe option 1 presents a neater interface and makes the application more maintainable and useable.
Both options would still require parsing and validation of the original xml string at some point, so it may also be a question where is the recommended place to perform this validation.
I was wondering what the current thoughts are for passing this kind of information and what people’s opinions on both alternatives are.

Option 1 is suited if you can ensure that the client always sends serialized representations of data contracts to the server.
However if you need some flexibility in the serialization/deserialization logic and not get tightly coupled with DataContracts, then option 2 looks good. Particularly useful when you want to support alternate forms of xml (say Atom representations, raw xml in custom format etc)
Also in option 2 inside the ProcessMessage() method, you have the option of deciding whether or not to deserialize the incoming xml payload (based on request headers or something that is specific to your application).
In option 1, the WCF runtime will always deserialize the payload.

I recently asked a couple of questions around this area: XML vs Objects and XML vs Objects #2. You'll find the answers to those questions interesting.
For our particular problem we've decided on a hybrod approach, with the interface looking something like this:
// Just using fields for simplicity and no attributes shown.
interface WCFDataContract
{
// Header details
public int id;
public int version;
public DateTime writeDateTime;
public string xmlBlob;
// Footer details
public int anotherBitOfInformation;
public string andSoemMoreInfo;
public book andABooleanJustInCase;
}
The reason we use an xmlBlob is because we own the header and footer schema but not the blob in the middle. Also, we don't really have to process that blob, rather we just pass it to another library (created by another department). The other library returns us more strongly typed data.
Good luck - I know from experience that your option 2 can be quite seductive and can sometimes be hard to argue against without being accused of being overly pure and not pragmatic enough ;)

I hope I understood this right. I think it might make sense to have your gateway app handle all the deserialization and have your internal app expose WCF services that take actual DataContract objects.
This way, your deserialization of the TCP-based XML is more centralized at the gateway, and your internal apps don't need to worry about it, they just need to expose whatever WCF services make sense, and can deal with actual objects.
If you force the internal apps to do the deserialization, you might end up with more maintenance if the format changes or whatever.
So I think I would say option 1 (unless I misunderstood).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.