I´m trying to implement session context data for a queued messaging system.
The session handling goes like this:
A front end application authenticates itself and receives a session id.
After that the session id is included in the message headers, so a message handler is provided with a context for e.g. security checks and audit logging. The client may pick up a session if it crashed and continue with it´s work.
So now we want to associate key/value pairs with the session id. But this creates many concurrency problems if the session data changes, as the session data used by the message handler should be that at the time the message was sent.
I see two possible solutions:
Put the associated session data in every message header
Store the session data versioned to the database and use a version id in the message header.
The first makes messages bigger, the second makes the session DB bigger and creates a lot of infrastructural code. I have to save the most current values to the DB in both, so a client may continue it´s work if it crashed or lost connection.
Are there any other solutions? I tend to use the first solution, but want to get some feedback first.
How do others deal with this (e.g. JMS/NServiceBus/Masstransit)?
Update based on Answer:
I´ve chosen to take the route of convincing my team members to use the session data only in the frontend and putting it into the messages if it is required for the message handler.
You didn't really go into detail about why you want to associate key/value pairs with the session concept.
Coming from NServiceBus and Udi Dahan's advice on SOA and service boundaries, this type of session concept tends to rub me the wrong way. My feeling is that message handlers should be, for the most part, fairly deterministic with respect to time. That is, it should run just as well right now, or sit in a queue for awhile and execute the exact same way at some point in the future.
So, my advice would be that for security purposes, go ahead and use message headers if necessary. In NServiceBus you can introduce message handlers from an IT/Ops Service that are configured to execute first in the handler chain, verifying security and stuff like that independent of the actual business logic. In this case, the header information just affects whether the message gets processed or rejected.
When you get to session type information, I would want to carefully analyze those requirements and put the relevant pieces in the message schema itself.
Again, it would be helpful to know the motivation behind the session data in the first place. If you edit your question, perhaps we could identify a way you could reorganize those requirements.
Related
I have a number of domain events that can be dispatched in our enterprise system. For example if someone creates or removes an address.
Should i be passing the entire entity as part of the event or should i just pass the ID. The messages are sent via service bus and consumed in parallel.
If i just send the ID then the entity may not be available on the consumer side, if a delete happening in the meanwhile. I can always just use an active flag and set that to false but what about if the entity was updated meanwhile and it changed something important.
How would i go about handling these cases?
This I believe to be a common dilemma on a service bus, and I believe there is no one perfect solution. I'm assuming the scope here is JUST the Events raised when important domain objects change states (i.e. not Transactional Commands, nor Read / Data Services)
The decision on sending just Event metadata, versus sending a full Reference Message (e.g. a new Customer Aggregate Root) probably has wider implications than just on concurrency / versioning issues relating to latency, e.g. some pros and cons of either approach:
The minimal Event metadata:
Has a much smaller payload (especially useful if you audit all messages on the bus)
Fits nicely into standard envelopes
Is reasonably secure if a delivered to an unauthorised bus endpoint (all the system gets is the knowledge that Customer XYZ has changed, not the actual details).
Whereas a full "aggregate" root Message reference update
Can be overkill if most subscribers aren't interested in the full payload.
Potential security concerns - not all subscribers on the bus may be entitled to the full payload
But is great for replenishing CQRS readstore caches, as endpoints don't need to go back to the source of truth to fetch data once they know their data is out of date - the data has already been provided.
So I guess the final decision will go with what you primarily intend doing with your EDA Events (Keeping CQRS caches updated vs Triggering BPM Workflows vs Monitoring CEP Rules etc). You might decide to go with a hybrid e.g. broadcast Event Data widely, but then route full Messages to only trusted endpoints (The event meta data can likely be projected from the full payload, so the Originating / Source of Truth system can just send one message payload to the bus after each state change).
To answer your data consistency question, I believe you will need to accept that the data will only be Eventually Consistent, and that latencies will cause temporary inconsistencies across the enterprise. I believe the best pattern here is to add a hash or timestamp to each Message obtained from the originating Source of Truth, which needs to be added to any Commands which have used this version of the data as an assumption.
Then, when the command handling system processes the command, it can then check this hash against the current 'true' version (based on the actual line of business system database, NOT against a readstore Cache), and will need to fail the command if the hash / timestamps do not match up - i.e. the optimistic concurrency pattern.
I have this scenario, and I don't really know where to start. Suppose there's a Web service-like app (might be API tho) hosted on a server. That app receives a request to proccess some data (through some method we will call processData(data theData)).
On the other side, there's a robot (might be installed on the same server) that procceses the data. So, The web-service inserts the request on a common Database (both programms have access to it), and it's supposed to wait for that row to change and send the results back.
The robot periodically check the database for new rows, proccesses the data and set some sort of flag to that row, indicating that the data was processed.
So the main problem here is, what should the method proccessData(..) do to check for the changes of the data row?.
I know one way to do it: I can build an iteration block that checks for the row every x secs. But i don't want to do that. What I want to do is to build some sort of event listener, that triggers when the row changes. I know it might involve some asynchronous programming
I might be dreaming, but is that even possible in a web enviroment.?
I've been reading about a SqlDependency class, Async and AWait classes, etc..
Depending on how much control you have over design of this distributed system, it might be better for its architecture if you take a step back and try to think outside the domain of solutions you have narrowed the problem down to so far. You have identified the "main problem" to be finding a way for the distributed services to communicate with each other through the common database. Maybe that is a thought you should challenge.
There are many potential ways for these components to communicate and if your design goal is to reduce latency and thus avoid polling, it might in fact be the right way for the service that needs to be informed of completion of this work item to be informed of it right away. However, if in the future the throughput of this system has to increase, processing work items in bulk and instead poll for the information might become the only feasible option. This is also why I have chosen to word my answer a bit more generically and discuss the design of this distributed system more abstractly.
If after this consideration your answer remains the same and you do want immediate notification, consider having the component that processes a work item to notify the component(s) that need to be notified. As a general design principle for distributed systems, it is best to have the component that is most authoritative for a given set of data to also be the component to answer requests about that data. In this case, the data you have is the completion status of your work items, so the best component to act on this would be the component completing the work items. It might be better for that component to inform calling clients and components of that completion. Here it's also important to know if you only write this data to the database for the sake of communication between components or if those rows have any value beyond the completion of a given work item, such as for reporting purposes or performance indicators (KPIs).
I think there can be valid reasons, though, why you would not want to have such a call, such as reducing coupling between components or lack of access to communicate with the other component in a direct manner. There are many communication primitives that allow such notification, such as MSMQ under Windows, or Queues in Windows Azure. There are also reasons against it, such as dependency on a third component for communication within your system, which could reduce the availability of your system and lead to outages. The questions you might want to ask yourself here are: "How much work can my component do when everything around it goes down?" and "What are my design priorities for this system in terms of reliability and availability?"
So I think the main problem you might want to really try to solve fist is a bit more abstract: how should the interface through which components of this distributed system communicate look like?
If after all of this you remain set on having the interface of communication between those components be the SQL database, you could explore using INSERT and UPDATE triggers in SQL. You can easily look up the syntax of those commands and specify Stored Procedures that then get executed. In those stored procedures you would want to check the completion flag of any new rows and possibly restrain the number of rows you check by date or have an ID for the last processed work item. To then notify the other component, you could go as far as using the built-in stored procedure XP_cmdshell to execute command lines under Windows. The command you execute could be a simple tool that pings your service for completion of the task.
I'm sorry to have initially overlooked your suggestion to use SQL Query Notifications. That is also a feasible way and works through the Service Broker component. You would define a SqlCommand, as if normally querying your database, pass this to an instance of SqlDependency and then subscribe to the event called OnChange. Once you execute the SqlCommand, you should get calls to the event handler you added to OnChange.
I am not sure, however, how to get the exact changes to the database out of the SqlNotificationEventArgs object that will be passed to your event handler, so your query might need to be specific enough for the application to tell that the work item has completed whenever the query changes, or you might have to do another round-trip to the database from your application every time you are notified to be able to tell what exactly has changed.
Are you referring to a Message Queue? The .Net framework already provides this facility. I would say let the web service manage an application level queue. The robot will request the same web service for things to do. Assuming that the data needed for the jobs are small, you can keep the whole thing in memory. I would rather not involve a database, if you don't already have one.
At our organization we have a SQL Server 2005 database and a fair number of database clients: web sites (php, zope, asp.net), rich clients (legacy fox pro). Now we need to pass certain events from the core database with other systems (MongoDb, LDAP and others). Messaging paradigm seems pretty capable of solving this kind of problem. So we decided to use RabbitMQ broker as a middleware.
The problem of consuming events from the database at first seemed to have only two possible solutions:
Poll the database for outgoing messages and pass them to a message broker.
Use triggers on certain tables to pass messages to a broker on the same machine.
I disliked the first idea due to latency issues which arise when periodical execution of sql is involved.
But event-based trigger approach has a problem which seems unsolvable to me at the moment. Consider this scenario:
A row is inserted into a table.
Trigger fires and sends a message (using a CLR Stored Procedure written in C#)
Everything is ok unless transaction which writes data is rolled back. In this case data will be consistent, but the message has already been sent and cannot be rolled back because trigger fires at the moment of writing to the database log, not at the time of transaction commit (which is a correct behaviour of a RDBMS).
I realize now that I'm asking too much of triggers and they are not suitable for tasks other than working with data.
So my questions are:
Has anyone managed to extract data events using triggers?
What other methods of consuming data events can you advise?
Is Query Notification (built on top of Service Broker) suitable in my situation?
Thanks in advance!
Lest first cut out of the of the equation the obvious misfit: Query Notification is not right technology for this, because is designed to address cache invalidation of relatively stable data. With QN you'll only know that table has changed, but you won't be able to know what had changed.
Kudos to you for figuring out why triggers invoking SQLCRL won't work: the consistency is broken on rollback.
So what does work? Consider this: BizTalk Server. In other words, there is an entire business built around this problem space, and solutions are far from trivial (otherwise nobody would buy such products).
You can get quite far though following a few principles:
decoupling. Event based triggers are OK, but do not send the message from the trigger. Aside from the consistency issue on rollback you also have the latency issue of having every DML operation now wait for an external API call (the RabbitMQ send) and the availability issue of the external API call failure (if RabbitMQ is unavailable, your DB is unavailable). The solution is to have the trigger use ordinary tables as queues, the trigger will enqueue a message in the local db queue (ie. will insert into this table) and and external process will service this queue by dequeueing the messages (ie. delete from the table) and forwarding them to RabbitMQ. This decouples the transaction from the RabbitMQ operation (the external process is able to see the message only if the original xact commits), but the cost is some obvious added latency (there is an extra hop involved, the local table acting as a queue).
idempotency. Since RabbitMQ cannot enroll in distributed transactions with the database you cannot guarantee atomicity of the DB operation (the dequeue from local table acting as queue) and the RabbitMQ operation (the send). Either one can succeed when the other failed, and there is simply no way around it w/o explicit distributed transaction enrollment support. Which implies that the application will send duplicate messages every once in a while (usually when things already go bad for some reason). And a quick heads up: enrolling into the act of explicit 'acknowledge' messages and send sequence numbers is a loosing battle as you'll quickly discover that you're reinventing TCP on top of messaging, that road is paved with bodies.
tolerance. For the same reasons as the item above every now in a while a message you believe was sent will never make it. Again, what damage this causes is entirely business specific. The issue is not how to prevent this situation (is almost impossible...) but how to detect this situation, and what to do about it. No silver bullet, I'm afraid.
You do mention in passing Service Broker (the fact that is powering Query Notification is the least interestign aspect of it...). As a messaging platform built into SQL Server which offers Exactly Once In Order delivery guarantees and is fully transacted it would solve all the above pain points (you can SEND from triggers withouth impunity, you can use Activation to solve the latency issue, you'll never see a duplicate or a missing message, there are clear error semantics) and some other pain points I did not mention before (consistency of backup/restore as the data and the messages are on the same unit of storage - the database, cosnsitnecy of HA/DR failover as SSB support both database mirroring and clustering etc). The draw back though is that SSB is only capable of talking to another SSB service, in other words it can only be used to exchange messages between two (or more) SQL Server instances. Any other use requires the parties to use a SQL Server to exchange messages. But if your endpoints are all SQL Server, then consider that there are some large scale deployments using Service Broker. Note that endpoints like php or asp.net can be considered SQL Server endpoints, they are just programming layers on top of the DB API, a different endpoint would, say, the need to send messages from handheld devices (phones) directly to the database (and eve those 99% of the time go through a web service, which means they can reach a SQL Server ultimately). Another consideration is that SSB is geared toward throughput and reliable delivery, not toward low latency. Is definitely not the technology to use to get back the response in a HTTP web request, for instance. IS the technology to use to submit for processing something triggered by a web request.
Remus's answer lays out some sound principals for generating and handling events. You can initiate the pushing of events from a trigger to achieve low latency.
You can achieve everything necessary from a trigger. We will still decouple this into two components: a trigger that generates the events and a local reader that reads the events.
The first component is the trigger.
Make a CLR trigger that prepares what needs to be done when the transaction commits.
Create a System.Transactions.IEnlistmentNotification that always agrees to be prepared, and whose void Commit(System.Transactions.Enlistment) method executes the prepared action.
In the trigger, call System.Transactions.Transaction.Current.EnlistVolatile(enlistmentNotification, System.Transactions.EnlistmentOptions.None)
You'll want your action to be short and sweet, like appending the data to a lockless queue in memory or updating some other state in memory. Don't try to communicate with other machines or processes. Don't write to a disk (if you wanted to write to a disk, just make an ordinary trigger that inserts into a queue table). You'll need to be careful to make sure your assembly is loaded only once so that any shared static state will be unique; this is easiest to do if your static state is in a top level assembly that isn't referenced by other assemblies, so no other assemblies will try to load it.
You will also need to either
initialize your state in such a way that it will be correct even if the system was restarted without sending all the previously queued messages (since a short, in memory queue will not be durable). This means you might be resending messages, so they will need to be idempotent. or
rely on the tolerance of another component to pick up on missed messages
The second component reads the state that is update by the trigger. Make a separate CLR component that reads from your queue or state, and does whatever you need done (like send an idempotent message to a messaging system, record that it was sent, whatever). If this component can fail (hint: it can), you will need some form of tolerance, which may belong in another system. You can achieve low latency by having the trigger signal the second component when new state is available.
One architectural possibility is to have the trigger put the event in memory on commit for another low-latency component to pick up and have the second component send a low-latency, low-reliability copy of an idempotent message. You can pair that with a more reliably or durable messaging system, such as SSB, that will reliably and durably, but with grater latency, send the same idempotent message later.
First of all, I am creating a something like a client/server solution using a standard ASP.NET website - I do know this method is not adviced, and most people would love to scream "COMET!" or "HTML5 Sockets!" - but please don't ;-) !
What I am doing...
I am creating an MMORPG on a website.
I have several clients whom need to be in contact at the same time. This is done by a global object in the Application scope.
My problem
I need to invoke an event to several clients. For instance, when an attack has been performed, I need to update some graphics. The attack logic is resolved in the global object, but each of the clients has to respond to this.
Right now I do the following:
fightTrace.Reciever.InvokeMoveEnded(this);
fightTrace.FiredBy.InvokeMoveEnded(this);
(This is a kind of observer pattern)
What now happends is a race condition. The one who loads the page_load event will get both of these events, and the one who is not running them, will experience no changes in the UI.
So what is it I really want?
What I really need is some genuine and nice way to create an observer pattern through the application state. I need to send an event out to every "listener" which is in this case is a client, and then do some update.
One way to do this is some session-thing, with true/false.. But I would really like some better way!
Thanks!
If I understood your context correctly then whenever the state of your application state object is changed you want to synchronize all the clients of your applications. What you are forgetting here is the stateless behavior of HTTP protocol. Once a response is sent the connection is lost you need to send an HTTP request again to be served again. However you can emulate some thing using State Management and Ajax based short and timely updates to simulate a connected environment. However I've to utter the words which you don't want to hear. Not recommended.
Instead what you can do is to save the state of application object and whenever a request comes serve the response based on updated state of your object. Any how a client has to initiate a request.
I'm currently in the process of building an ASP.NET MVC web application in c#.
I want to make sure that this application is built so that it can scale out in the future without the need for major re-factoring.
I'm quite keen on using some sort of queue to post any writes to my database base to and have a process which polls that queue asynchronously to perform the update. Once this data has been posted back to the database the client then needs to be updated with the new information. The implication here being that the process to write the data back to the database could take a short while based on business rules executing on the server.
My question is what would be the best way to handle the update from the client\browser perspective.
I'm thinking along the lines of posting the data back to the server and adding it to the queue and immediately sending a response to the client then polling at some frequency to get the updated data. Any best practices or patterns on this would be appreciated.
Also in terms of reading data from the database would you suggest using any particular techniques or would reading straight from db be sufficient given my scenario.
Update
Thought I'd post an update on this as it's been a while. We've actually ended up using Windows Azure but the solution is applicable to other platforms.
What we've ended up doing is using the Windows Azure Queue to post messages\commands to. This is a very quick process and returns immediately. We then have a worker role which processes these messages on another thread. This allows us to minimize any db writes\updates on the web role in theory allowing us to scale more easily.
We handle informing the user via emails or even silently depending on the type of data we are dealing with.
Not sure if this helps but why dont you have an auto refresh on the page every 30 seconds for example. This is sometimes how news feeds work on sports websites, saying the page will be updated every x minutes.
<meta http-equiv="refresh" content="120;url=index.aspx">
Why not let the user manually poll the status of the request? This is how your typical e-commerce app is implemented. When you purchase something online, the order is submitted to a queue for fullfillment. After it's submitted, the user is presented with a "Thank you for your order" page and a link where they can check the status of the order. The user can visit the link anytime to check the status, no need for an auto-poll mechanism.
Is your scenario so different from this?
Sorry in my previous answer I might have misunderstood. I was talking of a "queue" as something stored in a SQL DB, but it seems on reading your post again you are may be talking about a separate message queueing component like MSMQ or JMS?
I would never put a message queue in the front end, between a user and backend SQL DB. Queues are good for scaling across time, which is suitable between backend components, where variances in processing times are acceptable (e.g. order fulfillment)... when dealing with users, this variance is usually not acceptable.
While I don't know if I agree with the logic of why, I do know that something like jQuery is going to make your life a LOT easier. I would suggest making a RESTful web API that your client-side code consumes. For example, you want to post a new order to the system and have the client responsive? Make a post to www.mystore.com/order/create and have that return the new URI to access the order (i.e. order#) as a URI (www.mystore.com/order/1234). That response is then stored in the client code and a jQuery call is setup to poll for a response or stop polling on an error.
For further reading check out this Wikipedia article on the concept of REST.
Additionally you might consider the Reactive Extensions for .NET and within that check out the RxJS sub-project which has some pretty slick ways of handling with the polling problem without causing you to write the polling code yourself. Fun things to play with!
Maybe you can add a "pending transactions" area to the UI. When you queue a transaction, add it to the user's "pending transactions" list.
When it completes, show that in the user's "pending transactions" list the next time they request a new page.
You can make a completed transaction stay listed until the user clicks on it, or for a predetermined length of time.