I'm trying to design a system which reports activity events to a database via a web service. The web service and database have already been built (COTS software) - all I have to do is provide the event source.
The catch, though, is that the event source needs to be fault tolerant. We have multiple replicated databases that I can talk to, so if the web service or database I'm talking to goes down, the software can quickly switch to another one that's up.
What I need help with though is the case when all the databases are down. I've already designed a queue that will hold on to the events as they pile in (and burst them out once the connection is restored), but the queue is an in-memory structure: if my app crashes in this state, or if power is lost, etc., then all the events in the queue are lost. This is unacceptable. What I need is a way to persist the events so that when a database comes back online I can send a burst of queued-up events, even in the event of power loss or crash.
I know that I don't want to re-implement the queue itself to use the file system as a backing store. This would work (and I've tried it) - but that method slows the system down dramatically as the hard drive becomes a bottleneck. Aside from this though, I can't think of a single way to design this system such that all the events are safely stored on the hard drive only when access to the database isn't available.
Does anyone have any ideas? =)
When I need messaging with fault tolerance (and/or guaranteed delivery, which based on your description I am guessing you also need), I usually turn to MSMQ. It provides both fault tolerance (messages are stored on disk in case of machine restart) and guaranteed delivery (messages will automatically and continually resend until they are received), as well as transactional sends and receives, message journaling, poison message handling, and other features.
I have been able to achieve a throughput of several thousand messages per second using MSMQ. Frankly, I am not sure that you will get too much better than that while still being fault tolerant.
msmq. I think you could also take a look at the notion of Job object.
I would agree with guys that better to use out of the box system like MSMQ with a set of messaging patterns in hand.
Anyway, if you have to do it yourself, you can use in memory database instead of serializing data yourself, I believe it should be faster enough.
Related
We have an application that reads and writes to a third party data storage.
The code of that data storage is closed source, we do not know about it and can not change it.
There is only a slim API that allows reading and writing to it.
An pessimistic offline lock helps to span transactions and have concurrent applications work with it. That will work fine I believe.
But now we have the problem that other software will also write and read to that storage
and our application shall update when changes in that data storage happen. The data storage itself does not provide any notification. The third party software will not change some global state that indicates that something has changed.
Is there any kind of pattern or best practise to "observe" that data storage and
publish events to update all clients (of our software)?
I really do not want to periodically read, compare and publish events if it is not
absolutely the last resort. Perhaps someone has a better idea here?
A non-System implemented Pessimistic Offline Lock requires cooperation/participation/enforcement among all possible modifers of the data. This is generally not possible and is one of the two reasons that this approach is rarely taken in modern software. To do anything remotely like this (i.e., with multiple heterogenuous writers) in a useful way requires some kind help/assistance from the System facilities themselves. (The second reason is the issues of determining and resolving abandoned locks, very problematic).
As for possible solutions, then from a purely design viewpoint, either optimistic offline locks, which still need some System help, but much less, or avoid the issue altogether through more detailed state-progression/control in your data model.
My approach, however, would be to set-aside the design question (initially) recognizing that this is primarily an issue of the data-store's capabilities and start there, looking to use System-provided lock/transaction control, (which both 1: usually works and 2: is how it is usually done).
AFAIK, issues of synchronizing multi-writer access always have to start with "What tools/controls/facilities are available to constrain, divert and/or track the out-of-application writers?" What you can accomplish is practically limited by those facilities.
For instance, if you can force all access through a service of your own, then you can do almost anything. But if all you have is the OS's file-locking and file-modification-dates, then you are a lot more constrained. And if you don't have even that, then there's not much you can do.
In fact I do not have direct access to the data store, it is hosted on
some server and I have no control over the other applications that
read and write to it. Right now, the best I can think of is having a
service as a proxy which periodically queries the store, compares it
to an older state and fires update events to my clients if some other
application has altered it (and fire some other event if my
application alters it to notify my own clients, leaving the other
applications with their own problems). It sound not very good to me,
but it probably does the job.
Yep, that's about all you can do, and that only supports Optimistic Concurrency (partially), not Pessimistic. You might get improvements by adding some kind of checksum/hash to your stored data, but that's only an optimization.
At our organization we have a SQL Server 2005 database and a fair number of database clients: web sites (php, zope, asp.net), rich clients (legacy fox pro). Now we need to pass certain events from the core database with other systems (MongoDb, LDAP and others). Messaging paradigm seems pretty capable of solving this kind of problem. So we decided to use RabbitMQ broker as a middleware.
The problem of consuming events from the database at first seemed to have only two possible solutions:
Poll the database for outgoing messages and pass them to a message broker.
Use triggers on certain tables to pass messages to a broker on the same machine.
I disliked the first idea due to latency issues which arise when periodical execution of sql is involved.
But event-based trigger approach has a problem which seems unsolvable to me at the moment. Consider this scenario:
A row is inserted into a table.
Trigger fires and sends a message (using a CLR Stored Procedure written in C#)
Everything is ok unless transaction which writes data is rolled back. In this case data will be consistent, but the message has already been sent and cannot be rolled back because trigger fires at the moment of writing to the database log, not at the time of transaction commit (which is a correct behaviour of a RDBMS).
I realize now that I'm asking too much of triggers and they are not suitable for tasks other than working with data.
So my questions are:
Has anyone managed to extract data events using triggers?
What other methods of consuming data events can you advise?
Is Query Notification (built on top of Service Broker) suitable in my situation?
Thanks in advance!
Lest first cut out of the of the equation the obvious misfit: Query Notification is not right technology for this, because is designed to address cache invalidation of relatively stable data. With QN you'll only know that table has changed, but you won't be able to know what had changed.
Kudos to you for figuring out why triggers invoking SQLCRL won't work: the consistency is broken on rollback.
So what does work? Consider this: BizTalk Server. In other words, there is an entire business built around this problem space, and solutions are far from trivial (otherwise nobody would buy such products).
You can get quite far though following a few principles:
decoupling. Event based triggers are OK, but do not send the message from the trigger. Aside from the consistency issue on rollback you also have the latency issue of having every DML operation now wait for an external API call (the RabbitMQ send) and the availability issue of the external API call failure (if RabbitMQ is unavailable, your DB is unavailable). The solution is to have the trigger use ordinary tables as queues, the trigger will enqueue a message in the local db queue (ie. will insert into this table) and and external process will service this queue by dequeueing the messages (ie. delete from the table) and forwarding them to RabbitMQ. This decouples the transaction from the RabbitMQ operation (the external process is able to see the message only if the original xact commits), but the cost is some obvious added latency (there is an extra hop involved, the local table acting as a queue).
idempotency. Since RabbitMQ cannot enroll in distributed transactions with the database you cannot guarantee atomicity of the DB operation (the dequeue from local table acting as queue) and the RabbitMQ operation (the send). Either one can succeed when the other failed, and there is simply no way around it w/o explicit distributed transaction enrollment support. Which implies that the application will send duplicate messages every once in a while (usually when things already go bad for some reason). And a quick heads up: enrolling into the act of explicit 'acknowledge' messages and send sequence numbers is a loosing battle as you'll quickly discover that you're reinventing TCP on top of messaging, that road is paved with bodies.
tolerance. For the same reasons as the item above every now in a while a message you believe was sent will never make it. Again, what damage this causes is entirely business specific. The issue is not how to prevent this situation (is almost impossible...) but how to detect this situation, and what to do about it. No silver bullet, I'm afraid.
You do mention in passing Service Broker (the fact that is powering Query Notification is the least interestign aspect of it...). As a messaging platform built into SQL Server which offers Exactly Once In Order delivery guarantees and is fully transacted it would solve all the above pain points (you can SEND from triggers withouth impunity, you can use Activation to solve the latency issue, you'll never see a duplicate or a missing message, there are clear error semantics) and some other pain points I did not mention before (consistency of backup/restore as the data and the messages are on the same unit of storage - the database, cosnsitnecy of HA/DR failover as SSB support both database mirroring and clustering etc). The draw back though is that SSB is only capable of talking to another SSB service, in other words it can only be used to exchange messages between two (or more) SQL Server instances. Any other use requires the parties to use a SQL Server to exchange messages. But if your endpoints are all SQL Server, then consider that there are some large scale deployments using Service Broker. Note that endpoints like php or asp.net can be considered SQL Server endpoints, they are just programming layers on top of the DB API, a different endpoint would, say, the need to send messages from handheld devices (phones) directly to the database (and eve those 99% of the time go through a web service, which means they can reach a SQL Server ultimately). Another consideration is that SSB is geared toward throughput and reliable delivery, not toward low latency. Is definitely not the technology to use to get back the response in a HTTP web request, for instance. IS the technology to use to submit for processing something triggered by a web request.
Remus's answer lays out some sound principals for generating and handling events. You can initiate the pushing of events from a trigger to achieve low latency.
You can achieve everything necessary from a trigger. We will still decouple this into two components: a trigger that generates the events and a local reader that reads the events.
The first component is the trigger.
Make a CLR trigger that prepares what needs to be done when the transaction commits.
Create a System.Transactions.IEnlistmentNotification that always agrees to be prepared, and whose void Commit(System.Transactions.Enlistment) method executes the prepared action.
In the trigger, call System.Transactions.Transaction.Current.EnlistVolatile(enlistmentNotification, System.Transactions.EnlistmentOptions.None)
You'll want your action to be short and sweet, like appending the data to a lockless queue in memory or updating some other state in memory. Don't try to communicate with other machines or processes. Don't write to a disk (if you wanted to write to a disk, just make an ordinary trigger that inserts into a queue table). You'll need to be careful to make sure your assembly is loaded only once so that any shared static state will be unique; this is easiest to do if your static state is in a top level assembly that isn't referenced by other assemblies, so no other assemblies will try to load it.
You will also need to either
initialize your state in such a way that it will be correct even if the system was restarted without sending all the previously queued messages (since a short, in memory queue will not be durable). This means you might be resending messages, so they will need to be idempotent. or
rely on the tolerance of another component to pick up on missed messages
The second component reads the state that is update by the trigger. Make a separate CLR component that reads from your queue or state, and does whatever you need done (like send an idempotent message to a messaging system, record that it was sent, whatever). If this component can fail (hint: it can), you will need some form of tolerance, which may belong in another system. You can achieve low latency by having the trigger signal the second component when new state is available.
One architectural possibility is to have the trigger put the event in memory on commit for another low-latency component to pick up and have the second component send a low-latency, low-reliability copy of an idempotent message. You can pair that with a more reliably or durable messaging system, such as SSB, that will reliably and durably, but with grater latency, send the same idempotent message later.
I am creating a mass mailer application, where a web application sets up a email template and then queues a bunch of email address for sending. The other side will be a Windows service (or exe) that will poll this queue, picking up the messages for sending.
My question is, what would the advantage be of using SQL Service Broker (or MSMQ) over just creating my own custom queue table?
Everything I'm reading is suggesting I use Service Broker, but I really don't see what the huge advantage over a flat table (that would be a lot simpler to work with for me). For reference the application will be used to send 50,000-100,000 emails almost daily.
Do you know how to implement a queue over a flat table? This is not a silly question, implementing a queue over a table correctly is much harder than it sounds. Queue-like-tables are notoriously deadlock prone and you need to carefully consider the table design and the enqueue and dequeue operations. Also, do you know how to scale your pooling of the table? And how are you goind to handle retries and timeouts (ie. what timers are used for)?
I'm not saying you should use SSB. The lerning curve is very steep and is primarily a distributed applicaiton platform, not a local queueing product so some features, like dialogs, will actually be obstacles for you rather than advantages. I'm just saying that you must consider also the difficulties of flat-table-queues. If you never implemented a flat-table-queue then be warned, there are many dragons under that bridge.
50k-100k messages per day is nothing, is only one message per second. If you want 100k per minute, then we have something to talk about.
If you every need to port to another vendor's database, you will have less problem if you used normal tables.
As you seem to only have one reader and one write from your queue, I would tend to use a standard table until you hit problem. However if you start to feel the need to use “locking hints” etc, that the time to switch to the Service Broker Queues.
I would not use MSMQ, if both the sender and the reader need a database connection to work. MSMQ would be good if the sender did not talk to the database at all, as it lets the sender keep working when the database is down. However having to setup and maintain both the MSMQ and the database is likely to be more work then it is worth for most systems.
For advantages of Service Broker see this link:
http://msdn.microsoft.com/en-us/library/ms166063.aspx
In general we try to use a tool or standard functionality rather than building things ourselves. This lowers the cost and can make upgrading easier.
I know this is old question, but is sufficiently abstract to be relevant for long enough time.
After using both paradigms I would suggest flat table. It is surprisingly scalable and nifty. Correct hints need to be used.
Once the application goes distributed, or starts using mutiple allways on groups with different RW and RO servers, the Service Broker (or any other method of distributed communication) becomes a neccessity.
Flat table
needs only few hints (higly dependent on isolation level) to work scalably and reliably in the consumer (READPAST, UPDLOCK, ROWLOCK)
the order of message processing is not set in stone
the consumer must make sure that the message stays in the queue if the processing fails
needs some polling mechanism (job, CDC (here lies madness :)), external application...)
turn of maintenance jobs and automatic statistics for the table
Service broker
needs extremely overblown "infrastructure" (message types, contracts, services, queues, activation procedures, must be enabled after each server restart, conversations need to be correctly created and dropped...)
is extremely opaque - we have spent ages trying to make it run after it mysteriously stopped working
there is a predefined order of message processing
the tables it uses can cause deadlocks themselfs if SB is overused
is the only way (except for linked servers...) to send messages directly from database on RW server of one HA group to a database that is RO in this HA group (without any external app)
is the only way to send messages between different servers (linked servers are a big NONO (unless they become an YESYES - you know the drill - it depends)) (without any external app)
I have a CRUD winform App that uses Merge Replication to allow "disconnected" functionality. My question is; If I am doing all initializing and synchronizing programatically with RMO (like HERE) does it matter if it is a Push or Pull?
What would be a difference?
I understand the differences between the two (see HERE) but it seems that if I am only interacting through RMO the differences become a little fuzzy. If I can it seems that, even though Pull is favored for Merge Replication, I would want to use Push to make the Server bear the brunt and easier management.
Also, due to our environment, I do not need "real-time" updates. Syncing, in either case, will be fired from a UI event.
Does anyone have any experience with this?
Thanks!
We use merge replication via RMO on 20+ client systems that are occasionally connected. As far as I know, you should go with pull subscriptions. I don't know if you could make it work with push subscriptions but I don't advise trying. As you say, the client system will be requesting the sync, which fits the definition of a pull subscription.
The "Use When" section in your second link is pretty clear in its recommendation for push in this case:
Data will typically be synchronized on demand or on a schedule rather than
continuously.
The publication has a large number of Subscribers, and/or it would be too
resource-intensive to run all the
agents at the Distributor.
Subscribers are autonomous, disconnected, and/or mobile.
Subscribers will determine when they
will connect and synchronize changes.
Most often used with merge replication.
I have a table with a heavy load(many inserts/updates/deletes) in a SQL2005 database. I'd like to do some post processing for all these changes in as close to real time as possible(asynchronously so as not to lock the table in any way). I've looked a number of possible solutions but just can't seem to find that one neat solution that feels right.
The kind of post processing is fairly heavy as well, so much so that the windows listener service is actually going to pass the processing over to a number of machines. However this part of the application is already up and running, completetly asynchronous, and not what I need help with - I just wanted to mention this simply because it affects the design decision in that we couldn't just load up some CLR object in the DB to complete the processing.
So, The simple problem remains: data changes in a table, I want to do some processing in c# code on a remote server.
At present we've come up with using a sql trigger, which executes "xp_cmdshell" to lauch an exe which raises an event which the windows service is listening for. This just feels bad.
However, other solutions I've looked at online feel rather convoluted too. For instance setting up SQLCacheDependancy also involves having to setup Service broker. Another possible solution is to use a CLR trigger, which can call a webservice, but this has so many warnings online about it being a bad way to go about it, especially when performance is critical.
Idealy we wouldn't depnd on the table changes but would rather intercept the call inside our application and notify the service from there, unfortunately though we have some legacy applications making changes to the data too, and monitoring the table is the only centralised place at the moment.
Any help would be most appreciated.
Summary:
Need to respond to table data changes in real time
Performance is critical
High volume of traffic is expected
Polling and scheduled tasks are not an option(or real time)
Implementing service broker too big (but might be only solution?)
CLR code is not yet ruled out, but needs to be perfomant if suggested
Listener / monitor may be remote machine(likely to be same phyisical network)
You really don't have that many ways to detect changes in SQL 2005. You already listed most of them.
Query Notifications. This is the technology that powers SqlDependency and its derivatives, you can read more details on The Mysterious Notification. But QN is designed to invalidate results, not to pro-actively notify change content. You will only know that the table has changes, without knowing what changed. On a busy system this will not work, as the notifications will come pretty much continously.
Log reading. This is what transactional replication uses and is the least intrusive way to detect changes. Unfortunately is only available to internal components. Even if you manage to understand the log format, the problem is that you need support from the engine to mark the log as 'in use' until you read it, or it may be overwritten. Only transactional replication can do this sort of special marking.
Data compare. Rely on timestamp columns to detect changes. Is also pull based, quite intrussive and has problems detecting deletes.
Application Layer. This is the best option in theory, unless there are changes occuring to the data outside the scope of the application, in which case it crumbles. In practice there are always changes occuring outside the scope of the application.
Triggers. Ultimately, this is the only viable option. All change mechanisms based on triggers work the same way, they queue up the change notification to a component that monitors the queue.
There are always suggestions to do a tightly coupled, synchronous notification (via xp_cmdshell, xp_olecreate, CLR, notify with WCF, you name it), but all these schemes fail in practice because they are fundamentally flawed:
- they do not account for transaction consistency and rollbacks
- they introduce availability dependencies (the OLTP system cannot proceed unless the notified component is online)
- they perform horribly as each DML operation has to wait for an RPC call of some form to complete
If the triggers do not actually actively notify the listeners, but only queue up the notifications, there is a problem in monitoring the notifications queue (when I say 'queue', I mean any table that acts as a queue). Monitoring implies pulling for new entries in the queue, which means balancing the frequency of checks correctly with the load of changes, and reacting to load spikes. This is not trivial at all, actually is very difficult. However, there is one statement in SQL server that has the semantics to block, without pulling, until changes become available: WAITFOR(RECEIVE). That means Service Broker. You mentioned SSB several times in your post, but you are, rightfuly so, scared of deploying it because of the big unknown. But the reality is that it is, by far, the best fit for the task you described.
You do not have to deploy a full SSB architecture, where the notificaition is delivered all the way to the remote service (that would require a remote SQL instance anyway, even an Express one). All you need to accomplice is to decouple the moment when the change is detected (the DML trigger) from the moment when the notification is delivered (after the change is commited). For this all you need is a local SSB queue and service. In the trigger you SEND a change notification to the local service. After the original DML transaction commits, the service procedure activates and delivers the notification, using CLR for instance. You can see an example of something similar to this at Asynchronous T-SQL.
If you go down that path there are some tricks you'll need to learn to achieve high troughput and you must understant the concept of ordered delivery of messages in SSB. I reommend you read these links:
Reusing Conversations
Writing Service Broker Procedures
SQL Connections 2007 Demo
About the means to detect changes, SQL 2008 apparently adds new options: Change Data Capture and Change Tracking. I emphasizes 'apparently', since they are not really new technologies. CDC uses log reader and is based on the existing Transactional replication mechanisms. CT uses triggers and is very similar to existing Merge replication mechanisms. They are both intended for occasionally connected systems that need to sync up and hence not appropiate for real-time change notification. They can populate the change tables, but you are left with the task to monitor these tables for changes, which is exactly from where you started.
This could be done in many ways. below method is simple since you dont want to use CLR triggers and sqlcmd options.
Instead of using CLR triggers you can create the normal insert trigger which updates the dedicated tracking table on each insert.
And develop dedicated window service which actively polls on the tracking table and update the remote service if there is any change in the data and set the status in tracking table to done (so it wont be picked again)..
EDIT:
I think Microsoft sync services for ADO.Net can work for you. Check out the below links. It may help you
How to: Use SQL Server Change Tracking - sql server 2008
Use a Custom Change Tracking System - below sql server 2008
In similar circumstances we are using CLR trigger that is writing messages to the queue (MSMQ). Service written in C# is monitoring the queue and doing post-processing.
In our case it is all done on the same server, but you can send those messages directly to the remote queue, on a different machine, totally bypassing "local listener".
The code called from trigger looks like this:
public static void SendMsmqMessage(string queueName, string data)
{
//Define the queue path based on the input parameter.
string QueuePath = String.Format(".\\private$\\{0}", queueName);
try
{
if (!MessageQueue.Exists(QueuePath))
MessageQueue.Create(QueuePath);
//Open the queue with the Send access mode
MessageQueue MSMQueue = new MessageQueue(QueuePath, QueueAccessMode.Send);
//Define the queue message formatting and create message
BinaryMessageFormatter MessageFormatter = new BinaryMessageFormatter();
Message MSMQMessage = new Message(data, MessageFormatter);
MSMQueue.Send(MSMQMessage);
}
catch (Exception x)
{
// async logging: gotta return from the trigger ASAP
System.Threading.ThreadPool.QueueUserWorkItem(new WaitCallback(LogException), x);
}
}
Since you said there're many inserts running on that table, a batch processing could fit better.
Why did just create a scheduled job, which handle new data identified by a flag column, and process data in large chunks?
Use the typical trigger to fire a CLR on the database. This CLR will only start a program remotely using the Win32_Process Class:
http://motevich.blogspot.com/2007/11/execute-program-on-remote-computer.html