In an application I'm creating, I've got two components that I want the user to be able to pause/resume. I'm wondering what standard patterns might exist to support pausing and resuming, if any? Both components do a lot of network I/O. It seems like, at a high level, I have to persist the current queue of work that each component has - but persisting it is where I'm looking for these standard patterns? Do I serialize the component itself? Do I serialize just the work? What format do I serialize to (xml, database, etc...)? What does .NET have built in that might help? Are there any libraries to help with this? Are there any differences to consider if the user just pauses/resumes within the same app session or if they pause/resume after opening, closing and then opening the application again? What about persisting this information across different computers?
Any suggestions from past experience or patterns that come to mind? I hope this turns into more of discussion of the various ways of doing this and the pros/cons of each. Thanks.
By message queue I meant MSMQ or one of it's brethren. All messages would be persisted in some sort of database and therefore still available when the app restarts. The primary purpose of such queues is to ensure that messages get delivered even when communication is intermittent and/or unreliable.
It sounds like you could have your communication components take work from MSMQ instead of your current queues pretty easily.
If that doesn't fit your application, it is probably as simple as serializing the objects in your existing queues on termination, and de-serializing them again at application start up. If surviving unexpected termination is important you should always serialize an object as it is added to the work queue, but at that point you may want to look again at an existing message queue system.
You could implement threading and simply call the Suspend() and the Resume() functions on the thread accordingly.
Related
We have an application that reads and writes to a third party data storage.
The code of that data storage is closed source, we do not know about it and can not change it.
There is only a slim API that allows reading and writing to it.
An pessimistic offline lock helps to span transactions and have concurrent applications work with it. That will work fine I believe.
But now we have the problem that other software will also write and read to that storage
and our application shall update when changes in that data storage happen. The data storage itself does not provide any notification. The third party software will not change some global state that indicates that something has changed.
Is there any kind of pattern or best practise to "observe" that data storage and
publish events to update all clients (of our software)?
I really do not want to periodically read, compare and publish events if it is not
absolutely the last resort. Perhaps someone has a better idea here?
A non-System implemented Pessimistic Offline Lock requires cooperation/participation/enforcement among all possible modifers of the data. This is generally not possible and is one of the two reasons that this approach is rarely taken in modern software. To do anything remotely like this (i.e., with multiple heterogenuous writers) in a useful way requires some kind help/assistance from the System facilities themselves. (The second reason is the issues of determining and resolving abandoned locks, very problematic).
As for possible solutions, then from a purely design viewpoint, either optimistic offline locks, which still need some System help, but much less, or avoid the issue altogether through more detailed state-progression/control in your data model.
My approach, however, would be to set-aside the design question (initially) recognizing that this is primarily an issue of the data-store's capabilities and start there, looking to use System-provided lock/transaction control, (which both 1: usually works and 2: is how it is usually done).
AFAIK, issues of synchronizing multi-writer access always have to start with "What tools/controls/facilities are available to constrain, divert and/or track the out-of-application writers?" What you can accomplish is practically limited by those facilities.
For instance, if you can force all access through a service of your own, then you can do almost anything. But if all you have is the OS's file-locking and file-modification-dates, then you are a lot more constrained. And if you don't have even that, then there's not much you can do.
In fact I do not have direct access to the data store, it is hosted on
some server and I have no control over the other applications that
read and write to it. Right now, the best I can think of is having a
service as a proxy which periodically queries the store, compares it
to an older state and fires update events to my clients if some other
application has altered it (and fire some other event if my
application alters it to notify my own clients, leaving the other
applications with their own problems). It sound not very good to me,
but it probably does the job.
Yep, that's about all you can do, and that only supports Optimistic Concurrency (partially), not Pessimistic. You might get improvements by adding some kind of checksum/hash to your stored data, but that's only an optimization.
Imagine I want to have a small network of worker drones possibly on separate threads and possibly on separate processes or even on different PCs. The work items are created by a central program.
I'm looking for an existing product or service that will do this all for me. I know that there is MSMQ and also MQSeries. MQSeries is too expensive. MSMQ is notoriously unreliable. A database backed system would be fine, but I don't want to own/manage/write it. I want to use someone else's work queue system.
Related Articles:
Here is a similar question, but it's advocating building a custom queue mechanism.
The queue that I like a lot is this one from Google App Engine.
http://www.codeproject.com/KB/library/DotNetMQ.aspx
If you follow some guidelines you can use a database as a queue store with good success, see Using tables as Queues.
SQL Server comes with its own built-in message queuing, namely Service Broker. It allows you to avoid many of the MSMQ pitfalls when it comes to scalability, reliability and high availability and disaster recovery scenarios.
Servcie Broker is fully integrated in the database (no external store, one consistent backup/restore, one unit of failover, no need for expensive two-phase-commit DTC between message store and database, one single T-SQL API to access and program both the messages and your data) and also has some nice unique features such as transactional messaging with guaranteed Exactly-Once-In-Order delivery, correlated message locking, internal activation etc.
I have used Rabbit MQ in the past for a pet project, you could add that to your list for Queue systems.
As far as a framework to wrap the Queue's, you could take a look at http://www.nservicebus.com/ we have done a couple of basic projects here at work with that. And here's a quick example to get started: http://meisinger2.wordpress.com/2009/11/09/nservicebus-fifteen-minutes/
I have successfully used MassTransit in the past. It supports using MSMQ as well as RabbitMQ.
I'm trying to design a system which reports activity events to a database via a web service. The web service and database have already been built (COTS software) - all I have to do is provide the event source.
The catch, though, is that the event source needs to be fault tolerant. We have multiple replicated databases that I can talk to, so if the web service or database I'm talking to goes down, the software can quickly switch to another one that's up.
What I need help with though is the case when all the databases are down. I've already designed a queue that will hold on to the events as they pile in (and burst them out once the connection is restored), but the queue is an in-memory structure: if my app crashes in this state, or if power is lost, etc., then all the events in the queue are lost. This is unacceptable. What I need is a way to persist the events so that when a database comes back online I can send a burst of queued-up events, even in the event of power loss or crash.
I know that I don't want to re-implement the queue itself to use the file system as a backing store. This would work (and I've tried it) - but that method slows the system down dramatically as the hard drive becomes a bottleneck. Aside from this though, I can't think of a single way to design this system such that all the events are safely stored on the hard drive only when access to the database isn't available.
Does anyone have any ideas? =)
When I need messaging with fault tolerance (and/or guaranteed delivery, which based on your description I am guessing you also need), I usually turn to MSMQ. It provides both fault tolerance (messages are stored on disk in case of machine restart) and guaranteed delivery (messages will automatically and continually resend until they are received), as well as transactional sends and receives, message journaling, poison message handling, and other features.
I have been able to achieve a throughput of several thousand messages per second using MSMQ. Frankly, I am not sure that you will get too much better than that while still being fault tolerant.
msmq. I think you could also take a look at the notion of Job object.
I would agree with guys that better to use out of the box system like MSMQ with a set of messaging patterns in hand.
Anyway, if you have to do it yourself, you can use in memory database instead of serializing data yourself, I believe it should be faster enough.
I want to build a windows service that will use a remote encoding service (like encoding.com, zencoder, etc.) to upload video files for encoding, download them after the encoding process is complete, and process them.
In order to do that, I was thinking about having different queues, one for handling currently waiting files, one for files being uploaded, one for files waiting for encoding to complete and one more for downloading them. Each queue has a limitation, for example only 5 files can be uploaded for encoding at a certain time. The queues have to be visible and able to resurrect from a crash - currently we do that by writing the queue to an SQL table and managing the number of items in a separate table.
I also want the queues to run in the background, independent of each other, but able to transfer files from one queue to another as the process goes on.
My biggest question mark is about how to build the queues and managing them, and less about limiting the number of items in each queue.
I am not sure what is the right approach for this and would really appreciate any help.
Thanks!
You probably don't need to separate the work into separate queues, as long as they are logically separated in some way (tagged with different "job types" or such).
As I see it, the challenge is to not pick up and process more than a given limited number of jobs from the queue, based on the type of job. I had a somewhat similar issue a while ago which led to a question here on SO, and a subsequent blog post with my solution, both of which might give you some ideas.
In short my solution was that I keep a list of "tokens". When ever I want to perform a job that has some sort of limitation, I first pick up a token. If no tokens are available, I will need to wait for one to become available. Then you can use whatever queueing mechanism suitable to handle the queue as such.
There are various ways to approach this and it depends which one suits your case in terms of reliability and resilience/development cost/maintenance cost. You need to answer the question on the likes that what if server crashes, is it important to carry on what you were doing?
Queue can be implemented in MSMQ, SQL Server or simply in code and all queues in memory. For workflow you can use Windows Workflow Foundation, or implement it yourself which would be probably easier but change would be more difficult.
So if you give a few more hints, I should be able to help you better.
I have two unrelated processes that use .NET assemblies as plugins. However, either process can be started/stopped at any time. I can't rely on a particular process being the server. In fact, there may be multiple copies running of one of the processes, but only one of the other.
I initially implemented a solution based off of this article. However, this requires the one implementing the server to be running before the client.
Whats the best way to implement some kind of notification to the server when the client(s) were running first?
Using shared memory is tougher because you'll have to manage the size of the shared memory buffer (or just pre-allocate enough). You'll also have to manually manage the data structures that you put in there. Once you have it tested and working though, it will be easier to use and test because of its simplicity.
If you go the remoting route, you can use the IpcChannel instead of the TCP or HTTP channels for a single system communication using Named Pipes. http://msdn.microsoft.com/en-us/library/4b3scst2.aspx. The problem with this solution is that you'll need to come up with a registry type solution (either in shared memory or some other persistent store) that processes can register their endpoints with. That way, when you're looking for them, you can find a way to query for all the endpoints that are running on the system and you can find what you're looking for. The benefits of going with Remoting are that the serialization and method calling are all pretty straightforward. Also, if you decide to move to multiple machines on a network, you could just flip the switch to use the networking channels instead. The cons are that Remoting can get frustrating unless you clearly separate what are "Remote" calls from what are "Local" calls.
I don't know much about WCF, but that also might be worth looking into. Spider sense says that it probably has a more elegant solution to this problem... maybe.
Alternatively, you can create a "server" process that is separate from all the other processes and that gets launched (use a system Mutex to make sure more than one isn't launched) to act as a go-between and registration hub for all the other processes.
One more thing to look into the Publish-Subscribe model for events (Pub/Sub). This technique helps when you have a listener that is launched before the event source is available, but you don't want to wait to register for the event. The "server" process will handle the event registry to link up the publishers and subscribers.
Why not host the server and the client on both sides, and whoever comes up first gets to be the server? And if the server drops out, the client that is still active switches roles.
There are many ways to handle IPC (.net or not) and via a TCP/HTTP tunnel is one way...but can be a very bad choice (depending on circumstances and enviornment).
Shared memory and named pipes are two ways (and yes they can be done in .Net) that might be better solutions for you. There is also the IPC class in the .Net Framework...but I personally don't like them due to some AppDomain issues...
I agree with Garo.
Using a pub/sub service would be a great solution. This obviously means that this service would need to be up and running before either of the other two.
If you want to skip the pub/sub you can just implement the service in both applications with different end points. When either of the applications is launched it tries to access the other known object via the IPC proxy. If the proxy fails, the other object isn't up.
-Scott
I've spent 2 days meandering through all the options available for IPC while looking for a reliable, simple, and fast way to do full-duplex IPC. IPCLibrary, which I found on Codeplex.com, is so far working perfectly out of all the options that I tried. All with only 7 lines of code. :D If anyone stumbles across this trying to find a full-duplex IPC, save yourself a ton of time and give this library a try. Grab the source code, compile the data.dll and follow the examples given.
HTH,
Circ