As part of my constant learning curve into what you can do to make apps scale better, I am currently trying to get a direction to go with queuing, i.e. job queuing or workload processing whichever phrase you like.
In the distant past I used IBM MQ/Series - it worked for a financial app but quite heavy if I remember.
I know of MSMQ, and I have also heard of quite a few others.
But first, here is my context
I have a C#/.NET back-end web app which serves data etc to a Javascript (mostly jQuery etc) front-end via AJAX calls etc. I have a situation where a certain action involves uploading some files, setting up a few record entries in the database, emailing some users etc. So of course I don't want to make this process "online"/"real-time" due to the possible time delay and I am sure the overheads on the webserver/database etc.
So given the type of "messages" that I need to queue and process, what would be (I shouldn't just say easy here I guess!) a good start point? should I run with MSMQ and/or the SQL 2008 service broker stuff, or something like ZeroMQ - or should I simply create my own lightweight workload queue service?
I realise again without seeing the full picture it is hard to make full recommendations, however any start points gratefully received!
David
Don't try to make your own, please! There are so many things to take into account that you will spend more time on it than the rest of your project most probably.
I'd say go for MSMQ, it's very easy to use with WCF, the queues are transactional, have a retry mechanism, etc, and you benefit from the MSMQ UI to see the messages, move them and so on.
Related
I've read a good bit about threading with C#, but to be upfront I haven't done anything in production using it.
I have an application that has to process a bunch of documents and then send the documents via email. This may take 60 seconds to accomplish. I don't want the user of my web application to have to wait for these things to process to move on to other parts of the site.
On a button click the SendEmail function is called. What can I do to this code to make it so that my users can continue browsing the site without discontinuing the processing I need to do within the EmailPDFs function?
[Authorize]
public ActionResult SendEmail(decimal? id, decimal? id2)
{
EmailPDFs(..., ..., ...);
}
Thanks so much!
This is really the kind of thing that message queues are designed to handle. Fire off a message, and a process on a potentially separate server picks it up and processes it. When it's done, it sends a message back to a queue on your server, where a process on your server picks it up and notifies you that it's complete. You then notify your user that the work is finished.
Modern message queue systems can be backed by databases (such as Mongo, MySql, or SQL Server), and are extremely robust. The great thing about them is that they allow you to move long-running or CPU-intensive processes off onto other servers so that your web site remains nice and snappy.
You could try to add multi-threading and parallelism to your web application, by using TaskFactory and all that other stuff (for many folks, this is the route they take), but it doesn't make it very easy to separate your application if you need to, and break those big, resource-hogging pieces off if it becomes necessary.
I urge you to consider a queue-based solution.
Update:
For samples and information on how to implement this type of solution, see the following:
Reliable Messaging with MSMQ and .NET on MSDN
C#: A Message Queuing Service Application on MSDN
Also, consider glancing at this StackOverflow question for a quick crash course on the bare minimimum amount of code required.
A final note: MSMQ is built into certain flavors of Windows, and can be added to it through the Add/Remove Programs feature of the Control Panel. However, how you install it will depend on your specific flavor and version of Windows. A simple Google search will help you to find the appropriate instructions.
Good luck!
I'm an amateur programmer working on a pet project of mine and I would like some pointers on how to make a C# server application. Here's the general idea:
A client connects to the server application, which in turn fetches the necessary information from a mysql database and sends it back to the client to be displayed and wait for the next action.
I got the idea of making something like this after seeing a somewhat old IBM AS400 mainframe running a warehouse management system, and I though: "Hey, I could try developing a small version of this with a nice UI that doesn't look like it stepped out of a time machine!"
I searched around and used the tcplistener class to communicate between the server and client and managed to send some calls and responses using one thread per client. However I've read that this is not scalable for a large number of clients...
Am I looking at this problem the wrong way and I should try something else? Any input will be appreciated
You don't need to deal with TCP directly for this - WCF (Windows Communication Foundation) was written to abstract all the low level stuff from you.
Check this link out for a good example of how to create a client/server application, it has an entry level explanation, some code and a downloadable project...
http://www.codeproject.com/Articles/16765/WCF-Windows-Communication-Foundation-Example
You can find plenty of information about WCF elsewhere on the internet and here on SO.
It is a very large topic, but the situation you have described is pretty simple - so I doubt you will have problems following the example.
I'm Currenty writing a server architecture and came across this as well: faily easy to solve.
You're right, using 1 thread per client is not effiecient and is a huge waste or resources! The way around that is Thread Pooling. There are loads of different ways to do this but the way I chose to implement it on my server was to add every connection to a queue and then have x number of threads (which you can easily increase or decrease to handle demand) simply dequeue a connection, process it, then enqueue it again.
Of course using WCF will make life easier for you and will speed things up drastically but where's the challange in that!
I have an NHibernate MVC application that is using ReadCommitted Isolation.
On the site, there is a certain process that the user could initiate, and depending on the input, may take several minutes. This is because the session is per request and is open that entire time.
But while that runs, no other user can access the site (they can try, but their request won't go through unless the long-running thing is finished)
What's more, I also have a need to have a console app that also performs this long running function while connecting to the same database. It is causing the same issue.
I'm not sure what part of my setup is wrong, any feedback would be appreciated.
NHibernate is set up with fluent configuration and StructureMap.
Isolation level is set as ReadCommitted.
The session factory lifecycle is HybridLifeCycle (which on the web should be Session per request, but on the win console app would be ThreadLocal)
It sounds like your requests are waiting on database locks. Your options are really:
Break the long running process into a series of smaller transactions.
Use ReadUncommitted isolation level most of the time (this is appropriate in a lot of use cases).
Judicious use of Snapshot isolation level (Assuming you're using MS-SQL 2005 or later).
(N.B. I'm assuming the long-running function does a lot of reads/writes and the requests being blocked are primarily doing reads.)
As has been suggested, breaking your process down into multiple smaller transactions will probably be the solution.
I would suggest looking at something like Rhino Service Bus or NServiceBus (my preference is Rhino Service Bus - I find it much simpler to work with personally). What that allows you to do is separate the functionality down into small chunks, but maintain the transactional nature. Essentially with a service bus, you send a message to initiate a piece of work, the piece of work will be enlisted in a distributed transaction along with receiving the message, so if something goes wrong, the message will not just disappear, leaving your system in a potentially inconsistent state.
Depending on what you need to do, you could send an initial message to start the processing, and then after each step, send a new message to initiate the next step. This can really help to break down the transactions into much smaller pieces of work (and simplify the code). The two service buses I mentioned (there is also Mass Transit), also have things like retries built in, and error handling, so that if something goes wrong, the message ends up in an error queue and you can investigate what went wrong, hopefully fix it, and reprocess the message, thus ensuring your system remains consistent.
Of course whether this is necessary depends on the requirements of your system :)
Another, but more complex solution would be:
You build a background robot application which runs on one of the machines
this background worker robot can be receive "worker jobs" (the one initiated by the user)
then, the robot processes the jobs step & step in the background
Pitfalls are:
- you have to programm this robot very stable
- you need to watch the robot somehow
Sure, this is involves more work - on the flip side you will have the option to integrate more job-types, enabling your system to process different things in the background.
I think the design of your application /SQL statements has a problem , unless you are facebook I dont think any process it should take all this time , it is better to review your design and check where is the bottleneck are, instead of trying to make this long running process continue .
also some times ORM is not good for every scenario , did you try to use SP ?
I am creating a mass mailer application, where a web application sets up a email template and then queues a bunch of email address for sending. The other side will be a Windows service (or exe) that will poll this queue, picking up the messages for sending.
My question is, what would the advantage be of using SQL Service Broker (or MSMQ) over just creating my own custom queue table?
Everything I'm reading is suggesting I use Service Broker, but I really don't see what the huge advantage over a flat table (that would be a lot simpler to work with for me). For reference the application will be used to send 50,000-100,000 emails almost daily.
Do you know how to implement a queue over a flat table? This is not a silly question, implementing a queue over a table correctly is much harder than it sounds. Queue-like-tables are notoriously deadlock prone and you need to carefully consider the table design and the enqueue and dequeue operations. Also, do you know how to scale your pooling of the table? And how are you goind to handle retries and timeouts (ie. what timers are used for)?
I'm not saying you should use SSB. The lerning curve is very steep and is primarily a distributed applicaiton platform, not a local queueing product so some features, like dialogs, will actually be obstacles for you rather than advantages. I'm just saying that you must consider also the difficulties of flat-table-queues. If you never implemented a flat-table-queue then be warned, there are many dragons under that bridge.
50k-100k messages per day is nothing, is only one message per second. If you want 100k per minute, then we have something to talk about.
If you every need to port to another vendor's database, you will have less problem if you used normal tables.
As you seem to only have one reader and one write from your queue, I would tend to use a standard table until you hit problem. However if you start to feel the need to use “locking hints” etc, that the time to switch to the Service Broker Queues.
I would not use MSMQ, if both the sender and the reader need a database connection to work. MSMQ would be good if the sender did not talk to the database at all, as it lets the sender keep working when the database is down. However having to setup and maintain both the MSMQ and the database is likely to be more work then it is worth for most systems.
For advantages of Service Broker see this link:
http://msdn.microsoft.com/en-us/library/ms166063.aspx
In general we try to use a tool or standard functionality rather than building things ourselves. This lowers the cost and can make upgrading easier.
I know this is old question, but is sufficiently abstract to be relevant for long enough time.
After using both paradigms I would suggest flat table. It is surprisingly scalable and nifty. Correct hints need to be used.
Once the application goes distributed, or starts using mutiple allways on groups with different RW and RO servers, the Service Broker (or any other method of distributed communication) becomes a neccessity.
Flat table
needs only few hints (higly dependent on isolation level) to work scalably and reliably in the consumer (READPAST, UPDLOCK, ROWLOCK)
the order of message processing is not set in stone
the consumer must make sure that the message stays in the queue if the processing fails
needs some polling mechanism (job, CDC (here lies madness :)), external application...)
turn of maintenance jobs and automatic statistics for the table
Service broker
needs extremely overblown "infrastructure" (message types, contracts, services, queues, activation procedures, must be enabled after each server restart, conversations need to be correctly created and dropped...)
is extremely opaque - we have spent ages trying to make it run after it mysteriously stopped working
there is a predefined order of message processing
the tables it uses can cause deadlocks themselfs if SB is overused
is the only way (except for linked servers...) to send messages directly from database on RW server of one HA group to a database that is RO in this HA group (without any external app)
is the only way to send messages between different servers (linked servers are a big NONO (unless they become an YESYES - you know the drill - it depends)) (without any external app)
Say I need to design an in-memory service because of a very high load read/write system. I want to dump the results of the objects every 2 minutes. How would I access the in-memory objects/data from within a web application?
(I was thinking a Windows service would be running in the background handling the in-memory service etc.)
I want the fastest possible solution, and I would guess most people would say use a web service? What other options would I have? I just don't understand how I could hook into the Windows service's objects etc.
(Please don't ask why I would want to do this, maybe you're right and it's a bad idea but I am also curious if this type of architecture is possible.)
Update
I was looking at this site swoopo.com that I would think has a lot of hits near the end of auctions, but since the auction keeps resetting the hits to the database would be just crazy so I was thinking if they did it in memory then dumped to db every x minutes...
What you're describing is called a cache, with a facade front-end.
You write a facade to which you commit your changes and acquire your datasets. The facade queues up reads and writes and commits when the queue is full or after a certain amount of time has passed. Your web application has a single point of access to the data (the facade), and the facade is structured in such a way to avoid writing and reading from storage too often.
Most relational database management systems do this for you. They do this kind of optimization and queuing internally so writing another layer on top of it would only slow things down. So don't write a cache if you're using an RDBMS.
Regarding the specifics of accessing such a facade, you can treat it as just an object, and implement it however you want (its own thread, a thread pool, a Web service, a Windows service, whatever).
Any remoting technology would work such as sockets, pipes and the like.
Check out: www.remobjects.com
You could use a Windows Message Queues or a Service Bus, or even .NET remoting.
See http://www.nservicebus.com/, or http://code.google.com/p/masstransit/.
You could hook into the Windows Services objects by using Remoting or WCF, both offer very fast interprocess communication. Sockets are fast too but are more cumbersome to program compared to WCF. There is a ton of WCF documentation and support online.
Databases provide a level of caching for you. The advantage of an in memory golden copy such as the one you propose is that it never has to read from disk when a request comes in and if you host it on the same machine as your IIS (provided you have enough RAM for both) there is no extra network hop, making it much faster that querying a db. However, the downside to this approach is that it does not scale as well if you need to add machines to load balance.
Third party messaging providers such as TIBCO are also worth looking at.