I've written a server which interacts with an MSSQL database. It's currently written in .NET 4.0 and uses NHibernate as an ORM to retrieve information from the database. When reading about .NET 4.5 and the introduction of the async/await keywords I learned that, unfortunately, NHibernate does not have support for async/await .
I don't understand why issuing an async call to a database would be beneficial. Don't all the requests queue at the database level anyway? Wouldn't async just increase points of failure without improving anything?
In general, the benefit is that you are not blocking the currently executing thread while a possibly expensive (asynchronous) operation is run. In the context of a WPF / Windows Form application, this mean you are not blocking the UI Thread (if the request is originating from that thread) and your application remains responsive.
In the context of a web application (say IIS), this mean you are releasing a thread in the pool while you are awaiting for the result. Since you are not locking the thread, it can be reused to accept another request and results in better performance in terms of accepted connections (not necessarily time / request).
Don't all the requests queue at the database level anyway?
No. Read Understanding how SQL Server executes a query. Any database server worth the name will be able to run hundreds of requests concurrently. Serialization is necessary only if the requests are correlated (eg. you need the output of query 1 to pass as a parameter to query 2) or when operating under transaction constraints (only one statement can be active at any time within a transaction).
There are at least two major advantages of async calls:
resource usage. W/o considering anything else, just changing the programming model to an event driven async model will result in order of magnitude increase of throughput you app can drive. This, of course, applies to back end apps (eg. a web server), not to a client user driven app that will not be able to send anything more than what the one user initiates. Read the articles linked from High Performance Windows programs. This is also important to read, even though a bit dated: Asynchronous Pages in ASP.NET 2.0
overlapping requests. The synchronous model doe snot allow to issue a query to the back end until the current one completes. A lot of times the application has the info necessary (the params) to make two or more uncorrelated requests, but it simply can. Doing async calls allow the controlling thread to issue all the request is parallel, and resume after they all complete.
Neither .Net 4.5 Tasks not NHibernate have good support for async DB programming. Good old BeginExecuteXXX is much more powerful actually, although a bit arcane to program against.
NHibernate can support true async calls. I already implemented it on my own branch
https://github.com/ReverseBlade/nhibernate-core/tree/nh_4.5.1
You can check it out and compile. It is compiled against .net 4.5.1. It is compatible with standart nhibernate and passes all tests.
Then you can use things like .ToListAsync(); or GetAsync(), it will make true async calls.
If you need help you can write a comment. Good luck
Good news. NHibernate supports async/await out of the box since v 5.0
You may be confusing language features with design pattens; async is syntactic sugar to help you manage background tasks, while asynchronous tasks just mean that you're running two or more threads.
Just because NHibernate doesn't support async doesn't mean that you can't run asynchronously. This is very beneficial to the user because you don't want to freeze the UI while you're performing a (relatively) long-running query to a DB/service, especially if the server is bogged down.
I suppose you could count this as a point of failure, but really just a few areas:
Exceptions - You'd have this problem on one thread anyway, but you should gracefully handle any database errors that you'd encounter.
UI Management - You don't want to let the user interact with the UI in such a way as to trigger multiple queries, so you might disable a button, etc.
Result Handling - When the query is complete, you need to ensure that you marshal the data back to the UI thread. In C# this can be done via Invoke/BeginInvoke, though whether you're in WinForms or WPF determines the details.
EDIT:
Some sample skeleton code assuming WPF and at least .NET 4.0
Task.Factory.StartNew(() =>
{
using (var client = new dbClient())
{
// Perform query here
this.Dispatcher.BeginInvoke(new Action(() =>
{
// Set data source, etc, i.e.
this.Items = result;
}));
}
}).ContinueWith(ex => Logger.LogException(ex), TaskContinuationOptions.OnlyOnFaulted);
You say:
Don't all the requests queue at the database level anyway?
If by "queue" you mean "single-servicing queue" than the answer is no. SQL Server is a highly asynchronous and multi-threaded service that can service many, many queries simultaneously.
Even at a physical level, queueing (i.e. physical device servicing) is simultaneously split across the number of CPU cores, and the number of physical disks the make up the disk array.
So the reason to make asynchronous calls to SQL Server is to be able to leverage some of that multi-threading/multi-servicing capacity into your own service.
Related
This question already has an answer here:
How to deal with SQLite in an asynchronous project
(1 answer)
Closed 7 months ago.
I'm developing an ASP.NET Core web app (Blazor Server-side to be exact) and I was checking a .NET based embedded database called LiteDB and I noticed that it still lacks async calls while most SQLite wrappers have them (for example Dapper ORM for SQLite exposes many async methods)
If I'm not mistaken (please correct me if I'm wrong), the whole point of using asynchronous calls (let's say async/await in C#) is to free up the threads from waiting for the completion of IO operations (let's say querying a database).
The above scenario makes sense when in case of the said example, the database is in another machine or at least another process of the same machine because we are effectively relegating the job to something else and execution thread can do other jobs and come back to the result when it's ready.
But what about embedded databases such as SQLite (or the one mentioned above: LiteDB)? These databases run in the same process as the main application so any database processing (let's say querying) is still done by the threads of the application itself.
If the application is a classic GUI based app (let's say WinForm), using asynchronous calls would free up the main thread from being blocked and app becomes non-resposive and still understandable but what about the context of ASP.NET Core app in which every request is processed in a separate thread*?
*My question is that why use asynchronous calling when the app itself has to do the database processings too and therefore a thread has to be kept busy anyway;
Context
Microsoft's Async limitations (from 09/15/2021) states:
SQLite doesn't support asynchronous I/O. Async ADO.NET methods will execute synchronously in Microsoft.Data.Sqlite. Avoid calling them.
Instead, use a shared cache and write-ahead logging to improve performance and concurrency.
More
what about the context of ASP.NET Core app in which every request is processed in a separate thread*?
*My question is that why use asynchronous calling when the app itself has to do the database processing too and therefore a thread has to be kept busy anyway;
The first point is that it's not true that every request is processed in a separate thread. Using real async/await allows serving more requests than the number available treads.
Please remember that async/await does not equal multi-threading, they are separate and different; with overlaps.
It's not just the overall volume work that decides if using multiple threads is worth it or not. Who is doing what is very important. Even when all the cooking and serving is happening in the same restaurant you wouldn't want to dine in a busy restaurant where waiters do all the cooking.
You're right to think that the async/await is not beneficial with SQLite because under the hood it's synchronous but the point is that the original executing thread is never freed to do other work; the point is not that the work has to be done by the application itself (but could be done by new/dedicated thread).
Async(Await) are not only about free up threads if you have to do smth on another machine. Like example, when you want write text in file, you can do it async.
var text = "mytext";
File.WriteAllTextAsync(#"C:\Temp\csc.txt", text);
It free up your thread and will done by thread from thread pool
Same logic in SQLlite. You can do smth by another thread. So you can use it to improve perfomance and etc
You can check source code SQLLite and check how it works
https://github.com/praeclarum/sqlite-net/blob/master/src/SQLiteAsync.cs
I'm a bit confused, my ASP.NET MVC app will be hosted on a server, so is there any point in making it multi-threaded? For example, if I want one thread to execute my translations, is this a good idea? Can someone elaborate this to me please? I'm a bit confused with web apps multi-threading versus desktop apps multi-threading.
There's a few things to this.
The first is that every ASP.NET application (MVC or otherwise) is inherently multi-threaded: Each request will be processed on a separate thread, so you are automatically in a multi-threading situation and must consider this with any shared access to data (e.g. statics, etc.).
Another is that with MVC its particularly easy to write asynchronous controller methods like:
public async Task<ActionResult> Index(int id)
{
var model = await SomeMethodThatGetsModelAsync(id);
return View(model);
}
Now, if we're already multi-threaded then why bother? The benefit is (ironically in a way) to use fewer threads. Assuming that SomeMethodThatGetsModel(id) may block or otherwise hold up the thread, awaiting on SomeMethodThatGetsModelAsync(id) allows the current thread to handle another request. One of the limits on how many requests a webserver can handle is how many threads it can have handling those requests. Free up threads and you increase your throughput.
A further is that you may want some operation to happen in the background of the application as a whole, here the reason is the same as with desktop applications.
Simpilarly, if you have work that can be done simultaneously and which blocks (e.g. hit a database and two webservices) then your reason for doing so in a multi-threaded manner is the same as with a desktop app.
(In the last two cases though, be wary of using the default static thread pool, such as through ThreadPool.QueueUserWorkItem or Task.Run. Because this same thread pool is used for the main ASP.NET threads if you hit it heavily you're eating from the same plate as your framework. A few such uses is absolutely fine, but if you're making heavy use of separate threads then use a separate set of threads for them, perhaps with your own pooling mechanism).
is there a point to make it Multi-Threaded?
that's won't work. The question is: does your application needs multi-threading? For example, if you receive a collection of big entities, that need to be preprocessed somehow before further actions, you might process each of them in separate thread instead of cycle.
Im a bit confused with web apps multi threading vs desktop apps multi threading
Multithreading in asp.net and desktop are the same thing and works the same way.
Async has become a buzzword in .net and MS have introduced it in Web API 2 so that more requests can be handled whilst others are waiting on IO to finish.
Whilst I can see the benefit of this, is it really a concern? A x64 architecture has 30000+ threads in the Thread Pool so unless you have that many concurrent users on your website is async really required? Even if you have that many concurrent users without caching I'm pretty sure SQL Server will fall over with that many requests?
Apart from it being shiny when is there a real need to have async routing on a web framework?
Many of the other answers here are coming from a UI (desktop/mobile app) perspective, not a web server perspective.
Async has become a buzzword in .net and MS have introduced it in Web API 2 so that more requests can be handled whilst others are waiting on IO to finish.
async and await were introduced in .NET 4.5 / VS 2012. However, ASP.NET has had asynchronous request capability since .NET 2.0 - a very long time ago. And there have been people using it.
What async and await bring to the table is asynchronous code that is easy to maintain.
Whilst I can see the benefit of this, is it really a concern?
The key benefit of async on the server is scalability. Simply put, async tasks scale far better than threads.
#Joshua's comment is key regarding the memory; a thread takes a significant amount of memory (and don't forget the kernel-mode stack which cannot be paged out), while an async request literally only takes a few hundred bytes.
There's also bursting to consider. The .NET threadpool has a limited injection rate, so unless you set your minWorkerThread count to a value much higher than you normally need, then when you get a burst of traffic some requests will 503 before .NET can spin up enough threads to handle them. async keeps your threads free (as much as possible) so it handles bursting traffic better.
A x64 architecture has 30000+ threads in the Thread Pool so unless you have that many concurrent users on your website is async really required?
#Joshua is again correct when he points out that you're probably thinking of a request queue limit (which defaults to 1000 for the IIS queue and 5000 for the ASP.NET request limit). It's important to note that once this queue is filled (during bursty traffic), new requests are rejected with 503.
Even if you have that many concurrent users without caching I'm pretty sure SQL Server will fall over with that many requests?
Ah, now that's another question entirely.
I'm giving a talk at ThatConference 2013 specifically on async servers. One part of that talk is situations where async doesn't help (my Twitter update).
There's an excellent blog post here that takes the position that asynchronous db calls are just not worth the effort. It's important to note the assumptions in this post:
At the time that post was written, asynchronous web servers were difficult. These days we have async and more and more libraries are offering asynchronous APIs (e.g., Entity Framework).
The architecture assumes a single web server with a single SQL Server backend. This was a very common setup traditionally, but is quickly changing today.
Where async servers really shine is when your backend can also scale. E.g., a web service, Azure SQL, NoSQL cluster, etc. Example: I'm writing an MVC/WebAPI server that uses Azure SQL and Storage for its backend (for all practical purposes, I can act like they have infinite scalability); in that case, I'm going to make my server async. In situations like this, you can scale your server 10x or more by using async.
But if you just have a single SQL Server backend (and have no plans to change to Azure SQL), then there's no point in making your web server async because you're limited by your backend anyway.
When long operations can be efficiently executed in parallel. For instance, you have to execute two SQLs and load three pictures - do all five operations as async and await them all. In this case the overall time will be the longest duration of five operations, but not the sum of the durations.
Pre-fetch. If you can predict (with good probability) what user will do (e.g. almost certainly, (s)he will want to see the details...) you may start preparing the next page (frame, window) while user's reading the previous.
where did you get 30000 from. i dont remember exactly but I think Asp.net uses 12 x number of cores threads.
I have to use async, when operation take too long time (upload, export, processing) and user have to know about progress.
You need async in following scenarios
1) When you are performing a very long operation and you don't want to freeze your UI.
2) When you designed some task that needs to be completed in background.
For example, You are rendering images from database. But you don't want your page to be freeze at that time async is really helpful.
Sometimes there is a lot that needs to be done when a given Action is called. Many times, there is more that needs to be done than what needs to be done to generate the next HTML for the user. In order to make the user have a faster experience, I want to only do what I need to do to get them their next view and send it off, but still do more things afterwards. How can I do this, multi-threading? Would I then need to worry about making sure different threads don't step on each others feet? Is there any built in functionality for this type of thing in ASP.NET MVC?
As others have mentioned, you can use a spawned thread to do this. I would take care to consider the 'criticality' of several edge cases:
If your background task encounters an error, and fails to do what the user expected to be done, do you have a mechanism of report this failure to the user?
Depending on how 'business critical' the various tasks are, using a robust/resilient message queue to store 'background tasks to be processed' will help protected against a scenario where the user requests some action, and the server responsible crashes, or is taken offline, or IIS service is restarted, etc. and the background thread never completes.
Just food for though on other issues you might need to address.
How can I do this, multi-threading?
Yes!
Would I then need to worry about making sure different threads don't step on each others feet?
This is something you need to take care of anyway, since two different ASP.NET request could arrive at the same time (from different clients) and be handled in two different worker threads simultaneously. So, any code accessing shared data needs to be coded in a thread-safe way anyway, even without your new feature.
Is there any built in functionality for this type of thing in ASP.NET MVC?
The standard .net multi-threading techniques should work just fine here (manually starting threads, or using the Task features, or using the Async CTP, ...).
It depends on what you want to do, and how reliable you need it to be. If the operaitons pending after the response was sent are OK to be lost, then .Net Async calls, ThreadPool or new Thread are all going to work just fine. If the process crashes the pending work is lost, but you already accepted that this can happen.
If the work requires any reliable guarantee, for instance the work incurs updates in the site database, then you cannot use the .Net process threading, you need to persist the request to do the work and then process this work even after a process restart (app-pool recycle as IIS so friendly calls them).
One way to do this is to use MSMQ. Other way is to use the a database table as a queue. The most reliable way is to use the database activation mechanisms, as described in Asynchronous procedure execution.
You can start a background task, then return from the action. This example is using the task Parallel Library, found in .NET 4.0:
public ActionResult DoSomething()
{
Task t = new Task(()=>DoSomethingAsynchronously());
t.Start();
return View();
}
I would use MSMQ for this kind of work. Rather than spawning threads in an ASP.NET application, I'd use an Asynchronous out of process way to do this. It's very simple and very clean.
In fact I've been using MSMQ in ASP.NET applications for a very long time and have never had any issues with this approach. Further, having a different process (that is an executable in a different app domain) do the long running work is an ideal way to handle it since your web application is no being used to do this work. So IIS, the threadpool and your web application can continue to do what they need to, while other processes handle long running tasks.
Maybe you should give it a try: Using an Asynchronous Controller in ASP.NET MVC
I'm looking for a good strategy to truly decouple, for parallel processing, my web application's (ASP.NET MVC/C#) non-immediate processes. I define non-immediate as everything that doesn't require to be done right away to render a page or update information.
Those processes include sending email, updating some internal statistics based on database information, fetching outside information from web services which only needs to be done periodically and so forth.
Some communication needs to exist between the main ASP.NET MVC application and those background tasks though; e.g. the MVC application needs to inform the emailing process to send something out.
What is the best strategy to do this? MSMQ? Turn all those non-immediate processes into windows services? I'm imagining a truly decoupled scenario, but I don't want a trade off that makes troubleshooting/unit testing much harder or introduces vast amounts of code.
Thank you!
Can't speak for ASP.NET as I work primarily in Python, but...luckily I can answer this one as it's more of a meta-language question.
I've typically done this with a queue-based backend daemon which runs independently. When you need to add something to the queue, you can IPC with a method of your choice (I'm partial to HTTP) and deliver a job. The daemon just knocks through the jobs one by one -- possibly delegating them to worker threads itself. You can bust out of the RESTful side of your application and fire off jobs to the backend, i.e.:
# In frontend (sorry for Python, should be clear)
...
backend_do_request("http://loadbalancer:7124/ipc", my_job)
...
# In backend (psuedoPython)
while 1:
job = wait_for_request()
myqueue.append(job)
...
def workerthread():
job = myqueue.pop()
do_job(job)
If you later need to check in with the background daemon and ask "is job 2025 done?" you can account for that in your design.
If you want to do that with a Windows Service I would imagine you can. All it needs to do is listen on a port of your choice for whatever IPC you want to do -- I'd stick with network transports, as local IPC will assume same-machine and limit your scalability. Your unit testing shouldn't be that much harder; you can just account for the frontend and the backend as two different projects.
ThreadPool in .NET is queue based worker pool, however its used internally by ASP.NET host process, so if you try to utilize ThreadPool more, you may reduce performance of Web Server.
So you must create your own thread, mark it as background and let it poll every few seconds for job availability.
The best way to do is, create a Job Table in database as follow,
Table: JobQueue
JobID (bigint, auto number)
JobType (sendemail,calcstats)
JobParams (text)
IsRunning (true/false)
IsOver (true/false)
LastError (text)
JobThread class could be like following.
class JobThread{
static Thread bgThread = null;
static AutoResetEvent arWait = new AutoResetEvent(false);
public static void ProcessQueue(Job job)
{
// insert job in database
job.InsertInDB();
// start queue if its not created or if its in wait
if(bgThread==null){
bgThread = new Thread(new ..(WorkerProcess));
bgThread.IsBackground = true;
bgThread.Start();
}
else{
arWait.Set();
}
}
private static void WorkerProcess(object state){
while(true){
Job job = GetAvailableJob(
IsProcessing = false and IsOver = flase);
if(job == null){
arWait.WaitOne(10*1000);// wait ten seconds.
// to increase performance
// increase wait time
continue;
}
job.IsRunning = true;
job.UpdateDB();
try{
//
//depending upon job type do something...
}
catch(Exception ex){
job.LastError = ex.ToString(); // important step
// this will update your error in JobTable
// for later investigation
job.UpdateDB();
}
job.IsRunning = false;
job.IsOver = true;
job.UpdateDB();
}
}
}
Note
This implementation is not recommended for high memory usage tasks, ASP.NET will give lots of memory unavailability errors for big tasks, like for example, we had lot of image uploads and we needed to create thumbnails and process them using Bitmap objects, ASP.NET just wont allow you to use more memory so we had to create windows service of same type.
By creating Windows service you can create same thread queue and utilize more memory easily, and to make communication between ASP.NET and Windows Service you can use WCF or Mutex objects.
MSMQ
MSMQ is also great, but it increases configuration tasks and it becomes difficult to trace errors sometimes. We avoid MSMQ because lot of time we spend looking for an answer of problem in our code where else MSMQ configuration is problem and the errors sometimes dont give enough information of where exactly is the problem. In our custom solution we can create full debugger version with logs to trace errors. And thats biggest advantage of Managed Programs, in earlier Win32 apps, the errors were really difficult to trace.
Simplest way to handle async processing in ASP.NET is to use the ThreadPool to create a worker that you hand your work off to. Be aware that if you have lots of small jobs you are trying to hand-off quickly, the default ThreadPool has some annoying lock contention issues. In that scenario, you either need to use C# 4.0's new Stealing ThreadPool, or you can use MindTouch's Dream library which has a Stealing Threadpool implementation (along with tons of other async helpers) and works with 3.5.
Nservicebus sounds like it might be applicable here, though under the covers it'd probably use msmq. Essentially you sound like you're after doing asynchronous stuff, which .net has good mechanisms for dealing with.
We've done this with the workflow API, or if it's not imperative that it it execute you could use a simple delegate.BeginInvoke to run this on a background thread.
This is a pattern that I tend to think of as 'Offline Services', and I've usually implemented it as a Windows service that can run multiple tasks on their own schedules.
Each task implements a business process such as sending pending emails from a message queue or database table, writing queued log messages to an underlying provider, or performing some batch processing that needs to happen at regular intervals, such as archiving old data or importing data objects from incoming feeds.
The advantage of this approach is that you can build in full management capabilities into the task management service, such as tracing, impersonation, remote integration via WCF, and error handling and reporting, all while using your .NET language of choice to implement the tasks themselves.
There are a few scheduling APIs out there, such as Quartz.NET, that can be used as the starting point for this sort of system. In terms of multi-threading, my general approach is to run each task on its own worker thread, but to only allow one instance of a task to be running at a given time. If a task needs parallel execution then that is implemented in the task body as it will be entirely dependent on the work the task needs to do.
My view is that a web application should not be managing these sorts of tasks at all, as the web application's purpose is to handle requests from your users, not manage intermediate background jobs. It's a lot of work to build a system like this initially, but you'll be able to re-use it on virtually any project.
A windows service managing these tasks, using a ThreadPool, and communicating with it via an MSMQ is certainly my preferred approach. Nicely scalable as well, due to the public queue abilities.
If you can develop for .NET 4 Framework then you can decouple by using F# or the Parallel Computing features (http://msdn.microsoft.com/en-us/library/dd460693(VS.100).aspx)
F# is designed to support parallel computing so it may be a better choice than moving code into services.
Though, if you wanted, you could just use WCF and off-load everything to webservices, but that may not really solve your problem as it just moves the issues elsewhere.
EDIT: Moving the non-essential to webservices may make the most sense then, and this is a standard practice where the webserver is outside of the firewall, so vulnerable, so all the real work is done by other servers, and the webserver is just responsible for static pages and rendering.
You can use Spring.NET for this, if you don't want to add webservices, but either way you are largely just calling a remote process to do the work.
This is scalable as you can separate the business logic to several different servers, and since the webserver is largely just the view part of MVC it can handle more requests than if all the MVC work is in the webserver.
Because it is designed for this, Spring.NET should be easier to test, but, webservices can also be tested, as you should test each part separately, then do functional tests, but, by using Spring.NET it is easier to mock out levels.
MSMQ is an awesome way to do this. A web farm can feed requests into one or more queues. The queues can be serviced by one or more processes on one or more servers giving you scale and dedundancy. (Run MSMQ on a cluster if you want to remove the single point of failure). We did this about 8-9 years back and it was awesome watching it all run :) And even back then MSMQ was dead simple to use (from COM) -- I have to imagine things have only gotten better with .NET.
Following sound software engineering principles will keep your unit testing complexity to a minimum. Follow SRP (Single Responsibility Principle). This is especially the case for multi-threaded code which sounds like where you're headed. Robert Martin addresses this in his book "Clean Code".
To answer your question, there are, as you've seen from the array of posts, many ways to solve background processing. MSMQ is a great way to communicate with background processes and is also a great mechanism for addressing reliability (eg., request 5 emails sent, expect 5 emails sent).
A really simple and effective way to run a backround process in asp.net is using a background worker. You need to understand if the background worker (a thread) runs in the application's domain or inetinfo. If it's in the app domain, then the trade-off is you'll lose the thread when the app pool recycles. If you need it durable, then it should be carved out into its own process (eg., Windows Service). If you look into WCF, Microsoft addresses WS-Reliability using MSMQ. Better news is you can host WCF services in a Windows Service. One-way calls to the service suffice to eliminate blocking on the web server which effectively gives you background process.
James Black mentions using Spring.NET. I agree with his recommendation for 2 reasons: 1) Because Spring.NET's support for services and web are superior to other frameworks and 2) Spring.NET forces you to decouple which also simplifies testing.
Back on track:
1: Background worker - tradeoff is it's closely tied to the app pool/app domain and you're not separating effectively. Good for simple one-off type jobs (image resizing, etc). In-memory queues are volatile which can mean loss of data.
2: Windows Service - tradeoff is deployment complexity (although I'll argue this is minimal). If you will have families of low-resource-utilized background processes, opt for pluggability and host all in one Windows Service. Use durable storage (MSMQ, DB, FILE) for job requests and plan for recovery in your design. If you have 100 requests in queue and the Windows service restarts, it should be written so it immediately checks the queue for work.
3: WCF hosted in IIS - about the same complexity as (2) as I would expect the Windows Service to host WCF and that would be the communication mechanism between ASP.NET and the service. I don't personally like the "dump and run" design (where asp.net writes to a queue) because it reduces clarity and you're ultimately tightly coupling to msmq.