I'm wondering if the following code has any gotcha's that I'm not aware of when running on a webserver. Reading through the excellent series http://reedcopsey.com/series/parallelism-in-net4/ I am unable to find anything that relates specifically to my question, same with the msdn, so I thought I'd bring it here.
Example call:
public ActionResult Index() {
ViewBag.Message = "Welcome to ASP.NET MVC!";
Task.Factory.StartNew(() => {
//This is some long completing task that I don't care about
//Say logging to the database or updating certain information
System.Threading.Thread.Sleep(10000);
});
return View();
}
ASP.Net supports asynchronous pages, see Asynchronous Pages in ASP.NET, but is a complicated programming model and does not bind at all with MVC. That being said, launching asynchronous tasks from a synchronous requests handler works up to a point:
if the rate at which requests add new tasks exceeds the average rate of processing your process will crash eventually. The Tasks take up live memory and eventually they will fill up the in memory queues where they're stored and your will start getting failures to submit.
.Net Taks are inherently unreliable as they lack a persistent storage, so all tasks that are submitted async must be threaded as 'abandonware', ie. if they never complete there is no loss to the application nor to the user making the request. If the task is important, then it must be submitted through a reliable mechanism that guarantees execution in the presence of failures, like the one presented in Asynchronous procedure execution.
One important thing in this case is to ensure that the code contained inside the task is wrapped in a try/catch block or any possible exceptions thrown in this thread will propagate. You also should ensure that in this long running task you are not accessing any of the Http Context members such as Request, Response, Session, ... as they might no longer be available by the time you access them.
Use new Thread instead of Task.Factory.StartNew. Task.Factory.StartNew use thread from Threads Pool and if you will have many background tasks a Threads Pool will run out of threads and will degrade your web application. The requests will be queued and your web app eventually will die :)
You can test is your background work executed on Thread Pool using Thread.CurrentThread.IsThreadPoolThread. If you get True then Thread Pool is used.
Related
I wanted to ask you about async/await. Namely, why does it always need to be used? (all my friends say so)
Example 1.
public async Task Boo()
{
await WriteInfoIntoFile("file.txt");
some other logic...
}
I have a Boo method, inside which I write something to files and then execute some logic. Asynchrony is used here so that the stream does not stop while the information is being written to the file. Everything is logical.
Example 2.
public async Task Bar()
{
var n = await GetNAsync(nId);
_uow.NRepository.Remove(n);
await _uow.CompleteAsync();
}
But for the second example, I have a question. Why here asynchronously get the entity, if without its presence it will still be impossible to work further?
why does it always need to be used?
It shouldn't always be used. Ideally (and especially for new code), it should be used for most I/O-based operations.
Why here asynchronously get the entity, if without its presence it will still be impossible to work further?
Asynchronous code is all about freeing up the calling thread. This brings two kinds of benefits, depending on where the code is running.
If the calling thread is a UI thread inside a GUI application, then asynchrony frees up the UI thread to handle user input. In other words, the application is more responsive.
If the calling thread is a server-side thread, e.g., an ASP.NET request thread, then asynchrony frees up that thread to handle other user requests. In other words, the server is able to scale further.
Depending on the context, you might or might not get some benefit. In case you call the second function from a desktop application, it allows the UI to stay responsive while the async code is being executed.
Why here asynchronously get the entity, if without its presence it will still be impossible to work further?
You are correct in the sense that this stream of work cannot proceed, but using async versions allows freeing up the thread to do other work:
I like this paragraph from Using Asynchronous Methods in ASP.NET MVC 4 to explain the benefits:
Processing Asynchronous Requests
In a web app that sees a large number of concurrent requests at start-up or has a bursty load (where concurrency increases suddenly), making web service calls asynchronous increases the responsiveness of the app. An asynchronous request takes the same amount of time to process as a synchronous request. If a request makes a web service call that requires two seconds to complete, the request takes two seconds whether it's performed synchronously or asynchronously. However during an asynchronous call, a thread isn't blocked from responding to other requests while it waits for the first request to complete. Therefore, asynchronous requests prevent request queuing and thread pool growth when there are many concurrent requests that invoke long-running operations.
Not sure what you mean by
without its presence it will still be impossible to work further
regarding example 2. As far as I can tell this code gets an entity by id from its repository asynchronously, removes it, then completes the transaction on its Unit of Work. Do you mean why it does not simply remove the entry by id? That would certainly be an improvement, but would still leave you with an asynchronous method as CompleteAsync is obviously asynchronous?
As to your general question, I don't think there is a general concensus to always use async/await.
In your second example there with the async/await keywords you are getting the value of the n variable asynchronously. This might be necessary because the GetNAsync method is likely performing some time-consuming operation, such as querying a database or perhaps you might be calling a webservice downstream, that could block the main thread of execution. By calling the method asynchronously, the rest of the code in the Bar method can continue to run while the query is being performed in the background.
But if in the GetNAsync you are just calling another method locally that is doing some basic CPU bound task then the async is pointless in my view. Aync works well when you are sure you need to wait such as network calls or I/O bound calls that will definitely add latency to your stack.
So I have been trying to get the grasp for quite some time now but couldn't see the sense in declaring every controller-endpoint as an async method.
Let's look at a GET-Request to visualize the question.
This is my way to go with simple requests, just do the work and send the response.
[HttpGet]
public IActionResult GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = _dbService.ReadAllUsers();
return Ok(userList);
}
Below is the async Task<IActionResult> option I see very often, it does the same as the method above but the method itself is returning a Task. One could think, that this one is better because you can have multiple requests coming in and they get worked on asynchronously BUT I tested this approach and the approach above with the same results. Both can take multiple requests at once. So why should I choose this signature instead of the one above? I can only see the negative effects of this like transforming the code into a state-machine due to being async.
[HttpGet]
public async Task<IActionResult> GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = _dbService.ReadAllUsers();
return Ok(userList);
}
This approach below is also something I don't get the grasp off. I see a lot of code having exactly this setup. One async method they await and then returning the result. Awaiting like this makes the code sequential again instead of having the benefits of Multitasking/Multithreading. Am I wrong on this one?
[HttpGet]
public async Task<IActionResult> GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = await _dbService.ReadAllUsersAsync();
return Ok(userList);
}
It would be nice if you could enlighten me with facts so I can either continue developing like I do right now or know that I have been doing it wrong due to misunderstanding the concept.
Please read the "Synchronous vs. Asynchronous Request Handling" section of Intro to Async/Await on ASP.NET.
Both can take multiple requests at once.
Yes. This is because ASP.NET is multithreaded. So, in the synchronous case, you just have multiple threads invoking the same action method (on different controller instances).
For non-multithreaded platforms (e.g., Node.js), you have to make the code asynchronous to handle multiple requests in the same process. But on ASP.NET it's optional.
Awaiting like this makes the code sequential again instead of having the benefits of Multitasking/Multithreading.
Yes, it is sequential, but it's not synchronous. It's sequential in the sense that the async method executes one statement at a time, and that request isn't complete until the async method completes. But it's not synchronous - the synchronous code is also sequential, but it blocks a thread until the method completes.
So why should I choose this signature instead of the one above?
If your backend can scale, then the benefit of asynchronous action methods is scalability. Specifically, asynchronous action methods yield their thread while the asynchronous operation is in progress - in this case, GetUsers is not taking up a thread while the database is performing its query.
The benefit can be hard to see in a testing environment, because your server has threads to spare, so there's no observable difference between calling an asynchronous method 10 times (taking up 0 threads) and calling a synchronous method 10 times (taking up 10 threads, with another 54 to spare). You can artificially restrict the number of threads in your ASP.NET server and then do some tests to see the difference.
In a real-world server, you usually want to make it as asynchronous as possible so that your threads are available for handling other requests. Or, as described here:
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.
Bear in mind the "if" above; this particularly applies to existing code. If you have just a single SQL server backend, and if pretty much all your actions query the db, then changing them to be asynchronous may not be useful, since the scalability bottleneck is usually the db server and not the web server. But if your web app could use threads to handle non-db requests, or if your db backend is scalable (NoSQL, SQL Azure, etc), then changing action methods to be asynchronous would likely help.
For new code, I recommend asynchronous methods by default. Asynchronous makes better use of server resources and is more cloud-friendly (i.e., less expensive for pay-as-you-go hosting).
If your DB Service class has an async method for getting the user then you should see benefits. As soon as the request goes out to the DB then it is waiting on a network or disk response back from the service at the other end. While it is doing this the thread can be freed up to do other work, such as servicing other requests. As it stands in the non-async version the thread will just block and wait for a response.
With async controller actions you can also get a CancellationToken which will allow you to quit early if the token is signalled because the client at the other end has terminated the connection (but that may not work with all web servers).
[HttpGet]
public async Task<IActionResult> GetUsers(CancellationToken ct)
{
ct.ThrowIfCancellationRequested();
// Do some work and get a list of all user-objects.
List<User> userList = await _dbService.ReadAllUsersAsync(ct);
return Ok(userList);
}
So, if you have an expensive set of operations and the client disconnects, you can stop processing the request because the client won't ever receive it. This frees up the application's time and resources to deal with requests where the client is interested in a result coming back.
However, if you have a very simple action that requires no async calls then I probably wouldn't worry about it, and leave it as it is.
You are missing something fundamental here.
When you use async Task<> you are effectively saying, "run all the I/O code asynchronously and free up processing threads to... process".
At scale, your app will be able to serve more requests/second because your I/O is not tying up your CPU intensive work and vice versa.
You're not seeing much benefit now, because you're probably testing locally with plenty of resources at hand.
See
Understanding CPU and I/O bound for asynchronous operations
Async in depth
As you know, ASP.NET is based on a multi-threaded model of request execution. To put it simply, each request is run in its own thread, contrary to other single thread execution approaches like the event-loop in nodejs.
Now, threads are a finite resource. If you exhaust your thread pool (available threads) you can't continue handling tasks efficiently (or in most cases at all). If you use synchronous execution for your action handlers, you occupy those threads from the pool (even if they don't need to be). It is important to understand that those threads handle requests, they don't do input/output. I/O is handled by separate processes which are independent of the ASP.NET request threads. So, if in your action you have a DB fetch operation that takes let's say 15 seconds to execute, you're forcing your current thread to wait idle 15 seconds for the DB result to be returned so it can continue executing code. Therefore, if you have 50 such requests, you'll have 50 threads occupied in basically sleep mode. Obviously, this is not very scalable and becomes a problem soon enough. In order to use your resources efficiently, it is a good idea to preserve the state of the executing request when an I/O operation is reached and free the thread to handle another request in the meantime. When the I/O operation completes, the state is reassembled and given to a free thread in the pool to resume the handling. That's why you use async handling. It lets you use your finite resources more efficiently and prevents such thread starvation. Of course, it is not an ultimate solution. It helps you scale your application for higher load. If you don't have the need for it, don't use it as it just adds overhead.
The result returns a task from an asynchronous (non-blocking) operation which represents some work that should be done. The task can tell you if the work is completed and if the operation returns a result, the task gives you the result which won't be available until the task is completed. You can learn more about C# asynchronous programming from the official Microsoft Docs:
https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.task?view=net-5.0
Async operations return a Task(it is not the result, but a promise made that will have the results once the task is completed.
The async methods should be awaited so that it waits till the task is completed. if you await an async method, the return values won't be a task anymore and will be the results you expected.
Imagine this async method:
async Task<SomeResult> SomeMethod()
{
...
}
the following code:
var result = SomeMethod(); // the result is a Task<SomeResult> which may have not completed yet
var result = await SomeMethod(); // the result is type of SomeResult
Now, why do we use async methods instead of sync methods:
Imagine that a Method is doing some job that may take a long time to complete, when it is done sync all other requests will be waiting till the execution of this long taking job to complete, as they are all happening in the same thread. However when it is async, the task will be done in a (possibly another) thread and will not block the other requests.
I'm of the belief that you should never have to use Task.Run for any operation in .net core web context. If you have a long running task or CPU intensive task, you can offload it to a message queue for async processing. If you have a sync operation, that has no equivalent async method, then offloading to background thread does nothing for you, it actually makes in slightly worse.
What am I missing? Is there a genuine reason to use Task.Run in a high throughput server application?
Some quick examples:
A logging system where each worker thread can write to a queue and a worker thread is responsible for dequeuing items and writing them to a log file.
To access an apartment-model COM server with expensive initialization, where it may be better to keep a single instance on its own thread.
For logic that runs on a timer, e.g. a transaction that runs every 10 minutes to update an application variable with some sort of status.
CPU-bound operations where individual response time is more important than server throughput.
Logic that must continue to run after the HTTP response has been completed, e.g. if the total processing time would otherwise exceed an HTTP response timeout.
Worker threads for system operations, e.g. a long running thread that checks for expired cache entries.
Just to backup your belief:
Do not: Call Task.Run and immediately await it. ASP.NET Core already runs app code on normal Thread Pool threads, so calling
Task.Run only results in extra unnecessary Thread Pool scheduling.
Even if the scheduled code would block a thread, Task.Run does not
prevent that.
This is the official recommendation/best practice from Microsoft. Although it doesn't point out something you might have missed, it does tell you that it is a bad idea and why.
I'm using async/await in WebApi Controllers according to this article:
https://msdn.microsoft.com/en-us/magazine/dn802603.aspx
Hava look at this simplified code in my controller:
DataBaseData = await Task.Run( () => GetDataFunction() );
GetDataFunction is a function that will open a database connection, open a reader and read the data from the database.
In many examples I see it handled differently. The GetDataFunction itself is awaited. And within the function every single step is awaited.
For example:
connection.OpenAsync
reader.ReadAsycnc
reader.IsDBNullAsync
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
Update:
Thanks for the help. Now I got it. I did not get, that the asynchronous Api's do not start threads themselves. This really helped: blog.stephencleary.com/2013/11/there-is-no-thread.html
The article you linked to states:
You can kick off some background work by awaiting Task.Run, but there’s no point in doing so. In fact, that will actually hurt your scalability by interfering with the ASP.NET thread pool heuristics... As a general rule, don’t queue work to the thread pool on ASP.NET.
In other words, avoid Task.Run on ASP.NET.
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
It's because the asynchronous APIs do not use other threads. The entire point of async/await is to free up the current thread; not use another thread. I have a blog post describing how async works without needing threads.
So, the Task.Run (or custom thread) approach will use up (block) a thread while getting data from the database. Proper asynchronous methods (e.g., EF6 or the ADO.NET asynchronous APIs) do not; they allow the request thread to be used for other requests while that request is waiting for the database response.
I assume you are wondering why using
await Task.Run(() => GetDataFunction());
instead of
await GetDataFunction();
As you can see in Task.Run(Func) under the section Remarks there is written:
The Run(Func) method is used by language compilers to support the async and await keywords. It is not intended to be called directly from user code.
That means, you should use await GetDataFunction.
It's all about resource management and sharing.
When using connection.Open() method the thread that calls it has to wait for the connection to actually open. While waiting it does nothing but consume the CPU slice allocated to it by the operating system. However, when you await connection.OpenAsync() the thread frees up its resources and the OS can redistribute them to other threads.
Now, there's a catch: to actually gain from asynchronous calls they should take longer than the time it takes for the OS to switch contexts otherwise it will incur a drop in application performance.
The practice of awaiting on connection.OpenAsync(), reader.ReadAsync() etc. is due to the fact that these are potentially long running operations. In a system where de database resides on another machine and it is under a heavy load of requests, connecting to the database and getting the results of a query will take some time. Instead of blocking the CPU while waiting on the results to arrive why not allocate that time slice to another worker thread that waits in the scheduler queue to render the response for another client?
Now, about starting another thread for data access: don't do that!. Each new thread gets allocated about 1MB of memory space so beside wasting CPU time while waiting for database operations to finish you'll be wasting memory also. Furthermore, having such a heavy memory footprint would require a lot more runs for Garbage Collector which will freeze all of your threads when running.
Using Task.Run() will schedule the data access operations on a thread from the ThreadPool but you will lose the benefit of sharing the CPU time with another thread while waiting for database server to respond.
I'm developing an MVC web application that allows me to manage my data asynchronously through a web service.
It is my understanding this allows the CPU threads that access the app pool for the server upon which this website is running to return to the app pool after making a request so that they can be used to service other requests without stalling the entire thread.
Assuming my understanding is correct (although it may be badly worded), I got to thinking about when I should await things. Consider the function below:
public async Task<ActionResult> Index()
{
List<User> users = new List<User>();
using (var client = new HttpClient())
{
client.BaseAddress = new Uri("http://localhost:41979");
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(
new MediaTypeWithQualityHeaderValue("application/json"));
HttpResponseMessage response = await client.GetAsync("api/user/");
if (response.IsSuccessStatusCode)
{
users = await response.Content.ReadAsAsync<List<User>>();
}
}
return View(users);
}
All of my functions look similar except that they do different things with the data returned by the web service and I got to wondering, should I not await the return as well?
Something like:
return await View(users);
or
await return View(users);
I mean the website runs just fine so far except that I've had a bit of confusion to do with exactly what the web service should send back to the client website, but as I'm new to development involving a web service, I'm still wondering if I'm doing things properly and this has been eating at me for some time.
You can only await named or anonymous methods and lambda expressions which expose an asynchronous operation via Task, Task<T> or a custom awaiter. Since View by itself does nothing asynchronous, you can't await it.
What is actually the point of using await? Usually, you have some IO bound operations which is asynchronous by nature. By using async API's, you allow the thread to be non-blocked by returning to the thread-pool, making use of it to serve different requests. async does not change the nature of HTTP. It is still "request-response". When an async method yields control, it does not return a response to the client. It will only return once the action has completed.
View(object obj) returns a ViewResult, which in turn will transform your object into the desired output. ViewResult is not awaitable, as it doesn't expose any "promise" via an awaitable object. Thus, you can't asynchronously wait on it.
I got to thinking about when I should await things
It's better to always await the result from asynch calls.
If you don't await, you fire & forget, you don't receive response on your end in both success and error cases.
You "await" when you need to unwrap the async task as a value. Other wise you can return a task and have it run at a later time if needed. Also note that .Wait() is not the same as await. Wait() will block until the task has finished (side note I know). Also check your code, you have a syntax error in your method signature:
public async <Task>ActionResult> Index()
Should be
public async Task<ActionResult> Index()
I think it is a very good question and also very hard to answer it especially in the case of a web site.
I've had a bit of confusion to do with exactly what the web service should send back to the client website'
The most important thing to understand if you use async/await, then your action method code still serialized, I mean the next line will be executed only after the async operation finished. However there will be (at least) three threads:
To original web server worker thread in which the action method is
called. Originally the MVC infrastructure got this thread from the
pool, and dedicated this thread to serve the current request.
Optional thread(s) what are started by the async operation what can
be called with await.
A continuation thread (also a worker from the
pool) in which your action method is continuing after the await. Note
this thread will have the same context (HttpContext, user culture
etc) what was the original worker so it's transparent for you, but it will be an other thread freshly allocated from the pool.
For the first sight it has no sense: Why all those thread hocus-pocus in case if the operations in the action method are still serialized? I takes the same time... To understand this we must take a look to a desktop application say a WPF app.
To be short: There is a message loop and a dedicated thread which reads the UI (and other virtual) events from a queue and executes the event handlers in the context of this thread, like a button click event. In case you block this thread for 2 second the UI will be unresponsive for that time. So that's why we would like to do many things in async way in an other thread, and let the dedicated (UI) thread to fast return and process other messages from the queue. However this could be very inconvenient, because we sometime wanted to continue something after the async operation ended and we had the result. The C# feature async/await made this very convenient. This case the continuation thread is always the dedicated UI thread but in its (very) later round in its endless loop. To wrap it up:
In an event handler you can use async/await to still execute your operations in serialized, the continuation will be done always in the original dedicated UI thread but during the await this thread can loop, and process other messages.
Now back to the MVC action method: In this case your action method is the analogy of the event handler. Although it is still hard to understand the use of the thread complication: This case blocking the action method would not block anything (as it blocked the dedicated UI thread in WPF). Here are the facts:
Other clients (browsers) requests will be served by other threads. (so we are
not blocking anything by blocking this thread (*) keep reading
This request will not be served faster even if we use async/await because
the operations are serialized and we must wait (await) the result of
the async operation. (Do not confuse async/await with parallel processing)
So the use of async/await and the transparent thread tricks is not so obvious as it was in the case of the desktop application.
Still: Pool is not an endless resource. Releasing ASAP back to the pool (by await) the worker thread what was originally dedicated to serve the request, and let continue in an other thread after completion the async operation may lead better utilization of the thread pool and may lead better performance of the web server under extreme stress.