I have watched a few videos on youtube by Les Jackson, on how to create a .net core api. Les uses this approach for a method in his controller
public ActionResult<IEnumerable<Command>> GetAllCommands()
{
var commandItems = _repository.GetAppCommands();
return Ok(commandItems);
}
However I have just read an article in a book that uses this approach
public async Task<ActionResult<IEnumerable<Data.CityDataClass>>> GetCities()
{
return await _repository.GetCities();
}
From my thinking is the second method an overkill using the "async" declaration? The first method wont return anything until it has the results anyway.
Thanks
You can understand this better if you read this article on I/O threads
To use asynchronous and parallel features of the .NET properly, you should also understand the concept of I/O threads.
Not everything in a program consumes CPU time. When a thread tries to read data from a file on disk or sends a TCP/IP packet through network, the only thing it does is delegate the actual work to a device - disk or network adapter - and wait for results.
I/O thread is an abstraction intended to hide work with devices behind a simple and familiar concept. The main point here is that you don’t have to work with those devices in a different way, you can think of the pipeline inside them just like it’s a usual CPU consuming thread. At the same time, I/O threads are extremely cheap in comparison with CPU-bound threads, because, in fact, they are merely requests to devices
So with the first way, you are not utilizing the I/O threads but with the second you do, and it's much cheaper.
In your case, when accessing the Database, you are effectively doing an I/O (network) operation. By using asynchronous programming, you are awaiting on an I/O thread for the I/O (db read) operation to complete, without blocking a thread.
Edit (After Liam's comment)
Another interesting read is async in depth
Throughout this entire process, a key takeaway is that no thread is dedicated to running the task. Although work is executed in some context (that is, the OS does have to pass data to a device driver and respond to an interrupt), there is no thread dedicated to waiting for data from the request to come back. This allows the system to handle a much larger volume of work rather than waiting for some I/O call to finish.
For even more information check the asynchronous overhead here
If the async method completes synchronously the performance overhead is fairly small.
If the async method completes synchronously the following memory overhead will occur: for async Task methods there is no overhead, for async Taskmethods the overhead is 88 bytes per operation (on x64 platform).
ValueTask can remove the overhead mentioned above for async methods that complete synchronously.
A ValueTask-based async method is a bit faster than a Task-based method if the method completes synchronously and a bit slower otherwise.
A performance overhead of async methods that await non-completed task is way more substantial (~300 bytes per operation on x64 platform).
Related
So I have been trying to get the grasp for quite some time now but couldn't see the sense in declaring every controller-endpoint as an async method.
Let's look at a GET-Request to visualize the question.
This is my way to go with simple requests, just do the work and send the response.
[HttpGet]
public IActionResult GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = _dbService.ReadAllUsers();
return Ok(userList);
}
Below is the async Task<IActionResult> option I see very often, it does the same as the method above but the method itself is returning a Task. One could think, that this one is better because you can have multiple requests coming in and they get worked on asynchronously BUT I tested this approach and the approach above with the same results. Both can take multiple requests at once. So why should I choose this signature instead of the one above? I can only see the negative effects of this like transforming the code into a state-machine due to being async.
[HttpGet]
public async Task<IActionResult> GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = _dbService.ReadAllUsers();
return Ok(userList);
}
This approach below is also something I don't get the grasp off. I see a lot of code having exactly this setup. One async method they await and then returning the result. Awaiting like this makes the code sequential again instead of having the benefits of Multitasking/Multithreading. Am I wrong on this one?
[HttpGet]
public async Task<IActionResult> GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = await _dbService.ReadAllUsersAsync();
return Ok(userList);
}
It would be nice if you could enlighten me with facts so I can either continue developing like I do right now or know that I have been doing it wrong due to misunderstanding the concept.
Please read the "Synchronous vs. Asynchronous Request Handling" section of Intro to Async/Await on ASP.NET.
Both can take multiple requests at once.
Yes. This is because ASP.NET is multithreaded. So, in the synchronous case, you just have multiple threads invoking the same action method (on different controller instances).
For non-multithreaded platforms (e.g., Node.js), you have to make the code asynchronous to handle multiple requests in the same process. But on ASP.NET it's optional.
Awaiting like this makes the code sequential again instead of having the benefits of Multitasking/Multithreading.
Yes, it is sequential, but it's not synchronous. It's sequential in the sense that the async method executes one statement at a time, and that request isn't complete until the async method completes. But it's not synchronous - the synchronous code is also sequential, but it blocks a thread until the method completes.
So why should I choose this signature instead of the one above?
If your backend can scale, then the benefit of asynchronous action methods is scalability. Specifically, asynchronous action methods yield their thread while the asynchronous operation is in progress - in this case, GetUsers is not taking up a thread while the database is performing its query.
The benefit can be hard to see in a testing environment, because your server has threads to spare, so there's no observable difference between calling an asynchronous method 10 times (taking up 0 threads) and calling a synchronous method 10 times (taking up 10 threads, with another 54 to spare). You can artificially restrict the number of threads in your ASP.NET server and then do some tests to see the difference.
In a real-world server, you usually want to make it as asynchronous as possible so that your threads are available for handling other requests. Or, as described here:
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.
Bear in mind the "if" above; this particularly applies to existing code. If you have just a single SQL server backend, and if pretty much all your actions query the db, then changing them to be asynchronous may not be useful, since the scalability bottleneck is usually the db server and not the web server. But if your web app could use threads to handle non-db requests, or if your db backend is scalable (NoSQL, SQL Azure, etc), then changing action methods to be asynchronous would likely help.
For new code, I recommend asynchronous methods by default. Asynchronous makes better use of server resources and is more cloud-friendly (i.e., less expensive for pay-as-you-go hosting).
If your DB Service class has an async method for getting the user then you should see benefits. As soon as the request goes out to the DB then it is waiting on a network or disk response back from the service at the other end. While it is doing this the thread can be freed up to do other work, such as servicing other requests. As it stands in the non-async version the thread will just block and wait for a response.
With async controller actions you can also get a CancellationToken which will allow you to quit early if the token is signalled because the client at the other end has terminated the connection (but that may not work with all web servers).
[HttpGet]
public async Task<IActionResult> GetUsers(CancellationToken ct)
{
ct.ThrowIfCancellationRequested();
// Do some work and get a list of all user-objects.
List<User> userList = await _dbService.ReadAllUsersAsync(ct);
return Ok(userList);
}
So, if you have an expensive set of operations and the client disconnects, you can stop processing the request because the client won't ever receive it. This frees up the application's time and resources to deal with requests where the client is interested in a result coming back.
However, if you have a very simple action that requires no async calls then I probably wouldn't worry about it, and leave it as it is.
You are missing something fundamental here.
When you use async Task<> you are effectively saying, "run all the I/O code asynchronously and free up processing threads to... process".
At scale, your app will be able to serve more requests/second because your I/O is not tying up your CPU intensive work and vice versa.
You're not seeing much benefit now, because you're probably testing locally with plenty of resources at hand.
See
Understanding CPU and I/O bound for asynchronous operations
Async in depth
As you know, ASP.NET is based on a multi-threaded model of request execution. To put it simply, each request is run in its own thread, contrary to other single thread execution approaches like the event-loop in nodejs.
Now, threads are a finite resource. If you exhaust your thread pool (available threads) you can't continue handling tasks efficiently (or in most cases at all). If you use synchronous execution for your action handlers, you occupy those threads from the pool (even if they don't need to be). It is important to understand that those threads handle requests, they don't do input/output. I/O is handled by separate processes which are independent of the ASP.NET request threads. So, if in your action you have a DB fetch operation that takes let's say 15 seconds to execute, you're forcing your current thread to wait idle 15 seconds for the DB result to be returned so it can continue executing code. Therefore, if you have 50 such requests, you'll have 50 threads occupied in basically sleep mode. Obviously, this is not very scalable and becomes a problem soon enough. In order to use your resources efficiently, it is a good idea to preserve the state of the executing request when an I/O operation is reached and free the thread to handle another request in the meantime. When the I/O operation completes, the state is reassembled and given to a free thread in the pool to resume the handling. That's why you use async handling. It lets you use your finite resources more efficiently and prevents such thread starvation. Of course, it is not an ultimate solution. It helps you scale your application for higher load. If you don't have the need for it, don't use it as it just adds overhead.
The result returns a task from an asynchronous (non-blocking) operation which represents some work that should be done. The task can tell you if the work is completed and if the operation returns a result, the task gives you the result which won't be available until the task is completed. You can learn more about C# asynchronous programming from the official Microsoft Docs:
https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.task?view=net-5.0
Async operations return a Task(it is not the result, but a promise made that will have the results once the task is completed.
The async methods should be awaited so that it waits till the task is completed. if you await an async method, the return values won't be a task anymore and will be the results you expected.
Imagine this async method:
async Task<SomeResult> SomeMethod()
{
...
}
the following code:
var result = SomeMethod(); // the result is a Task<SomeResult> which may have not completed yet
var result = await SomeMethod(); // the result is type of SomeResult
Now, why do we use async methods instead of sync methods:
Imagine that a Method is doing some job that may take a long time to complete, when it is done sync all other requests will be waiting till the execution of this long taking job to complete, as they are all happening in the same thread. However when it is async, the task will be done in a (possibly another) thread and will not block the other requests.
I am aware of how async await works. I know that when execution reaches to await, it release the thread and after IO completes, it fetches thread from threadpool and run the remaining code. This way threads are efficiently utilized. But I am confused in some use cases:
Should we use async methods for the very fast IO method, like cache read/write method? Would not they result into unnecessarily context switch. If we use sync method, execution will complete on same thread and context switch may not happen.
Does Async-await saves only memory consumption(by creating lesser threads). Or it also saves cpu as well? As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
I am aware of how async await works.
You are not.
I know that when execution reaches to await, it release the thread
It does not. When execution reaches an await, the awaitable operand is evaluated, and then it is checked to see if the operation is complete. If it is not, then the remainder of the method is signed up as the continuation of the awaitable, and a task representing the work of the current method is returned to the caller.
None of that is "releasing the thread". Rather, control returns to the caller, and the caller keeps executing on the current thread. Of course, if the current caller was the only thing on this thread, then the thread is done. But there is no requirement that an async method be the only call on a thread!
after IO completes
An awaitable need not be an IO operation, but let's suppose that it is.
it fetches thread from threadpool and run the remaining code.
No. It schedules the remaining code to run on the correct context. That context might be a threadpool thread. It might be the UI thread. It might be the current thread. It might be any number of things.
Should we use async methods for the very fast IO method, like cache read/write method?
The awaitable is evaluated. If the awaitable knows that it can complete the operation in a reasonable amount of time then it is perfectly within its rights to do the operation and return a completed task. In which case there is no penalty; you're just checking a boolean to see if the task is completed.
Would not they result into unnecessarily context switch.
Not necessarily.
If we use sync method, execution will complete on same thread and context switch may not happen.
I am confused as to why you think a context switch happens on an IO operation. IO operations run on hardware, below the level of OS threads. There's no thread sitting there servicing IO tasks.
Does Async-await saves only memory consumption(by creating lesser threads)
The purpose of await is to (1) make more efficient use of expensive worker threads by allowing workflows to become more asynchronous, and thereby freeing up threads to do work while waiting for high-latency results, and (2) to make the source code for asynchronous workflows resemble the source code for synchronous workflows.
As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
Sure but you have this completely backwards. YOU WANT TO CONSUME CPU. You want to be consuming as much CPU as possible all the time! The CPU is doing work on behalf of the user and if it is idle then its not getting its work done as fast as it could. Don't hire a worker and then pay them to sleep! Hire a worker, and as soon as they are blocked on a high-latency task, put them to work doing something else so the CPU stays as hot as possible all the time. The owner of that machine paid good money for that CPU; it should be running at 100% all the time that there is work to be done!
So let's come back to your fundamental question:
Does async await increases Context switching
I know a great way to find out. Write a program using await, write another one without, run them both, and measure the number of context switches per second. Then you'll know.
But I don't see why context switches per second is a relevant metric. Let's consider two banks with lots of customers and lots of employees. At bank #1 the employees work on one task until it is complete; they never switch context. If an employee is blocked on waiting for a result from another, they go to sleep. At bank #2, employees switch from one task to another when they are blocked, and are constantly servicing customer requests. Which bank do you think has faster customer service?
Should we use async methods for the very fast IO method, like cache read/write method?
Such an IO would not block in the classical sense. "Blocking" is a loosely defined term. Normally it means that the CPU must wait for the hardware.
This type of IO is purely CPU work and there are no context switches. This would typically happen if the app reads a file or socket slower than data can be provided. Here, async IO does not help performance at all. I'm not even sure it would be suitable to unblock the UI thread since all tasks might complete synchronously.
Or it also saves cpu as well?
It generally increases CPU usage in real-world loads. This is because the async machinery adds processing, allocations and synchronization. Also, we need to transition to kernel mode two times instead of once (first to initiate the IO, then to dequeue the IO completion notification).
Typical workloads run with <<100% CPU. A production server with >60% CPU would worry me since there is no margin for error. In such cases the thread pool work queues are almost always empty. Therefore, there are no context switching savings caused by processing multiple IO completions on one context switch.
That's why CPU usage generally increases (slightly), except if the machine is very high on CPU load and the work queues are often capable of delivering a new item immediately.
On the server async IO is mainly useful for saving threads. If you have ample threads available you will realize zero or negative gains. In particular any single IO will not become one bit faster.
That means it does not consume cpu.
It would be a waste to leave the CPU unavailable while an IO is in progress. To the kernel an IO is just a data structure. While it's in progress there is no CPU work to be done.
An anonymous person said:
For IO-bound tasks there may not be a major performance advantage to using separate threads just to wait for a result.
Pushing the same work to a different thread certainly does not help with throughput. This is added work, not reduced work. It's a shell game. (And async IO does not use a thread while it's running so all of this is based on a false assumption.)
A simple way to convince yourself that async IO generally costs more CPU than sync IO is to run a simple TCP ping/pong benchmark sync and async. Sync is faster. This is kind of an artificial load so it's just a hint at what's going on and not a comprehensive measurement.
See this question and answer;
Why use async controllers, when IIS already handles the request concurrency?
Ok, a thread consumes more resources then the async/await construction, but why? What is the core difference? You still need to remember all state etc, don't you?
Why would a thread pool be limited, but can you have tons of more idle async/await constructions?
Is it because async/await knows more about your application?
Well, let's imagine a web-server. Most of his time, all he does is wait. it doesn't really CPU-bound usually, but more of I/O bound. It waits for network I/O, disk I/O etc. After every time he waits, he has something (usually very short to do) and then all he does is waiting again. Now, the interesting part is what happend while he waits. In the most "trivial" case (that of course is absolutely not production), you would create a thread to deal with every socket you have.
Now, each of those threads has it's own cost. Some handles, 1MB of stack space... And of course, not all those threads can run in the same time - so the OS scheduler need to deal with that and choose the right thread to run each time (which means A LOT of context switching). It will work for 1 clients. It'll work for 10 clients. But, let's imagine 10,000 clients at the same time. 10,000 threads means 10GB of memory. That's more than the average web server in the world.
All of these resources, is because you dedicated a thread for a user. BUT, most of this threads does nothing! they just wait for something to happen. and the OS has API for async IO that allows you to just queue an operation that will be done once the IO operation completed, without having dedicated thread waiting for it.
If you use async/await, you can write application that will easily use less threads, and each of the thread will be utilized much more - less "doing nothing" time.
async/await is not the only way of doing that. You could have done this before async/await was introduced. BUT, async/await allows you to write code that's very readable and very easy to write that does that, and look almost as it runs just on a single thread (not a lot of callbacks and delegates moving around like before).
By combining the easy syntax of async/await and some features of the OS like async I/O (by using IO completion port), you can write much more scalable code, without losing readability.
Another famous sample is WPF/WinForms. You have the UI thread, that all he does is to process events, and usually has nothing special to do. But, you can't block it or the GUI will hang and the user won't like it. By using async/await and splitting each "hard" work to short operations, you can achieve responsible UI and readable code. If you have to access the DB to execute a query, you'll start the async operation from the UI thread, and then you'll "await" it until it ends and you have results that you can process in the UI thread (because you need to show them to the user, for example). You could have done it before, but using async/await makes it much more readable.
Hope it helps.
Creating a new thread allocates a separate memory area exclusive for this thread holding its resources, mainly its call stack which in Windows takes up 1MB of memory.
So if you have a 1000 idle threads you are using up at least 1GB of memory doing nothing.
The state for async operations takes memory as well but it's just the actual size needed for that operation and the state machine generated by the compiler and it's kept on the heap.
Moreover, using many threads and blocking them has another cost (which IMO is bigger). When a thread is blocked it is taken out of the CPU and switched with another (i.e. context-switch). That means that your threads aren't using their time-slices optimally when they get blocked. Higher rate of context switching means your machine does more overhead of context-switching and less actual work by the individual threads.
Using async-await appropriately enables using all the given time-slice since the thread, instead of blocking, goes back to the thread pool and takes another task to execute while the asynchronous operation continues concurrently.
So, in conclusion, the resources async await frees up are CPU and memory, which allows your server to handle more requests concurrently with the same amount of resources or the same amount of requests with less resources.
The important thing to realize here is that a blocked thread is not usable to do any other work until it becomes unblocked. A thread that encounters an await is free to return to the threadpool and pick up other work until the value being awaited becomes available.
When you call a synchronous I/O method, the thread executing your code is blocked waiting for the I/O to complete. To handle 1000 concurrent requests, you will need 1000 threads.
When you call an asynchronous I/O method, the thread is not blocked. It initializes the I/O operation and can work on something else. It can be the rest of your method (if you don't await), or it can be some other request if you await the I/O method. The thread pool doesn't need to create new threads for new requests, as all the threads can be used optimally and keep the CPUs busy.
Async I/O operations are actually implemented asynchronously at the OS level.
I have a method which has just one task to do and has to wait for that task to complete:
public async Task<JsonResult> GetAllAsync()
{
var result = await this.GetAllDBAsync();
return Json(result, JsonRequestBehavior.AllowGet);
}
public async Task<List<TblSubjectSubset>> GetAllDBAsync()
{
return await model.TblSubjectSubsets.ToListAsync();
}
It is significantly faster than when I run it without async-await.
We know
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active
According to this link: https://msdn.microsoft.com/en-us/library/hh191443.aspx#BKMK_Threads. What is the reason for being faster when we don't have another thread to handle the job?
"Asynchronous" does not mean "faster."
"Asynchronous" means "performs its operation in a way that it does not require a thread for the duration of the operation, thus allowing that thread to be used for other work."
In this case, you're testing a single request. The asynchronous request will "yield" its thread to the ASP.NET thread pool... which has no other use for it, since there are no other requests.
I fully expect asynchronous handlers to run slower than synchronous handlers. This is for a variety of reasons: there's the overhead of the async/await state machine, and extra work when the task completes to have its thread enter the request context. Besides this, the Win32 API layer is still heavily optimized for synchronous calls (expect this to change gradually over the next decade or so).
So, why use asynchronous handlers then?
For scalability reasons.
Consider an ASP.NET server that is serving more than one request - hundreds or thousands of requests instead of a single one. In that case, ASP.NET will be very grateful for the thread returned to it during its request processing. It can immediately use that thread to handle other requests. Asynchronous requests allow ASP.NET to handle more requests with fewer threads.
This is assuming your backend can scale, of course. If every request has to hit a single SQL Server, then your scalability bottleneck will probably be your database, not your web server.
But if your situation calls for it, asynchronous code can be a great boost to your web server scalability.
For more information, see my article on async ASP.NET.
I agree with Orbittman when he mentions the overhead involved in the application architecture. It doesn't make for a very good benchmark premise since you can't be sure if the degradation can indeed be solely attributed to the async vs non-async calls.
I've created a really simple benchmark to get a rough comparison between an async and a synchronous call and async loses every time in the overall timing actually, though the data gathering section always seems to end up the same. Have a look: https://gist.github.com/mattGuima/25cb7893616d6baaf970
Having said that, the same thought regarding the architecture applies. Frameworks handle async calls differently: Async and await - difference between console, Windows Forms and ASP.NET
The main thing to remember is to never confuse async with performance gain, because it is completely unrelated and most often it will result on no gain at all, specially with CPU-bound code. Look at the Parallel library for that instead.
Async await is not the silver bullet that some people think it is and in your example is not required. If you were processing the result of the awaitable operation after you received it then you would be able to return a task and continue on the calling thread. You wouldn't have to then wait for the rest of the operation to complete. You would be correct to remove the async/await in the above code.
It's not really possible to answer the question without seeing the calling code either as it depends on what the context is trying to trying to do with the response. What you are getting back is not just a Task but a task in the context of the method that will continue when complete. See http://codeblog.jonskeet.uk/category/eduasync/ for much better information regarding the inner workings of async/await.
Lastly I would question your timings as with an Ajax request to a database and back there other areas with potentially greater latency, such as the HTTP request and response and the DB connection itself. I assume that you're using an ORM and that alone can cause an overhead. I wonder whether it's the async/await that is the problem.
Okay , let me try to put it in sentences ...
Lets consider an example,
where I create an async method and call it with await keyword,
As far as my knowledge tells me,
The main thread will be released
In a separate thread, async method will start executing
Once it is executed, The pointer will resume from last position It left in main thread.
Question 1 : Will it come back to main thread or it will be a new thread ?
Question 2: Does it make any difference if the async method is CPU bound or network bound ? If yes, what ?
The important question
Question 3 : Assuming that is was a CPU bound method, What did I achieve? I mean - main thread was released, but at the same time, another thread was used from thread pool. what's the point ?
async does not start a new thread. Neither does await. I recommend you read my async intro post and follow up with the resources at the bottom.
async is not about parallel programming; it's about asynchronous programming. If you need parallel programming, then use the Task Parallel Library (e.g., PLINQ, Parallel, or - in very complex cases - raw Tasks).
For example, you could have an async method that does I/O-bound operations. There's no need for another thread in this scenario, and none will be created.
If you do have a CPU-bound method, then you can use Task.Run to create an awaitable Task that executes that method on a thread pool thread. For example, you could do something like await Task.Run(() => Parallel...); to treat some parallel processing as an asynchronous operation.
Execution of the caller and async method will be entirely on the current thread. async methods don't create a new thread and using async/await does not actually create additional threads. Instead, thread completions/callbacks are used with a synchronization context and suspending/giving control (think Node.js style programming). However, when control is issued to or returns to the await statement, it may end up being on a different completion thread (this depends on your application and some other factors).
Yes, it will run slower if it is CPU or Network bound. Thus the await will take longer.
The benefit is not in terms of threads believe it or not... Asynchronous programming does not necessarily mean multiple threads. The benefit is that you can continue doing other work that doesn't require the async result, before waiting for the async result... An example is a web server HTTP listener thread pool. If you have a pool of size 20 then your limit is 20 concurrent requests... If all of these requests spend 90% of their time waiting on database work, you could async/await the database work and the time during which you await the database result callback will be freed... The thread will return to the HTTP listener thread pool and another user can access your site while the original one waits for the DB work to be done, upping your total limit.
It's really about freeing up threads that wait on externally-bound and slow operations to do other things while those operations execute... Taking advantage of built-in thread pools.
Don't forget that the async part could be some long-running job, e.g. running a giant database query over the network, downloading a file from the internet, etc.