Does async await increases Context switching

Does async await increases Context switching - c#

I am aware of how async await works. I know that when execution reaches to await, it release the thread and after IO completes, it fetches thread from threadpool and run the remaining code. This way threads are efficiently utilized. But I am confused in some use cases:
Should we use async methods for the very fast IO method, like cache read/write method? Would not they result into unnecessarily context switch. If we use sync method, execution will complete on same thread and context switch may not happen.
Does Async-await saves only memory consumption(by creating lesser threads). Or it also saves cpu as well? As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?

I am aware of how async await works.
You are not.
I know that when execution reaches to await, it release the thread
It does not. When execution reaches an await, the awaitable operand is evaluated, and then it is checked to see if the operation is complete. If it is not, then the remainder of the method is signed up as the continuation of the awaitable, and a task representing the work of the current method is returned to the caller.
None of that is "releasing the thread". Rather, control returns to the caller, and the caller keeps executing on the current thread. Of course, if the current caller was the only thing on this thread, then the thread is done. But there is no requirement that an async method be the only call on a thread!
after IO completes
An awaitable need not be an IO operation, but let's suppose that it is.
it fetches thread from threadpool and run the remaining code.
No. It schedules the remaining code to run on the correct context. That context might be a threadpool thread. It might be the UI thread. It might be the current thread. It might be any number of things.
Should we use async methods for the very fast IO method, like cache read/write method?
The awaitable is evaluated. If the awaitable knows that it can complete the operation in a reasonable amount of time then it is perfectly within its rights to do the operation and return a completed task. In which case there is no penalty; you're just checking a boolean to see if the task is completed.
Would not they result into unnecessarily context switch.
Not necessarily.
If we use sync method, execution will complete on same thread and context switch may not happen.
I am confused as to why you think a context switch happens on an IO operation. IO operations run on hardware, below the level of OS threads. There's no thread sitting there servicing IO tasks.
Does Async-await saves only memory consumption(by creating lesser threads)
The purpose of await is to (1) make more efficient use of expensive worker threads by allowing workflows to become more asynchronous, and thereby freeing up threads to do work while waiting for high-latency results, and (2) to make the source code for asynchronous workflows resemble the source code for synchronous workflows.
As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
Sure but you have this completely backwards. YOU WANT TO CONSUME CPU. You want to be consuming as much CPU as possible all the time! The CPU is doing work on behalf of the user and if it is idle then its not getting its work done as fast as it could. Don't hire a worker and then pay them to sleep! Hire a worker, and as soon as they are blocked on a high-latency task, put them to work doing something else so the CPU stays as hot as possible all the time. The owner of that machine paid good money for that CPU; it should be running at 100% all the time that there is work to be done!
So let's come back to your fundamental question:
Does async await increases Context switching
I know a great way to find out. Write a program using await, write another one without, run them both, and measure the number of context switches per second. Then you'll know.
But I don't see why context switches per second is a relevant metric. Let's consider two banks with lots of customers and lots of employees. At bank #1 the employees work on one task until it is complete; they never switch context. If an employee is blocked on waiting for a result from another, they go to sleep. At bank #2, employees switch from one task to another when they are blocked, and are constantly servicing customer requests. Which bank do you think has faster customer service?

Should we use async methods for the very fast IO method, like cache read/write method?
Such an IO would not block in the classical sense. "Blocking" is a loosely defined term. Normally it means that the CPU must wait for the hardware.
This type of IO is purely CPU work and there are no context switches. This would typically happen if the app reads a file or socket slower than data can be provided. Here, async IO does not help performance at all. I'm not even sure it would be suitable to unblock the UI thread since all tasks might complete synchronously.
Or it also saves cpu as well?
It generally increases CPU usage in real-world loads. This is because the async machinery adds processing, allocations and synchronization. Also, we need to transition to kernel mode two times instead of once (first to initiate the IO, then to dequeue the IO completion notification).
Typical workloads run with <<100% CPU. A production server with >60% CPU would worry me since there is no margin for error. In such cases the thread pool work queues are almost always empty. Therefore, there are no context switching savings caused by processing multiple IO completions on one context switch.
That's why CPU usage generally increases (slightly), except if the machine is very high on CPU load and the work queues are often capable of delivering a new item immediately.
On the server async IO is mainly useful for saving threads. If you have ample threads available you will realize zero or negative gains. In particular any single IO will not become one bit faster.
That means it does not consume cpu.
It would be a waste to leave the CPU unavailable while an IO is in progress. To the kernel an IO is just a data structure. While it's in progress there is no CPU work to be done.
An anonymous person said:
For IO-bound tasks there may not be a major performance advantage to using separate threads just to wait for a result.
Pushing the same work to a different thread certainly does not help with throughput. This is added work, not reduced work. It's a shell game. (And async IO does not use a thread while it's running so all of this is based on a false assumption.)
A simple way to convince yourself that async IO generally costs more CPU than sync IO is to run a simple TCP ping/pong benchmark sync and async. Sync is faster. This is kind of an artificial load so it's just a hint at what's going on and not a comprehensive measurement.

Related

Should I use async/await in my controller?

I have watched a few videos on youtube by Les Jackson, on how to create a .net core api. Les uses this approach for a method in his controller
public ActionResult<IEnumerable<Command>> GetAllCommands()
{
var commandItems = _repository.GetAppCommands();
return Ok(commandItems);
}
However I have just read an article in a book that uses this approach
public async Task<ActionResult<IEnumerable<Data.CityDataClass>>> GetCities()
{
return await _repository.GetCities();
}
From my thinking is the second method an overkill using the "async" declaration? The first method wont return anything until it has the results anyway.
Thanks

You can understand this better if you read this article on I/O threads
To use asynchronous and parallel features of the .NET properly, you should also understand the concept of I/O threads.
Not everything in a program consumes CPU time. When a thread tries to read data from a file on disk or sends a TCP/IP packet through network, the only thing it does is delegate the actual work to a device - disk or network adapter - and wait for results.
I/O thread is an abstraction intended to hide work with devices behind a simple and familiar concept. The main point here is that you don’t have to work with those devices in a different way, you can think of the pipeline inside them just like it’s a usual CPU consuming thread. At the same time, I/O threads are extremely cheap in comparison with CPU-bound threads, because, in fact, they are merely requests to devices
So with the first way, you are not utilizing the I/O threads but with the second you do, and it's much cheaper.
In your case, when accessing the Database, you are effectively doing an I/O (network) operation. By using asynchronous programming, you are awaiting on an I/O thread for the I/O (db read) operation to complete, without blocking a thread.
Edit (After Liam's comment)
Another interesting read is async in depth
Throughout this entire process, a key takeaway is that no thread is dedicated to running the task. Although work is executed in some context (that is, the OS does have to pass data to a device driver and respond to an interrupt), there is no thread dedicated to waiting for data from the request to come back. This allows the system to handle a much larger volume of work rather than waiting for some I/O call to finish.
For even more information check the asynchronous overhead here
If the async method completes synchronously the performance overhead is fairly small.
If the async method completes synchronously the following memory overhead will occur: for async Task methods there is no overhead, for async Taskmethods the overhead is 88 bytes per operation (on x64 platform).
ValueTask can remove the overhead mentioned above for async methods that complete synchronously.
A ValueTask-based async method is a bit faster than a Task-based method if the method completes synchronously and a bit slower otherwise.
A performance overhead of async methods that await non-completed task is way more substantial (~300 bytes per operation on x64 platform).

When should a task be considered "long running"?

When working with tasks, a rule of thumb appears to be that the thread pool - typically used by e.g. invoking Task.Run(), or Parallel.Invoke() - should be used for relatively short operations. When working with long running operations, we are supposed to use the TaskCreationOptions.LongRunning flag in order to - as far as I understand it - avoid clogging the thread pool queue, i.e. to push work to a newly-created thread.
But what exactly is a long running operation? How long is long, in terms of time? Are there other factors besides the expected task duration to be considered when deciding whether or not to use the LongRunning, like the anticipated CPU architecture (frequency, the number of cores, ...) or the number of tasks that will be attempted to be run at once from the programmer's perspective?
For example, suppose I have 500 tasks to process in a dedicated application, each taking 10-20 seconds to complete. Should I just start all 500 tasks using Task.Run (e.g. in a loop) and then await them all, perhaps as LongRunning, while leaving the default max level of concurrency? Then again, if I set LongRunning in such case, wouldn't this create 500 new threads and actually cause a lot of overhead and higher memory usage (due to extra threads being allocated) as compared to omitting LongRunning? This is assuming that no new tasks will be scheduled for execution while these 500 are being awaited.
I would guess that the decision to set LongRunning depends on the number of requests made to the thread pool in a given time interval, and that LongRunning should only be used for tasks that are expected to take significantly longer that the majority of the thread pool-placed tasks - by definition, at most a small percentage of all tasks. In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. Am I correct?

It kind of doesn't matter. The problem isn't really about time, it's about what your code is doing. If you're doing asynchronous I/O, you're only using the thread for the short amount of time between individual requests. If you're doing CPU work... well, you're using the CPU. There's no "thread-pool starvation", because the CPUs are fully utilized.
The real problem is when you're doing blocking work that doesn't use the CPU. In case like that, thread-pool starvation leads to CPU-underutilization - you said "I need the CPU for my work" and then you don't actually use it.
If you're not using blocking APIs, there's no point in using Task.Run with LongRunning. If you have to run some legacy blocking code asynchronously, using LongRunning may be a good idea. Total work time isn't as important as "how often you are doing this". If you spin up one thread based on a user clicking on a GUI, the cost is tiny compared to all the latencies already included in the act of clicking a button in the first place, and you can use LongRunning just fine to avoid the thread-pool. If you're running a loop that spawns lots of blocking tasks... stop doing that. It's a bad idea :D
For example, imagine there is no asynchronous API alternative File.Exists. So if you see that this is giving you trouble (e.g. over a faulty network connection), you'd fire it up using Task.Run - and since you're not doing CPU work, you'd use LongRunning.
In contrast, if you need to do some image manipulation that's basically 100% CPU work, it doesn't matter how long the operation takes - it's not a LongRunning thing.
And finally, the most common scenario for using LongRunning is when your "work" is actually the old-school "loop and periodically check if something should be done, do it and then loop again". Long running, but 99% of the time just blocking on some wait handle or something like that. Again, this is only useful when dealing with code that isn't CPU-bound, but that doesn't have proper asynchronous APIs. You might find something like this if you ever need to write your own SynchronizationContext, for example.
Now, how do we apply this to your example? Well, we can't, not without more information. If your code is CPU-bound, Parallel.For and friends are what you want - those ensure you only use enough threads to sature the CPUs, and it's fine to use the thread-pool for that. If it's not CPU bound... you don't really have any option besides using LongRunning if you want to run the tasks in parallel. Ideally, such work would consist of asynchronous calls you can safely invoke and await Task.WhenAll(...) from your own thread.

When working with tasks, a rule of thumb appears to be that the thread pool - typically used by e.g. invoking Task.Run(), or Parallel.Invoke() - should be used for relatively short operations. When working with long running operations, we are supposed to set the TaskCreationOptions.LongRunning to true in order to - as far as I understand it - avoid clogging the thread pool queue, i.e. to push work to a newly-created thread.
The vast majority of the time, you don't need to use LongRunning at all, because the thread pool will adjust to "losing" a thread to a long-running operation after 2 seconds.
The main problem with LongRunning is that it forces you to use the very dangerous StartNew API.
In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. Am I correct?
Yes. You should never set LongRunning when first writing code. If you are seeing delays due to the thread pool injection rate, then you can carefully add LongRunning.

You should not use TaskCreationOptions.LongRunning in your case. I would use Parallel.For.
The LongRunning option is not to be used if you're going to create a lot of tasks, just like in your case. It is to be used for creating couple of tasks that will be running for a Long Time.
By the way, i never used this option in any similar scenario.

As you point out, TaskCreationOptions.LongRunning's purpose is
to allow the ThreadPool to continue to process work items even though one task is running for an extended period of time
As for when to use it:
It's not a specific length per se...You'd typically only use LongRunning if you found through performance testing that not using it was causing long delays in the processing of other work.
Source

Why does a blocking thread consume more then async/await?

See this question and answer;
Why use async controllers, when IIS already handles the request concurrency?
Ok, a thread consumes more resources then the async/await construction, but why? What is the core difference? You still need to remember all state etc, don't you?
Why would a thread pool be limited, but can you have tons of more idle async/await constructions?
Is it because async/await knows more about your application?

Well, let's imagine a web-server. Most of his time, all he does is wait. it doesn't really CPU-bound usually, but more of I/O bound. It waits for network I/O, disk I/O etc. After every time he waits, he has something (usually very short to do) and then all he does is waiting again. Now, the interesting part is what happend while he waits. In the most "trivial" case (that of course is absolutely not production), you would create a thread to deal with every socket you have.
Now, each of those threads has it's own cost. Some handles, 1MB of stack space... And of course, not all those threads can run in the same time - so the OS scheduler need to deal with that and choose the right thread to run each time (which means A LOT of context switching). It will work for 1 clients. It'll work for 10 clients. But, let's imagine 10,000 clients at the same time. 10,000 threads means 10GB of memory. That's more than the average web server in the world.
All of these resources, is because you dedicated a thread for a user. BUT, most of this threads does nothing! they just wait for something to happen. and the OS has API for async IO that allows you to just queue an operation that will be done once the IO operation completed, without having dedicated thread waiting for it.
If you use async/await, you can write application that will easily use less threads, and each of the thread will be utilized much more - less "doing nothing" time.
async/await is not the only way of doing that. You could have done this before async/await was introduced. BUT, async/await allows you to write code that's very readable and very easy to write that does that, and look almost as it runs just on a single thread (not a lot of callbacks and delegates moving around like before).
By combining the easy syntax of async/await and some features of the OS like async I/O (by using IO completion port), you can write much more scalable code, without losing readability.
Another famous sample is WPF/WinForms. You have the UI thread, that all he does is to process events, and usually has nothing special to do. But, you can't block it or the GUI will hang and the user won't like it. By using async/await and splitting each "hard" work to short operations, you can achieve responsible UI and readable code. If you have to access the DB to execute a query, you'll start the async operation from the UI thread, and then you'll "await" it until it ends and you have results that you can process in the UI thread (because you need to show them to the user, for example). You could have done it before, but using async/await makes it much more readable.
Hope it helps.

Creating a new thread allocates a separate memory area exclusive for this thread holding its resources, mainly its call stack which in Windows takes up 1MB of memory.
So if you have a 1000 idle threads you are using up at least 1GB of memory doing nothing.
The state for async operations takes memory as well but it's just the actual size needed for that operation and the state machine generated by the compiler and it's kept on the heap.
Moreover, using many threads and blocking them has another cost (which IMO is bigger). When a thread is blocked it is taken out of the CPU and switched with another (i.e. context-switch). That means that your threads aren't using their time-slices optimally when they get blocked. Higher rate of context switching means your machine does more overhead of context-switching and less actual work by the individual threads.
Using async-await appropriately enables using all the given time-slice since the thread, instead of blocking, goes back to the thread pool and takes another task to execute while the asynchronous operation continues concurrently.
So, in conclusion, the resources async await frees up are CPU and memory, which allows your server to handle more requests concurrently with the same amount of resources or the same amount of requests with less resources.

The important thing to realize here is that a blocked thread is not usable to do any other work until it becomes unblocked. A thread that encounters an await is free to return to the threadpool and pick up other work until the value being awaited becomes available.

When you call a synchronous I/O method, the thread executing your code is blocked waiting for the I/O to complete. To handle 1000 concurrent requests, you will need 1000 threads.
When you call an asynchronous I/O method, the thread is not blocked. It initializes the I/O operation and can work on something else. It can be the rest of your method (if you don't await), or it can be some other request if you await the I/O method. The thread pool doesn't need to create new threads for new requests, as all the threads can be used optimally and keep the CPUs busy.
Async I/O operations are actually implemented asynchronously at the OS level.

c# webapi: Await Task.Run vs more granualar await

I'm using async/await in WebApi Controllers according to this article:
https://msdn.microsoft.com/en-us/magazine/dn802603.aspx
Hava look at this simplified code in my controller:
DataBaseData = await Task.Run( () => GetDataFunction() );
GetDataFunction is a function that will open a database connection, open a reader and read the data from the database.
In many examples I see it handled differently. The GetDataFunction itself is awaited. And within the function every single step is awaited.
For example:
connection.OpenAsync
reader.ReadAsycnc
reader.IsDBNullAsync
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
Update:
Thanks for the help. Now I got it. I did not get, that the asynchronous Api's do not start threads themselves. This really helped: blog.stephencleary.com/2013/11/there-is-no-thread.html

The article you linked to states:
You can kick off some background work by awaiting Task.Run, but there’s no point in doing so. In fact, that will actually hurt your scalability by interfering with the ASP.NET thread pool heuristics... As a general rule, don’t queue work to the thread pool on ASP.NET.
In other words, avoid Task.Run on ASP.NET.
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
It's because the asynchronous APIs do not use other threads. The entire point of async/await is to free up the current thread; not use another thread. I have a blog post describing how async works without needing threads.
So, the Task.Run (or custom thread) approach will use up (block) a thread while getting data from the database. Proper asynchronous methods (e.g., EF6 or the ADO.NET asynchronous APIs) do not; they allow the request thread to be used for other requests while that request is waiting for the database response.

I assume you are wondering why using
await Task.Run(() => GetDataFunction());
instead of
await GetDataFunction();
As you can see in Task.Run(Func) under the section Remarks there is written:
The Run(Func) method is used by language compilers to support the async and await keywords. It is not intended to be called directly from user code.
That means, you should use await GetDataFunction.

It's all about resource management and sharing.
When using connection.Open() method the thread that calls it has to wait for the connection to actually open. While waiting it does nothing but consume the CPU slice allocated to it by the operating system. However, when you await connection.OpenAsync() the thread frees up its resources and the OS can redistribute them to other threads.
Now, there's a catch: to actually gain from asynchronous calls they should take longer than the time it takes for the OS to switch contexts otherwise it will incur a drop in application performance.
The practice of awaiting on connection.OpenAsync(), reader.ReadAsync() etc. is due to the fact that these are potentially long running operations. In a system where de database resides on another machine and it is under a heavy load of requests, connecting to the database and getting the results of a query will take some time. Instead of blocking the CPU while waiting on the results to arrive why not allocate that time slice to another worker thread that waits in the scheduler queue to render the response for another client?
Now, about starting another thread for data access: don't do that!. Each new thread gets allocated about 1MB of memory space so beside wasting CPU time while waiting for database operations to finish you'll be wasting memory also. Furthermore, having such a heavy memory footprint would require a lot more runs for Garbage Collector which will freeze all of your threads when running.
Using Task.Run() will schedule the data access operations on a thread from the ThreadPool but you will lose the benefit of sharing the CPU time with another thread while waiting for database server to respond.

Last time. Is await working with threads?

I saw some similar questions before, just want to clarify it.
In this article, it is said "There is no thread" for async calls.
However, in another one, it is said
Here, however, we’re running the callback to update the Text of
textBox1on some arbitrary thread, wherever the Task Parallel Library
(TPL) implementation of ContinueWith happened to put it.
Also, in some cases, when i was calling ContinueWith in my project, i also got "cross-thread access exception.
So, who is right?
ANSWER: thanks to i3arnon. After reading first article more carefully, i found this place
So, we see that there was no thread while the request was in flight.
When the request completed, various threads were “borrowed” or had
work briefly queued to them. This work is usually on the order of a
millisecond or so (e.g., the APC running on the thread pool) down to a
microsecond or so (e.g., the ISR). But there is no thread that was
blocked, just waiting for that request to complete.

Both are. When you have code running on your CPU there's always a thread running it. The question is what happens when you don't have code to run, for example when you are waiting for an IO operation to complete.
If you use async await where you should there would be no thread idly waiting for that operation to complete, and only after it has completed a thread will be given (usually by the Thread Pool) to continue running code on your CPU.
When you don't use async-await (or a different asynchronous paradigm like Begin-End) you would hold a thread throughout the operation, even in the IO parts of it, which is a waste of resources.
It's important to add that although most asynchronous examples regard IO operations, that's not always the case. In some cases it's reasonable to treat a CPU bound operation (where you do hold a thread throughout the whole operation) asynchronously.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.