Why does a blocking thread consume more then async/await? - c#

See this question and answer;
Why use async controllers, when IIS already handles the request concurrency?
Ok, a thread consumes more resources then the async/await construction, but why? What is the core difference? You still need to remember all state etc, don't you?
Why would a thread pool be limited, but can you have tons of more idle async/await constructions?
Is it because async/await knows more about your application?

Well, let's imagine a web-server. Most of his time, all he does is wait. it doesn't really CPU-bound usually, but more of I/O bound. It waits for network I/O, disk I/O etc. After every time he waits, he has something (usually very short to do) and then all he does is waiting again. Now, the interesting part is what happend while he waits. In the most "trivial" case (that of course is absolutely not production), you would create a thread to deal with every socket you have.
Now, each of those threads has it's own cost. Some handles, 1MB of stack space... And of course, not all those threads can run in the same time - so the OS scheduler need to deal with that and choose the right thread to run each time (which means A LOT of context switching). It will work for 1 clients. It'll work for 10 clients. But, let's imagine 10,000 clients at the same time. 10,000 threads means 10GB of memory. That's more than the average web server in the world.
All of these resources, is because you dedicated a thread for a user. BUT, most of this threads does nothing! they just wait for something to happen. and the OS has API for async IO that allows you to just queue an operation that will be done once the IO operation completed, without having dedicated thread waiting for it.
If you use async/await, you can write application that will easily use less threads, and each of the thread will be utilized much more - less "doing nothing" time.
async/await is not the only way of doing that. You could have done this before async/await was introduced. BUT, async/await allows you to write code that's very readable and very easy to write that does that, and look almost as it runs just on a single thread (not a lot of callbacks and delegates moving around like before).
By combining the easy syntax of async/await and some features of the OS like async I/O (by using IO completion port), you can write much more scalable code, without losing readability.
Another famous sample is WPF/WinForms. You have the UI thread, that all he does is to process events, and usually has nothing special to do. But, you can't block it or the GUI will hang and the user won't like it. By using async/await and splitting each "hard" work to short operations, you can achieve responsible UI and readable code. If you have to access the DB to execute a query, you'll start the async operation from the UI thread, and then you'll "await" it until it ends and you have results that you can process in the UI thread (because you need to show them to the user, for example). You could have done it before, but using async/await makes it much more readable.
Hope it helps.

Creating a new thread allocates a separate memory area exclusive for this thread holding its resources, mainly its call stack which in Windows takes up 1MB of memory.
So if you have a 1000 idle threads you are using up at least 1GB of memory doing nothing.
The state for async operations takes memory as well but it's just the actual size needed for that operation and the state machine generated by the compiler and it's kept on the heap.
Moreover, using many threads and blocking them has another cost (which IMO is bigger). When a thread is blocked it is taken out of the CPU and switched with another (i.e. context-switch). That means that your threads aren't using their time-slices optimally when they get blocked. Higher rate of context switching means your machine does more overhead of context-switching and less actual work by the individual threads.
Using async-await appropriately enables using all the given time-slice since the thread, instead of blocking, goes back to the thread pool and takes another task to execute while the asynchronous operation continues concurrently.
So, in conclusion, the resources async await frees up are CPU and memory, which allows your server to handle more requests concurrently with the same amount of resources or the same amount of requests with less resources.

The important thing to realize here is that a blocked thread is not usable to do any other work until it becomes unblocked. A thread that encounters an await is free to return to the threadpool and pick up other work until the value being awaited becomes available.

When you call a synchronous I/O method, the thread executing your code is blocked waiting for the I/O to complete. To handle 1000 concurrent requests, you will need 1000 threads.
When you call an asynchronous I/O method, the thread is not blocked. It initializes the I/O operation and can work on something else. It can be the rest of your method (if you don't await), or it can be some other request if you await the I/O method. The thread pool doesn't need to create new threads for new requests, as all the threads can be used optimally and keep the CPUs busy.
Async I/O operations are actually implemented asynchronously at the OS level.

Related

Does async await increases Context switching

I am aware of how async await works. I know that when execution reaches to await, it release the thread and after IO completes, it fetches thread from threadpool and run the remaining code. This way threads are efficiently utilized. But I am confused in some use cases:
Should we use async methods for the very fast IO method, like cache read/write method? Would not they result into unnecessarily context switch. If we use sync method, execution will complete on same thread and context switch may not happen.
Does Async-await saves only memory consumption(by creating lesser threads). Or it also saves cpu as well? As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
I am aware of how async await works.
You are not.
I know that when execution reaches to await, it release the thread
It does not. When execution reaches an await, the awaitable operand is evaluated, and then it is checked to see if the operation is complete. If it is not, then the remainder of the method is signed up as the continuation of the awaitable, and a task representing the work of the current method is returned to the caller.
None of that is "releasing the thread". Rather, control returns to the caller, and the caller keeps executing on the current thread. Of course, if the current caller was the only thing on this thread, then the thread is done. But there is no requirement that an async method be the only call on a thread!
after IO completes
An awaitable need not be an IO operation, but let's suppose that it is.
it fetches thread from threadpool and run the remaining code.
No. It schedules the remaining code to run on the correct context. That context might be a threadpool thread. It might be the UI thread. It might be the current thread. It might be any number of things.
Should we use async methods for the very fast IO method, like cache read/write method?
The awaitable is evaluated. If the awaitable knows that it can complete the operation in a reasonable amount of time then it is perfectly within its rights to do the operation and return a completed task. In which case there is no penalty; you're just checking a boolean to see if the task is completed.
Would not they result into unnecessarily context switch.
Not necessarily.
If we use sync method, execution will complete on same thread and context switch may not happen.
I am confused as to why you think a context switch happens on an IO operation. IO operations run on hardware, below the level of OS threads. There's no thread sitting there servicing IO tasks.
Does Async-await saves only memory consumption(by creating lesser threads)
The purpose of await is to (1) make more efficient use of expensive worker threads by allowing workflows to become more asynchronous, and thereby freeing up threads to do work while waiting for high-latency results, and (2) to make the source code for asynchronous workflows resemble the source code for synchronous workflows.
As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
Sure but you have this completely backwards. YOU WANT TO CONSUME CPU. You want to be consuming as much CPU as possible all the time! The CPU is doing work on behalf of the user and if it is idle then its not getting its work done as fast as it could. Don't hire a worker and then pay them to sleep! Hire a worker, and as soon as they are blocked on a high-latency task, put them to work doing something else so the CPU stays as hot as possible all the time. The owner of that machine paid good money for that CPU; it should be running at 100% all the time that there is work to be done!
So let's come back to your fundamental question:
Does async await increases Context switching
I know a great way to find out. Write a program using await, write another one without, run them both, and measure the number of context switches per second. Then you'll know.
But I don't see why context switches per second is a relevant metric. Let's consider two banks with lots of customers and lots of employees. At bank #1 the employees work on one task until it is complete; they never switch context. If an employee is blocked on waiting for a result from another, they go to sleep. At bank #2, employees switch from one task to another when they are blocked, and are constantly servicing customer requests. Which bank do you think has faster customer service?
Should we use async methods for the very fast IO method, like cache read/write method?
Such an IO would not block in the classical sense. "Blocking" is a loosely defined term. Normally it means that the CPU must wait for the hardware.
This type of IO is purely CPU work and there are no context switches. This would typically happen if the app reads a file or socket slower than data can be provided. Here, async IO does not help performance at all. I'm not even sure it would be suitable to unblock the UI thread since all tasks might complete synchronously.
Or it also saves cpu as well?
It generally increases CPU usage in real-world loads. This is because the async machinery adds processing, allocations and synchronization. Also, we need to transition to kernel mode two times instead of once (first to initiate the IO, then to dequeue the IO completion notification).
Typical workloads run with <<100% CPU. A production server with >60% CPU would worry me since there is no margin for error. In such cases the thread pool work queues are almost always empty. Therefore, there are no context switching savings caused by processing multiple IO completions on one context switch.
That's why CPU usage generally increases (slightly), except if the machine is very high on CPU load and the work queues are often capable of delivering a new item immediately.
On the server async IO is mainly useful for saving threads. If you have ample threads available you will realize zero or negative gains. In particular any single IO will not become one bit faster.
That means it does not consume cpu.
It would be a waste to leave the CPU unavailable while an IO is in progress. To the kernel an IO is just a data structure. While it's in progress there is no CPU work to be done.
An anonymous person said:
For IO-bound tasks there may not be a major performance advantage to using separate threads just to wait for a result.
Pushing the same work to a different thread certainly does not help with throughput. This is added work, not reduced work. It's a shell game. (And async IO does not use a thread while it's running so all of this is based on a false assumption.)
A simple way to convince yourself that async IO generally costs more CPU than sync IO is to run a simple TCP ping/pong benchmark sync and async. Sync is faster. This is kind of an artificial load so it's just a hint at what's going on and not a comprehensive measurement.

When should a task be considered "long running"?

When working with tasks, a rule of thumb appears to be that the thread pool - typically used by e.g. invoking Task.Run(), or Parallel.Invoke() - should be used for relatively short operations. When working with long running operations, we are supposed to use the TaskCreationOptions.LongRunning flag in order to - as far as I understand it - avoid clogging the thread pool queue, i.e. to push work to a newly-created thread.
But what exactly is a long running operation? How long is long, in terms of time? Are there other factors besides the expected task duration to be considered when deciding whether or not to use the LongRunning, like the anticipated CPU architecture (frequency, the number of cores, ...) or the number of tasks that will be attempted to be run at once from the programmer's perspective?
For example, suppose I have 500 tasks to process in a dedicated application, each taking 10-20 seconds to complete. Should I just start all 500 tasks using Task.Run (e.g. in a loop) and then await them all, perhaps as LongRunning, while leaving the default max level of concurrency? Then again, if I set LongRunning in such case, wouldn't this create 500 new threads and actually cause a lot of overhead and higher memory usage (due to extra threads being allocated) as compared to omitting LongRunning? This is assuming that no new tasks will be scheduled for execution while these 500 are being awaited.
I would guess that the decision to set LongRunning depends on the number of requests made to the thread pool in a given time interval, and that LongRunning should only be used for tasks that are expected to take significantly longer that the majority of the thread pool-placed tasks - by definition, at most a small percentage of all tasks. In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. Am I correct?
It kind of doesn't matter. The problem isn't really about time, it's about what your code is doing. If you're doing asynchronous I/O, you're only using the thread for the short amount of time between individual requests. If you're doing CPU work... well, you're using the CPU. There's no "thread-pool starvation", because the CPUs are fully utilized.
The real problem is when you're doing blocking work that doesn't use the CPU. In case like that, thread-pool starvation leads to CPU-underutilization - you said "I need the CPU for my work" and then you don't actually use it.
If you're not using blocking APIs, there's no point in using Task.Run with LongRunning. If you have to run some legacy blocking code asynchronously, using LongRunning may be a good idea. Total work time isn't as important as "how often you are doing this". If you spin up one thread based on a user clicking on a GUI, the cost is tiny compared to all the latencies already included in the act of clicking a button in the first place, and you can use LongRunning just fine to avoid the thread-pool. If you're running a loop that spawns lots of blocking tasks... stop doing that. It's a bad idea :D
For example, imagine there is no asynchronous API alternative File.Exists. So if you see that this is giving you trouble (e.g. over a faulty network connection), you'd fire it up using Task.Run - and since you're not doing CPU work, you'd use LongRunning.
In contrast, if you need to do some image manipulation that's basically 100% CPU work, it doesn't matter how long the operation takes - it's not a LongRunning thing.
And finally, the most common scenario for using LongRunning is when your "work" is actually the old-school "loop and periodically check if something should be done, do it and then loop again". Long running, but 99% of the time just blocking on some wait handle or something like that. Again, this is only useful when dealing with code that isn't CPU-bound, but that doesn't have proper asynchronous APIs. You might find something like this if you ever need to write your own SynchronizationContext, for example.
Now, how do we apply this to your example? Well, we can't, not without more information. If your code is CPU-bound, Parallel.For and friends are what you want - those ensure you only use enough threads to sature the CPUs, and it's fine to use the thread-pool for that. If it's not CPU bound... you don't really have any option besides using LongRunning if you want to run the tasks in parallel. Ideally, such work would consist of asynchronous calls you can safely invoke and await Task.WhenAll(...) from your own thread.
When working with tasks, a rule of thumb appears to be that the thread pool - typically used by e.g. invoking Task.Run(), or Parallel.Invoke() - should be used for relatively short operations. When working with long running operations, we are supposed to set the TaskCreationOptions.LongRunning to true in order to - as far as I understand it - avoid clogging the thread pool queue, i.e. to push work to a newly-created thread.
The vast majority of the time, you don't need to use LongRunning at all, because the thread pool will adjust to "losing" a thread to a long-running operation after 2 seconds.
The main problem with LongRunning is that it forces you to use the very dangerous StartNew API.
In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. Am I correct?
Yes. You should never set LongRunning when first writing code. If you are seeing delays due to the thread pool injection rate, then you can carefully add LongRunning.
You should not use TaskCreationOptions.LongRunning in your case. I would use Parallel.For.
The LongRunning option is not to be used if you're going to create a lot of tasks, just like in your case. It is to be used for creating couple of tasks that will be running for a Long Time.
By the way, i never used this option in any similar scenario.
As you point out, TaskCreationOptions.LongRunning's purpose is
to allow the ThreadPool to continue to process work items even though one task is running for an extended period of time
As for when to use it:
It's not a specific length per se...You'd typically only use LongRunning if you found through performance testing that not using it was causing long delays in the processing of other work.
Source

c# webapi: Await Task.Run vs more granualar await

I'm using async/await in WebApi Controllers according to this article:
https://msdn.microsoft.com/en-us/magazine/dn802603.aspx
Hava look at this simplified code in my controller:
DataBaseData = await Task.Run( () => GetDataFunction() );
GetDataFunction is a function that will open a database connection, open a reader and read the data from the database.
In many examples I see it handled differently. The GetDataFunction itself is awaited. And within the function every single step is awaited.
For example:
connection.OpenAsync
reader.ReadAsycnc
reader.IsDBNullAsync
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
Update:
Thanks for the help. Now I got it. I did not get, that the asynchronous Api's do not start threads themselves. This really helped: blog.stephencleary.com/2013/11/there-is-no-thread.html
The article you linked to states:
You can kick off some background work by awaiting Task.Run, but there’s no point in doing so. In fact, that will actually hurt your scalability by interfering with the ASP.NET thread pool heuristics... As a general rule, don’t queue work to the thread pool on ASP.NET.
In other words, avoid Task.Run on ASP.NET.
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
It's because the asynchronous APIs do not use other threads. The entire point of async/await is to free up the current thread; not use another thread. I have a blog post describing how async works without needing threads.
So, the Task.Run (or custom thread) approach will use up (block) a thread while getting data from the database. Proper asynchronous methods (e.g., EF6 or the ADO.NET asynchronous APIs) do not; they allow the request thread to be used for other requests while that request is waiting for the database response.
I assume you are wondering why using
await Task.Run(() => GetDataFunction());
instead of
await GetDataFunction();
As you can see in Task.Run(Func) under the section Remarks there is written:
The Run(Func) method is used by language compilers to support the async and await keywords. It is not intended to be called directly from user code.
That means, you should use await GetDataFunction.
It's all about resource management and sharing.
When using connection.Open() method the thread that calls it has to wait for the connection to actually open. While waiting it does nothing but consume the CPU slice allocated to it by the operating system. However, when you await connection.OpenAsync() the thread frees up its resources and the OS can redistribute them to other threads.
Now, there's a catch: to actually gain from asynchronous calls they should take longer than the time it takes for the OS to switch contexts otherwise it will incur a drop in application performance.
The practice of awaiting on connection.OpenAsync(), reader.ReadAsync() etc. is due to the fact that these are potentially long running operations. In a system where de database resides on another machine and it is under a heavy load of requests, connecting to the database and getting the results of a query will take some time. Instead of blocking the CPU while waiting on the results to arrive why not allocate that time slice to another worker thread that waits in the scheduler queue to render the response for another client?
Now, about starting another thread for data access: don't do that!. Each new thread gets allocated about 1MB of memory space so beside wasting CPU time while waiting for database operations to finish you'll be wasting memory also. Furthermore, having such a heavy memory footprint would require a lot more runs for Garbage Collector which will freeze all of your threads when running.
Using Task.Run() will schedule the data access operations on a thread from the ThreadPool but you will lose the benefit of sharing the CPU time with another thread while waiting for database server to respond.

Threads, Task, async/await, Threadpool

I am getting really confused here about multithreading :(
I am reading about the C# Async/Await keywords. I often read, that by using this async feature, the code gets executed "non-blocking". People put code examples in two categories "IO-Bound" and "CPU-bound" - and that I should not use a thread when I execute io-bound things, because that thread will just wait ..
I dont get it... If I do not want a user have to wait for an operation, I have to execute that operation on another thread, right ?
If I use the Threadpool, an instance of "Thread"-class, delegate.BeginInvoke or the TPL -- every asynchronous execution is done on another thread. (with or without a callback)
What you are missing is that not every asynchronous operation is done on another thread. Waiting on an IO operation or a web service call does not require the creation of a thread. On Windows this is done by using the OS I/O Completion Ports.
What happens when you call something like Stream.ReadAsync is that the OS will issue a read command to the disk and then return to the caller. Once the disk completes the read the notifies the OS kernel which will then trigger a call back to your processes. So there is no need to create a new threadpool thread that will just sit and block.
What is meant is this:
Suppose you query some data from a database (on another server) - you will send a request and just wait for the answer. Instead of having a thread block and wait for the return it's better to register an callback that get's called when the data comes back - this is (more or less) what async/await does.
It will free the thread to do other things (give it back to the pool) but once your data come back asynchronously it will get another thread and continue your code at the point you left (it's really some kind of state-machine that handles that).
If your calculation is really CPU intensive (let's say you are calculating prime-numbers) things are different - you are not waiting for some external IO, you are doing heavy work on the CPU - here it's a better idea to use a thread so that your UI will not block.
I dont get it... If I do not want a user have to wait for an operation, I have to execute that operation on another thread, right ?
Not exactly. An operation will take however long it is going to take. When you have a single-user application, running long-running things on a separate thread lets the user interface remain responsive. At the very least this allows the UI to have something like a "Cancel" button that can take user input and cancel processing on the other thread. For some single-user applications, it makes sense to allow the user to keep doing other things while a long-running task completes (for example let them work on one file while another file is uploading or downloading).
For web applications, you do not want to block a thread from the thread pool during lengthy(ish) IO, for example while reading from a database or calling another web service. This is because there are only a limited number of threads available in the thread pool, and if they are all in use, the web server will not be able to accept additional HTTP requests.

Is it necessary to use the async Begin/End methods if already on a separate thread?

Trying to figure out whether or not I should use async methods or not such as:
TcpListener.BeginAcceptTcpClient
TcpListener.EndcceptTcpClient
and
NetworkStream.BeginRead
NetworkStream.EndRead
as opposed to their synchronous TcpListener.AcceptTcpClient and NetworkStream.Read versions. I've been looking at related threads but I'm still a bit unsure about one thing:
Question: The main advantage of using an asynchronous method is that the GUI is not locked up. However, these methods will be called on separate Task threads as it is so there is no threat of that. Also, TcpListener.AcceptTcpClient blocks the thread until a connection is made so there is no wasted CPU cycles. Since this is the case, then why do so many always recommend using the async versions? It seems like in this case the synchronous versions would be superior?
Also, another disadvantage of using asynchronous methods is the increased complexity and constant casting of objects. For example, having to do this:
private void SomeMethod()
{
// ...
listener.BeginAcceptTcpClient(OnAcceptConnection, listener);
}
private void OnAcceptConnection(IAsyncResult asyn)
{
TcpListener listener = (TcpListener)asyn.AsyncState;
TcpClient client = listener.EndAcceptTcpClient(asyn);
}
As opposed to this:
TcpClient client = listener.AcceptTcpClient();
Also it seems like the async versions would have much more overhead due to having to create another thread. (Basically, every connection would have a thread and then when reading that thread would also have another thread. Threadception!)
Also, there is the boxing and unboxing of the TcpListener and the overhead associated with creating, managing, and closing these additional threads.
Basically, where normally there would just be individual threads for handling individual client connections, now there is that and then an additional thread for each type of operation performed (reading/writing stream data and listening for new connections on the server's end)
Please correct me if I am wrong. I am still new to threading and I'm trying to understand this all. However, in this case it seems like using the normal synchronous methods and just blocking the thread would be the optimal solution?
TcpListener.AcceptTcpClient blocks the thread until a connection is made so there is no wasted CPU cycles.
But there is also no work getting done. A Thread is a very expensive operating system object, about the most expensive there is. Your program is consuming a megabyte of memory without it being used while the thread blocks on connection request.
However, these methods will be called on separate Task threads as it is so there is no threat of that
A Task is not a good solution either, it uses a threadpool thread but the thread will block. The threadpool manager tries to keep the number of running TP threads equal to the number of cpu cores on the machine. That won't work well when a TP thread blocks for a long time. It prevents other useful work from being done by other TP threads that are waiting to get their turn.
BeginAcceptTcpClient() uses a so-called I/O completion callback. No system resources are consumed while the socket is listening. As soon as a connection request comes in, the operating system runs an APC (asynchronous procedure call) which grabs a threadpool thread to make the callback. The thread itself is in use for, typically, a few microseconds. Very efficient.
This kind of code will get a lot simpler in the next version of C# with the next async and await keywords. End of the year, maybe.
If you call AcceptTcpClient() on any thread, that thread is useless until you get a connection.
If you call BeginAcceptTcpClient(), the calling thread can stop immediately, without wasting the thread.
This is particularly important when using the ThreadPool (or the TPL), since they use a limited number of pool threads.
If you have too many threads waiting for operations, you can run out of threadpool threads, so that new work items will have to wait until one of the other threads finish.

Categories

Resources