I'm learning c# async/await methods. I don't understand how the async method keeps the reference to its caller method and how it jumps back to it correctly once its execution is done. Will the state machine or thread keeps the reference to it caller?
This is my simplified understanding of the inner workings, details are probably incorrect, but I hope the overall principle is close enough.
Lets assume you have a task from some source, and you await said task in an async method. This will register a callback delegate in the task that resumes execution of the async method, and it also captures an 'execution context'. When the task completes, regardless of success/exception/cancellation, all the registered callbacks will be called. Depending on context this might involve:
Posting a message to the main thread, asking it to invoke the callback with the result
Queuing the callback on the thread pool
Calling the callback directly
If the task is already completed then the callback may just be executed immediately.
So there will be an indirect reference from the called async method to the calling method, at least if the calling method also awaits the task. But in practice that is not something you should need to care about.
I highly recommend reading the Dissecting the async methods in C# article linked to in the comments by Mat J for a more detailed understanding. I also found Gor Nishanov article on Fibers under the magnifying glass interesting, but it covers how earlier attempts at asynchronicity worked, and how they arguably failed. There is also some articles on Continuation Passing Style (CPS) that might also be interesting, as async/await in some ways can be viewed as CPS with a easier syntax.
Related
I want to get info about all call stacks (or get all stacktraces) in my asynchronous C# application. I know, how to get stacktraces of all existing threads.
But how to get info about all call stacks released by await, which do not have a running thread on it?
CONTEXT EXAMPLE
Suppose the following code:
private static async Task Main()
{
async Task DeadlockMethod(SemaphoreSlim lock1, SemaphoreSlim lock2)
{
await lock1.WaitAsync();
await Task.Delay(500);
await lock2.WaitAsync(); // this line causes the deadlock
}
SemaphoreSlim lockA = new SemaphoreSlim(1);
SemaphoreSlim lockB = new SemaphoreSlim(1);
Task call1 = Task.Run(() => DeadlockMethod(lockA, lockB));
Task call2 = Task.Run(() => DeadlockMethod(lockB, lockA));
Task waitTask = Task.Delay(1000);
await Task.WhenAny(call1, call2, waitTask);
if (!call1.IsCompleted
&& !call2.IsCompleted)
{
// DUMP STACKTRACES to find the deadlock
}
}
I would like to dump all stacktraces, even those not having its thread currently, so that I can find the deadlock.
If line await lock2.WaitAsync(); is changed to lock2.Wait();, then it would be possible by already mentioned get stacktraces of all threads. But how to list all stacktraces without a running thread?
PREVENTION OF MISUNDERSTANDING:
The example is very simplified, it just ilustrates one of potential complications. The original problem is a complex multithreaded application, which runs on a server and many hard-to-investigate parallel-related issues may happen.
We would use the list of async/await stacktraces not only to find deadlocks, but also for other purposes. Therefore please do not advice me how to avoid deadlocks or how to write a multithreaded application - that is not the point of the question.
You can answer this generally, but also solution working at least on .Net Core 3.1 is enough.
I know, how to get stacktraces of all existing threads.
Just gonna give a bit of background here.
In Windows, threads are an OS concept. They're the unit of scheduling. So there's a definite list of threads somewhere, since that's what the OS scheduler uses.
Furthermore, each thread has a call stack. This dates back to the early days of computer programming. However, the purpose of the call stack is often misunderstood. The call stack is used as a sequence of return locations. When a method returns, it pops its call stack arguments off the stack and also the return location, and then jumps to the return location.
This is important to remember because the call stack does not represent how code got into a situation; it represents where code is going it returns from the current method. The call stack is where the code is going to, not where it came from. That is the reason the call stack exists: to direct the future code, not to assist diagnostics. Now, it does turn out that the call stack does have useful information on it for diagnostics since it gives an indication of where the code came from as well as where it's going, so that's why call stacks are on exceptions and are commonly used for diagnostics. But that's not the actual reason why the call stack exists; it's just a happy circumstance.
Now, enter asynchronous code.
In asynchronous code, the call stack still represents where the code is returning to (just like all call stacks). But in asynchronous code, the call stack no longer represents where the code came from. In the synchronous world, these two things were the same, and the call stack (which is necessary) can also be used to answer the question of "how did this code get here?". In the asynchronous world, the call stack is still necessary but only answers the question "where is this code going?" and cannot answer the question "how did this code get here?". To answer the "how did this code get here?" question you need a causality chain.
Furthermore, call stacks are necessary for correct operation (in both the synchronous and asynchronous worlds), and so the compiler/runtime ensures they exist. Causality chains are not necessary, and they are not provided out of the box. In the synchronous world, the call stack just happens to be a causality chain, which is nice, but that happy circumstance doesn't carry over to the asynchronous world.
When a thread is released by await, the stacktrace and all objects along the call stack are stored somewhere.
No; this is not the case. This would be true if async used fibers, but it doesn't. There is no call stack saved anywhere.
Because otherwise the continuation thread would lose context.
When an await resumes, it only needs sufficient context to continue executing its own method, and potentially completing the method. So, there is an async state machine structure that is boxed and placed on the heap; this structure contains references to local variables (including this and method arguments). But that is all that is necessary for program correctness; a call stack is not necessary and so it is not stored.
You can easily see this yourself by setting a breakpoint after an await and observing the call stack. You'll see that the call stack is gone after the first await yields. Or - more properly - the call stack represents the code that is continuing the async method, not the code that originally started the async method.
At the implementation level, async/await is more like callbacks than anything else. When a method hits an await, it sticks its state machine structure on the heap (if it hasn't already) and wires up a callback. That callback is triggered (invoked directly) when the task completes, and that continues executing the async method. When that async method completes, it completes its tasks, and anything awaiting those tasks are then invoked to continue executing. So, if a whole sequence of tasks complete, you actually end up with a call stack that is an inversion of the causality stack.
I would like to dump all stacktraces, even those not having its thread currently, so that I can find the deadlock.
So, there's a couple of problems here. First, there is no global list of all Task objects (or more generally, tasklike objects). And that would be a difficult thing to get.
Second, for each asynchronous method/task, there's no causality chain anyway. The compiler doesn't generate one because it's not necessary for correct operation.
That's not to say either of these problems are insurmountable - just difficult. I've done some work on the causality chain problem with my AsyncDiagnostics library. It's rather old at this point but should upgrade pretty easily to .NET Core. It uses PostSharp to modify the compiler-generated code for each method and manually track causality chains.
However, the goal of AsyncDiagnotics is to get causality chains onto exceptions. Getting a list of all tasklikes and associating causality chains with each one is another problem, likely requiring the use of an attached profiler. I'm aware of other companies who have wanted this solution, but none of them have dedicated the time necessary to create one; all of them have found it more efficient to implement code reviews, auditing, and developer training.
I marked Stephen Cleary's answer as the correct answer. He gave hints and explained deeply why it is so difficult.
I posted this alternative answer to explain, how we finally solved it and what we decided to do.
WORKAROUND SOLVING THE PROBLEM
Assumption: stacktraces including own code are enough.
Based on the assumption we can to this:
Encapsulate all called external Async methods (track their enter and leave)
Implement style check, which will warn about using any Async method out of your project namespaces
Ad 1.: Encapsulation
Suppose an external method Task ExternalObject.ExternalAsync(). We will create encapsulating extension method:
public static async Task MyExternalAsync(this ExternalObject obj)
{
using var disposable = AsyncStacktraces.MethodStarted();
await obj.ExternalAsync();
}
During the AsyncStacktraces.MethodStarted(); static call the current stacktrace will be recorded from Environment.StackTrace property into some static dictionary together with the disposable object. There will be no performance issues, since the async method itself is most probably much more expensive than stacktrace retrieval.
The disposable object will implement IDisposable interface. The .Dispose() method will remove the current stacktrace from the static dictionary at the end of the MyExternalAsync() method.
Usually only few tens of external Async methods are actually called in the solution, therefore the effort is quite low.
Ad 2.: Style check
Custom style check extension will warn when anybody uses external Async method directly. CI can be set-up so that it will not pass when this warning exists. On few places, where we will need a direct external Async method, we will use #pragma warning disable.
It seems it is impossible in c# (dotnet core 3.1) to call an asynchronous function from a non-async method without blocking the original thread. Why ?
Code example:
public async Task myMethodAsync() {
await Task.Delay(5000);
}
public void callingMethhod() {
myMethodAsync().Wait(); // all flavours of this expression, like f.ex. .Result seem to be blocking the calling thread
}
What is the technical limitation for being able to release the calling thread until the async method completes, and then continue execution from there? Is it something technically impossible?
Yes, that is expected. That is why the concept of Task etc exist. If it was possible to do what you want: they wouldn't need to - we could just wave a wand and everything would be async. The entire point of awaitable types is that releasing the calling thread while allowing later continuation is hard, and requires coordination and participation from the calling code; and from what ever calls that; etc - all the way up the stack to whatever is running the threads. Your callingMethhod code is synchronous: it has only a few things it can do:
it can run to completion - meaning: it would have to block
it can throw an exception
The impact of this is that async/await is infectious; anything that touches awaitables kinda needs to also be awaitable, which means: you would typically only call myMethodAsync from a callingMethhod that returned a [Value]Task[<T>], and which is presumably async with an await (although those bits aren't strictly necessary in the example shown).
It seems it is impossible in c# (dotnet core 3.1) to call an asynchronous function from a non-async method without blocking the original thread. Why ?
Because that's what synchronous means. Synchronous means it blocks the thread until the method is complete.
What is the technical limitation for being able to release the calling thread until the async method completes, and then continue execution from there? Is it something technically impossible?
If your calling method wants to release its thread, then make it asynchronous. Asynchronous methods are capable of releasing their calling threads. There's nothing impossible here; it's been solved by async and await.
Now, if you're asking "why can't every method implicitly be async", then that is in theory possible, but it causes a couple of major issues. It can never be done in C# for backwards compatibility reasons. The two issues that immediately come to mind are:
Limited interop. It's not possible to use an "everything is async" language to interop with anything that uses old-school threading (Win32 mutexes, et. al.).
Unexpected concurrency. There are a lot of scenarios where developers make an assumption of synchronous code. If every method is potentially asynchronous, then even code as simple as var x = this._list[0]; this._list.RemoveAt(0); is no longer safe.
I've been reading about the new async and await operators in C# and tried to figure out in which circumstances they would possibly be useful to me. I studied several MSDN articles and here's what I read between the lines:
You can use async for Windows Forms and WPF event handlers, so they can perform lengthy tasks without blocking the UI thread while the bulk of the operation is being executed.
async void button1_Click(object sender, EventArgs e)
{
// even though this call takes a while, the UI thread will not block
// while it is executing, therefore allowing further event handlers to
// be invoked.
await SomeLengthyOperationAsync();
}
A method using await must be async, which means that the usage of any async function somewhere in your code ultimately forces all methods in the calling sequence from the UI event handlers up until the lowest-level async method to be async as well.
In other words, if you create a thread with an ordinary good old ThreadStart entry point (or a Console application with good old static int Main(string[] args)), then you cannot use async and await because at one point you would have to use await, and make the method that uses it async, and hence in the calling method you also have to use await and make that one async and so on. But once you reach the thread entry point (or Main()), there's no caller to which an await would yield control to.
So basically you cannot use async and await without having a GUI that uses the standard WinForms and WPF message loop. I guess all that makes indeed sense, since MSDN states that async programming does not mean multithreading, but using the UI thread's spare time instead; when using a console application or a thread with a user defined entry point, multithreading would be necessary to perform asynchronous operations (if not using a compatible message loop).
My question is, are these assumptions accurate?
So basically you cannot use async and await without having a GUI that uses the standard WinForms and WPF message loop.
That's absolutely not the case.
In Windows Forms and WPF, async/await has the handy property of coming back to the UI thread when the asynchronous operation you were awaiting has completed, but that doesn't mean that's the only purpose to it.
If an asynchronous method executes on a thread-pool thread - e.g. in a web service - then the continuation (the rest of the asynchronous method) will simply execute in any thread-pool thread, with the context (security etc) preserved appropriately. This is still really useful for keeping the number of threads down.
For example, suppose you have a high traffic web service which mostly proxies requests to other web services. It spends most of its time waiting for other things, whether that's due to network traffic or genuine time at another service (e.g. a datbase). You shouldn't need lots of threads for that - but with blocking calls, you naturally end up with a thread per request. With async/await, you'd end up with very few threads, because very few requests would actually need any work performed for them at any one point in time, even if there were a lot of requests "in flight".
The trouble is that async/await is most easily demonstrated with UI code, because everyone knows the pain of either using background threads properly or doing too much work in the UI thread. That doesn't mean it's the only place the feature is useful though - far from it.
Various server-side technologies (MVC and WCF for example) already have support for asynchronous methods, and I'd expect others to follow suit.
A method using await must be async, which means that the usage of any async function somewhere in your code ultimately forces all methods in the calling sequence from the UI event handlers up until the lowest-level async method to be async as well.
Not true - methods marked async just mean they can use await, but callers of those methods have no restrictions. If the method returns Task or Task<T> then they can use ContinueWith or anything else you could do with tasks in 4.0
A good non-UI example is MVC4 AsyncController.
Ultimately, async/await is mostly about getting the compiler rewriting so you can write what looks like synchronous code and avoid all the callbacks like you had to do before async/await was added. It also helps with the SynchronizationContext handling, useful for scenarios with thread affinity (UI frameworks, ASP.NET), but even without those, it's still useful. Main can always do DoStuffAsync().Wait(); for instance. :)
My question is, are these assumptions accurate?
No.
You can use async for Windows Forms and WPF event handlers, so they can perform lengthy tasks without blocking the UI thread while the bulk of the operation is being executed.
True. Also true for other UI applications including Silverlight and Windows Store.
And also true for ASP.NET. In this case, it's the HTTP request thread that is not blocked.
A method using await must be async, which means that the usage of any async function somewhere in your code ultimately forces all methods in the calling sequence from the UI event handlers up until the lowest-level async method to be async as well.
This is a best practice ("async all the way down"), but it's not strictly required. You can block on the result of an asynchronous operation; many people choose to do this in Console applications.
an ordinary good old ThreadStart entry point
Well... I do have to take issue with "ordinary good old". As I explain on my blog, Thread is pretty much the worst option you have for doing background operations.
I recommend you review my introduction to async and await, and follow up with the async / await FAQ.
async-await is only wrapper for Task class manipulations, which is part of so named Tasks Parallel Library - TPL(published before async-await auto code generation tech.)
So fact is you may not use any references to UI controls within async - await.
Typically async-await is powerfull tool for any web and server relations, loading resources, sql. It works with smart waiting data with alive UI.
Typically TPL application: from simple big size loop till multi stages parallel calculations in complex calculations based on shared data (ContinueWith and so on)
Why is it possible to do this in C#?
var task = Task.Run (...);
await task;
Isn't Task.Run() supposed to be used for CPU-bound code? Does it make sense to call awaitfor this?
I.e., after calling Task.Run I understand that the task is running in another thread of the thread pool. What's the purpose of calling await? Wouldn't make more sense just to call task.Wait()?
One last question, my first impression was that await is intended to be used exclusively with async methods. Is it common to use it for task returned by Task.Run()?
EDIT. It also makes me wonder, why do we have Task.Wait () and not a Task.Await(). I mean, why a method is used for Wait() and a keyworkd for await. Wouldn't be more consistent to use a method in both cases?
There would be no point at all in using Wait. There's no point in starting up a new thread to do work if you're just going to have another thread sitting there doing nothing waiting on it. The only sensible option of those two is to await it. Awaiting a task is entirely sensible, as it's allowing the original thread to continue execution.
It's sensible to await any type of Task (in the right context), regardless of where it comes from. There's nothing special about async methods being awaited. In fact in every single asynchronous program there needs to be asynchronous methods that aren't using the async keyword; if every await is awaiting an async method then you'll never have anywhere to start.
There are several good answers here, but from a more philosophical standpoint...
If you have lots of CPU-bound work to do, the best solution is usually the Task Parallel Library, i.e., Parallel or Parallel LINQ.
If you have I/O-bound work to do, the best solution is usually async and await code that is built around naturally-asynchronous implementations (e.g., Task.Factory.FromAsync).
Task.Run is a way to execute a single piece of CPU-bound code and treat it as asynchronous from the point of view of the calling thread. I.e., if you want to do CPU-bound work but not have it interfere with the UI.
The construct await Task.Run is a way to bridge the two worlds: have the UI thread queue up CPU-bound work and treat it asynchronously. This is IMO the best way to bridge asynchronous and parallel code as well, e.g., await Task.Run(() => Parallel.ForEach(...)).
why a method is used for Wait() and a keyword for await.
One reason that await is a keyword is because they wanted to enable pattern matching. Tasks are not the only "awaitables" out there. WinRT has its own notion of "asynchronous operations" which are awaitable, Rx observable sequences are awaitable, Task.Yield returns a non-Task awaitable, and this enables you to create your own awaitables if necessary (e.g., if you want to avoid Task allocations in high-performance socket applications).
Yes, it's common and recommended. await allows to wait for a task (or any awaitable) asynchronously. It's true that it's mostly used for naturally asynchronous operations (e.g. I/O), but it's also used for offloading work to be done on a different thread using Task.Run and waiting asynchronously for it to complete.
Using Wait not only blocks the calling thread and so defeats the purpose of using Task.Run in the first place, it could also potentially lead to deadlocks in a GUI environment with a single threaded synchronization context.
One last question, my first impression was that await is intended to be used exclusively with async methods
Whether a method is actually marked with the async modifier is an implementation detail and most "root" task returning methods in .Net aren't actually async ones (Task.Delay is a good example for that).
Wouldn't make more sense just to call task.Wait()?
No, if you call Wait you're involving two threads there, one of the worker threads from the ThreadPool is working for you (given the task is CPU bound), and also your calling thread will be blocked.
Why would you block the calling thread? The result will be too bad if the calling thread is a UI thread! Also if you call Task.Run immediately followed by Task.Wait you're making it worse, too. It is no better than calling the delegate synchronously. There is no point whatsoever in calling Wait immediately after starting a Task.
You should almost never use Wait, always prefer await and release the calling thread.
It is very common and useful for cases like (greatly simplified; production code would need exception handling, for instance):
async void OnButtonClicked()
{
//... Get information from UI controls ...
var results = await Task.Run(CpuBoundWorkThatShouldntBlockUI);
//... Update UI based on results from work run in the background ...
}
Regarding the comment of your later edit, 'await' is not a method call. It is a keyword (only allowed in a method that has been marked 'async', for clarity) that the compiler uses to decide how to implement the method. Under the hood, this involves rewriting the method procedure as a state machine that can be paused every time you use the 'await' keyword, then resumed when whatever is it awaiting calls back to indicate that it is done. This is a simplified description (exception propagation and other details complicate things), but the key point is that 'await' does a lot more than just calling a method on a Task.
In previous C# versions, the closest construct to this 'async/await' magic was the use of 'yield return' to implement IEnumerable<T>. For both enumerations and async methods, you want a way to pause and resume a method. The async/await keywords (and associated compiler support) start with this basic idea of a resumable method, then add in some powerful features like automatic propagation of exceptions, dispatching callbacks via a synchronization context (mostly useful to keep code on the UI thread) and automatic implementation of all the glue code to set up the continuation callback logic.
I've been reading about the new async and await operators in C# and tried to figure out in which circumstances they would possibly be useful to me. I studied several MSDN articles and here's what I read between the lines:
You can use async for Windows Forms and WPF event handlers, so they can perform lengthy tasks without blocking the UI thread while the bulk of the operation is being executed.
async void button1_Click(object sender, EventArgs e)
{
// even though this call takes a while, the UI thread will not block
// while it is executing, therefore allowing further event handlers to
// be invoked.
await SomeLengthyOperationAsync();
}
A method using await must be async, which means that the usage of any async function somewhere in your code ultimately forces all methods in the calling sequence from the UI event handlers up until the lowest-level async method to be async as well.
In other words, if you create a thread with an ordinary good old ThreadStart entry point (or a Console application with good old static int Main(string[] args)), then you cannot use async and await because at one point you would have to use await, and make the method that uses it async, and hence in the calling method you also have to use await and make that one async and so on. But once you reach the thread entry point (or Main()), there's no caller to which an await would yield control to.
So basically you cannot use async and await without having a GUI that uses the standard WinForms and WPF message loop. I guess all that makes indeed sense, since MSDN states that async programming does not mean multithreading, but using the UI thread's spare time instead; when using a console application or a thread with a user defined entry point, multithreading would be necessary to perform asynchronous operations (if not using a compatible message loop).
My question is, are these assumptions accurate?
So basically you cannot use async and await without having a GUI that uses the standard WinForms and WPF message loop.
That's absolutely not the case.
In Windows Forms and WPF, async/await has the handy property of coming back to the UI thread when the asynchronous operation you were awaiting has completed, but that doesn't mean that's the only purpose to it.
If an asynchronous method executes on a thread-pool thread - e.g. in a web service - then the continuation (the rest of the asynchronous method) will simply execute in any thread-pool thread, with the context (security etc) preserved appropriately. This is still really useful for keeping the number of threads down.
For example, suppose you have a high traffic web service which mostly proxies requests to other web services. It spends most of its time waiting for other things, whether that's due to network traffic or genuine time at another service (e.g. a datbase). You shouldn't need lots of threads for that - but with blocking calls, you naturally end up with a thread per request. With async/await, you'd end up with very few threads, because very few requests would actually need any work performed for them at any one point in time, even if there were a lot of requests "in flight".
The trouble is that async/await is most easily demonstrated with UI code, because everyone knows the pain of either using background threads properly or doing too much work in the UI thread. That doesn't mean it's the only place the feature is useful though - far from it.
Various server-side technologies (MVC and WCF for example) already have support for asynchronous methods, and I'd expect others to follow suit.
A method using await must be async, which means that the usage of any async function somewhere in your code ultimately forces all methods in the calling sequence from the UI event handlers up until the lowest-level async method to be async as well.
Not true - methods marked async just mean they can use await, but callers of those methods have no restrictions. If the method returns Task or Task<T> then they can use ContinueWith or anything else you could do with tasks in 4.0
A good non-UI example is MVC4 AsyncController.
Ultimately, async/await is mostly about getting the compiler rewriting so you can write what looks like synchronous code and avoid all the callbacks like you had to do before async/await was added. It also helps with the SynchronizationContext handling, useful for scenarios with thread affinity (UI frameworks, ASP.NET), but even without those, it's still useful. Main can always do DoStuffAsync().Wait(); for instance. :)
My question is, are these assumptions accurate?
No.
You can use async for Windows Forms and WPF event handlers, so they can perform lengthy tasks without blocking the UI thread while the bulk of the operation is being executed.
True. Also true for other UI applications including Silverlight and Windows Store.
And also true for ASP.NET. In this case, it's the HTTP request thread that is not blocked.
A method using await must be async, which means that the usage of any async function somewhere in your code ultimately forces all methods in the calling sequence from the UI event handlers up until the lowest-level async method to be async as well.
This is a best practice ("async all the way down"), but it's not strictly required. You can block on the result of an asynchronous operation; many people choose to do this in Console applications.
an ordinary good old ThreadStart entry point
Well... I do have to take issue with "ordinary good old". As I explain on my blog, Thread is pretty much the worst option you have for doing background operations.
I recommend you review my introduction to async and await, and follow up with the async / await FAQ.
async-await is only wrapper for Task class manipulations, which is part of so named Tasks Parallel Library - TPL(published before async-await auto code generation tech.)
So fact is you may not use any references to UI controls within async - await.
Typically async-await is powerfull tool for any web and server relations, loading resources, sql. It works with smart waiting data with alive UI.
Typically TPL application: from simple big size loop till multi stages parallel calculations in complex calculations based on shared data (ContinueWith and so on)