Related
I completely don't understand the applied meaning of async\await.
I just started learning async\await and I know that there are already a huge number of topics. If I understand correctly, then async\await is not needed anywhere else except for operations with a long wait in a thread, if this is not related to a long calculation. For example, database response, network request, file handling. Many people write that async\await is also needed so as not to block the main thread. And here it is completely unclear to me why it should be blocked. Don't block without async\await, just create a task. So I'm trying to create a code that will wait a long time for a response from the network.
I created an example. I see with my own eyes through the windows task manager that the while (i < int.MaxValue) operation is processed first, taking up the entire processor resource, although I first launched the DownloadFile. And only then, when the processor is released, I see that the download files is in progress. On my machine, the example runs ~54 seconds.
Question: how could I first run the DownloadFile asynchronously so that the threads do not idle uselessly, but can do while (i < int.MaxValue)?
using System.Net;
string PathProject = Directory.GetParent(Directory.GetCurrentDirectory()).Parent.Parent.Parent.FullName;
//Create folder 1 in the project folder
DirectoryInfo Path = new DirectoryInfo($"{PathProject}\\1");
int Iterations = Environment.ProcessorCount * 3;
string file = "https://s182vla.storage.yandex.net/rdisk/82b08d86b9920a5e889c6947e4221eb1350374db8d799ee9161395f7195b0b0e/62f75403/geIEA69cusBRNOpxmtup5BdJ7AbRoezTJE9GH4TIzcUe-Cp7uoav-lLks4AknK2SfU_yxi16QmxiuZOGFm-hLQ==?uid=0&filename=004%20-%2002%20Lesnik.mp3&disposition=attachment&hash=e0E3gNC19eqNvFi1rXJjnP1y8SAS38sn5%2ByGEWhnzE5cwAGsEnlbazlMDWSjXpyvq/J6bpmRyOJonT3VoXnDag%3D%3D&limit=0&content_type=audio%2Fmpeg&owner_uid=160716081&fsize=3862987&hid=98984d857027117759bc5ce6092eaa6a&media_type=audio&tknv=v2&rtoken=k9xogU6296eg&force_default=no&ycrid=na-2bc914314062204f1cbf810798018afd-downloader16e&ts=5e61a6daac6c0&s=eef8b08190dc7b22befd6bad89e1393b394869a1668d9b8af3730cce4774e8ad&pb=U2FsdGVkX1__q3AvjJzgzWG4wVR80Oh8XMl-0Dlfyu9FhqAYQVVkoBV0dtBmajpmOkCXKUXPbREOS-MZCxMNu2rkAkKq_n-AXcZ85svtSFs";
List<Task> tasks = new List<Task>();
void MyMethod1(int i)
{
WebClient client = new WebClient();
client.DownloadFile(file, $"{Path}\\{i}.mp3");
}
void MyMethod2()
{
int i = 0;
while (i < int.MaxValue)
{
i++;
}
}
DateTime dateTimeStart = DateTime.Now;
for (int i = 0; i < Iterations; i++)
{
int j = i;
tasks.Add(Task.Run(() => MyMethod1(j)));
}
for (int i = 0; i < Iterations; i++)
{
tasks.Add(Task.Run(() => { MyMethod2(); MyMethod2(); }));
}
Task.WaitAll(tasks.ToArray());
Console.WriteLine(DateTime.Now - dateTimeStart);
while (true)
{
Thread.Sleep(100);
if (Path.GetFiles().Length == Iterations)
{
Thread.Sleep(1000);
foreach (FileInfo f in Path.GetFiles())
{
f.Delete();
}
return;
}
}
If there are 2 web servers that talk to a database and they run on 2 machines with the same spec the web server with async code will be able to handle more concurrent requests.
The following is from 2014's Async Programming : Introduction to Async/Await on ASP.NET
Why Not Increase the Thread Pool Size?
At this point, a question is always asked: Why not just increase the size of the thread pool? The answer is twofold: Asynchronous code scales both further and faster than blocking thread pool threads.
Asynchronous code can scale further than blocking threads because it uses much less memory; every thread pool thread on a modern OS has a 1MB stack, plus an unpageable kernel stack. That doesn’t sound like a lot until you start getting a whole lot of threads on your server. In contrast, the memory overhead for an asynchronous operation is much smaller. So, a request with an asynchronous operation has much less memory pressure than a request with a blocked thread. Asynchronous code allows you to use more of your memory for other things (caching, for example).
Asynchronous code can scale faster than blocking threads because the thread pool has a limited injection rate. As of this writing, the rate is one thread every two seconds. This injection rate limit is a good thing; it avoids constant thread construction and destruction. However, consider what happens when a sudden flood of requests comes in. Synchronous code can easily get bogged down as the requests use up all available threads and the remaining requests have to wait for the thread pool to inject new threads. On the other hand, asynchronous code doesn’t need a limit like this; it’s “always on,” so to speak. Asynchronous code is more responsive to sudden swings in request volume.
(These days threads are added added every 0.5 second)
WebRequest.Create("https://192.168.1.1").GetResponse()
At some point the above code will probably hit the OS method recv(). The OS will suspend your thread until data becomes available. The state of your function, in CPU registers and the thread stack, will be preserved by the OS while the thread is suspended. In the meantime, this thread can't be used for anything else.
If you start that method via Task.Run(), then your method will consume a thread from a thread pool that has been prepared for you by the runtime. Since these threads aren't used for anything else, your program can continue handling other requests on other threads. However, creating a large number of OS threads has significant overheads.
Every OS thread must have some memory reserved for its stack, and the OS must use some memory to store the full state of the CPU for any suspended thread. Switching threads can have a significant performance cost. For maximum performance, you want to keep a small number of threads busy. Rather than having a large number of suspended threads which the OS must keep swapping in and out of each CPU core.
When you use async & await, the C# compiler will transform your method into a coroutine. Ensuring that any state your program needs to remember is no longer stored in CPU registers or on the OS thread stack. Instead all of that state will be stored in heap memory while your task is suspended. When your task is suspended and resumed, only the data which you actually need will be loaded & stored, rather than the entire CPU state.
If you change your code to use .GetResponseAsync(), the runtime will call an OS method that supports overlapped I/O. While your task is suspended, no OS thread will be busy. When data is available, the runtime will continue to execute your task on a thread from the thread pool.
Is this going to impact the program you are writing today? Will you be able to tell the difference? Not until the CPU starts to become the bottleneck. When you are attempting to scale your program to thousands of concurrent requests.
If you are writing new code, look for the Async version of any I/O method. Sprinkle async & await around. It doesn't cost you anything.
If I understand correctly, then async\await is not needed anywhere else except for operations with a long wait in a thread, if this is not related to a long calculation.
It's kind of recursive, but async is best used whenever there's something asynchronous. In other words, anything where the CPU would be wasted if it had to just spin (or block) while waiting for the operation to complete. Operations that are naturally asynchronous are generally I/O-based (as you mention, DB and other network calls, as well as file I/O), but they can be more arbitrary events, too (e.g., timers). Anything where there isn't actual code to run to get the response.
Many people write that async\await is also needed so as not to block the main thread.
At a higher level, there are two primary benefits to async/await, depending on what kind of code you're talking about:
On the server side (e.g., web apps), async/await provides scalability by using fewer threads per request.
On the client side (e.g., UI apps), async/await provides responsiveness by keeping the UI thread free to respond to user input.
Developers tend to emphasize one or the other depending on the kind of work they normally do. So if you see an async article talking about "not blocking the main thread", they're talking about UI apps specifically.
And here it is completely unclear to me why it should be blocked. Don't block without async\await, just create a task.
That works just fine for many situations. But it doesn't work well in others.
E.g., it would be a bad idea to just Task.Run onto a background thread in a web app. The primary benefit of async in a web app is to provide scalability by using fewer threads per request, so using Task.Run does not provide any benefits at all (in fact, scalability is reduced). So, the idea of "use Task.Run instead of async/await" cannot be adopted as a universal principle.
The other problem is in resource-constrained environments, such as mobile devices. You can only have so many threads there before you start running into other problems.
But if you're talking Desktop apps (e.g., WPF and friends), then sure, you can use async/await to free up the UI thread, or you can use Task.Run to free up the UI thread. They both achieve the same goal.
Question: how could I first run the DownloadFile asynchronously so that the threads do not idle uselessly, but can do while (i < int.MaxValue)?
There's nothing in your code that is asynchronous at all. So really, you're dealing with multithreading/parallelism. In general, I recommend using higher-level constructs such as Parallel for parallelism rather than Task.Run.
But regardless of the API used, the underlying problem is that you're kicking off Environment.ProcessorCount * 6 threads. You'll want to ensure that your thread pool is ready for that many threads by calling ThreadPool.SetMinThreads with the workerThreads set to a high enough number.
It's not web requests but here's a toy example:
Test:
n: 1 await: 00:00:00.1373839 sleep: 00:00:00.1195186
n: 10 await: 00:00:00.1290465 sleep: 00:00:00.1086578
n: 100 await: 00:00:00.1101379 sleep: 00:00:00.6517959
n: 300 await: 00:00:00.1207069 sleep: 00:00:02.0564836
n: 500 await: 00:00:00.1211736 sleep: 00:00:02.2742309
n: 1000 await: 00:00:00.1571661 sleep: 00:00:05.3987737
Code:
using System.Diagnostics;
foreach( var n in new []{1, 10, 100, 300, 500, 1000})
{
var sw = Stopwatch.StartNew();
var tasks = Enumerable.Range(0,n)
.Select( i => Task.Run( async () =>
{
await Task.Delay(TimeSpan.FromMilliseconds(100));
}));
await Task.WhenAll(tasks);
var tAwait = sw.Elapsed;
sw = Stopwatch.StartNew();
var tasks2 = Enumerable.Range(0,n)
.Select( i => Task.Run( () =>
{
Thread.Sleep(TimeSpan.FromMilliseconds(100));
}));
await Task.WhenAll(tasks2);
var tSleep = sw.Elapsed;
Console.WriteLine($"n: {n,4} await: {tAwait} sleep: {tSleep}");
}
I am new to TPL and I am wondering: How does the asynchronous programming support that is new to C# 5.0 (via the new async and await keywords) relate to the creation of threads?
Specifically, does the use of async/await create a new thread each time that they are used? And if there many nested methods that use async/await, is a new thread created for each of those methods?
In short NO
From Asynchronous Programming with Async and Await : Threads
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active. You can use Task.Run to move CPU-bound work to a
background thread, but a background thread doesn't help with a process
that's just waiting for results to become available.
According to MSDN : async keyword
An async method runs synchronously until it reaches its first await expression, at which point the method is suspended until the awaited task is complete. In the meantime, control returns to the caller of the method, as the example in the next section shows.
Here is a sample code to check it :
class Program
{
static void Main(string[] args)
{
Program p = new Program();
p.Run();
}
private void Print(string txt)
{
string dateStr = DateTime.Now.ToString("HH:mm:ss.fff");
Console.WriteLine($"{dateStr} Thread #{Thread.CurrentThread.ManagedThreadId}\t{txt}");
}
private void Run()
{
Print("Program Start");
Experiment().Wait();
Print("Program End. Press any key to quit");
Console.Read();
}
private async Task Experiment()
{
Print("Experiment code is synchronous before await");
await Task.Delay(500);
Print("Experiment code is asynchronous after first await");
}
}
And the result :
We see the code of Experiment() method after await executes on another Thread.
But if I replace the Task.Delay by my own code (method SomethingElse) :
class Program
{
static void Main(string[] args)
{
Program p = new Program();
p.Run();
}
private void Print(string txt)
{
string dateStr = DateTime.Now.ToString("HH:mm:ss.fff");
Console.WriteLine($"{dateStr} Thread #{Thread.CurrentThread.ManagedThreadId}\t{txt}");
}
private void Run()
{
Print("Program Start");
Experiment().Wait();
Print("Program End. Press any key to quit");
Console.Read();
}
private async Task Experiment()
{
Print("Experiment code is synchronous before await");
await SomethingElse();
Print("Experiment code is asynchronous after first await");
}
private Task SomethingElse()
{
Print("Experiment code is asynchronous after first await");
Thread.Sleep(500);
return (Task.CompletedTask);
}
}
I notice the thread remains the same !
In conclusion, I'll say async/await code could use another thread, but only if the thread is created by another code, not by async/await.
In this case, I think Task.Delay created the thread, so I can conclude async/await does not create a new Thread like said by #Adriaan Stander.
Sorry for being late to the party.
I am new to TPL and I am wondering: How does the asynchronous
programming support that is new to C# 5.0 (via the new async and await
keywords) relate to the creation of threads?
async/await is not introduced for thread creation, but to utilize the current thread optimally.
Your app might read files, wait for response from another server or even do a computation with high memory access (Simply any IO task). These tasks are not CPU intensive (Any task that will not use 100% of your thread).
Think about the case when you are processing 1000 non CPU intensive tasks. In this case, process of creating 1000s of OS level thread might eat up more CPU and Memory than doing actual work on a single thread (4mb per thread in Windows, 4MB * 1000 = 4GB). At the same time if you run all the tasks sequentially, you might have to wait until the IO tasks gets finished. Which end up in long time to complete the task, while keeping the CPU idle.
Since we require parallelism to complete multiple tasks quickly, at the same time all parallel tasks are not CPU hungry, but creating threads is inefficient.
The compiler will break the execution at any method call to an async method (which gets called with an await) and immediately execute the code outside of the current code branch, once an await is reached, the execution will go inside the previous async. This will be repeated again and again until all the async calls are completed and their awaiters are satisfied.
If any of the async method have heavy CPU load without a call to an async method, then yes, your system will become unresponsive and all the remaining async methods will not get called until the current task is finished.
So I've been reading up on the threading model, and Async / Await can certainly lead to new threads being used (not necessarily created - the pool creates them at application start). It's up to the scheduler to determine if a new thread is needed. And as I see it, a call to an awaitable function may have internal details that increase the chances of the scheduler utilizing another thread; simply because more work means more opportunities / reasons for the scheduler to divvy out work.
WinRT async operations automatically happen on the thread pool. And typically you will be calling FROM the thread pool, except for UI thread work .. Xaml/Input/Events.
Async operations started on Xaml/UI threads have their results delivered back to the [calling] UI thread. But asynchronous operation results started from a thread pool thread are delivered wherever the completion happens, which may not be the same thread you were on before. The reason behind this is that code written for the thread pool is likely to be written to be thread safe and it is also for efficiency, Windows doesn't have to negotiate that thread switch.
So again, in answer to the OP, new threads are not necessarily created but your application can and will use multiple threads to complete asynchronous work.
I know this seems to contradict some of the literature regarding async / await, but that's because although the async / await construct is not by itself multithreaded. Awaitables are the, or one of the mechanisms by which the scheduler can divide work and construct calls across threads.
This is at the limit of my knowledge right now regarding async and threading, so I might not have it exactly right, but I do think it's important to see the relationship between awaitables and threading.
Using Async/Await doesn't necessarily cause a new thread to be created. But the use of Async/Await can lead to a new thread to be created because the awaitable function may internally spawn a new thread. And it often does, making the statement 'No, it doesn't spawn threads' almost useless in practice. For example, the following code spawns new threads.
VisualProcessor.Ctor()
{
...
BuildAsync();
}
async void BuildAsync()
{
...
TextureArray dudeTextures = await TextureArray.FromFilesAsync(…);
}
public static async Task<TextureArray> FromFilesAsync(...)
{
Debug.WriteLine("TextureArray.FromFilesAsync() T1 : Thread Id = " + GetCurrentThreadId());
List<StorageFile> files = new List<StorageFile>();
foreach (string path in paths)
{
if (path != null)
files.Add(await Package.Current.InstalledLocation.GetFileAsync(path)); // << new threads
else
files.Add(null);
}
Debug.WriteLine("TextureArray.FromFilesAsync() T2 : Thread Id = " + GetCurrentThreadId());
...
}
In case of Java Spring Framework, a method annotated with #Async runs in a separate thread. Quoting from official guide (https://spring.io/guides/gs/async-method) -
The findUser method is flagged with Spring’s #Async annotation,
indicating that it should run on a separate thread. The method’s
return type is CompletableFuture instead of User, a requirement
for any asynchronous service.
Of course in the backend it uses a Thread Pool and a Queue (where async tasks wait for a thread to be back in the pool).
Important for anyone researching this difficult topic in Unity specifically,
be sure to see another question I asked which raised related key issues:
In Unity specifically, "where" does an await literally return to?
For C# experts, Unity is single-threaded1
It's common to do calculations and such on another thread.
When you do something on another thread, you often use async/wait since, uh, all the good C# programmers say that's the easy way to do that!
void TankExplodes() {
ShowExplosion(); .. ordinary Unity thread
SoundEffects(); .. ordinary Unity thread
SendExplosionInfo(); .. it goes to another thread. let's use 'async/wait'
}
using System.Net.WebSockets;
async void SendExplosionInfo() {
cws = new ClientWebSocket();
try {
await cws.ConnectAsync(u, CancellationToken.None);
...
Scene.NewsFromServer("done!"); // class function to go back to main tread
}
catch (Exception e) { ... }
}
OK, so when you do this, you do everything "just as you normally do" when you launch a thread in a more conventional way in Unity/C# (so using Thread or whatever or letting a native plugin do it or the OS or whatever the case may be).
Everything works out great.
As a lame Unity programmer who only knows enough C# to get to the end of the day, I have always assumed that the async/await pattern above literally launches another thread.
In fact, does the code above literally launch another thread, or does c#/.Net use some other approach to achieve tasks when you use the natty async/wait pattern?
Maybe it works differently or specifically in the Unity engine from "using C# generally"? (IDK?)
Note that in Unity, whether or not it is a thread drastically affects how you have to handle the next steps. Hence the question.
Issue: I realize there's lots of discussion about "is await a thread", but, (1) I have never seen this discussed / answered in the Unity setting (does it make any difference? IDK?) (2) I simply have never seen a clear answer!
1 Some ancillary calculations (eg, physics etc) are done on other threads, but the actual "frame based game engine" is one pure thread. (It's impossible to "access" the main engine frame thread in any way whatsoever: when programming, say, a native plugin or some calculation on another thread, you just leave markers and values for the components on the engine frame thread to look at and use when they run each frame.)
This reading: Tasks are (still) not threads and async is not parallel might help you understand what's going on under the hood.
In short in order for your task to run on a separate thread you need to call
Task.Run(()=>{// the work to be done on a separate thread. });
Then you can await that task wherever needed.
To answer your question
"In fact, does the code above literally launch another thread, or does
c#/.Net use some other approach to achieve tasks when you use the
natty async/wait pattern?"
No - it doesn't.
If you did
await Task.Run(()=> cws.ConnectAsync(u, CancellationToken.None));
Then cws.ConnectAsync(u, CancellationToken.None) would run on a separate thread.
As an answer to the comment here is the code modified with more explanations:
async void SendExplosionInfo() {
cws = new ClientWebSocket();
try {
var myConnectTask = Task.Run(()=>cws.ConnectAsync(u, CancellationToken.None));
// more code running...
await myConnectTask; // here's where it will actually stop to wait for the completion of your task.
Scene.NewsFromServer("done!"); // class function to go back to main tread
}
catch (Exception e) { ... }
}
You might not need it on a separate thread though because the async work you're doing is not CPU bound (or so it seems). Thus you should be fine with
try {
var myConnectTask =cws.ConnectAsync(u, CancellationToken.None);
// more code running...
await myConnectTask; // here's where it will actually stop to wait for the completion of your task.
Scene.NewsFromServer("done!"); // continue from here
}
catch (Exception e) { ... }
}
Sequentially it will do exactly the same thing as the code above but on the same thread. It will allow the code after "ConnectAsync" to execute and will only stop to wait for the completion of "ConnectAsync" where it says await and since "ConnectAsync" is not CPU bound you (making it somewhat parallel in a sense of the work being done somewhere else i. e. networking) will have enough juice to run your tasks on, unless your code in "...." also requires a lot of CPU bound work, that you'd rather run in parallel.
Also you might want to avoid using async void for it's there only for top level functions. Try using async Task in your method signature. You can read more on this here.
No, async/await does not mean - another thread. It can start another thread but it doesn't have to.
Here you can find quite interesting post about it: https://blogs.msdn.microsoft.com/benwilli/2015/09/10/tasks-are-still-not-threads-and-async-is-not-parallel/
Important notice
First of all, there's an issue with your question's first statement.
Unity is single-threaded
Unity is not single-threaded; in fact, Unity is a multi-threaded environment. Why? Just go to the official Unity web page and read there:
High-performance multithreaded system: Fully utilize the multicore processors available today (and tomorrow), without heavy programming. Our new foundation for enabling high-performance is made up of three sub-systems: the C# Job System, which gives you a safe and easy sandbox for writing parallel code; the Entity Component System (ECS), a model for writing high-performance code by default, and the Burst Compiler, which produces highly-optimized native code.
The Unity 3D engine uses a .NET Runtime called "Mono" which is multi-threaded by its nature. For some platforms, the managed code will be transformed into native code, so there will be no .NET Runtime. But the code itself will be multi-threaded anyway.
So please, don't state misleading and technically incorrect facts.
What you're arguing with, is simply a statement that there is a main thread in Unity which processes the core workload in a frame-based way. This is true. But it isn't something new and unique! E.g. a WPF application running on .NET Framework (or .NET Core starting with 3.0) has a main thread too (often called the UI thread), and the workload is processed on that thread in a frame-based way using the WPF Dispatcher (dispatcher queue, operations, frames etc.) But all this doesn't make the environment single-threaded! It's just a way to handle the application's logic.
An answer to your question
Please note: my answer only applies to such Unity instances that run a .NET Runtime environment (Mono). For those instances that convert the managed C# code into native C++ code and build/run native binaries, my answer is most probably at least inaccurate.
You write:
When you do something on another thread, you often use async/wait since, uh, all the good C# programmers say that's the easy way to do that!
The async and await keywords in C# are just a way to use the TAP (Task-Asynchronous Pattern).
The TAP is used for arbitrary asynchronous operations. Generally speaking, there is no thread. I strongly recommend to read this Stephen Cleary's article called "There is no thread". (Stephen Cleary is a renowned asynchronous programming guru if you don't know.)
The primary cause for using the async/await feature is an asynchronous operation. You use async/await not because "you do something on another thread", but because you have an asynchronous operation you have to wait for. Whether there is a background thread this operation will run or or not - this does not matter for you (well, almost; see below). The TAP is an abstraction level that hides these details.
In fact, does the code above literally launch another thread, or does c#/.Net use some other approach to achieve tasks when you use the natty async/wait pattern?
The correct answer is: it depends.
if ClientWebSocket.ConnectAsync throws an argument validation exception right away (e.g. an ArgumentNullException when uri is null), no new thread will be started
if the code in that method completes very quickly, the result of the method will be available synchronously, no new thread will be started
if the implementation of the ClientWebSocket.ConnectAsync method is a pure asynchronous operation with no threads involved, your calling method will be "suspended" (due to await) - so no new thread will be started
if the method implementation involves threads and the current TaskScheduler is able to schedule this work item on a running thread pool thread, no new thread will be started; instead, the work item will be queued on an already running thread pool thread
if all thread pool threads are already busy, the runtime might spawn new threads depending on its configuration and current system state, so yes - a new thread might be started and the work item will be queued on that new thread
You see, this is pretty much complex. But that's exactly the reason why the TAP pattern and the async/await keyword pair were introduced into C#. These are usually the things a developer doesn't want to bother with, so let's hide this stuff in the runtime/framework.
#agfc states a not quite correct thing:
"This won't run the method on a background thread"
await cws.ConnectAsync(u, CancellationToken.None);
"But this will"
await Task.Run(()=> cws.ConnectAsync(u, CancellationToken.None));
If ConnectAsync's synchronous part implementation is tiny, the task scheduler might run that part synchronously in both cases. So these both snippets might be exactly the same depending on the called method implementation.
Note that the ConnectAsync has an Async suffix and returns a Task. This is a convention-based information that the method is truly asynchronous. In such cases, you should always prefer await MethodAsync() over await Task.Run(() => MethodAsync()).
Further interesting reading:
await vs await Task.Run
return Task.Run vs await Task.Run
I don't like answering my own question, but as it turns out none of the answers here is totally correct. (However many/all of the answers here are hugely useful in different ways).
In fact, the actual answer can be stated in a nutshell:
On which thread the execution resumes after an await is controlled by SynchronizationContext.Current.
That's it.
Thus in any particular version of Unity (and note that, as of writing 2019, they are drastically changing Unity - https://unity.com/dots) - or indeed any C#/.Net environment at all - the question on this page can be answered properly.
The full information emerged at this follow-up QA:
https://stackoverflow.com/a/55614146/294884
The code after an await will continue on another threadpool thread. This can have consequences when dealing with non-thread-safe references in a method, such as a Unity, EF's DbContext and many other classes, including your own custom code.
Take the following example:
[Test]
public async Task TestAsync()
{
using (var context = new TestDbContext())
{
Console.WriteLine("Thread Before Async: " + Thread.CurrentThread.ManagedThreadId.ToString());
var names = context.Customers.Select(x => x.Name).ToListAsync();
Console.WriteLine("Thread Before Await: " + Thread.CurrentThread.ManagedThreadId.ToString());
var result = await names;
Console.WriteLine("Thread After Await: " + Thread.CurrentThread.ManagedThreadId.ToString());
}
}
The output:
------ Test started: Assembly: EFTest.dll ------
Thread Before Async: 29
Thread Before Await: 29
Thread After Await: 12
1 passed, 0 failed, 0 skipped, took 3.45 seconds (NUnit 3.10.1).
Note that code before and after the ToListAsync is running on the same thread. So prior to awaiting any of the results we can continue processing, though the results of the async operation will not be available, just the Task that is created. (which can be aborted, awaited, etc.)
Once we put an await in, the code following will be effectively split off as a continuation, and will/can come back on a different thread.
This applies when awaiting for an async operation in-line:
[Test]
public async Task TestAsync2()
{
using (var context = new TestDbContext())
{
Console.WriteLine("Thread Before Async/Await: " + Thread.CurrentThread.ManagedThreadId.ToString());
var names = await context.Customers.Select(x => x.Name).ToListAsync();
Console.WriteLine("Thread After Async/Await: " + Thread.CurrentThread.ManagedThreadId.ToString());
}
}
Output:
------ Test started: Assembly: EFTest.dll ------
Thread Before Async/Await: 6
Thread After Async/Await: 33
1 passed, 0 failed, 0 skipped, took 4.38 seconds (NUnit 3.10.1).
Again, the code after the await is executed on another thread from the original.
If you want to ensure that the code calling async code remains on the same thread then you need to use the Result on the Task to block the thread until the async task completes:
[Test]
public void TestAsync3()
{
using (var context = new TestDbContext())
{
Console.WriteLine("Thread Before Async: " + Thread.CurrentThread.ManagedThreadId.ToString());
var names = context.Customers.Select(x => x.Name).ToListAsync();
Console.WriteLine("Thread After Async: " + Thread.CurrentThread.ManagedThreadId.ToString());
var result = names.Result;
Console.WriteLine("Thread After Result: " + Thread.CurrentThread.ManagedThreadId.ToString());
}
}
Output:
------ Test started: Assembly: EFTest.dll ------
Thread Before Async: 20
Thread After Async: 20
Thread After Result: 20
1 passed, 0 failed, 0 skipped, took 4.16 seconds (NUnit 3.10.1).
So as far as Unity, EF, etc. goes, you should be cautious about using async liberally where these classes are not thread safe. For instance the following code may lead to unexpected behaviour:
using (var context = new TestDbContext())
{
var ids = await context.Customers.Select(x => x.CustomerId).ToListAsync();
foreach (var id in ids)
{
var orders = await context.Orders.Where(x => x.CustomerId == id).ToListAsync();
// do stuff with orders.
}
}
As far as the code goes this looks fine, but DbContext is not thread-safe and the single DbContext reference will be running on a different thread when it is queried for Orders based on the await on the initial Customer load.
Use async when there is a significant benefit to it over synchronous calls, and you're sure the continuation will access only thread-safe code.
I'm amazed there is no mention of ConfigureAwait in this thread. I was looking for a confirmation that Unity did async/await the same way that it is done for "regular" C# and from what I see above it seems to be the case.
The thing is, by default, an awaited task will resume in the same threading context after completion. If you await on the main thread, it will resume on the main thread. If you await on a thread from the ThreadPool, it will use any available thread from the thread pool. You can always create different contexts for different purposes, like a DB access context and whatnot.
This is where ConfigureAwait is interesting. If you chain a call to ConfigureAwait(false) after your await, you are telling the runtime that you do not need to resume in the same context and therefore it will resume on a thread from the ThreadPool. Omitting a call to ConfigureAwait plays it safe and will resume in the same context (main thread, DB thread, ThreadPool context, whatever the context the caller was on).
So, starting on the main thread, you can await and resume in the main thread like so:
// Main thread
await WhateverTaskAync();
// Main thread
or go to the thread pool like so:
// Main thread
await WhateverTaskAsync().ConfigureAwait(false);
// Pool thread
Likewise, starting from a thread in the pool:
// Pool thread
await WhateverTaskAsync();
// Pool thread
is equivalent to :
// Pool thread
await WhateverTaskAsync().ConfigureAwait(false);
// Pool thread
To go back to the main thread, you would use an API that transfers to the main thread:
// Pool thread
await WhateverTaskAsync().ConfigureAwait(false);
// Pool thread
RunOnMainThread(()
{
// Main thread
NextStep()
});
// Pool thread, possibly before NextStep() is run unless RunOnMainThread is synchronous (which it normally isn't)
This is why people say that calling Task.Run runs code on a pool thread. The await is superfluous though...
// Main Thread
await Task.Run(()
{
// Pool thread
WhateverTaskAsync()
// Pool thread before WhateverTaskAsync completes because it is not awaited
});
// Main Thread before WhateverTaskAsync completes because it is not awaited
Now, calling ConfigureAwait(false) does not guarantee that the code inside the Async method is called in a separate thread. It only states that when returning from the await, you have no guarantee of being in the same threading context.
If your Async method looks like this:
private async Task WhateverTaskAsync()
{
int blahblah = 0;
for(int i = 0; i < 100000000; ++i)
{
blahblah += i;
}
}
... because there is actually no await inside the Async method, you will get a compilation warning and it will all run within the calling context. Depending on its ConfigureAwait state, it might resume on the same or a different context. If you want the method to run on a pool thread, you would instead write the Async method as such:
private Task WhateverTaskAsync()
{
return Task.Run(()
{
int blahblah = 0;
for(int i = 0; i < 100000000; ++i)
{
blahblah += i;
}
}
}
Hopefully, that clears up some things for others.
I am starting a new task from a function but I would not want it to run on the same thread. I don't care which thread it runs on as long as it is a different one (so the information given in this question does not help).
Am I guaranteed that the below code will always exit TestLock before allowing Task t to enter it again? If not, what is the recommended design pattern to prevent re-entrency?
object TestLock = new object();
public void Test(bool stop = false) {
Task t;
lock (this.TestLock) {
if (stop) return;
t = Task.Factory.StartNew(() => { this.Test(stop: true); });
}
t.Wait();
}
Edit: Based on the below answer by Jon Skeet and Stephen Toub, a simple way to deterministically prevent reentrancy would be to pass a CancellationToken, as illustrated in this extension method:
public static Task StartNewOnDifferentThread(this TaskFactory taskFactory, Action action)
{
return taskFactory.StartNew(action: action, cancellationToken: new CancellationToken());
}
I mailed Stephen Toub - a member of the PFX Team - about this question. He's come back to me really quickly, with a lot of detail - so I'll just copy and paste his text here. I haven't quoted it all, as reading a large amount of quoted text ends up getting less comfortable than vanilla black-on-white, but really, this is Stephen - I don't know this much stuff :) I've made this answer community wiki to reflect that all the goodness below isn't really my content:
If you call Wait() on a Task that's completed, there won't be any blocking (it'll just throw an exception if the task completed with a TaskStatus other than RanToCompletion, or otherwise return as a nop). If you call Wait() on a Task that's already executing, it must block as there’s nothing else it can reasonably do (when I say block, I'm including both true kernel-based waiting and spinning, as it'll typically do a mixture of both). Similarly, if you call Wait() on a Task that has the Created or WaitingForActivation status, it’ll block until the task has completed. None of those is the interesting case being discussed.
The interesting case is when you call Wait() on a Task in the WaitingToRun state, meaning that it’s previously been queued to a TaskScheduler but that TaskScheduler hasn't yet gotten around to actually running the Task's delegate yet. In that case, the call to Wait will ask the scheduler whether it's ok to run the Task then-and-there on the current thread, via a call to the scheduler's TryExecuteTaskInline method. This is called inlining. The scheduler can choose to either inline the task via a call to base.TryExecuteTask, or it can return 'false' to indicate that it is not executing the task (often this is done with logic like...
return SomeSchedulerSpecificCondition() ? false : TryExecuteTask(task);
The reason TryExecuteTask returns a Boolean is that it handles the synchronization to ensure a given Task is only ever executed once). So, if a scheduler wants to completely prohibit inlining of the Task during Wait, it can just be implemented as return false; If a scheduler wants to always allow inlining whenever possible, it can just be implemented as:
return TryExecuteTask(task);
In the current implementation (both .NET 4 and .NET 4.5, and I don’t personally expect this to change), the default scheduler that targets the ThreadPool allows for inlining if the current thread is a ThreadPool thread and if that thread was the one to have previously queued the task.
Note that there isn't arbitrary reentrancy here, in that the default scheduler won’t pump arbitrary threads when waiting for a task... it'll only allow that task to be inlined, and of course any inlining that task in turn decides to do. Also note that Wait won’t even ask the scheduler in certain conditions, instead preferring to block. For example, if you pass in a cancelable CancellationToken, or if you pass in a non-infinite timeout, it won’t try to inline because it could take an arbitrarily long amount of time to inline the task's execution, which is all or nothing, and that could end up significantly delaying the cancellation request or timeout. Overall, TPL tries to strike a decent balance here between wasting the thread that’s doing the Wait'ing and reusing that thread for too much. This kind of inlining is really important for recursive divide-and-conquer problems (e.g. QuickSort) where you spawn multiple tasks and then wait for them all to complete. If such were done without inlining, you’d very quickly deadlock as you exhaust all threads in the pool and any future ones it wanted to give to you.
Separate from Wait, it’s also (remotely) possible that the Task.Factory.StartNew call could end up executing the task then and there, iff the scheduler being used chose to run the task synchronously as part of the QueueTask call. None of the schedulers built into .NET will ever do this, and I personally think it would be a bad design for scheduler, but it’s theoretically possible, e.g.:
protected override void QueueTask(Task task, bool wasPreviouslyQueued)
{
return TryExecuteTask(task);
}
The overload of Task.Factory.StartNew that doesn’t accept a TaskScheduler uses the scheduler from the TaskFactory, which in the case of Task.Factory targets TaskScheduler.Current. This means if you call Task.Factory.StartNew from within a Task queued to this mythical RunSynchronouslyTaskScheduler, it would also queue to RunSynchronouslyTaskScheduler, resulting in the StartNew call executing the Task synchronously. If you’re at all concerned about this (e.g. you’re implementing a library and you don’t know where you’re going to be called from), you can explicitly pass TaskScheduler.Default to the StartNew call, use Task.Run (which always goes to TaskScheduler.Default), or use a TaskFactory created to target TaskScheduler.Default.
EDIT: Okay, it looks like I was completely wrong, and a thread which is currently waiting on a task can be hijacked. Here's a simpler example of this happening:
using System;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication1 {
class Program {
static void Main() {
for (int i = 0; i < 10; i++)
{
Task.Factory.StartNew(Launch).Wait();
}
}
static void Launch()
{
Console.WriteLine("Launch thread: {0}",
Thread.CurrentThread.ManagedThreadId);
Task.Factory.StartNew(Nested).Wait();
}
static void Nested()
{
Console.WriteLine("Nested thread: {0}",
Thread.CurrentThread.ManagedThreadId);
}
}
}
Sample output:
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
As you can see, there are lots of times when the waiting thread is reused to execute the new task. This can happen even if the thread has acquired a lock. Nasty re-entrancy. I am suitably shocked and worried :(
Why not just design for it, rather than bend over backwards to ensure it doesn't happen?
The TPL is a red herring here, reentrancy can happen in any code provided you can create a cycle, and you don't know for sure what's going to happen 'south' of your stack frame. Synchronous reentrancy is the best outcome here - at least you can't self-deadlock yourself (as easily).
Locks manage cross thread synchronisation. They are orthogonal to managing reentrancy. Unless you are protecting a genuine single use resource (probably a physical device, in which case you should probably use a queue), why not just ensure your instance state is consistent so reentrancy can 'just work'.
(Side thought: are Semaphores reentrant without decrementing?)
You could easily test this by writting a quick app that shared a socket connection between threads / tasks.
The task would acquire a lock before sending a message down the socket and waiting for a response. Once this blocks and becomes idle (IOBlock) set another task in the same block to do the same. It should block on acquiring the lock, if it does not and the second task is allowed to pass the lock because it run by the same thread then you have an problem.
Solution with new CancellationToken() proposed by Erwin did not work for me, inlining happened to occur anyway.
So I ended up using another condition advised by Jon and Stephen
(... or if you pass in a non-infinite timeout ...):
Task<TResult> task = Task.Run(func);
task.Wait(TimeSpan.FromHours(1)); // Whatever is enough for task to start
return task.Result;
Note: Omitting exception handling etc here for simplicity, you should mind those in production code.
I am starting a new task from a function but I would not want it to run on the same thread. I don't care which thread it runs on as long as it is a different one (so the information given in this question does not help).
Am I guaranteed that the below code will always exit TestLock before allowing Task t to enter it again? If not, what is the recommended design pattern to prevent re-entrency?
object TestLock = new object();
public void Test(bool stop = false) {
Task t;
lock (this.TestLock) {
if (stop) return;
t = Task.Factory.StartNew(() => { this.Test(stop: true); });
}
t.Wait();
}
Edit: Based on the below answer by Jon Skeet and Stephen Toub, a simple way to deterministically prevent reentrancy would be to pass a CancellationToken, as illustrated in this extension method:
public static Task StartNewOnDifferentThread(this TaskFactory taskFactory, Action action)
{
return taskFactory.StartNew(action: action, cancellationToken: new CancellationToken());
}
I mailed Stephen Toub - a member of the PFX Team - about this question. He's come back to me really quickly, with a lot of detail - so I'll just copy and paste his text here. I haven't quoted it all, as reading a large amount of quoted text ends up getting less comfortable than vanilla black-on-white, but really, this is Stephen - I don't know this much stuff :) I've made this answer community wiki to reflect that all the goodness below isn't really my content:
If you call Wait() on a Task that's completed, there won't be any blocking (it'll just throw an exception if the task completed with a TaskStatus other than RanToCompletion, or otherwise return as a nop). If you call Wait() on a Task that's already executing, it must block as there’s nothing else it can reasonably do (when I say block, I'm including both true kernel-based waiting and spinning, as it'll typically do a mixture of both). Similarly, if you call Wait() on a Task that has the Created or WaitingForActivation status, it’ll block until the task has completed. None of those is the interesting case being discussed.
The interesting case is when you call Wait() on a Task in the WaitingToRun state, meaning that it’s previously been queued to a TaskScheduler but that TaskScheduler hasn't yet gotten around to actually running the Task's delegate yet. In that case, the call to Wait will ask the scheduler whether it's ok to run the Task then-and-there on the current thread, via a call to the scheduler's TryExecuteTaskInline method. This is called inlining. The scheduler can choose to either inline the task via a call to base.TryExecuteTask, or it can return 'false' to indicate that it is not executing the task (often this is done with logic like...
return SomeSchedulerSpecificCondition() ? false : TryExecuteTask(task);
The reason TryExecuteTask returns a Boolean is that it handles the synchronization to ensure a given Task is only ever executed once). So, if a scheduler wants to completely prohibit inlining of the Task during Wait, it can just be implemented as return false; If a scheduler wants to always allow inlining whenever possible, it can just be implemented as:
return TryExecuteTask(task);
In the current implementation (both .NET 4 and .NET 4.5, and I don’t personally expect this to change), the default scheduler that targets the ThreadPool allows for inlining if the current thread is a ThreadPool thread and if that thread was the one to have previously queued the task.
Note that there isn't arbitrary reentrancy here, in that the default scheduler won’t pump arbitrary threads when waiting for a task... it'll only allow that task to be inlined, and of course any inlining that task in turn decides to do. Also note that Wait won’t even ask the scheduler in certain conditions, instead preferring to block. For example, if you pass in a cancelable CancellationToken, or if you pass in a non-infinite timeout, it won’t try to inline because it could take an arbitrarily long amount of time to inline the task's execution, which is all or nothing, and that could end up significantly delaying the cancellation request or timeout. Overall, TPL tries to strike a decent balance here between wasting the thread that’s doing the Wait'ing and reusing that thread for too much. This kind of inlining is really important for recursive divide-and-conquer problems (e.g. QuickSort) where you spawn multiple tasks and then wait for them all to complete. If such were done without inlining, you’d very quickly deadlock as you exhaust all threads in the pool and any future ones it wanted to give to you.
Separate from Wait, it’s also (remotely) possible that the Task.Factory.StartNew call could end up executing the task then and there, iff the scheduler being used chose to run the task synchronously as part of the QueueTask call. None of the schedulers built into .NET will ever do this, and I personally think it would be a bad design for scheduler, but it’s theoretically possible, e.g.:
protected override void QueueTask(Task task, bool wasPreviouslyQueued)
{
return TryExecuteTask(task);
}
The overload of Task.Factory.StartNew that doesn’t accept a TaskScheduler uses the scheduler from the TaskFactory, which in the case of Task.Factory targets TaskScheduler.Current. This means if you call Task.Factory.StartNew from within a Task queued to this mythical RunSynchronouslyTaskScheduler, it would also queue to RunSynchronouslyTaskScheduler, resulting in the StartNew call executing the Task synchronously. If you’re at all concerned about this (e.g. you’re implementing a library and you don’t know where you’re going to be called from), you can explicitly pass TaskScheduler.Default to the StartNew call, use Task.Run (which always goes to TaskScheduler.Default), or use a TaskFactory created to target TaskScheduler.Default.
EDIT: Okay, it looks like I was completely wrong, and a thread which is currently waiting on a task can be hijacked. Here's a simpler example of this happening:
using System;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication1 {
class Program {
static void Main() {
for (int i = 0; i < 10; i++)
{
Task.Factory.StartNew(Launch).Wait();
}
}
static void Launch()
{
Console.WriteLine("Launch thread: {0}",
Thread.CurrentThread.ManagedThreadId);
Task.Factory.StartNew(Nested).Wait();
}
static void Nested()
{
Console.WriteLine("Nested thread: {0}",
Thread.CurrentThread.ManagedThreadId);
}
}
}
Sample output:
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
As you can see, there are lots of times when the waiting thread is reused to execute the new task. This can happen even if the thread has acquired a lock. Nasty re-entrancy. I am suitably shocked and worried :(
Why not just design for it, rather than bend over backwards to ensure it doesn't happen?
The TPL is a red herring here, reentrancy can happen in any code provided you can create a cycle, and you don't know for sure what's going to happen 'south' of your stack frame. Synchronous reentrancy is the best outcome here - at least you can't self-deadlock yourself (as easily).
Locks manage cross thread synchronisation. They are orthogonal to managing reentrancy. Unless you are protecting a genuine single use resource (probably a physical device, in which case you should probably use a queue), why not just ensure your instance state is consistent so reentrancy can 'just work'.
(Side thought: are Semaphores reentrant without decrementing?)
You could easily test this by writting a quick app that shared a socket connection between threads / tasks.
The task would acquire a lock before sending a message down the socket and waiting for a response. Once this blocks and becomes idle (IOBlock) set another task in the same block to do the same. It should block on acquiring the lock, if it does not and the second task is allowed to pass the lock because it run by the same thread then you have an problem.
Solution with new CancellationToken() proposed by Erwin did not work for me, inlining happened to occur anyway.
So I ended up using another condition advised by Jon and Stephen
(... or if you pass in a non-infinite timeout ...):
Task<TResult> task = Task.Run(func);
task.Wait(TimeSpan.FromHours(1)); // Whatever is enough for task to start
return task.Result;
Note: Omitting exception handling etc here for simplicity, you should mind those in production code.