async await usages for CPU computing vs IO operation? - c#

I already know that async-await keeps the thread context , also handle exception forwarding etc.(which helps a lot).
But consider the following example :
/*1*/ public async Task<int> ExampleMethodAsync()
/*2*/ {
/*3*/ var httpClient = new HttpClient();
/*4*/
/*5*/ //start async task...
/*6*/ Task<string> contentsTask = httpClient.GetStringAsync("http://msdn.microsoft.com");
/*7*/
/*8*/ //wait and return...
/*9*/ string contents = await contentsTask;
/*10*/
/*11*/ //get the length...
/*12*/ int exampleInt = contents.Length;
/*13*/
/*14*/ //return the length...
/*15*/ return exampleInt;
/*16*/ }
If the async method (httpClient.GetStringAsync) is an IO operation ( like in my sample above) So - I gain these things :
Caller Thread is not blocked
Worker thread is released because there is an IO operation ( IO completion ports...) (GetStringAsync uses TaskCompletionSource and not open a new thread)
Preserved thread context
Exception is thrown back
But What if instead of httpClient.GetStringAsync (IO operation) , I have a Task of CalcFirstMillionsDigitsOf_PI_Async (heavy compute bound operation on a sperate thread)
It seems that the only things I gain here is :
Preserved thread context
Exception is thrown back
Caller Thread is not blocked
But I still have another thread ( parallel thread) which executes the operation. and the cpu is switching between the main thread and the operation .
Does my diagnostics is correct?

Actually, you only get the second set of advantages in both cases. await doesn't start asynchronous execution of anything, it's simply a keyword to the compiler to generate code for handling completion, context etc.
You can find a better explanation of this in '"Invoke the method with await"... ugh!' by Stephen Toub.
It's up to the asynchronous method itself to decide how it achieves the asynchronous execution:
Some methods will use a Task to run their code on a ThreadPool thread,
Some will use some IO-completion mechanism. There is even a special ThreadPool for that, which you can use with Tasks with a custom TaskScheduler
Some will wrap a TaskCompletionSource over another mechanism like events or callbacks.
In every case, it is the specific implementation that releases the thread (if one is used). The TaskScheduler releases the thread automatically when a Task finishes execution, so you get this functionality for cases #1 and #2 anyway.
What happens in case #3 for callbacks, depends on how the callback is made. Most of the time the callback is made on a thread managed by some external library. In this case you have to quickly process the callback and return to allow the library to reuse the method.
EDIT
Using a decompiler, it's possible to see that GetStringAsync uses the third option: It creates a TaskCompletionSource that gets signalled when the operation finishes. Executing the operation is delegated to an HttpMessageHandler.

Your analysis is correct, though the wording on your second part makes it sound like async is creating a worker thread for you, which it is not.
In library code, you actually want to keep your synchronous methods synchronous. If you want to consume a synchronous method asynchronously (e.g., from a UI thread), then call it using await Task.Run(..)

Yes, you're correct. I cannot find any wrong statement in your question. Just the term "Preserved thread context" is unclear to me. Do you mean the "logical control flow"? In that case I'd agree.
Regarding the CPU bound example: you'd normally not do it that way because starting a CPU-based task and waiting for it increases overhead and decreases throughput. But this might be valid if you need the caller to be unblocked (in the case of a WinForms or WFP project for example).

Related

Update button text from another thread [duplicate]

I am starting a new task from a function but I would not want it to run on the same thread. I don't care which thread it runs on as long as it is a different one (so the information given in this question does not help).
Am I guaranteed that the below code will always exit TestLock before allowing Task t to enter it again? If not, what is the recommended design pattern to prevent re-entrency?
object TestLock = new object();
public void Test(bool stop = false) {
Task t;
lock (this.TestLock) {
if (stop) return;
t = Task.Factory.StartNew(() => { this.Test(stop: true); });
}
t.Wait();
}
Edit: Based on the below answer by Jon Skeet and Stephen Toub, a simple way to deterministically prevent reentrancy would be to pass a CancellationToken, as illustrated in this extension method:
public static Task StartNewOnDifferentThread(this TaskFactory taskFactory, Action action)
{
return taskFactory.StartNew(action: action, cancellationToken: new CancellationToken());
}
I mailed Stephen Toub - a member of the PFX Team - about this question. He's come back to me really quickly, with a lot of detail - so I'll just copy and paste his text here. I haven't quoted it all, as reading a large amount of quoted text ends up getting less comfortable than vanilla black-on-white, but really, this is Stephen - I don't know this much stuff :) I've made this answer community wiki to reflect that all the goodness below isn't really my content:
If you call Wait() on a Task that's completed, there won't be any blocking (it'll just throw an exception if the task completed with a TaskStatus other than RanToCompletion, or otherwise return as a nop). If you call Wait() on a Task that's already executing, it must block as there’s nothing else it can reasonably do (when I say block, I'm including both true kernel-based waiting and spinning, as it'll typically do a mixture of both). Similarly, if you call Wait() on a Task that has the Created or WaitingForActivation status, it’ll block until the task has completed. None of those is the interesting case being discussed.
The interesting case is when you call Wait() on a Task in the WaitingToRun state, meaning that it’s previously been queued to a TaskScheduler but that TaskScheduler hasn't yet gotten around to actually running the Task's delegate yet. In that case, the call to Wait will ask the scheduler whether it's ok to run the Task then-and-there on the current thread, via a call to the scheduler's TryExecuteTaskInline method. This is called inlining. The scheduler can choose to either inline the task via a call to base.TryExecuteTask, or it can return 'false' to indicate that it is not executing the task (often this is done with logic like...
return SomeSchedulerSpecificCondition() ? false : TryExecuteTask(task);
The reason TryExecuteTask returns a Boolean is that it handles the synchronization to ensure a given Task is only ever executed once). So, if a scheduler wants to completely prohibit inlining of the Task during Wait, it can just be implemented as return false; If a scheduler wants to always allow inlining whenever possible, it can just be implemented as:
return TryExecuteTask(task);
In the current implementation (both .NET 4 and .NET 4.5, and I don’t personally expect this to change), the default scheduler that targets the ThreadPool allows for inlining if the current thread is a ThreadPool thread and if that thread was the one to have previously queued the task.
Note that there isn't arbitrary reentrancy here, in that the default scheduler won’t pump arbitrary threads when waiting for a task... it'll only allow that task to be inlined, and of course any inlining that task in turn decides to do. Also note that Wait won’t even ask the scheduler in certain conditions, instead preferring to block. For example, if you pass in a cancelable CancellationToken, or if you pass in a non-infinite timeout, it won’t try to inline because it could take an arbitrarily long amount of time to inline the task's execution, which is all or nothing, and that could end up significantly delaying the cancellation request or timeout. Overall, TPL tries to strike a decent balance here between wasting the thread that’s doing the Wait'ing and reusing that thread for too much. This kind of inlining is really important for recursive divide-and-conquer problems (e.g. QuickSort) where you spawn multiple tasks and then wait for them all to complete. If such were done without inlining, you’d very quickly deadlock as you exhaust all threads in the pool and any future ones it wanted to give to you.
Separate from Wait, it’s also (remotely) possible that the Task.Factory.StartNew call could end up executing the task then and there, iff the scheduler being used chose to run the task synchronously as part of the QueueTask call. None of the schedulers built into .NET will ever do this, and I personally think it would be a bad design for scheduler, but it’s theoretically possible, e.g.:
protected override void QueueTask(Task task, bool wasPreviouslyQueued)
{
return TryExecuteTask(task);
}
The overload of Task.Factory.StartNew that doesn’t accept a TaskScheduler uses the scheduler from the TaskFactory, which in the case of Task.Factory targets TaskScheduler.Current. This means if you call Task.Factory.StartNew from within a Task queued to this mythical RunSynchronouslyTaskScheduler, it would also queue to RunSynchronouslyTaskScheduler, resulting in the StartNew call executing the Task synchronously. If you’re at all concerned about this (e.g. you’re implementing a library and you don’t know where you’re going to be called from), you can explicitly pass TaskScheduler.Default to the StartNew call, use Task.Run (which always goes to TaskScheduler.Default), or use a TaskFactory created to target TaskScheduler.Default.
EDIT: Okay, it looks like I was completely wrong, and a thread which is currently waiting on a task can be hijacked. Here's a simpler example of this happening:
using System;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication1 {
class Program {
static void Main() {
for (int i = 0; i < 10; i++)
{
Task.Factory.StartNew(Launch).Wait();
}
}
static void Launch()
{
Console.WriteLine("Launch thread: {0}",
Thread.CurrentThread.ManagedThreadId);
Task.Factory.StartNew(Nested).Wait();
}
static void Nested()
{
Console.WriteLine("Nested thread: {0}",
Thread.CurrentThread.ManagedThreadId);
}
}
}
Sample output:
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
As you can see, there are lots of times when the waiting thread is reused to execute the new task. This can happen even if the thread has acquired a lock. Nasty re-entrancy. I am suitably shocked and worried :(
Why not just design for it, rather than bend over backwards to ensure it doesn't happen?
The TPL is a red herring here, reentrancy can happen in any code provided you can create a cycle, and you don't know for sure what's going to happen 'south' of your stack frame. Synchronous reentrancy is the best outcome here - at least you can't self-deadlock yourself (as easily).
Locks manage cross thread synchronisation. They are orthogonal to managing reentrancy. Unless you are protecting a genuine single use resource (probably a physical device, in which case you should probably use a queue), why not just ensure your instance state is consistent so reentrancy can 'just work'.
(Side thought: are Semaphores reentrant without decrementing?)
You could easily test this by writting a quick app that shared a socket connection between threads / tasks.
The task would acquire a lock before sending a message down the socket and waiting for a response. Once this blocks and becomes idle (IOBlock) set another task in the same block to do the same. It should block on acquiring the lock, if it does not and the second task is allowed to pass the lock because it run by the same thread then you have an problem.
Solution with new CancellationToken() proposed by Erwin did not work for me, inlining happened to occur anyway.
So I ended up using another condition advised by Jon and Stephen
(... or if you pass in a non-infinite timeout ...):
Task<TResult> task = Task.Run(func);
task.Wait(TimeSpan.FromHours(1)); // Whatever is enough for task to start
return task.Result;
Note: Omitting exception handling etc here for simplicity, you should mind those in production code.

Is Async/Await using Task.Run starting a new thread asynchronously?

I have read a lot of articles and still cant get understand this part.
Consider this code :
private async void button1_Click(object sender, EventArgs e)
{
await Dosomething();
}
private async Task<string> Dosomething()
{
await Task.Run((() => "Do Work"));
return "I am done";
}
First question:
When I click the button, it will Call DoSomething and await a Task that creates a Thread from the threadpool by calling Task.Run ( if I am not mistaken ) and all of this runs asynchronously. So I achieved creating a thread that does my work but doing it asynchronously? But consider that I don't need any result back, i just want the work to be done without getting any result back, is there really a need to use async/await , and if so, how?
Second question:
When running a thread asynchronously, how does that work? Is it running on the main UI but on a separate thread or is it running on a separate thread and separate is asynchronously inside that method?
The purpose of creating Async methods is so you can Await them later. Kind of like "I'm going to put this water on to boil, finish prepping the rest of my soup ingredients, and then come back to the pot and wait for the water to finish boiling so I can make dinner." You start the water boiling, which it does asynchronously while you do other things, but eventually you have to stop and wait for it. If what you want is to "fire-and-forget" then Async and Await are not necessary.
Simplest way to do a fire and forget method in C#?
Starting a new task queues that task for execution on a threadpool thread. Threads execute in the context of the process (eg. the executable that runs your application). If this is a web application running under IIS, then that thread is created in the context of the IIS worker process. That thread executes separately from the main execution thread, so it goes off and does its thing regardless of what your main execution thread is doing, and at the same time, your main execution thread moves on with its own work.
1
There's a big difference if you don't await the Task or you await it:
Case you don't await it: DoSomething is called but next sentence is executed while DoSomething Task hasn't been completed.
Case you await it: DoSomething is called and next sentence is executed once DoSomething Task has been completed.
So, the need of async/await will depend on how you want to call DoSomething: if you don't await it is like calling it the fire & forget way.
2
Is it running on the main UI but on a separate thread or is it running
on a seperate thread and separate is asynchronously inside that
method?
Asynchronous code sometimes means other thread (see this Q&A Asynchronous vs Multithreading - Is there a difference?). That is, either if the code is being executed in a separate thread from the UI one or it lets continue the processing of the UI thread while it gets resumed, it's nice because UI loop can still update the screen while other tasks are being done in parallel without freezing the UI.
An asynchronous method (i.e. async method) is a syntactic sugar to tell the compiler that await statements should be treated as a state machine. The C# compiler turns your async/await code into a state machine where code awaiting a Task result is executed after the code that's being awaited.
Interesting Q&As
You might want to review these other Q&As:
Async/Await vs Threads
What's the difference between Task.Start/Wait and Async/Await?
async/await - when to return a Task vs void?
Is Async await keyword equivalent to a ContinueWith lambda?
OP said...
[...] But does this mean that "async/await" will fire off a thread and
Task.Run also fires off a thread or are they both the same thread?
Using async-await doesn't mean "I create a thread". It's just a syntactic sugar to implement continuations in an elegant way. A Task may or may not be a thread. For example, Task.FromResult(true) creates a fake task to be able to implement an async method without requirement it to create a thread:
public Task<bool> SomeAsync()
{
// This way, this method either decides if its code is asynchronous or
// synchronous, but the caller can await it anyway!
return Task.FromResult(true);
}
The type Task<TResult> requires you to return a TResult from your task. If you don't have anything to return, you can use Task instead (which, incidentally, is the base class of Task<TResult>).
But keep in mind that a task is not a thread. A task is a job to be done, while a thread is a worker. As your program runs, jobs and workers become available and unavailable. Behind the scenes, the library will assign your jobs to available workers and, because creating new workers is a costly operation, it will typically prefer to reuse the existing ones, through a thread pool.

async/await. Where is continuation of awaitable part of method performed?

I am really curious how async/await enables your program not to be halted.
I really like the way how Stephen Cleary explains async/await: "I like to think of "await" as an "asynchronous wait". That is to say, the async method pauses until the awaitable is complete(so it waits), but the actual thread is not blocked (so it's asynchornous)."
I've read that async method works synchronously till compilator meets await keywords. Well. If compilator cannot figure out awaitable, then compilator queues the awaitable and yield control to the method that called method AccessTheWebAsync. OK.
Inside the caller (the event handler in this example), the processing pattern continues. The caller might do other work that doesn't depend on the result from AccessTheWebAsync before awaiting that result, or the caller might await immediately. The event handler is waiting for AccessTheWebAsync, and AccessTheWebAsync is waiting for GetStringAsync. Let's see an msdn example:
async Task<int> AccessTheWebAsync()
{
// You need to add a reference to System.Net.Http to declare client.
HttpClient client = new HttpClient();
// GetStringAsync returns a Task<string>. That means that when you await the
// task you'll get a string (urlContents).
Task<string> getStringTask = client.GetStringAsync("http://msdn.microsoft.com");
// You can do work here that doesn't rely on the string from GetStringAsync.
DoIndependentWork();
// The await operator suspends AccessTheWebAsync.
// - AccessTheWebAsync can't continue until getStringTask is complete.
// - Meanwhile, control returns to the caller of AccessTheWebAsync.
// - Control resumes here when getStringTask is complete.
// - The await operator then retrieves the string result from getStringTask.
string urlContents = await getStringTask;
// The return statement specifies an integer result.
// Any methods that are awaiting AccessTheWebAsync retrieve the length value.
return urlContents.Length;
}
Another article from msdn blog says that async/await does not create new thread or use other threads from thread pool. OK.
My questions:
Where does async/await execute awaitable code(in our example downloading a web site) cause control yields to the next row of code of our program and program just asks result of Task<string> getStringTask? We know that no new threads, no thread pool are not used.
Am I right in my silly assumption that CLR just switches the current executable code and awaitable part of the method between each other in scope of one thread? But changing the order of addends does not change the sum and UI might be blocked for some unnoticeable time.
Where does async/await execute awaitable code(in our example downloading a web site) cause control yields to the next row of code of our program and program just asks result of Task getStringTask? We know that no new threads, no thread pool are not used.
If the operation is truly asynchronous, then there's no code to "execute". You can think of it as all being handled via callbacks; the HTTP request is sent (synchronously) and then the HttpClient registers a callback that will complete the Task<string>. When the download completes, the callback is invoked, completing the task. It's a bit more complex than this, but that's the general idea.
I have a blog post that goes into more detail on how asynchronous operations can be threadless.
Am I right in my silly assumption that CLR just switches the current executable code and awaitable part of the method between each other in scope of one thread?
That's a partially true mental model, but it's incomplete. For one thing, when an async method resumes, its (former) call stack is not resumed along with it. So async/await are very different than fibers or co-routines, even though they can be used to accomplish similar things.
Instead of thinking of await as "switch to other code", think of it as "return an incomplete task". If the calling method also calls await, then it also returns an incomplete task, etc. Eventually, you'll either return an incomplete task to a framework (e.g., ASP.NET MVC/WebAPI/SignalR, or a unit test runner); or you'll have an async void method (e.g., UI event handler).
While the operation is in progress, you end up with a "stack" of task objects. Not a real stack, just a dependency tree. Each async method is represented by a task instance, and they're all waiting for that asynchronous operation to complete.
Where is continuation of awaitable part of method performed?
When awaiting a task, await will - by default - resume its async method on a captured context. This context is SynchronizationContext.Current unless it is null, in which case it is TaskScheduler.Current. In practice, this means that an async method running on a UI thread will resume on that UI thread; an async method handling an ASP.NET request will resume handling that same ASP.NET request (possibly on a different thread); and in most other cases the async method will resume on a thread pool thread.
In the example code for your question, GetStringAsync will return an incomplete task. When the download completes, that task will complete. So, when AccessTheWebAsync calls await on that download task, (assuming the download hasn't already finished) it will capture its current context and then return an incomplete task from AccessTheWebAsync.
When the download task completes, the continuation of AccessTheWebAsync will be scheduled to that context (UI thread, ASP.NET request, thread pool, ...), and it will extract the Length of the result while executing in that context. When the AccessTheWebAsync method returns, it sets the result of the task previously returned from AccessTheWebAsync. This in turn will resume the next method, etc.
In general the continuation (the part of your method after await) can run anywhere. In practice it tends to run on the UI thread (e.g. in a Windows application) or on the thread pool (e.g. in an ASP .NET server). It can also run synchronously on the caller thread in some cases ... really it depends on what kind of API you're calling and what synchronization context is being used.
The blog article you linked does not say that continuations are not run on thread pool threads, it merely says that marking a method as async does not magically cause invocations of the method to run on a separate thread or on the thread pool.
That is, they're just trying to tell you that if you have a method void Foo() { Console.WriteLine(); }, changing that to async Task Foo() { Console.WriteLine(); } doesn't suddenly cause an invocation of Foo(); to behave any differently at all – it'll still be executed synchronously.
If by "awaitable code" you mean the actual asynchronous operation, then you need to realize that it "executes" outside of the CPU so there's no thread needed and no code to run.
For example when you download a web page, most of the operation happens when your server sends and receives data from the web server. There's no code to execute while this happens. That's the reason you can "take over" the thread and do other stuff (other CPU operations) before awaiting the Task to get the actual result.
So to your questions:
It "executes" outside of the CPU (so it's not really executed). That could mean the network driver, a remote server, etc. (mostly I/O).
No. Truly asynchronous operations don't need to be executed by the CLR. They are only started and completed in the future.
A simple example is Task.Delay which creates a task that completes after an interval:
var delay = Task.Delay(TimeSpan.FromSeconds(30));
// do stuff
await delay;
Task.Delay internally creates and sets a System.Threading.Timer that will execute a callback after the interval and complete the task. System.Threading.Timer doesn't need a thread, it uses the system clock. So you have "awaitable code" that "executes" for 30 seconds but nothing actually happens in that time. The operation started and will complete 30 seconds in the future.

What really happens when call async method?

I try to understand why is better using the 'Async' method than using simple old synchronous way.
There is small issue that I don't understand.
On the synchronous way:
I have some thread that call method FileStream.Read(...).
Because calling this method is synchronous so the calling thread will wait until the IRP (I/O request packet) will signal that this Io request is finish.
Until the IRP will return ==> this thread will suspend ( sleep ).
On the A-synchronous way:
I have some thread (Task .. lets call this thread 'TheadAsync01') that calls method FileStream.ReadAsync(...).
Because calling this method is A-Synchronous so the calling thread will not wait until the IRP (I/O request packet) will signal that this IO request is finish; and this calling thread will continue to his next action.
Now, When the IRP will signal that this IO request is finish what happened?
(The thread TheadAsync01 is now doing something else and can't continue the work with what the 'FileStream.ReadAsync' return now.)
Is other thread will continue the continue the next action with the return value of the ReadAsync?
What I don't understand here?
The reason it bothers you is this mistaken assumption:
The thread TheadAsync01 is now doing something else and can't continue
the work with what the 'FileStream.ReadAsync' return now.
In a typical application I/O is by far the most time-consuming task.
When TPL is used correctly, threads are not blocked by time-consuming operations. Instead, everything time-consuming (in other words, any I/O) is delegated via await. So when your IRP signals, the thread will either be free from work, or will be free very soon.
If there's some heavy calculation (something time-consuming which is not I/O), you need to plan accordingly, for example run it on a dedicated thread.
The function ReadAsync immediately returns a value, namely a Task object. Somewhere you should do something with the return value. The canonical way is to use await:
await FileStream.ReadAsync(...)
This will ensure that the calling site will not continue with operation until ReadAsync has completed its job. If you want to do something in the meantime you could await the task object later or you can manually deal with the task object.
If you just call ReadAsync, ignoring the returned task object, doing nothing with it, then your reading is mostly an expensive no-op.
When a ***Async method returns a Task or Task you use this to track the running of the asynchronous operation. You can make the call behave synchronously with respect to the calling code by calling .Wait() on the task. Alternatively, as of .Net 4.5 you can await the task.
e.g:
private async void DoFileRead(...)
{
var result = await fileStream.ReadAsync(...);
// Do follow on tasks
}
In this scenario any follow on code would be wrapped in a continuation by the compiler and executed when the async call completed. One requirement of using the async keyword is to mark the calling method with the async keyword (see the example above).

Is Task.Factory.StartNew() guaranteed to use another thread than the calling thread?

I am starting a new task from a function but I would not want it to run on the same thread. I don't care which thread it runs on as long as it is a different one (so the information given in this question does not help).
Am I guaranteed that the below code will always exit TestLock before allowing Task t to enter it again? If not, what is the recommended design pattern to prevent re-entrency?
object TestLock = new object();
public void Test(bool stop = false) {
Task t;
lock (this.TestLock) {
if (stop) return;
t = Task.Factory.StartNew(() => { this.Test(stop: true); });
}
t.Wait();
}
Edit: Based on the below answer by Jon Skeet and Stephen Toub, a simple way to deterministically prevent reentrancy would be to pass a CancellationToken, as illustrated in this extension method:
public static Task StartNewOnDifferentThread(this TaskFactory taskFactory, Action action)
{
return taskFactory.StartNew(action: action, cancellationToken: new CancellationToken());
}
I mailed Stephen Toub - a member of the PFX Team - about this question. He's come back to me really quickly, with a lot of detail - so I'll just copy and paste his text here. I haven't quoted it all, as reading a large amount of quoted text ends up getting less comfortable than vanilla black-on-white, but really, this is Stephen - I don't know this much stuff :) I've made this answer community wiki to reflect that all the goodness below isn't really my content:
If you call Wait() on a Task that's completed, there won't be any blocking (it'll just throw an exception if the task completed with a TaskStatus other than RanToCompletion, or otherwise return as a nop). If you call Wait() on a Task that's already executing, it must block as there’s nothing else it can reasonably do (when I say block, I'm including both true kernel-based waiting and spinning, as it'll typically do a mixture of both). Similarly, if you call Wait() on a Task that has the Created or WaitingForActivation status, it’ll block until the task has completed. None of those is the interesting case being discussed.
The interesting case is when you call Wait() on a Task in the WaitingToRun state, meaning that it’s previously been queued to a TaskScheduler but that TaskScheduler hasn't yet gotten around to actually running the Task's delegate yet. In that case, the call to Wait will ask the scheduler whether it's ok to run the Task then-and-there on the current thread, via a call to the scheduler's TryExecuteTaskInline method. This is called inlining. The scheduler can choose to either inline the task via a call to base.TryExecuteTask, or it can return 'false' to indicate that it is not executing the task (often this is done with logic like...
return SomeSchedulerSpecificCondition() ? false : TryExecuteTask(task);
The reason TryExecuteTask returns a Boolean is that it handles the synchronization to ensure a given Task is only ever executed once). So, if a scheduler wants to completely prohibit inlining of the Task during Wait, it can just be implemented as return false; If a scheduler wants to always allow inlining whenever possible, it can just be implemented as:
return TryExecuteTask(task);
In the current implementation (both .NET 4 and .NET 4.5, and I don’t personally expect this to change), the default scheduler that targets the ThreadPool allows for inlining if the current thread is a ThreadPool thread and if that thread was the one to have previously queued the task.
Note that there isn't arbitrary reentrancy here, in that the default scheduler won’t pump arbitrary threads when waiting for a task... it'll only allow that task to be inlined, and of course any inlining that task in turn decides to do. Also note that Wait won’t even ask the scheduler in certain conditions, instead preferring to block. For example, if you pass in a cancelable CancellationToken, or if you pass in a non-infinite timeout, it won’t try to inline because it could take an arbitrarily long amount of time to inline the task's execution, which is all or nothing, and that could end up significantly delaying the cancellation request or timeout. Overall, TPL tries to strike a decent balance here between wasting the thread that’s doing the Wait'ing and reusing that thread for too much. This kind of inlining is really important for recursive divide-and-conquer problems (e.g. QuickSort) where you spawn multiple tasks and then wait for them all to complete. If such were done without inlining, you’d very quickly deadlock as you exhaust all threads in the pool and any future ones it wanted to give to you.
Separate from Wait, it’s also (remotely) possible that the Task.Factory.StartNew call could end up executing the task then and there, iff the scheduler being used chose to run the task synchronously as part of the QueueTask call. None of the schedulers built into .NET will ever do this, and I personally think it would be a bad design for scheduler, but it’s theoretically possible, e.g.:
protected override void QueueTask(Task task, bool wasPreviouslyQueued)
{
return TryExecuteTask(task);
}
The overload of Task.Factory.StartNew that doesn’t accept a TaskScheduler uses the scheduler from the TaskFactory, which in the case of Task.Factory targets TaskScheduler.Current. This means if you call Task.Factory.StartNew from within a Task queued to this mythical RunSynchronouslyTaskScheduler, it would also queue to RunSynchronouslyTaskScheduler, resulting in the StartNew call executing the Task synchronously. If you’re at all concerned about this (e.g. you’re implementing a library and you don’t know where you’re going to be called from), you can explicitly pass TaskScheduler.Default to the StartNew call, use Task.Run (which always goes to TaskScheduler.Default), or use a TaskFactory created to target TaskScheduler.Default.
EDIT: Okay, it looks like I was completely wrong, and a thread which is currently waiting on a task can be hijacked. Here's a simpler example of this happening:
using System;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication1 {
class Program {
static void Main() {
for (int i = 0; i < 10; i++)
{
Task.Factory.StartNew(Launch).Wait();
}
}
static void Launch()
{
Console.WriteLine("Launch thread: {0}",
Thread.CurrentThread.ManagedThreadId);
Task.Factory.StartNew(Nested).Wait();
}
static void Nested()
{
Console.WriteLine("Nested thread: {0}",
Thread.CurrentThread.ManagedThreadId);
}
}
}
Sample output:
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
As you can see, there are lots of times when the waiting thread is reused to execute the new task. This can happen even if the thread has acquired a lock. Nasty re-entrancy. I am suitably shocked and worried :(
Why not just design for it, rather than bend over backwards to ensure it doesn't happen?
The TPL is a red herring here, reentrancy can happen in any code provided you can create a cycle, and you don't know for sure what's going to happen 'south' of your stack frame. Synchronous reentrancy is the best outcome here - at least you can't self-deadlock yourself (as easily).
Locks manage cross thread synchronisation. They are orthogonal to managing reentrancy. Unless you are protecting a genuine single use resource (probably a physical device, in which case you should probably use a queue), why not just ensure your instance state is consistent so reentrancy can 'just work'.
(Side thought: are Semaphores reentrant without decrementing?)
You could easily test this by writting a quick app that shared a socket connection between threads / tasks.
The task would acquire a lock before sending a message down the socket and waiting for a response. Once this blocks and becomes idle (IOBlock) set another task in the same block to do the same. It should block on acquiring the lock, if it does not and the second task is allowed to pass the lock because it run by the same thread then you have an problem.
Solution with new CancellationToken() proposed by Erwin did not work for me, inlining happened to occur anyway.
So I ended up using another condition advised by Jon and Stephen
(... or if you pass in a non-infinite timeout ...):
Task<TResult> task = Task.Run(func);
task.Wait(TimeSpan.FromHours(1)); // Whatever is enough for task to start
return task.Result;
Note: Omitting exception handling etc here for simplicity, you should mind those in production code.

Categories

Resources