I have two long running tasks that I want to execute, to be followed with a continuation. The problem I am having is that the continuation is running before the two long running tasks have signalled. Those two tasks are not erroring, they've barely started when the continuation commences.
The intent is that the first two tasks can run in parallel, but the continuation should only run once they've both finished.
I have probably overlooked something really simple, but near as I can tell I am following the patterns suggested in the question Running multiple async tasks and waiting for them all to complete. I realise I could probably dispense with the .ContinueWith() as the code following the Task.WhenAll() will implicitly become a continuation, but I prefer to keep the explicit continuation (unless it is the cause of my issue).
private async Task MyMethod()
{
var tasks = new List<Task>
{
new Task(async () => await DoLongRunningTask1()),
new Task(async () => await DoLongRunningTask2())
};
Parallel.ForEach(tasks, async task => task.Start());
await Task.WhenAll(tasks)
.ContinueWith(async t =>{
//another long running bit of code with some parallel
//execution based on the results of the first two tasks
}
, TaskContinuationOptions.OnlyOnRanToCompletion);
}
private async Task DoLongRunningTask1()
{
... long running operation ...
}
private async Task DoLongRunningTask2()
{
... another long running operation ...
}
I appreciate any help or suggestions!
Your code has multiple issues:
Creating tasks with the non-generic Task constructor, with async delegate. This constructor accepts an Action, not a Func<Task>, so your async delegate is actually an async void, which is something to avoid. You can fix this by creating nested Task<Task>s, where the outer task represents the launching of the inner task.
Using the Parallel.ForEach with async delegate. This again results in async void delegate, for the same reason. The correct API to use when you want to parallelize asynchronous delegates is the Parallel.ForEachAsync, available from .NET 6 and later.
Using parallelism to start tasks. Starting a task with Start is not CPU-bound. The Start doesn't execute the task, it just schedules it for execution, normally on the ThreadPool. So you can replace the Parallel.ForEach with a simple foreach loop or the List.ForEach.
Awaiting an unwrapped ContinueWith continuation that has an async delegate. This continuation returns a nested Task<Task>. You should Unwrap this nested task, otherwise you are awaiting the launching of the asynchronous operation, not its completion.
Here is your code "fixed":
private async Task MyMethod()
{
List<Task<Task>> tasksOfTasks = new()
{
new Task<Task>(async () => await DoLongRunningTask1()),
new Task<Task>(async () => await DoLongRunningTask2())
};
tasksOfTasks.ForEach(taskTask => taskTask.Start());
await Task.WhenAll(tasksOfTasks.Select(taskTask => taskTask.Unwrap()))
.ContinueWith(async t =>
{
//another long running bit of code with some parallel
//execution based on the results of the first two tasks
}, TaskContinuationOptions.OnlyOnRanToCompletion).Unwrap();
}
In case of failure in the DoLongRunningTask1 or the DoLongRunningTask2 methods, this code will most likely result in a TaskCanceledException, which is what happens when you pass the TaskContinuationOptions.OnlyOnRanToCompletion flag and the antecedent task does not run to completion successfully. This behavior is unlikely to be what you want.
The above code also violates the guideline CA2008: Do not create tasks without passing a TaskScheduler. This guideline says that being depended on the ambient TaskScheduler.Current is not a good idea, and specifying explicitly the TaskScheduler.Default is recommended. You'll have to specify it in both the Start and ContinueWith methods.
If you think that all this is overly complicated, you are right. That's not the kind of code that you want to write and maintain, unless you have spent months studying the Task Parallel Library, you know it in and out, and you have no other higher-level alternative. It's the kind of code that you might find in specialized libraries, not in application code.
You are overusing async. Below is more cleaner rewrite of what you have done above.
private async Task MyMethod()
{
var tasks = new Task[] {DoLongRunningTask1(), DoLongRunningTask2()];
await Task.WhenAll(tasks);
//another long running bit of code with some parallel
//execution based on the results of the first two tasks
}
private async Task DoLongRunningTask1()
{
... long running operation ...
}
private async Task DoLongRunningTask2()
{
... another long running operation ...
}
I assumed DoLongRunningTask's are async in nature, if they are syncronious then start them using Task.Run within a ThreadPool thread and use resulting Task.
Related
Ok, so basically I have a bunch of tasks (10) and I want to start them all at the same time and wait for them to complete. When completed I want to execute other tasks. I read a bunch of resources about this but I can't get it right for my particular case...
Here is what I currently have (code has been simplified):
public async Task RunTasks()
{
var tasks = new List<Task>
{
new Task(async () => await DoWork()),
//and so on with the other 9 similar tasks
}
Parallel.ForEach(tasks, task =>
{
task.Start();
});
Task.WhenAll(tasks).ContinueWith(done =>
{
//Run the other tasks
});
}
//This function perform some I/O operations
public async Task DoWork()
{
var results = await GetDataFromDatabaseAsync();
foreach (var result in results)
{
await ReadFromNetwork(result.Url);
}
}
So my problem is that when I'm waiting for tasks to complete with the WhenAll call, it tells me that all tasks are over even though none of them are completed. I tried adding Console.WriteLine in my foreach and when I have entered the continuation task, data keeps coming in from my previous Tasks that aren't really finished.
What am I doing wrong here?
You should almost never use the Task constructor directly. In your case that task only fires the actual task that you can't wait for.
You can simply call DoWork and get back a task, store it in a list and wait for all the tasks to complete. Meaning:
tasks.Add(DoWork());
// ...
await Task.WhenAll(tasks);
However, async methods run synchronously until the first await on an uncompleted task is reached. If you worry about that part taking too long then use Task.Run to offload it to another ThreadPool thread and then store that task in the list:
tasks.Add(Task.Run(() => DoWork()));
// ...
await Task.WhenAll(tasks);
If you want to run those task's parallel in different threads using TPL you may need something like this:
public async Task RunTasks()
{
var tasks = new List<Func<Task>>
{
DoWork,
//...
};
await Task.WhenAll(tasks.AsParallel().Select(async task => await task()));
//Run the other tasks
}
These approach parallelizing only small amount of code: the queueing of the method to the thread pool and the return of an uncompleted Task. Also for such small amount of task parallelizing can take more time than just running asynchronously. This could make sense only if your tasks do some longer (synchronous) work before their first await.
For most cases better way will be:
public async Task RunTasks()
{
await Task.WhenAll(new []
{
DoWork(),
//...
});
//Run the other tasks
}
To my opinion in your code:
You should not wrap your code in Task before passing to Parallel.ForEach.
You can just await Task.WhenAll instead of using ContinueWith.
Essentially you're mixing two incompatible async paradigms; i.e. Parallel.ForEach() and async-await.
For what you want, do one or the other. E.g. you can just use Parallel.For[Each]() and drop the async-await altogether. Parallel.For[Each]() will only return when all the parallel tasks are complete, and you can then move onto the other tasks.
The code has some other issues too:
you mark the method async but don't await in it (the await you do have is in the delegate, not the method);
you almost certainly want .ConfigureAwait(false) on your awaits, especially if you aren't trying to use the results immediately in a UI thread.
The DoWork method is an asynchronous I/O method. It means that you don't need multiple threads to execute several of them, as most of the time the method will asynchronously wait for the I/O to complete. One thread is enough to do that.
public async Task RunTasks()
{
var tasks = new List<Task>
{
DoWork(),
//and so on with the other 9 similar tasks
};
await Task.WhenAll(tasks);
//Run the other tasks
}
You should almost never use the Task constructor to create a new task. To create an asynchronous I/O task, simply call the async method. To create a task that will be executed on a thread pool thread, use Task.Run. You can read this article for a detailed explanation of Task.Run and other options of creating tasks.
Just also add a try-catch block around the Task.WhenAll
NB: An instance of System.AggregateException is thrown that acts as a wrapper around one or more exceptions that have occurred. This is important for methods that coordinate multiple tasks like Task.WaitAll() and Task.WaitAny() so the AggregateException is able to wrap all the exceptions within the running tasks that have occurred.
try
{
Task.WaitAll(tasks.ToArray());
}
catch(AggregateException ex)
{
foreach (Exception inner in ex.InnerExceptions)
{
Console.WriteLine(String.Format("Exception type {0} from {1}", inner.GetType(), inner.Source));
}
}
I don't quite understand the difference between Task.Wait and await.
I have something similar to the following functions in a ASP.NET WebAPI service:
public class TestController : ApiController
{
public static async Task<string> Foo()
{
await Task.Delay(1).ConfigureAwait(false);
return "";
}
public async static Task<string> Bar()
{
return await Foo();
}
public async static Task<string> Ros()
{
return await Bar();
}
// GET api/test
public IEnumerable<string> Get()
{
Task.WaitAll(Enumerable.Range(0, 10).Select(x => Ros()).ToArray());
return new string[] { "value1", "value2" }; // This will never execute
}
}
Where Get will deadlock.
What could cause this? Why doesn't this cause a problem when I use a blocking wait rather than await Task.Delay?
Wait and await - while similar conceptually - are actually completely different.
Wait will synchronously block until the task completes. So the current thread is literally blocked waiting for the task to complete. As a general rule, you should use "async all the way down"; that is, don't block on async code. On my blog, I go into the details of how blocking in asynchronous code causes deadlock.
await will asynchronously wait until the task completes. This means the current method is "paused" (its state is captured) and the method returns an incomplete task to its caller. Later, when the await expression completes, the remainder of the method is scheduled as a continuation.
You also mentioned a "cooperative block", by which I assume you mean a task that you're Waiting on may execute on the waiting thread. There are situations where this can happen, but it's an optimization. There are many situations where it can't happen, like if the task is for another scheduler, or if it's already started or if it's a non-code task (such as in your code example: Wait cannot execute the Delay task inline because there's no code for it).
You may find my async / await intro helpful.
Based on what I read from different sources:
An await expression does not block the thread on which it is executing. Instead, it causes the compiler to sign up the rest of the async method as a continuation on the awaited task. Control then returns to the caller of the async method. When the task completes, it invokes its continuation, and execution of the async method resumes where it left off.
To wait for a single task to complete, you can call its Task.Wait method. A call to the Wait method blocks the calling thread until the single class instance has completed execution. The parameterless Wait() method is used to wait unconditionally until a task completes. The task simulates work by calling the Thread.Sleep method to sleep for two seconds.
This article is also a good read.
Some important facts were not given in other answers:
async/await is more complex at CIL level and thus costs memory and CPU time.
Any task can be canceled if the waiting time is unacceptable.
In the case of async/await we do not have a handler for such a task to cancel it or monitoring it.
Using Task is more flexible than async/await.
Any sync functionality can by wrapped by async.
public async Task<ActionResult> DoAsync(long id)
{
return await Task.Run(() => { return DoSync(id); } );
}
async/await generate many problems. We do not know if await statement will be reached without runtime and context debugging. If first await is not reached, everything is blocked. Sometimes even when await seems to be reached, still everything is blocked:
https://github.com/dotnet/runtime/issues/36063
I do not see why I must live with the code duplication for sync and async method or using hacks.
Conclusion: Creating Tasks manually and controlling them is much better. Handler to Task gives more control. We can monitor Tasks and manage them:
https://github.com/lsmolinski/MonitoredQueueBackgroundWorkItem
Sorry for my english.
Ok, so basically I have a bunch of tasks (10) and I want to start them all at the same time and wait for them to complete. When completed I want to execute other tasks. I read a bunch of resources about this but I can't get it right for my particular case...
Here is what I currently have (code has been simplified):
public async Task RunTasks()
{
var tasks = new List<Task>
{
new Task(async () => await DoWork()),
//and so on with the other 9 similar tasks
}
Parallel.ForEach(tasks, task =>
{
task.Start();
});
Task.WhenAll(tasks).ContinueWith(done =>
{
//Run the other tasks
});
}
//This function perform some I/O operations
public async Task DoWork()
{
var results = await GetDataFromDatabaseAsync();
foreach (var result in results)
{
await ReadFromNetwork(result.Url);
}
}
So my problem is that when I'm waiting for tasks to complete with the WhenAll call, it tells me that all tasks are over even though none of them are completed. I tried adding Console.WriteLine in my foreach and when I have entered the continuation task, data keeps coming in from my previous Tasks that aren't really finished.
What am I doing wrong here?
You should almost never use the Task constructor directly. In your case that task only fires the actual task that you can't wait for.
You can simply call DoWork and get back a task, store it in a list and wait for all the tasks to complete. Meaning:
tasks.Add(DoWork());
// ...
await Task.WhenAll(tasks);
However, async methods run synchronously until the first await on an uncompleted task is reached. If you worry about that part taking too long then use Task.Run to offload it to another ThreadPool thread and then store that task in the list:
tasks.Add(Task.Run(() => DoWork()));
// ...
await Task.WhenAll(tasks);
If you want to run those task's parallel in different threads using TPL you may need something like this:
public async Task RunTasks()
{
var tasks = new List<Func<Task>>
{
DoWork,
//...
};
await Task.WhenAll(tasks.AsParallel().Select(async task => await task()));
//Run the other tasks
}
These approach parallelizing only small amount of code: the queueing of the method to the thread pool and the return of an uncompleted Task. Also for such small amount of task parallelizing can take more time than just running asynchronously. This could make sense only if your tasks do some longer (synchronous) work before their first await.
For most cases better way will be:
public async Task RunTasks()
{
await Task.WhenAll(new []
{
DoWork(),
//...
});
//Run the other tasks
}
To my opinion in your code:
You should not wrap your code in Task before passing to Parallel.ForEach.
You can just await Task.WhenAll instead of using ContinueWith.
Essentially you're mixing two incompatible async paradigms; i.e. Parallel.ForEach() and async-await.
For what you want, do one or the other. E.g. you can just use Parallel.For[Each]() and drop the async-await altogether. Parallel.For[Each]() will only return when all the parallel tasks are complete, and you can then move onto the other tasks.
The code has some other issues too:
you mark the method async but don't await in it (the await you do have is in the delegate, not the method);
you almost certainly want .ConfigureAwait(false) on your awaits, especially if you aren't trying to use the results immediately in a UI thread.
The DoWork method is an asynchronous I/O method. It means that you don't need multiple threads to execute several of them, as most of the time the method will asynchronously wait for the I/O to complete. One thread is enough to do that.
public async Task RunTasks()
{
var tasks = new List<Task>
{
DoWork(),
//and so on with the other 9 similar tasks
};
await Task.WhenAll(tasks);
//Run the other tasks
}
You should almost never use the Task constructor to create a new task. To create an asynchronous I/O task, simply call the async method. To create a task that will be executed on a thread pool thread, use Task.Run. You can read this article for a detailed explanation of Task.Run and other options of creating tasks.
Just also add a try-catch block around the Task.WhenAll
NB: An instance of System.AggregateException is thrown that acts as a wrapper around one or more exceptions that have occurred. This is important for methods that coordinate multiple tasks like Task.WaitAll() and Task.WaitAny() so the AggregateException is able to wrap all the exceptions within the running tasks that have occurred.
try
{
Task.WaitAll(tasks.ToArray());
}
catch(AggregateException ex)
{
foreach (Exception inner in ex.InnerExceptions)
{
Console.WriteLine(String.Format("Exception type {0} from {1}", inner.GetType(), inner.Source));
}
}
I just came across some code like:
var task = Task.Run(async () => { await Foo.StartAsync(); });
task.Wait();
(No, I don't know the inner-workings of Foo.StartAsync()). My initial reaction would be get rid of async/await and rewrite as:
var task = Foo.StartAsync();
task.Wait();
Would that be correct, or not (again, knowing nothing at all about Foo.StartAsync()). This answer to What difference does it make - running an 'async' action delegate with a Task.Run ... seems to indicate there may be cases when it might make sense, but it also says "To tell the truth, I haven't seen that many scenarios ..."
Normally, the intended usage for Task.Run is to execute CPU-bound code on a non-UI thread. As such, it would be quite rare for it to be used with an async delegate, but it is possible (e.g., for code that has both asynchronous and CPU-bound portions).
However, that's the intended usage. I think in your example:
var task = Task.Run(async () => { await Foo.StartAsync(); });
task.Wait();
It's far more likely that the original author is attempting to synchronously block on asynchronous code, and is (ab)using Task.Run to avoid deadlocks common in that situation (as I describe on my blog).
In essence, it looks like the "thread pool hack" that I describe in my article on brownfield asynchronous code.
The best solution is to not use Task.Run or Wait:
await Foo.StartAsync();
This will cause async to grow through your code base, which is the best approach, but may cause an unacceptable amount of work for your developers right now. This is presumably why your predecessor used Task.Run(..).Wait().
Mostly yes.
Using Task.Run like this is mostly used by people who don't understand how to execute an async method.
However, there is a difference. Using Task.Run means starting the async method on a ThreadPool thread.
This can be useful when the async method's synchronous part (the part before the first await) is substantial and the caller wants to make sure that method isn't blocking.
This can also be used to "get out of" the current context, for example where there isn't a SynchronizationContext.
It's worth noting that your method has to be marked async to be able to use the await keyword.
The code as written seems to be a workaround for running asynchronous code in a synchronous context. While I wouldn't say you should never ever do this, using async methods is preferable in almost every scenario.
// use this only when running Tasks in a synchronous method
// use async instead whenever possible
var task = Task.Run(async () => await Foo.StartAsync());
task.Wait();
Async methods, like your example of Foo.StartAsync(), should always return a Task object. This means that using Task.Run() to create another task is usually redundant in an async method. The task returned by the async method can simply be awaited by using the await keyword. The only reason you should use Task.Run() is when you are performing CPU-bound work that needs to be performed on a separate thread. The task returned by the async method can simply be awaited by using the await keyword. You can read more in depth details in Microsoft's guide to async programming.
In an async method, your code can be as simple as:
await Foo.StartAsync();
If you want to perform some other work while the task is running, you can assign the function to a variable and await the result (task completion) later.
For example:
var task = Foo.StartAsync();
// do some other work before waiting for task to finish
Bar();
Baz();
// now wait for the task to finish executing
await task;
With CPU-bound work that needs to be run on a separate thread, you can use Task.Run(), but you await the result instead of using the thread blocking Task.Wait():
var task = Task.Run(async () => await Foo.StartAsync());
// do some other work before waiting for task to finish
Bar();
Baz();
// now wait for the task to finish executing
await task;
I don't quite understand the difference between Task.Wait and await.
I have something similar to the following functions in a ASP.NET WebAPI service:
public class TestController : ApiController
{
public static async Task<string> Foo()
{
await Task.Delay(1).ConfigureAwait(false);
return "";
}
public async static Task<string> Bar()
{
return await Foo();
}
public async static Task<string> Ros()
{
return await Bar();
}
// GET api/test
public IEnumerable<string> Get()
{
Task.WaitAll(Enumerable.Range(0, 10).Select(x => Ros()).ToArray());
return new string[] { "value1", "value2" }; // This will never execute
}
}
Where Get will deadlock.
What could cause this? Why doesn't this cause a problem when I use a blocking wait rather than await Task.Delay?
Wait and await - while similar conceptually - are actually completely different.
Wait will synchronously block until the task completes. So the current thread is literally blocked waiting for the task to complete. As a general rule, you should use "async all the way down"; that is, don't block on async code. On my blog, I go into the details of how blocking in asynchronous code causes deadlock.
await will asynchronously wait until the task completes. This means the current method is "paused" (its state is captured) and the method returns an incomplete task to its caller. Later, when the await expression completes, the remainder of the method is scheduled as a continuation.
You also mentioned a "cooperative block", by which I assume you mean a task that you're Waiting on may execute on the waiting thread. There are situations where this can happen, but it's an optimization. There are many situations where it can't happen, like if the task is for another scheduler, or if it's already started or if it's a non-code task (such as in your code example: Wait cannot execute the Delay task inline because there's no code for it).
You may find my async / await intro helpful.
Based on what I read from different sources:
An await expression does not block the thread on which it is executing. Instead, it causes the compiler to sign up the rest of the async method as a continuation on the awaited task. Control then returns to the caller of the async method. When the task completes, it invokes its continuation, and execution of the async method resumes where it left off.
To wait for a single task to complete, you can call its Task.Wait method. A call to the Wait method blocks the calling thread until the single class instance has completed execution. The parameterless Wait() method is used to wait unconditionally until a task completes. The task simulates work by calling the Thread.Sleep method to sleep for two seconds.
This article is also a good read.
Some important facts were not given in other answers:
async/await is more complex at CIL level and thus costs memory and CPU time.
Any task can be canceled if the waiting time is unacceptable.
In the case of async/await we do not have a handler for such a task to cancel it or monitoring it.
Using Task is more flexible than async/await.
Any sync functionality can by wrapped by async.
public async Task<ActionResult> DoAsync(long id)
{
return await Task.Run(() => { return DoSync(id); } );
}
async/await generate many problems. We do not know if await statement will be reached without runtime and context debugging. If first await is not reached, everything is blocked. Sometimes even when await seems to be reached, still everything is blocked:
https://github.com/dotnet/runtime/issues/36063
I do not see why I must live with the code duplication for sync and async method or using hacks.
Conclusion: Creating Tasks manually and controlling them is much better. Handler to Task gives more control. We can monitor Tasks and manage them:
https://github.com/lsmolinski/MonitoredQueueBackgroundWorkItem
Sorry for my english.