Usage of Task.WhenAll with infinite Tasks produced by BlockingCollection - c#

I am adding Background Tasks to a Blocking Collection (added in the Background).
I am waiting with Task.WhenAll on a Enumerable returned by GetConsumingEnumerable.
My question is: Is the overload of Task.WhenAll which receives an IEnumerable "prepared" to potentially receive an endless amount of tasks?
I am simply not sure if i can do it this way or if it was meant to be used this way?
private async Task RunAsync(TimeSpan delay, CancellationToken cancellationToken)
{
using (BlockingCollection<Task> jobcollection = new BlockingCollection<Task>())
{
Task addingTask = Task.Run(async () =>
{
while (true)
{
DateTime utcNow = DateTime.UtcNow;
var jobs = Repository.GetAllJobs();
foreach (var job in GetRootJobsDue(jobs, utcNow))
{
jobcollection.Add(Task.Run(() => RunJob(job, jobs, cancellationToken, utcNow), cancellationToken), cancellationToken);
}
await Task.Delay(delay, cancellationToken);
}
}, cancellationToken);
await Task.WhenAll(jobcollection.GetConsumingEnumerable(cancellationToken));
}
}

Since your goal is merely to wait until the cancellation token is cancelled, you should do that. For reasons others have explained, using WhenAll on an infinite sequence of tasks isn't the way to go about that. There are easier ways to get a task that will never complete.
await new TaskCompletionSource<bool>().Task
.ContinueWith(t => { }, cancellationToken);

Task.WhenAll will not work with an infinite number of tasks. It will first (synchronously) wait for the enumerable to finish, and then (asynchronously) wait for them all to complete.
If you want to react to a sequence in an asynchronous way, then you need to use IObservable<Task> (Reactive Extensions). You can use a TPL Dataflow BufferBlock as a "queue" that can work with either synchronous or asynchronous code, and is easily convertible to IObservable<Task>.

I assume that Task.WhenAll will try to enumerate the collection, which means that it itself will block until the collection is completed or cancelled. If it didn't, then the code could theoretically finish awaiting before the tasks have been created. So there's going to be an extra block in there... it will block waiting for the threads to be created, and then block again until the tasks are done. I don't think that's a bad thing for your code, as it's still going to block until the same point in time.

Related

Continuation is running before antecedents have finished

I have two long running tasks that I want to execute, to be followed with a continuation. The problem I am having is that the continuation is running before the two long running tasks have signalled. Those two tasks are not erroring, they've barely started when the continuation commences.
The intent is that the first two tasks can run in parallel, but the continuation should only run once they've both finished.
I have probably overlooked something really simple, but near as I can tell I am following the patterns suggested in the question Running multiple async tasks and waiting for them all to complete. I realise I could probably dispense with the .ContinueWith() as the code following the Task.WhenAll() will implicitly become a continuation, but I prefer to keep the explicit continuation (unless it is the cause of my issue).
private async Task MyMethod()
{
var tasks = new List<Task>
{
new Task(async () => await DoLongRunningTask1()),
new Task(async () => await DoLongRunningTask2())
};
Parallel.ForEach(tasks, async task => task.Start());
await Task.WhenAll(tasks)
.ContinueWith(async t =>{
//another long running bit of code with some parallel
//execution based on the results of the first two tasks
}
, TaskContinuationOptions.OnlyOnRanToCompletion);
}
private async Task DoLongRunningTask1()
{
... long running operation ...
}
private async Task DoLongRunningTask2()
{
... another long running operation ...
}
I appreciate any help or suggestions!
Your code has multiple issues:
Creating tasks with the non-generic Task constructor, with async delegate. This constructor accepts an Action, not a Func<Task>, so your async delegate is actually an async void, which is something to avoid. You can fix this by creating nested Task<Task>s, where the outer task represents the launching of the inner task.
Using the Parallel.ForEach with async delegate. This again results in async void delegate, for the same reason. The correct API to use when you want to parallelize asynchronous delegates is the Parallel.ForEachAsync, available from .NET 6 and later.
Using parallelism to start tasks. Starting a task with Start is not CPU-bound. The Start doesn't execute the task, it just schedules it for execution, normally on the ThreadPool. So you can replace the Parallel.ForEach with a simple foreach loop or the List.ForEach.
Awaiting an unwrapped ContinueWith continuation that has an async delegate. This continuation returns a nested Task<Task>. You should Unwrap this nested task, otherwise you are awaiting the launching of the asynchronous operation, not its completion.
Here is your code "fixed":
private async Task MyMethod()
{
List<Task<Task>> tasksOfTasks = new()
{
new Task<Task>(async () => await DoLongRunningTask1()),
new Task<Task>(async () => await DoLongRunningTask2())
};
tasksOfTasks.ForEach(taskTask => taskTask.Start());
await Task.WhenAll(tasksOfTasks.Select(taskTask => taskTask.Unwrap()))
.ContinueWith(async t =>
{
//another long running bit of code with some parallel
//execution based on the results of the first two tasks
}, TaskContinuationOptions.OnlyOnRanToCompletion).Unwrap();
}
In case of failure in the DoLongRunningTask1 or the DoLongRunningTask2 methods, this code will most likely result in a TaskCanceledException, which is what happens when you pass the TaskContinuationOptions.OnlyOnRanToCompletion flag and the antecedent task does not run to completion successfully. This behavior is unlikely to be what you want.
The above code also violates the guideline CA2008: Do not create tasks without passing a TaskScheduler. This guideline says that being depended on the ambient TaskScheduler.Current is not a good idea, and specifying explicitly the TaskScheduler.Default is recommended. You'll have to specify it in both the Start and ContinueWith methods.
If you think that all this is overly complicated, you are right. That's not the kind of code that you want to write and maintain, unless you have spent months studying the Task Parallel Library, you know it in and out, and you have no other higher-level alternative. It's the kind of code that you might find in specialized libraries, not in application code.
You are overusing async. Below is more cleaner rewrite of what you have done above.
private async Task MyMethod()
{
var tasks = new Task[] {DoLongRunningTask1(), DoLongRunningTask2()];
await Task.WhenAll(tasks);
//another long running bit of code with some parallel
//execution based on the results of the first two tasks
}
private async Task DoLongRunningTask1()
{
... long running operation ...
}
private async Task DoLongRunningTask2()
{
... another long running operation ...
}
I assumed DoLongRunningTask's are async in nature, if they are syncronious then start them using Task.Run within a ThreadPool thread and use resulting Task.

How to make this sequence printed asynchronously? [duplicate]

Ok, so basically I have a bunch of tasks (10) and I want to start them all at the same time and wait for them to complete. When completed I want to execute other tasks. I read a bunch of resources about this but I can't get it right for my particular case...
Here is what I currently have (code has been simplified):
public async Task RunTasks()
{
var tasks = new List<Task>
{
new Task(async () => await DoWork()),
//and so on with the other 9 similar tasks
}
Parallel.ForEach(tasks, task =>
{
task.Start();
});
Task.WhenAll(tasks).ContinueWith(done =>
{
//Run the other tasks
});
}
//This function perform some I/O operations
public async Task DoWork()
{
var results = await GetDataFromDatabaseAsync();
foreach (var result in results)
{
await ReadFromNetwork(result.Url);
}
}
So my problem is that when I'm waiting for tasks to complete with the WhenAll call, it tells me that all tasks are over even though none of them are completed. I tried adding Console.WriteLine in my foreach and when I have entered the continuation task, data keeps coming in from my previous Tasks that aren't really finished.
What am I doing wrong here?
You should almost never use the Task constructor directly. In your case that task only fires the actual task that you can't wait for.
You can simply call DoWork and get back a task, store it in a list and wait for all the tasks to complete. Meaning:
tasks.Add(DoWork());
// ...
await Task.WhenAll(tasks);
However, async methods run synchronously until the first await on an uncompleted task is reached. If you worry about that part taking too long then use Task.Run to offload it to another ThreadPool thread and then store that task in the list:
tasks.Add(Task.Run(() => DoWork()));
// ...
await Task.WhenAll(tasks);
If you want to run those task's parallel in different threads using TPL you may need something like this:
public async Task RunTasks()
{
var tasks = new List<Func<Task>>
{
DoWork,
//...
};
await Task.WhenAll(tasks.AsParallel().Select(async task => await task()));
//Run the other tasks
}
These approach parallelizing only small amount of code: the queueing of the method to the thread pool and the return of an uncompleted Task. Also for such small amount of task parallelizing can take more time than just running asynchronously. This could make sense only if your tasks do some longer (synchronous) work before their first await.
For most cases better way will be:
public async Task RunTasks()
{
await Task.WhenAll(new []
{
DoWork(),
//...
});
//Run the other tasks
}
To my opinion in your code:
You should not wrap your code in Task before passing to Parallel.ForEach.
You can just await Task.WhenAll instead of using ContinueWith.
Essentially you're mixing two incompatible async paradigms; i.e. Parallel.ForEach() and async-await.
For what you want, do one or the other. E.g. you can just use Parallel.For[Each]() and drop the async-await altogether. Parallel.For[Each]() will only return when all the parallel tasks are complete, and you can then move onto the other tasks.
The code has some other issues too:
you mark the method async but don't await in it (the await you do have is in the delegate, not the method);
you almost certainly want .ConfigureAwait(false) on your awaits, especially if you aren't trying to use the results immediately in a UI thread.
The DoWork method is an asynchronous I/O method. It means that you don't need multiple threads to execute several of them, as most of the time the method will asynchronously wait for the I/O to complete. One thread is enough to do that.
public async Task RunTasks()
{
var tasks = new List<Task>
{
DoWork(),
//and so on with the other 9 similar tasks
};
await Task.WhenAll(tasks);
//Run the other tasks
}
You should almost never use the Task constructor to create a new task. To create an asynchronous I/O task, simply call the async method. To create a task that will be executed on a thread pool thread, use Task.Run. You can read this article for a detailed explanation of Task.Run and other options of creating tasks.
Just also add a try-catch block around the Task.WhenAll
NB: An instance of System.AggregateException is thrown that acts as a wrapper around one or more exceptions that have occurred. This is important for methods that coordinate multiple tasks like Task.WaitAll() and Task.WaitAny() so the AggregateException is able to wrap all the exceptions within the running tasks that have occurred.
try
{
Task.WaitAll(tasks.ToArray());
}
catch(AggregateException ex)
{
foreach (Exception inner in ex.InnerExceptions)
{
Console.WriteLine(String.Format("Exception type {0} from {1}", inner.GetType(), inner.Source));
}
}

Executing tasks in parallel

Ok, so basically I have a bunch of tasks (10) and I want to start them all at the same time and wait for them to complete. When completed I want to execute other tasks. I read a bunch of resources about this but I can't get it right for my particular case...
Here is what I currently have (code has been simplified):
public async Task RunTasks()
{
var tasks = new List<Task>
{
new Task(async () => await DoWork()),
//and so on with the other 9 similar tasks
}
Parallel.ForEach(tasks, task =>
{
task.Start();
});
Task.WhenAll(tasks).ContinueWith(done =>
{
//Run the other tasks
});
}
//This function perform some I/O operations
public async Task DoWork()
{
var results = await GetDataFromDatabaseAsync();
foreach (var result in results)
{
await ReadFromNetwork(result.Url);
}
}
So my problem is that when I'm waiting for tasks to complete with the WhenAll call, it tells me that all tasks are over even though none of them are completed. I tried adding Console.WriteLine in my foreach and when I have entered the continuation task, data keeps coming in from my previous Tasks that aren't really finished.
What am I doing wrong here?
You should almost never use the Task constructor directly. In your case that task only fires the actual task that you can't wait for.
You can simply call DoWork and get back a task, store it in a list and wait for all the tasks to complete. Meaning:
tasks.Add(DoWork());
// ...
await Task.WhenAll(tasks);
However, async methods run synchronously until the first await on an uncompleted task is reached. If you worry about that part taking too long then use Task.Run to offload it to another ThreadPool thread and then store that task in the list:
tasks.Add(Task.Run(() => DoWork()));
// ...
await Task.WhenAll(tasks);
If you want to run those task's parallel in different threads using TPL you may need something like this:
public async Task RunTasks()
{
var tasks = new List<Func<Task>>
{
DoWork,
//...
};
await Task.WhenAll(tasks.AsParallel().Select(async task => await task()));
//Run the other tasks
}
These approach parallelizing only small amount of code: the queueing of the method to the thread pool and the return of an uncompleted Task. Also for such small amount of task parallelizing can take more time than just running asynchronously. This could make sense only if your tasks do some longer (synchronous) work before their first await.
For most cases better way will be:
public async Task RunTasks()
{
await Task.WhenAll(new []
{
DoWork(),
//...
});
//Run the other tasks
}
To my opinion in your code:
You should not wrap your code in Task before passing to Parallel.ForEach.
You can just await Task.WhenAll instead of using ContinueWith.
Essentially you're mixing two incompatible async paradigms; i.e. Parallel.ForEach() and async-await.
For what you want, do one or the other. E.g. you can just use Parallel.For[Each]() and drop the async-await altogether. Parallel.For[Each]() will only return when all the parallel tasks are complete, and you can then move onto the other tasks.
The code has some other issues too:
you mark the method async but don't await in it (the await you do have is in the delegate, not the method);
you almost certainly want .ConfigureAwait(false) on your awaits, especially if you aren't trying to use the results immediately in a UI thread.
The DoWork method is an asynchronous I/O method. It means that you don't need multiple threads to execute several of them, as most of the time the method will asynchronously wait for the I/O to complete. One thread is enough to do that.
public async Task RunTasks()
{
var tasks = new List<Task>
{
DoWork(),
//and so on with the other 9 similar tasks
};
await Task.WhenAll(tasks);
//Run the other tasks
}
You should almost never use the Task constructor to create a new task. To create an asynchronous I/O task, simply call the async method. To create a task that will be executed on a thread pool thread, use Task.Run. You can read this article for a detailed explanation of Task.Run and other options of creating tasks.
Just also add a try-catch block around the Task.WhenAll
NB: An instance of System.AggregateException is thrown that acts as a wrapper around one or more exceptions that have occurred. This is important for methods that coordinate multiple tasks like Task.WaitAll() and Task.WaitAny() so the AggregateException is able to wrap all the exceptions within the running tasks that have occurred.
try
{
Task.WaitAll(tasks.ToArray());
}
catch(AggregateException ex)
{
foreach (Exception inner in ex.InnerExceptions)
{
Console.WriteLine(String.Format("Exception type {0} from {1}", inner.GetType(), inner.Source));
}
}

How to pass LongRunning flag specifically to Task.Run()?

I need a way to set an async task as long running without using Task.Factory.StartNew(...) and instead using Task.Run(...) or something similar.
Context:
I have Task that loops continuously until it is externally canceled that I would like to set as 'long running' (i.e. give it a dedicated thread). This can be achieved through the code below:
var cts = new CancellationTokenSource();
Task t = Task.Factory.StartNew(
async () => {
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
} }, cts.Token, TaskCreationOptions.LongRunning, TaskScheduler.Default);
The problem is that Task.Factory.StartNew(...) does not return the active async task that is passed in but rather a 'task of running the Action' which functionally always has taskStatus of 'RanToCompletion'. Since my code needs to be able to track the task's status to see when it becomes 'Canceled' (or 'Faulted') I need to use something like below:
var cts = new CancellationTokenSource();
Task t = Task.Run(
async () => {
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
} }, cts.Token);
Task.Run(...), as desired, returns the async process itself allowing me to obtain actual statuses of 'Canceled' or 'Faulted'. I cannot specify the task as long running, however. So, anyone know how to best go about running an async task while both storing that active task itself (with desired taskStatus) and setting the task to long running?
I have Task that loops continuously until it is externally canceled that I would like to set as 'long running' (i.e. give it a dedicated thread)... anyone know how to best go about running an async task while both storing that active task itself (with desired taskStatus) and setting the task to long running?
There's a few problems with this. First, "long running" does not necessarily mean a dedicated thread - it just means that you're giving the TPL a hint that the task is long-running. In the current (4.5) implementation, you will get a dedicated thread; but that's not guaranteed and could change in the future.
So, if you need a dedicated thread, you'll have to just create one.
The other problem is the notion of an "asynchronous task". What actually happens with async code running on the thread pool is that the thread is returned to the thread pool while the asynchronous operation (i.e., Task.Delay) is in progress. Then, when the async op completes, a thread is taken from the thread pool to resume the async method. In the general case, this is more efficient than reserving a thread specifically to complete that task.
So, with async tasks running on the thread pool, dedicated threads don't really make sense.
Regarding solutions:
If you do need a dedicated thread to run your async code, I'd recommend using the AsyncContextThread from my AsyncEx library:
using (var thread = new AsyncContextThread())
{
Task t = thread.TaskFactory.Run(async () =>
{
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
}
});
}
However, you almost certainly don't need a dedicated thread. If your code can execute on the thread pool, then it probably should; and a dedicated thread doesn't make sense for async methods running on the thread pool. More specifically, the long-running flag doesn't make sense for async methods running on the thread pool.
Put another way, with an async lambda, what the thread pool actually executes (and sees as tasks) are just the parts of the lambda in-between the await statements. Since those parts aren't long-running, the long-running flag is not required. And your solution becomes:
Task t = Task.Run(async () =>
{
while (true)
{
cts.Token.ThrowIfCancellationRequested(); // not long-running
try
{
"Running...".Dump(); // not long-running
await Task.Delay(500, cts.Token); // not executed by the thread pool
}
catch (TaskCanceledException ex) { }
}
});
Call Unwrap on the task returned from Task.Factory.StartNew this will return the inner task, which has the correct status.
var cts = new CancellationTokenSource();
Task t = Task.Factory.StartNew(
async () => {
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
} }, cts.Token, TaskCreationOptions.LongRunning, TaskScheduler.Default).Unwrap();
On a dedicated thread, there's nothing to yield to. Don't use async and await, use synchronous calls.
This question gives two ways to do a cancellable sleep without await:
Task.Delay(500, cts.Token).Wait(); // requires .NET 4.5
cts.WaitHandle.WaitOne(TimeSpan.FromMilliseconds(500)); // valid in .NET 4.0 and later
If part of your work does use parallelism, you can start parallel tasks, saving those into an array, and use Task.WaitAny on the Task[]. Still no use for await in the main thread procedure.
This is unnecessary and Task.Run will suffice as the Task Scheduler will set any task to LongRunning if it runs for more than 0.5 seconds.
See here why.
https://blog.stephencleary.com/2013/08/startnew-is-dangerous.html
You need to specify custom TaskCreationOptions. Let’s consider each of
the options. AttachedToParent shouldn’t be used in async tasks, so
that’s out. DenyChildAttach should always be used with async tasks
(hint: if you didn’t already know that, then StartNew isn’t the tool
you need). DenyChildAttach is passed by Task.Run. HideScheduler might
be useful in some really obscure scheduling scenarios but in general
should be avoided for async tasks. That only leaves LongRunning and
PreferFairness, which are both optimization hints that should only be
specified after application profiling. I often see LongRunning misused
in particular. In the vast majority of situations, the threadpool will
adjust to any long-running task in 0.5 seconds - without the
LongRunning flag. Most likely, you don’t really need it.
The real issue you have here is that your operation is not in fact long running. The actual work you're doing is an asynchronous operation, meaning it will return to the caller basically immediately. So not only do you not need to use the long running hint when having the task scheduler schedule it, there's no need to even use a thread pool thread to do this work, because it'll be basically instantaneous. You shouldn't be using StartNew or Run at all, let alone with the long running flag.
So rather than taking your asynchronous method and starting it in another thread, you can just start it right on the current thread by calling the asynchronous method. Offloading the starting of an already asynchronous operation is just creating more work that'll make things slower.
So your code simplifies all the way down to:
var cts = new CancellationTokenSource();
Task t = DoWork();
async Task DoWork()
{
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException) { }
}
}
I think the consideration should be not how long the thread run but how much of its time is it really working.
In your example there is short work and them await Task.Delay(...).
If this is really the case in your project you probably shouldn't use a dedicated thread for this task and let it run on the regular thread pool. Every time you'll call await on an IO operation or on Task.Delay() you'll release the thread for other tasks to use.
You should only use LongRunning when you'll decrease your thread from the thread-pool and never give it back or give it back only for a small percentage of the time. In such a case (where the work is long and Task.Delay(...) is short in comparison) using a dedicated thread for the job is a reasonable solution.
On the other hand if your thread is really working most of the time it will consume system resources (CPU time) and maybe it doesn't really matter if it holds a thread of the thread-pool since it is preventing other work from happening anyway.
Conclusion? Just use Task.Run() (without LongRunning) and use await in your long running task when and if it is possible. Revert to LongRunning only when you actually see the other approach is causing you problems and even then check your code and design to make sure it is really necessary and there isn't something else you can change in your code.

Does Task.Wait(int) stop the task if the timeout elapses without the task finishing?

I have a task and I expect it to take under a second to run but if it takes longer than a few seconds I want to cancel the task.
For example:
Task t = new Task(() =>
{
while (true)
{
Thread.Sleep(500);
}
});
t.Start();
t.Wait(3000);
Notice that after 3000 milliseconds the wait expires. Was the task canceled when the timeout expired or is the task still running?
Task.Wait() waits up to specified period for task completion and returns whether the task completed in the specified amount of time (or earlier) or not. The task itself is not modified and does not rely on waiting.
Read nice series: Parallelism in .NET, Parallelism in .NET – Part 10, Cancellation in PLINQ and the Parallel class by Reed Copsey
And: .NET 4 Cancellation Framework / Parallel Programming: Task Cancellation
Check following code:
var cts = new CancellationTokenSource();
var newTask = Task.Factory.StartNew(state =>
{
var token = (CancellationToken)state;
while (!token.IsCancellationRequested)
{
}
token.ThrowIfCancellationRequested();
}, cts.Token, cts.Token);
if (!newTask.Wait(3000, cts.Token)) cts.Cancel();
If you want to cancel a Task, you should pass in a CancellationToken when you create the task. That will allow you to cancel the Task from the outside. You could tie cancellation to a timer if you want.
To create a Task with a Cancellation token see this example:
var tokenSource = new CancellationTokenSource();
var token = tokenSource.Token;
var t = Task.Factory.StartNew(() => {
// do some work
if (token.IsCancellationRequested) {
// Clean up as needed here ....
}
token.ThrowIfCancellationRequested();
}, token);
To cancel the Task call Cancel() on the tokenSource.
The task is still running until you explicitly tell it to stop or your loop finishes (which will never happen).
You can check the return value of Wait to see this:
(from http://msdn.microsoft.com/en-us/library/dd235606.aspx)
Return Value
Type: System.Boolean
true if the Task completed execution within the allotted time; otherwise, false.
Was the task canceled when the timeout expired or is the task still running?
No and Yes.
The timeout passed to Task.Wait is for the Wait, not the task.
If your task calls any synchronous method that does any kind of I/O or other unspecified action that takes time, then there is no general way to "cancel" it.
Depending on how you try to "cancel" it, one of the following may happen:
The operation actually gets canceled and the resource it works on is in a stable state (You were lucky!)
The operation actually gets canceled and the resource it works on is in an inconsistent state (potentially causing all sorts of problems later)
The operation continues and potentially interferes with whatever your other code is doing (potentially causing all sorts of problems later)
The operation fails or causes your process to crash.
You don't know what happens, because it is undocumented
There are valid scenarios where you can and probably should cancel a task using one of the generic methods described in the other answers. But if you are here because you want to interrupt a specific synchronous method, better see the documentation of that method to find out if there is a way to interrupt it, if it has a "timeout" parameter, or if there is an interruptible variation of it.

Categories

Resources