Notify consumers when all tasks have completed without blocking the thread

Notify consumers when all tasks have completed without blocking the thread - c#

Given the code below what is the most elegant way to notify consumers when all tasks have run to completion without blocking the thread? Currently I have a solution using a counter which is incremented before the Execute(action) and decremented after the Continuation() and LogException(). If the counter is zero then it is safe to assume that no more tasks are being processed.

First off, don't use the Task constructor at all. If you want to run a delegate in a thread pool thread, use Task.Run. Next, don't use ContinueWith, use await to add continuations to tasks. It's much easier to write correct code that way, particularly with respect to proper error handling. Your code is most effectively written as:
try
{
await Task.Run(() => Execute(action));
Continuation();
}
catch(Exception e)
{
LogExceptions(e)
}
finally
{
CustomLogging();
}

Related

Two Tasks run on the same thread which invalidates lock

Edit
I find Building Async Coordination Primitives, Part 1: AsyncManualResetEvent might be related to my topic.
In the case of TaskCompletionSource, that means that synchronous continuations can happen as part of a call to {Try}Set*, which means in our AsyncManualResetEvent example, those continuations could execute as part of the Set method. Depending on your needs (and whether callers of Set may be ok with a potentially longer-running Set call as all synchronous continuations execute), this may or may not be what you want.
Many thanks to all of the answers, thank you for your knowledge and patience!
Original Question
I know that Task.Run runs on a threadpool thread and threads can have re-entrancy. But I never knew that 2 tasks can run on the same thread when they are both alive!
My Question is: is that reasonable by design? Does that mean lock inside an async method is meaningless (or say, lock cannot be trusted in async method block, if I'd like a method that doesn't allow reentrancy)?
Code:
namespace TaskHijacking
{
class Program
{
static TaskCompletionSource<bool> tcs = new TaskCompletionSource<bool>();
static object methodLock = new object();
static void MethodNotAllowReetrance(string callerName)
{
lock(methodLock)
{
Console.WriteLine($"Enter MethodNotAllowReetrance, caller: {callerName}, on thread: {Thread.CurrentThread.ManagedThreadId}");
if (callerName == "task1")
{
tcs.SetException(new Exception("Terminate tcs"));
}
Thread.Sleep(1000);
Console.WriteLine($"Exit MethodNotAllowReetrance, caller: {callerName}, on thread: {Thread.CurrentThread.ManagedThreadId}");
}
}
static void Main(string[] args)
{
var task1 = Task.Run(async () =>
{
await Task.Delay(1000);
MethodNotAllowReetrance("task1");
});
var task2 = Task.Run(async () =>
{
try
{
await tcs.Task; // await here until task SetException on tcs
}
catch
{
// Omit the exception
}
MethodNotAllowReetrance("task2");
});
Task.WaitAll(task1, task2);
Console.ReadKey();
}
}
}
Output:
Enter MethodNotAllowReetrance, caller: task1, on thread: 6
Enter MethodNotAllowReetrance, caller: task2, on thread: 6
Exit MethodNotAllowReetrance, caller: task2, on thread: 6
Exit MethodNotAllowReetrance, caller: task1, on thread: 6
The control flow of the thread 6 is shown in the figure:

You already have several solutions. I just want to describe the problem a bit more. There are several factors at play here that combine to cause the observed re-entrancy.
First, lock is re-entrant. lock is strictly about mutual exclusion of threads, which is not the same as mutual exclusion of code. I think re-entrant locks are a bad idea in the 99% case (as described on my blog), since developers generally want mutual exclusion of code and not threads. SemaphoreSlim, since it is not re-entrant, mutually excludes code. IMO re-entrant locks are a holdover from decades ago, when they were introduced as an OS concept, and the OS is just concerned about managing threads.
Next, TaskCompletionSource<T> by default invokes task continuations synchronously.
Also, await will schedule its method continuation as a synchronous task continuation (as described on my blog).
Task continuations will sometimes run asynchronously even if scheduled synchronously, but in this scenario they will run synchronously. The context captured by await is the thread pool context, and the completing thread (the one calling TCS.TrySet*) is a thread pool thread, and in that case the continuation will almost always run synchronously.
So, you end up with a thread that takes a lock, completes a TCS, thus executing the continuations of that task, which includes continuing another method, which is then able to take that same lock.
To repeat the existing solutions in other answers, to solve this you need to break that chain at some point:
(OK) Use a non-reentrant lock. SemaphoreSlim.WaitAsync will still execute the continuations while holding the lock (not a good idea), but since SemaphoreSlim isn't re-entrant, the method continuation will (asynchronously) wait for the lock to be available.
(Best) Use TaskCompletionSource.RunContinuationsAsynchronously, which will force task continuations onto a (different) thread pool thread. This is a better solution because your code is no longer invoking arbitrary code while holding a lock (i.e., the task continuations).
You can also break the chain by using a non-thread-pool context for the method awaiting the TCS. E.g., if that method had to resume on a UI thread, then it could not be run synchronously from a thread pool thread.
From a broader perspective, if you're mixing locks and TaskCompletionSource instances, it sounds like you may be building (or may need) an asynchronous coordination primitive. I have an open-source library that implements a bunch of them, if that helps.

A task is an abstraction over some amount of work. Usually this means that the work is split into parts, where the execution can be paused and resumed between parts. When resuming it may very well run on another thread. But the pausing/resuming may only be done at the await statements. Notably, while the task is 'paused', for example because it is waiting for IO, it does not consume any thread at all, it will only use a thread while it is actually running.
My Question is: is that reasonable by design? Does that mean lock inside an async method is meaningless?
locks inside a async method is far from meaningless since it allows you to ensure a section of code is only run by one thread at a time.
In your first example there can be only one thread that has the lock at a time. While the lock is held that task cannot be paused/resumed since await is not legal while in a lock body. So a single thread will execute the entire lock body, and that thread cannot do anything else until it completes the lock body. So there is no risk of re-entrancy unless you invoke some code that can call back to the same method.
In your updated example the problem occurs due to TaskCompletionSource.SetException, this is allowed to reuse the current thread to run any continuation of the task immediately. To avoid this, and many other issues, make sure you only hold the lock while running a limited amount of code. Any method calls that may run arbitrary code risks causing deadlocks, reentrancy, and many other problems.
You can solve the specific problem by using a ManualResetEvent(Slim) to do the signaling between threads instead of using a TaskCompletionSource.

So your method is basically like this:
static void MethodNotAllowReetrance()
{
lock (methodLock) tcs.SetResult();
}
...and the tcs.Task has a continuation attached that invokes the MethodNotAllowReetrance. What happens then is the same thing that would happen if your method was like this instead:
static void MethodNotAllowReetrance()
{
lock (methodLock) MethodNotAllowReetrance();
}
The moral lesson is that you must be very careful every time you invoke any method inside a lock-protected region. In this particular case you have a couple of options:
Don't complete the TaskCompletionSource while holding the lock. Defer its completion until after you have exited the protected region:
static void MethodNotAllowReetrance()
{
bool doComplete = false;
lock (methodLock) doComplete = true;
if (doComplete) tcs.SetResult();
}
Configure the TaskCompletionSource so that it invokes its continuations asynchronously, by passing the TaskCreationOptions.RunContinuationsAsynchronously in its constructor. This is an option that you don't have very often. For example when you cancel a CancellationTokenSource, you don't have the option to invoke asynchronously the callbacks registered to its associated CancellationToken.
Refactor the MethodNotAllowReetrance method in a way that it can handle reentrancy.

Use SemaphoreSlim instead of lock, since, as the documentation says:
The SemaphoreSlim class doesn't enforce thread or task identity
In your case, it would look something like this:
// Semaphore only allows one request to enter at a time
private static readonly SemaphoreSlim _semaphoreSlim = new SemaphoreSlim(1, 1);
void SyncMethod() {
_semaphoreSlim.Wait();
try {
// Do some sync work
} finally {
_semaphoreSlim.Release();
}
}
The try/finally block is optional, but it makes sure that the semaphore is released even if an exception is thrown somewhere in your code.
Note that SemaphoreSlim also has a WaitAsync() method, if you want to wait asynchronously to enter the semaphore.

Not sure if code runs in parallel. Tasks in console application

I have this code:
var dt = new DeveloperTest();
var tasks = readers.Select(dt.ProcessReaderAsync).ToList();
var printCounterTask = new Task(() => dt.DelayedPrint(output));
printCounterTask.Start();
Task.WhenAll(tasks).ContinueWith(x => dt.Print(output).ContinueWith(_ =>
{
dt.Finished = true;
})).Wait();
printCounterTask.Wait();
What this does is preparing tasks that will be run and then start a (I think ) parallel execution which starts with:
printCounterTask.Start();
this is what delayed print does:
public async Task DelayedPrint(IOutputResult output)
{
while (true)
{
if (!Finished)
{
//every 10 seconds should print.
//at least one print even if the execution is less than 10 seconds
//as this starts in paralel with the processing
Task.Delay(10 * 1000).Wait();
await Print(output);
}
else
{
#if DEBUG
Console.WriteLine("Finished with printing");
#endif
break;
}
}
}
Basically is printing some output that is delayed every 10 seconds, then when all the tasks are complete stops the infinite loop.
if you want to see the whole code is here https://github.com/velchev/Exclaimer-Test
I am not sure if this
Task.WhenAll(tasks).ContinueWith(x => dt.Print(output).ContinueWith(_ =>
{
dt.Finished = true;
})).Wait();
runs in parallel with printCounterTask.Start();
When I debut it seems it does as a breakpoint in the !Finished code is hit and then in the else clause too. As far as I know when you start a task it runs in parallel so all the tasks should run in parallel. A task is a representation of a thread which syntactically is easier to control compared to the old syntax. So all this threads running and because of the better syntax is easier to say - wait till all finish and then change the flag. Any helpful explanation will be appreciated. Thank you mates.

The code is mostly correct as written, but there are some nuances around the Task constructor and ContinueWith that make it difficult to understand, and make it easy to break. For example, printCounterTask.Wait() will not wait until DelayedPrint completes, because the Task constructor does not understand asynchronous delegates.
To make the code fully correct and much easier to read and reason about, replace new Task/Start with Task.Run, and replace ContinueWith with await:
var dt = new DeveloperTest();
var tasks = readers.Select(dt.ProcessReaderAsync).ToList();
var printCounterTask = Task.Run(() => dt.DelayedPrint(output));
await Task.WhenAll(tasks);
await dt.Print(output);
dt.Finished = true;
await printCounterTask;
You will also find your code to be clearer if you follow the convention of suffixing asynchronous methods with Async.
A task is a representation of a thread which syntactically is easier to control compared to the old syntax.
No, not at all. A task is a Future - a representation of an operation that may complete sometime in the future. This "operation" does not necessarily require a thread. Task.Run does queue work to the thread pool, but in this example, the task does not always use a thread pool thread (specifically, it doesn't use a thread pool thread during the await Task.Delay).

You are partly right.
The tasks will run in parallel with
printCounterTask
as expected.
However a task is not a representation of a thread and not a syntactic sugaring which easier to control over thread.
Here you can find a useful information:
https://www.dotnetforall.com/difference-task-and-thread/
In general it's important for you to understand that Tasks are using Threads from the ThreadPool.
A task is a representation of a method you wish to execute as a background work (you don't want to block the current execution), and a task needs a thread in order to operate, but it's not true that a task is a thread.
You may have more tasks than available threads in the thread pool, which will lead to them waiting in the queue for available thread in order to be executed.
Also take in consideration that Task.WhenAll will not execute the tasks for you, you'll have to execute them yourself (implementation of ProcessReaderAsync is missing, but if you're using Task.Run it's OK).

What does accessing a Result on a Task before calling Wait actually do?

var task = Task.Run(() => DoSomeStuff()).Result;
What happens here under the hood?
I made a tiny test:
using System;
using System.Threading.Tasks;
public class Program
{
public static void Main()
{
var r = Task.Run( () => {Thread.Sleep(5000); return 123; }).Result;
Console.WriteLine(r);
}
}
It prints "123" after 5s. So does accessing any such property on Task act as a shortcut to calling Task.Wait() i.e. is this safe to do?
Previously my code called Task.Delay(5000) which returned "123" immediately. I fixed this in my question but leave this here as comments and answers reference it.

You asked two questions. First, does accessing Result implicitly cause a synchronous Wait? Yes. But the more important question is:
is this safe to do?
It is not safe to do this.
It is very easy to get into a situation where the task that you are synchronously waiting on has scheduled work to run in the future before it completes onto the thread that you just put to sleep. Now we have a situation where the thread will not wake up until the sleeping thread does some work, which it never does because it is asleep.
If you already know that the task is complete then it is safe to synchronously wait for the result. If you do not, then it is not safe to synchronously wait.
Now, you might say, suppose I know through some other means that it is safe to wait synchronously for an incomplete task. Then is it safe to wait? Well, by the assumption of the question, yes, but it still might not be smart to wait. Remember, the whole point of asynchrony is to manage resources efficiently in a world of high latency. If you are synchronously waiting for an asynchronous task to complete then you are forcing a worker to sleep until another worker is done; the sleeping worker could be doing work! The whole point of asynchrony is to avoid situations where workers go idle, so don't force them to.
An await is an asynchronous wait. It is a wait that means "wait on running the rest of the current workflow until after this task is done, but find something to do while you are waiting". We did a lot of work to add it to the language, so use it!

So does accessing any such property on Task act as a shortcut to calling Task.Wait()?
Yes.
From the docs:
Accessing the [Result] property's get accessor blocks the calling thread until the asynchronous operation is complete; it is equivalent to calling the Wait method.
However, your test doesn't do what you think it does.
Task.Delay(..) returns a Task which completes after the specified amount of time. It doesn't block the calling thread.
So () => { Task.Delay(5000); return 123; } simply creates a new Task (which will complete in 5 seconds), then throws it away and immediately returns 123.
You can either:
Block the calling thread, by doing Task.Delay(5000).Wait() (which does the same thing as Thread.Sleep(5000))
Asynchronously wait for the Task returned from Task.Delay to complete: Task.Run(async () => { await Task.Delay(5000); return 123; })

The test doesn't wait for Task.Delay() so it returns immediatelly. It should be :
var r = Task.Run(async () => { await Task.Delay(5000); return 123; }).Result;
The behavior of Result is well defined - if the task hasn't completed, it blocks until it does. Accessing other Task properties doesn't block

Task.Run does not work like Thread.start

I've been developing an application which I need to run some methods as parallel and not blocking. first I used Task.Run, but IN DEBUG MODE, I see that the operation blocks and just waits for the result. I do not want this, I want all method , which call in a foreach loop, run asynchronously.
public async void f()
{
foreach (var item in childrenANDparents)
{
await Task.Run(() => SendUpdatedSiteInfo(item.Host,site_fr));
// foreach loop does not work until the task return and continues
}
}
So I changed the task.run to thread.start and it works great!
public async void f()
{
foreach (var item in childrenANDparents)
{
Thread t = new Thread(() => SendUpdatedSiteInfo(item.Host, site_fr));
t.Start();
// foreach loop works regardless of the method, in debug mode it shows me
// they are working in parallel
}
}
Would you explain what is the difference and why ? I expect the same behavior from both code and it seems they are different.
thanks

I want all method , which call in a foreach loop, run asynchronously.
It seems that you're confusing async/sync calls with parallelization.
A quote from MSDN:
Data parallelism: A form of parallel processing where the same
computation executes in parallel on different data. Data parallelism
is supported in the Microsoft .NET Framework by the Parallel.For and
Parallel.ForEach methods and by PLINQ. Compare to task parallelism.
Asynchronous operation: An operation that that does not block the current thread
of control when the operation starts.
Let's have a closer look at your code again:
foreach (var item in childrenANDparents)
{
await Task.Run(() => SendUpdatedSiteInfo(item.Host,site_fr));
}
The await keyword will cause compiler to create a StateMachine that will handle the method execution.
It's like if you say to compiler:"Start this async operation without blocking any threads and when it's completed - execute the rest of the stuff".
After Task finishes execution this thread will be released and returned to a ThreadPool and it will execute the rest of the code on a first available thread from a ThreadPool and will make attempt to execute it in a thread in which it had started the method execution (unless .ConfigureAwait(false) is used in which case it's more like 'fire and forget' mode when we don't really care which thread will do the continuation).
When you create a separate Thread you do parallelism by delegating some code to run in a separate Thread. So depending on the code itself it may or may not be executed asynchronously.
It's like if you say to compiler:"Take this piece of work start a new thread and do it there"
If you still want to use Tasks with parallelism you could create an array of tasks in a loop and then wait for all of them to finish execution:
var tasks = new[]
{
childrenANDparents.Select(item=> Task.Run(() => SendUpdatedSiteInfo(item.Host,site_fr)));
}
await Task.WhenAll(tasks);
P.S.
And yes you may as well use TPL (Task Parallel Library) and specifically Parallel loops.

You could use a simple Parallel.ForEach or PLinq
Parallel.ForEach(childrenANDparents, (item) =>
{
SendUpdatedSiteInfo(item.Host,site_fr)
});
To better understand async and await its best to start reading some docos, its a large topic, but its worth your while
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/

How to pass LongRunning flag specifically to Task.Run()?

I need a way to set an async task as long running without using Task.Factory.StartNew(...) and instead using Task.Run(...) or something similar.
Context:
I have Task that loops continuously until it is externally canceled that I would like to set as 'long running' (i.e. give it a dedicated thread). This can be achieved through the code below:
var cts = new CancellationTokenSource();
Task t = Task.Factory.StartNew(
async () => {
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
} }, cts.Token, TaskCreationOptions.LongRunning, TaskScheduler.Default);
The problem is that Task.Factory.StartNew(...) does not return the active async task that is passed in but rather a 'task of running the Action' which functionally always has taskStatus of 'RanToCompletion'. Since my code needs to be able to track the task's status to see when it becomes 'Canceled' (or 'Faulted') I need to use something like below:
var cts = new CancellationTokenSource();
Task t = Task.Run(
async () => {
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
} }, cts.Token);
Task.Run(...), as desired, returns the async process itself allowing me to obtain actual statuses of 'Canceled' or 'Faulted'. I cannot specify the task as long running, however. So, anyone know how to best go about running an async task while both storing that active task itself (with desired taskStatus) and setting the task to long running?

I have Task that loops continuously until it is externally canceled that I would like to set as 'long running' (i.e. give it a dedicated thread)... anyone know how to best go about running an async task while both storing that active task itself (with desired taskStatus) and setting the task to long running?
There's a few problems with this. First, "long running" does not necessarily mean a dedicated thread - it just means that you're giving the TPL a hint that the task is long-running. In the current (4.5) implementation, you will get a dedicated thread; but that's not guaranteed and could change in the future.
So, if you need a dedicated thread, you'll have to just create one.
The other problem is the notion of an "asynchronous task". What actually happens with async code running on the thread pool is that the thread is returned to the thread pool while the asynchronous operation (i.e., Task.Delay) is in progress. Then, when the async op completes, a thread is taken from the thread pool to resume the async method. In the general case, this is more efficient than reserving a thread specifically to complete that task.
So, with async tasks running on the thread pool, dedicated threads don't really make sense.
Regarding solutions:
If you do need a dedicated thread to run your async code, I'd recommend using the AsyncContextThread from my AsyncEx library:
using (var thread = new AsyncContextThread())
{
Task t = thread.TaskFactory.Run(async () =>
{
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
}
});
}
However, you almost certainly don't need a dedicated thread. If your code can execute on the thread pool, then it probably should; and a dedicated thread doesn't make sense for async methods running on the thread pool. More specifically, the long-running flag doesn't make sense for async methods running on the thread pool.
Put another way, with an async lambda, what the thread pool actually executes (and sees as tasks) are just the parts of the lambda in-between the await statements. Since those parts aren't long-running, the long-running flag is not required. And your solution becomes:
Task t = Task.Run(async () =>
{
while (true)
{
cts.Token.ThrowIfCancellationRequested(); // not long-running
try
{
"Running...".Dump(); // not long-running
await Task.Delay(500, cts.Token); // not executed by the thread pool
}
catch (TaskCanceledException ex) { }
}
});

Call Unwrap on the task returned from Task.Factory.StartNew this will return the inner task, which has the correct status.
var cts = new CancellationTokenSource();
Task t = Task.Factory.StartNew(
async () => {
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException ex) { }
} }, cts.Token, TaskCreationOptions.LongRunning, TaskScheduler.Default).Unwrap();

On a dedicated thread, there's nothing to yield to. Don't use async and await, use synchronous calls.
This question gives two ways to do a cancellable sleep without await:
Task.Delay(500, cts.Token).Wait(); // requires .NET 4.5
cts.WaitHandle.WaitOne(TimeSpan.FromMilliseconds(500)); // valid in .NET 4.0 and later
If part of your work does use parallelism, you can start parallel tasks, saving those into an array, and use Task.WaitAny on the Task[]. Still no use for await in the main thread procedure.

This is unnecessary and Task.Run will suffice as the Task Scheduler will set any task to LongRunning if it runs for more than 0.5 seconds.
See here why.
https://blog.stephencleary.com/2013/08/startnew-is-dangerous.html
You need to specify custom TaskCreationOptions. Let’s consider each of
the options. AttachedToParent shouldn’t be used in async tasks, so
that’s out. DenyChildAttach should always be used with async tasks
(hint: if you didn’t already know that, then StartNew isn’t the tool
you need). DenyChildAttach is passed by Task.Run. HideScheduler might
be useful in some really obscure scheduling scenarios but in general
should be avoided for async tasks. That only leaves LongRunning and
PreferFairness, which are both optimization hints that should only be
specified after application profiling. I often see LongRunning misused
in particular. In the vast majority of situations, the threadpool will
adjust to any long-running task in 0.5 seconds - without the
LongRunning flag. Most likely, you don’t really need it.

The real issue you have here is that your operation is not in fact long running. The actual work you're doing is an asynchronous operation, meaning it will return to the caller basically immediately. So not only do you not need to use the long running hint when having the task scheduler schedule it, there's no need to even use a thread pool thread to do this work, because it'll be basically instantaneous. You shouldn't be using StartNew or Run at all, let alone with the long running flag.
So rather than taking your asynchronous method and starting it in another thread, you can just start it right on the current thread by calling the asynchronous method. Offloading the starting of an already asynchronous operation is just creating more work that'll make things slower.
So your code simplifies all the way down to:
var cts = new CancellationTokenSource();
Task t = DoWork();
async Task DoWork()
{
while (true)
{
cts.Token.ThrowIfCancellationRequested();
try
{
"Running...".Dump();
await Task.Delay(500, cts.Token);
}
catch (TaskCanceledException) { }
}
}

I think the consideration should be not how long the thread run but how much of its time is it really working.
In your example there is short work and them await Task.Delay(...).
If this is really the case in your project you probably shouldn't use a dedicated thread for this task and let it run on the regular thread pool. Every time you'll call await on an IO operation or on Task.Delay() you'll release the thread for other tasks to use.
You should only use LongRunning when you'll decrease your thread from the thread-pool and never give it back or give it back only for a small percentage of the time. In such a case (where the work is long and Task.Delay(...) is short in comparison) using a dedicated thread for the job is a reasonable solution.
On the other hand if your thread is really working most of the time it will consume system resources (CPU time) and maybe it doesn't really matter if it holds a thread of the thread-pool since it is preventing other work from happening anyway.
Conclusion? Just use Task.Run() (without LongRunning) and use await in your long running task when and if it is possible. Revert to LongRunning only when you actually see the other approach is causing you problems and even then check your code and design to make sure it is really necessary and there isn't something else you can change in your code.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.