.NET Framework 4.0: Chaining tasks in a loop - c#

I want to chain multiple Tasks, so that when one ends the next one starts. I know I can do this using ContinueWith. But what if I have a large number of tasks, so that:
t1 continues with t2
t2 continues with t3
t3 continues with t4
...
Is there a nice way to do it, other than creating this chain manually using a loop?

Well, assuming you have some sort of enumerable of Action delegates or something you want to do, you can easily use LINQ to do the following:
// Create the base task. Run synchronously.
var task = new Task(() => { });
task.RunSynchronously();
// Chain them all together.
var query =
// For each action
from action in actions
// Assign the task to the continuation and
// return that.
select (task = task.ContinueWith(action));
// Get the last task to wait on.
// Note that this cannot be changed to "Last"
// because the actions enumeration could have no
// elements, meaning that Last would throw.
// That means task can be null, so a check
// would have to be performed on it before
// waiting on it (unless you are assured that
// there are items in the action enumeration).
task = query.LastOrDefault();
The above code is really your loop, just in a fancier form. It does the same thing in that it takes the previous task (after primed with a dummy "noop" Task) and then adds a continuation in the form of ContinueWith (assigning the continuation to the current task in the process for the next iteration of the loop, which is performed when LastOrDefault is called).

You may use static extensions ContinueWhenAll here.
So you can pass multiple tasks.
Update
You can use a chaining extension such as this:
public static class MyTaskExtensions
{
public static Task BuildChain(this Task task,
IEnumerable<Action<Task>> actions)
{
if (!actions.Any())
return task;
else
{
Task continueWith = task.ContinueWith(actions.First());
return continueWith.BuildChain(actions.Skip(1));
}
}
}

Related

Does Task objects internally contain a collection of ContinueWith tasks?

I'm reading a book which says:
Task objects internally contain a collection of ContinueWith tasks.
So you can actually call ContinueWith several times using a single Task object. When the task
completes, all the ContinueWith tasks will be queued to the thread pool.
so I try to check the source code of Task https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs,146
I didn't find any private field that looks like a collection of ContinueWith tasks.
So my question is , does Task objects internally contain a collection of ContinueWith tasks?
And if it does, let's say we have the following code:
Task<Int32> t = Task.Run(() => Sum(10000));
Task a = t.ContinueWith(task => Console.WriteLine("The sum is: " + task.Result),
TaskContinuationOptions.OnlyOnRanToCompletion);
Task b = t.ContinueWith(task => Console.WriteLine("Sum threw: " + task.Exception.InnerException),
TaskContinuationOptions.OnlyOnFaulted);
if (a == b) {
... // false
{
Since calling ContinueWith just add an item to a collection, then a and b should point to the same Task object, but a == b return false?
I guess you could say the answer is yes and no. The Task class utilizes Interlocked in order to keep up with continuations.
You can see the method for adding a continuation named AddTaskContinuationComplex(object tc, bool addBeforeOthers) at line 4727. Within this method the continuation is added to a list and then passed to Interlocked.
// Construct a new TaskContinuation list
List<object> newList = new List<object>();
// Add in the old single value
newList.Add(oldValue);
// Now CAS in the new list
Interlocked.CompareExchange(ref m_continuationObject, newList, oldValue);
Now if you take a look at the FinishContinuations (line 3595) method you can see
// Atomically store the fact that this task is completing. From this point on, the adding of continuations will
// result in the continuations being run/launched directly rather than being added to the continuation list.
object continuationObject = Interlocked.Exchange(ref m_continuationObject, s_taskCompletionSentinel);
TplEtwProvider.Log.RunningContinuation(Id, continuationObject);
Here the current task is being marked as completed and the next task is being obtained from Interlocked.
Now if you go to the Interlocked class and examine it, you can see that while it isn't necessarily a collection itself it keeps up with and maintains continuations in a thread-safe manner.

How to wait on Async tasks

static void Main(string[] args)
{
Action myAction = async () =>
{
await Task.Delay(5);
Console.WriteLine(Interlocked.Add(ref ExecutionCounter, 1));
};
var actions = new[] { myAction, myAction, myAction };
Task.WaitAll(actions.Select(a => Execute(a)).ToArray()); //This blocks, right?
Console.WriteLine("Done waiting on tasks.");
Console.ReadLine();
}
static int ExecutionCounter = 0;
private static Task Execute(Action a)
{
return Task.Factory.StartNew(async () =>
{
await Task.Delay(5);
a();
});
}
This seems simple enough, but naturally the output always looks like this (the order of the numbers change, of course):
Done waiting on tasks.
2
1
3
What am I missing here? Why doesn't Task.WaitAll block like I'm expecting it to?
So there are several separate bugs here.
First, for Execute, you're using StartNew with an async lambda. Since StartNew doesn't have a Task<Task> returning overload, like Task.Run does, you've got a method that returns a Task indicating when the asynchronous operation has finished starting, not when the asynchronous operation has finished, which means that the Task returned by Execute will be completed basically right away, rather than after Delay finishes or the action you call finishes. Additionally, there's simply no reason to use StartNew or Run at all when running asynchronous methods, you can just execute them normally and await them without pushing them to a thread pool thread.
Next, Execute accepts an Action, which implies that it's a synchronous method that doesn't compute any value. What you're providing is an asynchronous method, but as the delegate doesn't return a Task, Execute can't await it. If you want Execute to handle asynchronous methods, it needs to accept a delegate that returns a Task.
So given all of that Execute should look like this.
private static async Task Execute(Func<Task> action)
{
await Task.Delay(TimeSpan.FromMilliseconds(5));
await action();
}
Next onto the Main method. As mentioned before Execute is accepting an Action when you're trying to provide an async method. This means that when the action is run the code will continued executing before your actions have finished. You need to adjust it to using a Task returning method.
After all of that, your code still has a race condition in it, at a conceptual level, that will prevent you from theoretically getting the results in the right order. You're performing 3 different operations in parallel, and as a result of that, they can finish in any order. While you are atomically incrementing the counter, it's possible for one thread to increment the counter, then another to run, increment the counter, print its value, then have the other thread run again and print out the value, given you a possible output of what you have, even after fixing all of the bugs mentioned above. To ensure that the values are printed in order, you need to ensure that the increment and the console write are performed atomically.
Now you can write out your Main method like so:
int ExecutionCounter = 0;
object key = new object();
Func<Task> myAction = async () =>
{
await Task.Delay(TimeSpan.FromMilliseconds(5));
lock (key)
{
Console.WriteLine(++ExecutionCounter);
}
};
var actions = new[] { myAction, myAction, myAction };
Task.WaitAll(actions.Select(a => Execute(a)).ToArray()); //This blocks, right?
And yes, as your comment mentions, calling WaitAll will block, rather than being asynchronous.

C# async aggregate and dispatch

I'm having trouble assimilating the c# Task, async and await patterns.
Windows service, .NET v4.5.2 server-side.
I have a Windows service accepting a variety of sources of incoming records, arriving externally ad-hoc via a self-hosted web api. I would like to batch up these records and then forward them on to another service. If the number of batched records exceeds a threshold, the batch should be dispatched immediately. Furthermore, the batch as it stands should also be dispatched if a time interval has elapsed. This means that a record is never held for more than N seconds.
I'm struggling to fit this into a Task based async pattern.
In days gone by, I would have created a Thread, a ManualResetEvent and a System.Threading.Timer. The Thread would loop around a Wait on the reset event. The Timer would set the event when fired, as would the code doing the aggregation when the batch size exceeded the threshold. Following the Wait, the Thread would stop the Timer, do the dispatch (an HTTP Post), reset the Timer and clear the ManualResetEvent, the loop back and Wait.
However, I am seeing folk say that this is 'bad' as the Wait just blocks a valuable thread resource, and that async/await is my panacea.
First off, are they right? Is my way out-of-date and inefficient or can I JFDI?
I've found examples here for batching and here for tasks at intervals, but not a combination of the two.
Is this requirement actually compatible with async/await?
Actually, you're almost doing the right thing, and they are also partially right.
What you should know is that you should avoid idle threads, with long waiting on events or waiting for I/O to complete (waiting on locks with few contention and fast statement blocks or spinning loops with compare-and-swap are usually OK).
What most of them don't know is that tasks are not magic, for instance, Task.Delay uses a Timer (more exactly, a System.Threading.Timer) and waiting on a non-complete task ends up using a ManualResetEventSlim (an improvement over ManualResetEvent, as it doesn't create a Win32 event unless explicitly asked for, e.g. ((IAsyncResult)task).AsyncWaitHandle).
So yes, your requirements are achievable with async/await, or tasks in general.
Runnable example at .NET Fiddle:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
public class Record
{
private int n;
public Record(int n)
{
this.n = n;
}
public int N { get { return n; } }
}
public class RecordReceiver
{
// Arbitrary constants
// You should fetch value from configuration and define sensible defaults
private static readonly int threshold = 5;
// I chose a low value so the example wouldn't timeout in .NET Fiddle
private static readonly TimeSpan timeout = TimeSpan.FromMilliseconds(100);
// I'll use a Stopwatch to trace execution times
private readonly Stopwatch sw = Stopwatch.StartNew();
// Using a separate private object for locking
private readonly object lockObj = new object();
// The list of accumulated records to execute in a batch
private List<Record> records = new List<Record>();
// The most recent TCS to signal completion when:
// - the list count reached the threshold
// - enough time has passed
private TaskCompletionSource<IEnumerable<Record>> batchTcs;
// A CTS to cancel the timer-based task when the threshold is reached
// Not strictly necessary, but it reduces resource usage
private CancellationTokenSource delayCts;
// The task that will be completed when a batch of records has been dispatched
private Task dispatchTask;
// This method doesn't use async/await,
// because we're not doing an async flow here.
public Task ReceiveAsync(Record record)
{
Console.WriteLine("Received record {0} ({1})", record.N, sw.ElapsedMilliseconds);
lock (lockObj)
{
// When the list of records is empty, set up the next task
//
// TaskCompletionSource is just what we need, we'll complete a task
// not when we've finished some computation, but when we reach some criteria
//
// This is the main reason this method doesn't use async/await
if (records.Count == 0)
{
// I want the dispatch task to run on the thread pool
// In .NET 4.6, there's TaskCreationOptions.RunContinuationsAsynchronously
// .NET 4.6
//batchTcs = new TaskCompletionSource<IEnumerable<Record>>(TaskCreationOptions.RunContinuationsAsynchronously);
//dispatchTask = DispatchRecordsAsync(batchTcs.Task);
// Previously, we have to set up a continuation task using the default task scheduler
// .NET 4.5.2
batchTcs = new TaskCompletionSource<IEnumerable<Record>>();
var asyncContinuationsTask = batchTcs.Task
.ContinueWith(bt => bt.Result, TaskScheduler.Default);
dispatchTask = DispatchRecordsAsync(asyncContinuationsTask);
// Create a cancellation token source to be able to cancel the timer
//
// To be used when we reach the threshold, to release timer resources
delayCts = new CancellationTokenSource();
Task.Delay(timeout, delayCts.Token)
.ContinueWith(
dt =>
{
// When we hit the timer, take the lock and set the batch
// task as complete, moving the current records to its result
lock (lockObj)
{
// Avoid dispatching an empty list of records
//
// Also avoid a race condition by checking the cancellation token
//
// The race would be for the actual timer function to start before
// we had a chance to cancel it
if ((records.Count > 0) && !delayCts.IsCancellationRequested)
{
batchTcs.TrySetResult(new List<Record>(records));
records.Clear();
}
}
},
// Since our continuation function is fast, we want it to run
// ASAP on the same thread where the actual timer function runs
//
// Note: this is just a hint, but I trust it'll be favored most of the time
TaskContinuationOptions.ExecuteSynchronously);
// Remember that we want our batch task to have continuations
// running outside the timer thread, since dispatching records
// is probably too much work for a timer thread.
}
// Actually store the new record somewhere
records.Add(record);
// When we reach the threshold, set the batch task as complete,
// moving the current records to its result
//
// Also, cancel the timer task
if (records.Count >= threshold)
{
batchTcs.TrySetResult(new List<Record>(records));
delayCts.Cancel();
records.Clear();
}
// Return the last saved dispatch continuation task
//
// It'll start after either the timer or the threshold,
// but more importantly, it'll complete after it dispatches all records
return dispatchTask;
}
}
// This method uses async/await, since we want to use the async flow
internal async Task DispatchRecordsAsync(Task<IEnumerable<Record>> batchTask)
{
// We expect it to return a task right here, since the batch task hasn't had
// a chance to complete when the first record arrives
//
// Task.ConfigureAwait(false) allows us to run synchronously and on the same thread
// as the completer, but again, this is just a hint
//
// Remember we've set our task to run completions on the thread pool?
//
// With .NET 4.6, completing a TaskCompletionSource created with
// TaskCreationOptions.RunContinuationsAsynchronously will start scheduling
// continuations either on their captured SynchronizationContext or TaskScheduler,
// or forced to use TaskScheduler.Default
//
// Before .NET 4.6, completing a TaskCompletionSource could mean
// that continuations ran withing the completer, especially when
// Task.ConfigureAwait(false) was used on an async awaiter, or when
// Task.ContinueWith(..., TaskContinuationOptions.ExecuteSynchronously) was used
// to set up a continuation
//
// That's why, before .NET 4.6, we need to actually run a task for that effect,
// and we used Task.ContinueWith without TaskContinuationOptions.ExecuteSynchronously
// and with TaskScheduler.Default, to ensure it gets scheduled
//
// So, why am I using Task.ConfigureAwait(false) here anyway?
// Because it'll make a difference if this method is run from within
// a Windows Forms or WPF thread, or any thread with a SynchronizationContext
// or TaskScheduler that schedules tasks on a dedicated thread
var batchedRecords = await batchTask.ConfigureAwait(false);
// Async methods are transformed into state machines,
// much like iterator methods, but with async specifics
//
// What await actually does is:
// - check if the awaitable is complete
// - if so, continue executing
// Note: if every awaited awaitable is complete along an async method,
// the method will complete synchronously
// This is only expectable with tasks that have already completed
// or I/O that is always ready, e.g. MemoryStream
// - if not, return a task and schedule a continuation for just after the await expression
// Note: the continuation will resume the state machine on the next state
// Note: the returned task will complete on return or on exception,
// but that is something the compiled state machine will handle
foreach (var record in batchedRecords)
{
Console.WriteLine("Dispatched record {0} ({1})", record.N, sw.ElapsedMilliseconds);
// I used Task.Yield as a replacement for actual work
//
// It'll force the async state machine to always return here
// and shedule a continuation that reenters the async state machine right afterwards
//
// This is not something you usually want on production code,
// so please replace this with the actual dispatch
await Task.Yield();
}
}
}
public class Program
{
public static void Main()
{
// Our main entry point is synchronous, so we run an async entry point and wait on it
//
// The difference between MainAsync().Result and MainAsync().GetAwaiter().GetResult()
// is in the way exceptions are thrown:
// - the former aggregates exceptions, throwing an AggregateException
// - the latter doesn't aggregate exceptions if it doesn't have to, throwing the actual exception
//
// Since I'm not combining tasks (e.g. Task.WhenAll), I'm not expecting multiple exceptions
//
// If my main method returned int, I could return the task's result
// and I'd make MainAsync return Task<int> instead of just Task
MainAsync().GetAwaiter().GetResult();
}
// Async entry point
public static async Task MainAsync()
{
var receiver = new RecordReceiver();
// I'll provide a few records:
// - a delay big enough between the 1st and the 2nd such that the 1st will be dispatched
// - 8 records in a row, such that 5 of them will be dispatched, and 3 of them will wait
// - again, a delay big enough that will provoke the last 3 records to be dispatched
// - and a final record, which will wait to be dispatched
//
// We await for Task.Delay between providing records,
// but we'll await for the records in the end only
//
// That is, we'll not await each record before the next,
// as that would mean each record would only be dispatched after at least the timeout
var t1 = receiver.ReceiveAsync(new Record(1));
await Task.Delay(TimeSpan.FromMilliseconds(300));
var t2 = receiver.ReceiveAsync(new Record(2));
var t3 = receiver.ReceiveAsync(new Record(3));
var t4 = receiver.ReceiveAsync(new Record(4));
var t5 = receiver.ReceiveAsync(new Record(5));
var t6 = receiver.ReceiveAsync(new Record(6));
var t7 = receiver.ReceiveAsync(new Record(7));
var t8 = receiver.ReceiveAsync(new Record(8));
var t9 = receiver.ReceiveAsync(new Record(9));
await Task.Delay(TimeSpan.FromMilliseconds(300));
var t10 = receiver.ReceiveAsync(new Record(10));
// I probably should have used a list of records, but this is just an example
await Task.WhenAll(t1, t2, t3, t4, t5, t6, t7, t8, t9, t10);
}
}
You can make this more interesting, like returning a distinct task, such as Task<RecordDispatchReport>, from ReceiveAsync which is completed by the processing part of DispatchRecords, using a TaskCompletionSource for each record.

Execute set of tasks in parallel but with a group timeout

I'm currently trying to write a status checking tool with a reliable timeout value. One way I'd seen how to do this was using Task.WhenAny() and including a Task.Delay, however it doesn't seem to produce the results I expect:
public void DoIUnderstandTasksTest()
{
var checkTasks = new List<Task>();
// Create a list of dummy tasks that should just delay or "wait"
// for some multiple of the timeout
for (int i = 0; i < 10; i++)
{
checkTasks.Add(Task.Delay(_timeoutMilliseconds/2));
}
// Wrap the group of tasks in a task that will wait till they all finish
var allChecks = Task.WhenAll(checkTasks);
// I think WhenAny is supposed to return the first task that completes
bool didntTimeOut = Task.WhenAny(allChecks, Task.Delay(_timeoutMilliseconds)) == allChecks;
Assert.True(didntTimeOut);
}
What am I missing here?
I think you're confusing the workings of the When... calls with Wait....
Task.WhenAny doesn't return the first task to complete among those you pass to it. Rather, it returns a new Task that will be completed when any of the internal tasks finish. This means your equality check will always return false - the new task will never equal the previous one.
The behavior you're expecting seems similar to Task.WaitAny, which will block current execution until any of the internal tasks complete, and return the index of the completed task.
Using WaitAny, your code will look like this:
// Wrap the group of tasks in a task that will wait till they all finish
var allChecks = Task.WhenAll(checkTasks);
var taskIndexThatCompleted = Task.WaitAny(allChecks, Task.Delay(_timeoutMilliseconds));
Assert.AreEqual(0, taskIndexThatCompleted);

c# Executing Multiple calls in Parallel

I'm looping through an Array of values, for each value I want to execute a long running process. Since I have multiple tasks to be performed that have no inter dependency I want to be able to execute them in parallel.
My code is:
List<Task<bool>> dependantTasksQuery = new List<Task<bool>>();
foreach (int dependantID in dependantIDList)
{
dependantTasksQuery.Add(WaitForDependantObject(dependantID));
}
Task<bool>[] dependantTasks = dependantTasksQuery.ToArray();
//Wait for all dependant tasks to complete
bool[] lengths = await Task.WhenAll(dependantTasks);
The WaitForDependantObject method just looks like:
async Task<bool> WaitForDependantObject(int idVal)
{
System.Threading.Thread.Sleep(20000);
bool waitDone = true;
return waitDone;
}
As you can see I've just added a sleep to highlight my issue. What is happening when debugging is that on the line:
dependantTasksQuery.Add(WaitForDependantObject(dependantID));
My code is stopping and waiting the 20 seconds for the method to complete. I did not want to start the execution until I had completed the loop and built up the Array. Can somebody point me to what I'm doing wrong? I'm pretty sure I need an await somewhere
In your case WaitForDependantObject isn't asynchronous at all even though it returns a task. If that's your goal do as Luke Willis suggests. To make these calls both asynchronous and truly parallel you need to offload them to a Thread Pool thread with Task.Run:
bool[] lengths = await Task.WhenAll(dependantIDList.Select(() => Task.Run(() => WaitForDependantObject(dependantID))));
async methods run synchronously until an await is reached and them returns a task representing the asynchronous operation. In your case you don't have an await so the methods simply execute one after the other. Task.Run uses multiple threads to enable parallelism even on these synchronous parts on top of the concurrency of awaiting all the tasks together with Task.WhenAll.
For WaitForDependantObject to represent an async method more accurately it should look like this:
async Task<bool> WaitForDependantObject(int idVal)
{
await Task.Delay(20000);
return true;
}
Use Task.Delay to make method asynchronous and looking more real replacement of mocked code:
async Task<bool> WaitForDependantObject(int idVal)
{
// how long synchronous part of method takes (before first await)
System.Threading.Thread.Sleep(1000);
// method returns as soon as awiting started
await Task.Delay(2000); // how long IO or other async operation takes place
// simulate data processing, would run on new thread unless
// used in WPF/WinForms/ASP.Net and no call to ConfigureAwait(false) made by caller.
System.Threading.Thread.Sleep(1000);
bool waitDone = true;
return waitDone;
}
You can do this using Task.Factory.StartNew.
Replace this:
dependantTasksQuery.Add(WaitForDependantObject(dependantID));
with this:
dependantTasksQuery.Add(
Task.Factory.StartNew(
() => WaitForDependantObject(dependantID)
)
);
This will run your method within a new Task and add the task to your List.
You will also want to change the method signature of WaitForDependantObject to be:
bool WaitForDependantObject(int idVal)
You can then wait for your tasks to complete with:
Task.WaitAll(dependentTasksQuery.ToArray());
And get your results with:
bool[] lengths = dependentTasksQuery.Select(task => task.Result).ToArray();

Categories

Resources