I want to chain Tasks, then start the chain in parallel.
This snippet is just to illustrate my question:
var taskOrig = new Task(() => { });
var task = taskOrig;
foreach (var msg in messages)
{
task=task.ContinueWith(t => Console.WriteLine(msg));
}
taskOrig.Start();
Everything works fine except a little perfectionist inside me doesn't like having empty method executed first () => { }.
Is there any way to avoid it?
I do understand It's barely affecting performance (unless you do it really often), but still. Performance matters in my case, so checking if task exists in every iteration is not the way to do it.
You could do this:
Task task = Task.FromResult<object>(null);
foreach (var msg in messages)
{
task = task.ContinueWith(t => Console.WriteLine(msg));
}
The previous solution won't work in 4.0. In 4.0 you'd need to do the following instead:
var tcs = new TaskCompletionSource<object>();
Task task = tcs.Task;
foreach (var msg in messages)
{
task = task.ContinueWith(t => Console.WriteLine(msg));
}
tcs.SetResult(null);
(You can move SetResult to before the foreach loop if you prefer.)
Technically it's not the same as the continuations will start executing while you're still adding more. That's unlikely to be a problem though.
Another option would be to use something like this:
public static Task ForEachAsync<T>(IEnumerable<T> items, Action<T> action)
{
return Task.Factory.StartNew(() =>
{
foreach (T item in items)
{
action(item);
}
});
}
An example usage would be:
ForEachAsync(messages, msg => Console.WriteLine(msg));
One way to do that, is to create task in loop if it is null, but the code you provide looks better for me:
Task task = null;
foreach (var msg in messages)
{
if (task == null)
task = new Task(() => Console.WriteLine(msg))
else
task = task.ContinueWith(t => Console.WriteLine(msg));
}
task.Start();
Perhaps this:
if(messages.Length > 0)
{
Task task = new Task(t => Console.WriteLine(messages[0]));
for(int i = 1; i < messages.Length; i++)
{
task = task.ContinueWith(t => Console.WriteLine(messages[i]));
}
task.Start();
}
Related
I have a collection of 1000 input message to process. I'm looping the input collection and starting the new task for each message to get processed.
//Assume this messages collection contains 1000 items
var messages = new List<string>();
foreach (var msg in messages)
{
Task.Factory.StartNew(() =>
{
Process(msg);
});
}
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time?
How to ensure this message get processed in the same sequence/order of the Collection?
You could use Parallel.Foreach and rely on MaxDegreeOfParallelism instead.
Parallel.ForEach(messages, new ParallelOptions {MaxDegreeOfParallelism = 10},
msg =>
{
// logic
Process(msg);
});
SemaphoreSlim is a very good solution in this case and I higly recommend OP to try this, but #Manoj's answer has flaw as mentioned in comments.semaphore should be waited before spawning the task like this.
Updated Answer: As #Vasyl pointed out Semaphore may be disposed before completion of tasks and will raise exception when Release() method is called so before exiting the using block must wait for the completion of all created Tasks.
int maxConcurrency=10;
var messages = new List<string>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach(var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Answer to Comments
for those who want to see how semaphore can be disposed without Task.WaitAll
Run below code in console app and this exception will be raised.
System.ObjectDisposedException: 'The semaphore has been disposed.'
static void Main(string[] args)
{
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
// Task.WaitAll(tasks.ToArray());
}
Console.WriteLine("Exited using block");
Console.ReadKey();
}
private static void Process(string msg)
{
Thread.Sleep(2000);
Console.WriteLine(msg);
}
I think it would be better to use Parallel LINQ
Parallel.ForEach(messages ,
new ParallelOptions{MaxDegreeOfParallelism = 4},
x => Process(x);
);
where x is the MaxDegreeOfParallelism
With .NET 5.0 and Core 3.0 channels were introduced.
The main benefit of this producer/consumer concurrency pattern is that you can also limit the input data processing to reduce resource impact.
This is especially helpful when processing millions of data records.
Instead of reading the whole dataset at once into memory, you can now consecutively query only chunks of the data and wait for the workers to process it before querying more.
Code sample with a queue capacity of 50 messages and 5 consumer threads:
/// <exception cref="System.AggregateException">Thrown on Consumer Task exceptions.</exception>
public static async Task ProcessMessages(List<string> messages)
{
const int producerCapacity = 10, consumerTaskLimit = 3;
var channel = Channel.CreateBounded<string>(producerCapacity);
_ = Task.Run(async () =>
{
foreach (var msg in messages)
{
await channel.Writer.WriteAsync(msg);
// blocking when channel is full
// waiting for the consumer tasks to pop messages from the queue
}
channel.Writer.Complete();
// signaling the end of queue so that
// WaitToReadAsync will return false to stop the consumer tasks
});
var tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
var consumerTasks = Enumerable
.Range(1, consumerTaskLimit)
.Select(_ => Task.Run(async () =>
{
try
{
while (await channel.Reader.WaitToReadAsync(ct))
{
ct.ThrowIfCancellationRequested();
while (channel.Reader.TryRead(out var message))
{
await Task.Delay(500);
Console.WriteLine(message);
}
}
}
catch (OperationCanceledException) { }
catch
{
tokenSource.Cancel();
throw;
}
}))
.ToArray();
Task waitForConsumers = Task.WhenAll(consumerTasks);
try { await waitForConsumers; }
catch
{
foreach (var e in waitForConsumers.Exception.Flatten().InnerExceptions)
Console.WriteLine(e.ToString());
throw waitForConsumers.Exception.Flatten();
}
}
As pointed out by Theodor Zoulias:
On multiple consumer exceptions, the remaining tasks will continue to run and have to take the load of the killed tasks. To avoid this, I implemented a CancellationToken to stop all the remaining tasks and handle the exceptions combined in the AggregateException of waitForConsumers.Exception.
Side note:
The Task Parallel Library (TPL) might be good at automatically limiting the tasks based on your local resources. But when you are processing data remotely via RPC, it's necessary to manually limit your RPC calls to avoid filling the network/processing stack!
If your Process method is async you can't use Task.Factory.StartNew as it doesn't play well with an async delegate. Also there are some other nuances when using it (see this for example).
The proper way to do it in this case is to use Task.Run. Here's #ClearLogic answer modified for an async Process method.
static void Main(string[] args)
{
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Run(async () =>
{
try
{
await Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Console.WriteLine("Exited using block");
Console.ReadKey();
}
private static async Task Process(string msg)
{
await Task.Delay(2000);
Console.WriteLine(msg);
}
You can create your own TaskScheduler and override QueueTask there.
protected virtual void QueueTask(Task task)
Then you can do anything you like.
One example here:
Limited concurrency level task scheduler (with task priority) handling wrapped tasks
You can simply set the max concurrency degree like this way:
int maxConcurrency=10;
var messages = new List<1000>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
foreach(var msg in messages)
{
Task.Factory.StartNew(() =>
{
concurrencySemaphore.Wait();
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
}
}
If you need in-order queuing (processing might finish in any order), there is no need for a semaphore. Old fashioned if statements work fine:
const int maxConcurrency = 5;
List<Task> tasks = new List<Task>();
foreach (var arg in args)
{
var t = Task.Run(() => { Process(arg); } );
tasks.Add(t);
if(tasks.Count >= maxConcurrency)
Task.WaitAny(tasks.ToArray());
}
Task.WaitAll(tasks.ToArray());
I ran into a similar problem where I wanted to produce 5000 results while calling apis, etc. So, I ran some speed tests.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
{
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id).GetAwaiter().GetResult();
});
produced 100 results in 30 seconds.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
{
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id);
});
Moving the GetAwaiter().GetResult() to the individual async api calls inside GetProductMetaData resulted in 14.09 seconds to produce 100 results.
foreach (var id in ids.Take(100))
{
GetProductMetaData(productsMetaData, client, id);
}
Complete non-async programming with the GetAwaiter().GetResult() in api calls resulted in 13.417 seconds.
var tasks = new List<Task>();
while (y < ids.Count())
{
foreach (var id in ids.Skip(y).Take(100))
{
tasks.Add(GetProductMetaData(productsMetaData, client, id));
}
y += 100;
Task.WhenAll(tasks).GetAwaiter().GetResult();
Console.WriteLine($"Finished {y}, {sw.Elapsed}");
}
Forming a task list and working through 100 at a time resulted in a speed of 7.36 seconds.
using (SemaphoreSlim cons = new SemaphoreSlim(10))
{
var tasks = new List<Task>();
foreach (var id in ids.Take(100))
{
cons.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
GetProductMetaData(productsMetaData, client, id);
}
finally
{
cons.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Using SemaphoreSlim resulted in 13.369 seconds, but also took a moment to boot to start using it.
var throttler = new SemaphoreSlim(initialCount: take);
foreach (var id in ids)
{
throttler.WaitAsync().GetAwaiter().GetResult();
tasks.Add(Task.Run(async () =>
{
try
{
skip += 1;
await GetProductMetaData(productsMetaData, client, id);
if (skip % 100 == 0)
{
Console.WriteLine($"started {skip}/{count}, {sw.Elapsed}");
}
}
finally
{
throttler.Release();
}
}));
}
Using Semaphore Slim with a throttler for my async task took 6.12 seconds.
The answer for me in this specific project was use a throttler with Semaphore Slim. Although the while foreach tasklist did sometimes beat the throttler, 4/6 times the throttler won for 1000 records.
I realize I'm not using the OPs code, but I think this is important and adds to this discussion because how is sometimes not the only question that should be asked, and the answer is sometimes "It depends on what you are trying to do."
Now to answer the specific questions:
How to limit the maximum number of parallel tasks in c#: I showed how to limit the number of tasks that are completed at a time.
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time? I cannot guess how many will be processed at a time unless I set an upper limit but I can set an upper limit. Obviously different computers function at different speeds due to CPU, RAM etc. and how many threads and cores the program itself has access to as well as other programs running in tandem on the same computer.
How to ensure this message get processed in the same sequence/order of the Collection? If you want to process everything in a specific order, it is synchronous programming. The point of being able to run things asynchronously is ensuring that they can do everything without an order. As you can see from my code, the time difference is minimal in 100 records unless you use async code. In the event that you need an order to what you are doing, use asynchronous programming up until that point, then await and do things synchronously from there. For example, task1a.start, task2a.start, then later task1a.await, task2a.await... then later task1b.start task1b.await and task2b.start task 2b.await.
public static void RunTasks(List<NamedTask> importTaskList)
{
List<NamedTask> runningTasks = new List<NamedTask>();
try
{
foreach (NamedTask currentTask in importTaskList)
{
currentTask.Start();
runningTasks.Add(currentTask);
if (runningTasks.Where(x => x.Status == TaskStatus.Running).Count() >= MaxCountImportThread)
{
Task.WaitAny(runningTasks.ToArray());
}
}
Task.WaitAll(runningTasks.ToArray());
}
catch (Exception ex)
{
Log.Fatal("ERROR!", ex);
}
}
you can use the BlockingCollection, If the consume collection limit has reached, the produce will stop producing until a consume process will finish. I find this pattern more easy to understand and implement than the SemaphoreSlim.
int TasksLimit = 10;
BlockingCollection<Task> tasks = new BlockingCollection<Task>(new ConcurrentBag<Task>(), TasksLimit);
void ProduceAndConsume()
{
var producer = Task.Factory.StartNew(RunProducer);
var consumer = Task.Factory.StartNew(RunConsumer);
try
{
Task.WaitAll(new[] { producer, consumer });
}
catch (AggregateException ae) { }
}
void RunConsumer()
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Start();
}
}
void RunProducer()
{
for (int i = 0; i < 1000; i++)
{
tasks.Add(new Task(() => Thread.Sleep(1000), TaskCreationOptions.AttachedToParent));
}
}
Note that the RunProducer and RunConsumer has spawn two independent tasks.
I'm new to C# threads and tasks and I'm trying to develop a workflow but without success probably because I'm mixing tasks with for iterations...
The point is:
I've got a bunch of lists, and inside each one there are some things to do, and need to make them work as much parallel and less blocking possible, and as soon as each subBunchOfThingsTodo is done ( it means every thing to do inside it is done parallely) it has do some business(DoSomethingAfterEveryThingToDoOfThisSubBunchOfThingsAreDone()).
e.g:
bunchOfSubBunchsOfThingsTodo
subBunchOfThingsTodo
ThingToDo1
ThingToDo2
subBunchOfThingsTodo
ThingToDo1
ThingToDo2
ThingToDo3
subBunchOfThingsTodo
ThingToDo1
ThingToDo2...
This is how I'm trying but unfortunately each iteration waits the previous one bunchOfThingsToDo and I need them to work in parallel.
The same happens to the things to do , they wait the previous thing to start...
List<X> bunchOfSubBunchsOfThingsTodo = getBunchOfSubBunchsOfThingsTodo();
foreach (var subBunchOfThingsToDo in bunchOfSubBunchsOfThingsTodo)
{
int idSubBunchOfThingsToDo = subBunchOfThingsToDo.ThingsToDo.FirstOrDefault().IdSubBunchOfThingsToDo;
var parent = Task.Factory.StartNew(() =>
{
foreach (var thingToDo in subBunchOfThingsToDo.ThingsToDo)
{
var child = Task.Factory.StartNew(() =>
{
//Do some stuff with thingToDo... Here I call several business methods
});
}
});
parent.Wait();
DoSomethingAfterEveryThingToDoOfThisSubBunchOfThingsAreDone(idSubBunchOfThingsToDo);
}
You may want to try using Task.WhenAll and playing with linq to generate a collection of hot tasks:
static async void ProcessThingsToDo(IEnumerable<ThingToDo> bunchOfThingsToDo)
{
IEnumerable<Task> GetSubTasks(ThingToDo thing)
=> thing.SubBunchOfThingsToDo.Select( async subThing => await Task.Run(subThing));
var tasks = bunchOfThingsToDo
.Select(async thing => await Task.WhenAll(GetSubTasks(thing)));
await Task.WhenAll(tasks);
}
This way you are running each subThingToDo on a separate task and you get only one Task composed by all subtasks for each thingToDo
EDIT
ThingToDo is a rather simple class in this sample:
class ThingToDo
{
public IEnumerable<Action> SubBunchOfThingsToDo { get; }
}
With minimum changes of your code you can try this way:
var toWait = new List<Task>();
List<X> bunchOfSubBunchsOfThingsTodo = getBunchOfSubBunchsOfThingsTodo();
foreach (var subBunchOfThingsToDo in bunchOfSubBunchsOfThingsTodo)
{
int idSubBunchOfThingsToDo = subBunchOfThingsToDo.ThingsToDo.FirstOrDefault().IdSubBunchOfThingsToDo;
var parent = Task.Factory.StartNew(() =>
{
Parallel.ForEach(subBunchOfThingsToDo.ThingsToDo,
thingToDo =>
{
//Do some stuff with thingToDo... Here I call several business methods
});
});
//parent.Wait();
var handle = parent.ContinueWith((x) =>
{
DoSomethingAfterEveryThingToDoOfThisSubBunchOfThingsAreDone(idSubBunchOfThingsToDo);
})
.Start();
toWait.Add(handle);
}
Task.WhenAll(toWait);
Thanks to downvoters team, that advised 'good' solution:
var bunchOfSubBunchsOfThingsTodo = getBunchOfSubBunchsOfThingsTodo();
var toWait = bunchOfSubBunchsOfThingsTodo
.Select(subBunchOfThingsToDo =>
{
return Task.Run(() =>
{
int idSubBunchOfThingsToDo = subBunchOfThingsToDo.ThingsToDo.FirstOrDefault().IdSubBunchOfThingsToDo;
Parallel.ForEach(subBunchOfThingsToDo.ThingsToDo,
thingToDo =>
{
//Do some stuff with thingToDo... Here I call several business methods
});
DoSomethingAfterEveryThingToDoOfThisSubBunchOfThingsAreDone(idSubBunchOfThingsToDo);
});
});
Task.WhenAll(toWait);
I'm having trouble trying to correctly architect the most efficient way to iterate several async tasks launched from a request object and then performing some other async tasks that depend on both the request object and the result of the first async task. I'm running a C# lambda function in AWS. I've tried a model like this (error handling and such has been omitted for brevity):
public async Task MyAsyncWrapper()
{
List<Task> Tasks = new List<Task>();
foreach (var Request in Requests)
{
var Continuation = this.ExecuteAsync(Request).ContinueWith(async x => {
var KeyValuePair<bool, string> Result = x.Result;
if (Result.Key == true)
{
await this.DoSomethingElseAsync(Request.Id, Request.Name, Result.Value);
Console.WriteLine("COMPLETED");
}
}
Tasks.Add(Continuation);
}
Task.WaitAll(Tasks.ToArray());
}
This approach results in the DoSomethingElseAsync() method not really getting awaited on and in a lot of my Lambda Function calls, I never get the "COMPLETED" output. I've also approached this in this method:
public async Task MyAsyncWrapper()
{
foreach (var Request in Requests)
{
KeyValuePair<bool, string> Result = await this.ExecuteAsync(Request);
if (Result.Key == true)
{
await this.DoSomethingElseAsync(Request.Id, Request.Name, Result.Value);
Console.WriteLine("COMPLETED");
}
}
}
This works, but I think it's wasteful, since I can only execute one iteration of the loop while waiting on the asnyc's to finish. I also have referenced Interleaved Tasks but the issue is that I basically have two loops, one to populate the tasks, and another to iterate them after they've completed, where I don't have access to the original Request object anymore. So basically this:
List<Task<KeyValuePair<bool, string>>> Tasks = new List<Task<KeyValuePair<bool, string>>>();
foreach (var Request in Requests)
{
Tasks.Add(ths.ExecuteAsync(Request);
}
foreach (Task<KeyValuePair<bool, string>> ResultTask in Tasks.Interleaved())
{
KeyValuePair<bool, string> Result = ResultTask.Result;
//Can't access the original request for this method's parameters
await this.DoSomethingElseAsync(???, ???, Result.Value);
}
Any ideas on better ways to implement this type of async chaining in a foreach loop? My ideal approach wouldn't be to return the request object back as part of the response from ExecuteAsync(), so I'd like to try and find other options if possible.
I may be misinterpreting, but why not move your "iteration" into it's own function and then use Task.WhenAll to wait for all iterations in parallel.
public async Task MyAsyncWrapper()
{
var allTasks = Requests.Select(ProcessRequest);
await Task.WhenAll(allTasks);
}
private async Task ProcessRequest(Request request)
{
KeyValuePair<bool, string> Result = await this.ExecuteAsync(request);
if (Result.Key == true)
{
await this.DoSomethingElseAsync(request.Id, request.Name, Result.Value);
Console.WriteLine("COMPLETED");
}
}
Consider using TPL dataflow:
var a = new TransformBlock<Input, OutputA>(async Input i=>
{
// do something async.
return new OutputA();
});
var b = new TransformBlock<OutputA, OutputB>(async OutputA i =>
{
// do more async.
return new OutputB();
});
var c = new ActionBlock<OutputB>(async OutputB i =>
{
// do some final async.
});
a.LinkTo(b, new DataflowLinkOptions { PropogateCompletion = true });
b.LinkTo(c, new DataflowLinkOptions { PropogateCompletion = true });
// push all of the items into the dataflow.
a.Post(new Input());
a.Complete();
// wait for it all to complete.
await c.Completion;
I want to create a collection of awaitable tasks, so that I can start them together and asynchronously process the result from each one as they complete.
I have this code, and a compilation error:
> cannot assign void to an implicitly-typed variable
If I understand well, the tasks return by Select don't have a return type, even though the delegate passed returns ColetaIsisViewModel, I would think:
public MainViewModel()
{
Task.Run(LoadItems);
}
async Task LoadItems()
{
IEnumerable<Task> tasks = Directory.GetDirectories(somePath)
.Select(dir => new Task(() =>
new ItemViewModel(new ItemSerializer().Deserialize(dir))));
foreach (var task in tasks)
{
var result = await task; // <-- here I get the compilation error
DoSomething(result);
}
}
You shouldn't ever use the Task constructor.
Since you're calling synchronous code (Deserialize), you could use Task.Run:
async Task LoadItems()
{
var tasks = Directory.GetDirectories(somePath)
.Select(dir => Task.Run(() =>
new ItemViewModel(new ItemSerializer().Deserialize(dir))));
foreach (var task in tasks)
{
var result = await task;
DoSomething(result);
}
}
Alternatively, you could use Parallel or Parallel LINQ:
void LoadItems()
{
var vms = Directory.GetDirectories(somePath)
.AsParallel().Select(dir =>
new ItemViewModel(new ItemSerializer().Deserialize(dir)))
.ToList();
foreach (var vm in vms)
{
DoSomething(vm);
}
}
Or, if you make Deserialize a truly async method, then you can make it all asynchronous:
async Task LoadItems()
{
var tasks = Directory.GetDirectories(somePath)
.Select(async dir =>
new ItemViewModel(await new ItemSerializer().DeserializeAsync(dir))));
foreach (var task in tasks)
{
var result = await task;
DoSomething(result);
}
}
Also, I recommend that you do not use fire-and-forget in your constructor. There are better patterns for asynchronous constructors.
I know the question has been answered, but you can always do this too:
var serializer = new ItemSerializer();
var directories = Directory.GetDirectories(somePath);
foreach (string directory in directories)
{
await Task.Run(() => serializer.Deserialize(directory))
.ContinueWith(priorTask => DoSomething(priorTask.Result));
}
Notice I pulled out the serializer instantiation (assuming there are no side effects).
I hope this makes sense - Suppose I have the following code:
Task.Run(() =>
{
return Task.WhenAll
(
Task1,
Task2,
...
Taskn
)
.ContinueWith(tsks=>
{
TaskA (uses output from Tasks Task1 & Task2, say)
}
, ct)
.ContinueWith(res =>
{
TaskB (uses output from TaskA and Task3, say)
}
, ct);
});
So I want all my first N tasks to run concurrently (since we have no interdependencies), then only once they're all finished, to continue with a task that relies on their outputs (I get that for this, I can use the tsks.Result).
BUT THEN I want to continue with a task that relies on one of the first tasks and the result of TaskA.
I'm a bit lost how to structure my code correctly so I can access the results of my first set of tasks outside of the immediately proceeding ContinueWith.
My one thought was to assign return value to them within my method - Something like:
... declare variables outside of Tasks ...
Task.Run(() =>
{
return Task.WhenAll
(
Task.Run(() => { var1 = Task1.Result; }, ct),
...
Task.Run(() => { varn = Taskn.Result; }, ct),
)
.ContinueWith(tsks=>
{
TaskA (uses output from Tasks var1 & varn, say)
}
, ct)
.ContinueWith(res =>
{
TaskB (uses output from TaskA and var3, say)
}
, ct);
});
But, even though this works for me, I have no doubt that that is doing it wrong.
What is the correct way? Should I have a state object that contains all the necessary variables and pass that throughout all my tasks? Is there a better way in total?
Please forgive my ignorance here - I'm just VERY new to concurrency programming.
Since Task1, Task2, ... , TaskN are in scope for the call of WhenAll, and because by the time ContinueWith passes control to your next task all the earlier tasks are guaranteed to finish, it is safe to use TaskX.Result inside the code implementing continuations:
.ContinueWith(tsks=>
{
var resTask1 = Task1.Result;
...
}
, ct)
You are guaranteed to get the result without blocking, because the task Task1 has finished running.
Here is a way to do it with ConcurrentDictionary, which sounds like it might be applicable in your use case. Also, since you're new to concurrency, it shows you the Interlocked class as well:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Executing...");
var numOfTasks = 50;
var tasks = new List<Task>();
for (int i = 0; i < numOfTasks; i++)
{
var iTask = Task.Run(() =>
{
var counter = Interlocked.Increment(ref _Counter);
Console.WriteLine(counter);
if (counter == numOfTasks - 1)
{
Console.WriteLine("Waiting {0} ms", 5000);
Task.Delay(5000).Wait(); // to simulate a longish running task
}
_State.AddOrUpdate(counter, "Updated Yo!", (k, v) =>
{
throw new InvalidOperationException("This shouldn't occure more than once.");
});
});
tasks.Add(iTask);
}
Task.WhenAll(tasks)
.ContinueWith(t =>
{
var longishState = _State[numOfTasks - 1];
Console.WriteLine(longishState);
Console.WriteLine("Complete. longishState: " + longishState);
});
Console.ReadKey();
}
static int _Counter = -1;
static ConcurrentDictionary<int, string> _State = new ConcurrentDictionary<int, string>();
}
You get output similar to this (though it the Waiting line won't always be last before the continuation):
An elegant way to solve this is to use Barrier class.
Like this:
var nrOfTasks = ... ;
ConcurrentDictionary<int, ResultType> Results = new ConcurrentDictionary<int, ResultType>();
var barrier = new Barrier(nrOfTasks, (b) =>
{
// here goes the work of TaskA
// and immediatley
// here goes the work of TaskB, having the results of TaskA and any other task you might need
});
Task.Run(() => { Results[1] = Task1.Result; barrier.SignalAndWait(); }, ct),
...
Task.Run(() => { Results[nrOfTasks] = Taskn.Result; barrier.SignalAndWait(); }, ct