Process a list of tasks without concurrency

Process a list of tasks without concurrency - c#

Given this list:
var tasks = new List<Task>
{
MyMethod1,
MyMethod2,
MyMethod3
};
And these methods:
async Task MyMethod1()
{
await SomeOtherMethod1();
}
async Task MyMethod2()
{
await SomeOtherMethod2();
}
async Task MyMethod3()
{
await SomeOtherMethod3();
}
Is it possible to do this where each task completes in order (no concurrency):
foreach (var task in tasks)
{
await task;
}
I haven't been able to find any way or examples where this is done. foreach just fires them all off. There is await foreach, but Task doesn't contain a public instance or extension definition for GetAsyncEnumerator.

foreach doesn't start the tasks. By convention, tasks are returned "hot" - that is, already running. So, the very fact that you have a list of tasks implies they are already running concurrently.
If you want to create a list of asynchronous actions to execute in the future (i.e., asynchronous delegates), then you want to use a List<Func<Task>> instead of a List<Task>, and then you can foreach over each delegate, invoking it and awaiting the returned task.

Do u mean you want to wait for them to finish one by one by their order?
You can do something simple like a regular for loop (actually foreach should have the same results) and .Wait() or I didn't fully understand what you are trying to achieve...
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace TestStuff
{
class Program
{
static void Main(string[] args)
{
Task t1 = new Task(() => {
Thread.Sleep(2000);
Console.WriteLine("I'm T1! I'm done!");
});
Task t2 = new Task(() => {
Thread.Sleep(1000);
Console.WriteLine("I'm T2! I'm done!");
});
Task t3 = new Task(() => {
Thread.Sleep(500);
Console.WriteLine("I'm T3! I'm done!");
});
List<Task> tasks = new List<Task>() { t1, t2, t3 };
for (int i = 0; i < tasks.Count; i++)
{
tasks[i].Start();
tasks[i].Wait();
}
}
}
}

Thank you everyone for the comments and answers. The answers really got me to a solution. I upvoted both answers. I'm putting an answer here, because it updates the code in the question to a working version that I'm using.
var tasks = new List<Func<Task>>
{
async () => await MyMethod1,
async () => await MyMethod2,
async () => await MyMethod3
};
foreach (var task in tasks)
{
await Task.Run(task);
}

Related

Task.WhenAll but process results one by one

Let's say I have a list of Tasks, and I want to run them in parallel. But I don't need all of them to finish to continue, I can move on with just one. The following code waits for all the tasks to finish to move on. Is there a way for me to individually continue with the task that has completed while waiting for the other ones to finish?
List<string>[] x = await Task.WhenAll(new Task<List<string>>[] { task1, task2 })
// When task1 finishes, I want to process the result immediately
// instead of waiting on task2.

You're probably looking for Task.WhenAny.
I've used it for setting off a pile of tasks and then processing each of them as they become ready, but I suppose you could also just wait for one to finish and continue without the loop if you don't care about dealing with the rest.
while(tasks.Count() > 0)
{
var task = await Task.WhenAny(tasks);
tasks.Remove(task);
var taskresult = await task;
// process result
}

If you are using C# 8 and .NET Core you can take advantage of IAsyncEnumerable to hide this complexity from the consuming side.
Just like this:
static async Task Main(string[] args)
{
await foreach (var data in GetData())
{
Console.WriteLine(data);
}
Console.ReadLine();
}
static async IAsyncEnumerable<string> GetData()
{
List<Task<string>> tasks = new List<Task<string>> {GetData1(), GetData3(), GetData2()};
while (tasks.Any())
{
var finishedTask = await Task.WhenAny(tasks);
tasks.Remove(finishedTask);
yield return await finishedTask;
}
}
static async Task<string> GetData1()
{
await Task.Delay(5000);
return "Data1";
}
static async Task<string> GetData2()
{
await Task.Delay(3000);
return "Data2";
}
static async Task<string> GetData3()
{
await Task.Delay(2000);
return "Data3";
}

You can use Task.WhenAny instead.
Example "stolen" from Stephen Cleary's Blog:
var client = new HttpClient();
string results = await await Task.WhenAny(
client.GetStringAsync("http://example.com"),
client.GetStringAsync("http://microsoft.com"));
// results contains the HTML for whichever website responded first.
Responding to comment
You absolutely can keep track of the other tasks:
// supposing you have a list of Tasks in `myTasks`:
while( myTasks.Count > 0 )
{
var finishedTask = await Task.WhenAny(myTasks);
myTasks.Remove(finishedTask);
handleFinishedTask(finishedTask); // assuming this is a method that
// does the work on finished tasks.
}
The only thing you'd have to watch out for is :
The returned task will always end in the RanToCompletion state with its Result set to the first task to complete. This is true even if the first task to complete ended in the Canceled or Faulted state.
Remarks in WhenAny Doks(Emphasis by me)

In case you want to process the results in order of the completion of the tasks, there is the OrderByCompletion extension method that does exactly that in Stephen Cleary's Nito.AsyncEx library, with the signature below:
// Creates a new collection of tasks that complete in order.
public static List<Task<T>> OrderByCompletion<T>(this IEnumerable<Task<T>> #this);
Usage example:
Task<string>[] tasks = new[] { task1, task2, task3, task4 };
foreach (var task in tasks.OrderByCompletion())
{
string result = await task;
// Do something with result
}
If you prefer not having external dependencies, the source code is here.

Based on the answer of Peter Csala, here a extension method for IAsyncEnumerable:
public static async IAsyncEnumerable<T> OrderedByCompletion<T>(this IEnumerable<Task<T>> tasks)
{
List<Task<T>> taskList = new List<Task<T>>(tasks);
while (taskList.Count > 0)
{
var finishedTask = await Task.WhenAny(taskList);
taskList.Remove(finishedTask);
yield return await finishedTask;
}
}

Switch new Task(()=>{ }) for Func<Task>

In an answer to one of my other questions, I was told that use of new Task(() => { }) is not something that is a normal use case. I was advised to use Func<Task> instead. I have tried to make that work, but I can't seem to figure it out. (Rather than drag it out in the comments, I am asking a separate question here.)
My specific scenario is that I need the Task to not start right when it is declared and to be able to wait for it later.
Here is a LinqPad example using new Task(() => { }). NOTE: This works perfectly! (Except that it uses new Task.)
static async void Main(string[] args)
{
// Line that I need to swap to a Func<Task> somehow.
// note that this is "cold" not started task
Task startupDone = new Task(() => { });
var runTask = DoStuff(() =>
{
//+++ This is where we want to task to "start"
startupDone.Start();
});
//+++ Here we wait for the task to possibly start and finish. Or timeout.
// Note that this times out at 1000ms even if "blocking = 10000" below.
var didStartup = startupDone.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
await runTask;
Console.Read();
}
public static async Task DoStuff(Action action)
{
// Swap to 1000 to simulate starting up blocking
var blocking = 1; //1000;
await Task.Delay(500 + blocking);
action();
// Do the rest of the stuff...
await Task.Delay(1000);
}
I tried swapping the second line with:
Func<Task> startupDone = new Func<Task>(async () => { });
But then the lines below the comments with +++ in them don't work right.
I swapped the startupDone.Start() with startupDone.Invoke().
But startupDone.Wait needs the task. Which is only returned in the lambda. I am not sure how to get access to the task outside the lambda so I can Wait for it.
How can use a Func<Task> and start it in one part of my code and do a Wait for it in another part of my code? (Like I can with new Task(() => { })).

The code you posted cannot be refactored to make use of a Func<Task> instead of a cold task, because the method that needs to await the task (the Main method) is not the same method that controls the creation/starting of the task (the lambda parameter of the DoStuff method). This could make the use of the Task constructor legitimate in this case, depending on whether the design decision to delegate the starting of the task to a lambda is justified. In this particular example the startupDone is used as a synchronization primitive, to signal that a condition has been met and the program can continue. This could be achieved equally well by using a specialized synchronization primitive, like for example a SemaphoreSlim:
static async Task Main(string[] args)
{
var startupSemaphore = new SemaphoreSlim(0);
Task runTask = RunAsync(startupSemaphore);
bool startupFinished = await startupSemaphore.WaitAsync(1000);
Console.WriteLine(startupFinished ? "Startup Finished" : "Startup Timed Out");
await runTask;
}
public static async Task RunAsync(SemaphoreSlim startupSemaphore)
{
await Task.Delay(500);
startupSemaphore.Release(); // Signal that the startup is done
await Task.Delay(1000);
}
In my opinion using a SemaphoreSlim is more meaningful in this case, and makes the intent of the code clearer. It also allows to await asynchronously the signal with a timeout WaitAsync(Int32), which is not something that you get from a Task out of the box (it is doable though).
Using cold tasks may be tempting in some cases, but when you revisit your code after a month or two you'll find yourself confused, because of how rare and unexpected is to have to deal with tasks that may or may have not been started yet.

I always try my hardest to never have blocking behavior when dealing with anything async or any type that represents potential async behavior such as Task. You can slightly modify your DoStuff to facilitate waiting on your Action.
static async void Main(string[] args)
{
Func<CancellationToken,Task> startupTask = async(token)=>
{
Console.WriteLine("Waiting");
await Task.Delay(3000, token);
Console.WriteLine("Completed");
};
using var source = new CancellationTokenSource(2000);
var runTask = DoStuff(() => startupTask(source.Token), source.Token);
var didStartup = await runTask;
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
Console.Read();
}
public static async Task<bool> DoStuff(Func<Task> action, CancellationToken token)
{
var blocking = 10000;
try
{
await Task.Delay(500 + blocking, token);
await action();
}
catch(TaskCanceledException ex)
{
return false;
}
await Task.Delay(1000);
return true;
}

First, the type of your "do this later" object is going to become Func<Task>. Then, when the task is started (by invoking the function), you get back a Task that represents the operation:
static async void Main(string[] args)
{
Func<Task> startupDoneDelegate = async () => { };
Task startupDoneTask = null;
var runTask = await DoStuff(() =>
{
startupDoneTask = startupDoneDelegate();
});
var didStartup = startupDoneTask.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
}

Task.Continue with task.IsFaulted not capturing exceptions [duplicate]

I'm trying to use Task.WaitAll on a list of tasks. The thing is the tasks are an async lambda which breaks Tasks.WaitAll as it never waits.
Here is an example code block:
List<Task> tasks = new List<Task>();
tasks.Add(Task.Factory.StartNew(async () =>
{
using (dbContext = new DatabaseContext())
{
var records = await dbContext.Where(r => r.Id = 100).ToListAsync();
//do long cpu process here...
}
}
Task.WaitAll(tasks);
//do more stuff here
This doesn't wait because of the async lambda. So how am I supposed to await I/O operations in my lambda?

Task.Factory.StartNew doesn't recognise async delegates as there is no overload that accepts a function returning a Task.
This plus other reasons (see StartNew is dangerous) is why you should be using Task.Run here:
tasks.Add(Task.Run(async () => ...

This doesn't wait because of the async lambda. So how am I supposed to
await I/O operations in my lambda?
The reason Task.WaitAll doesn't wait for the completion of the IO work presented by your async lambda is because Task.Factory.StartNew actually returns a Task<Task>. Since your list is a List<Task> (and Task<T> derives from Task), you wait on the outer task started by StartNew, while ignoring the inner one created by the async lambda. This is why they say Task.Factory.StartNew is dangerous with respect to async.
How could you fix this? You could explicitly call Task<Task>.Unwrap() in order to get the inner task:
List<Task> tasks = new List<Task>();
tasks.Add(Task.Factory.StartNew(async () =>
{
using (dbContext = new DatabaseContext())
{
var records = await dbContext.Where(r => r.Id = 100).ToListAsync();
//do long cpu process here...
}
}).Unwrap());
Or like others said, you could call Task.Run instead:
tasks.Add(Task.Run(async () => /* lambda */);
Also, since you want to be doing things right, you'll want to use Task.WhenAll, why is asynchronously waitable, instead of Task.WaitAll which synchronously blocks:
await Task.WhenAll(tasks);

You can do like this.
void Something()
{
List<Task> tasks = new List<Task>();
tasks.Add(ReadAsync());
Task.WaitAll(tasks.ToArray());
}
async Task ReadAsync() {
using (dbContext = new DatabaseContext())
{
var records = await dbContext.Where(r => r.Id = 100).ToListAsync();
//do long cpu process here...
}
}

you have to use the Task.ContinueWith method. Like this
List<Task> tasks = new List<Task>();
tasks.Add(Task.Factory.StartNew(() =>
{
using (dbContext = new DatabaseContext())
{
return dbContext.Where(r => r.Id = 100).ToListAsync().ContinueWith(t =>
{
var records = t.Result;
// do long cpu process here...
});
}
}
}

Creating awaitable tasks using LINQ

I want to create a collection of awaitable tasks, so that I can start them together and asynchronously process the result from each one as they complete.
I have this code, and a compilation error:
> cannot assign void to an implicitly-typed variable
If I understand well, the tasks return by Select don't have a return type, even though the delegate passed returns ColetaIsisViewModel, I would think:
public MainViewModel()
{
Task.Run(LoadItems);
}
async Task LoadItems()
{
IEnumerable<Task> tasks = Directory.GetDirectories(somePath)
.Select(dir => new Task(() =>
new ItemViewModel(new ItemSerializer().Deserialize(dir))));
foreach (var task in tasks)
{
var result = await task; // <-- here I get the compilation error
DoSomething(result);
}
}

You shouldn't ever use the Task constructor.
Since you're calling synchronous code (Deserialize), you could use Task.Run:
async Task LoadItems()
{
var tasks = Directory.GetDirectories(somePath)
.Select(dir => Task.Run(() =>
new ItemViewModel(new ItemSerializer().Deserialize(dir))));
foreach (var task in tasks)
{
var result = await task;
DoSomething(result);
}
}
Alternatively, you could use Parallel or Parallel LINQ:
void LoadItems()
{
var vms = Directory.GetDirectories(somePath)
.AsParallel().Select(dir =>
new ItemViewModel(new ItemSerializer().Deserialize(dir)))
.ToList();
foreach (var vm in vms)
{
DoSomething(vm);
}
}
Or, if you make Deserialize a truly async method, then you can make it all asynchronous:
async Task LoadItems()
{
var tasks = Directory.GetDirectories(somePath)
.Select(async dir =>
new ItemViewModel(await new ItemSerializer().DeserializeAsync(dir))));
foreach (var task in tasks)
{
var result = await task;
DoSomething(result);
}
}
Also, I recommend that you do not use fire-and-forget in your constructor. There are better patterns for asynchronous constructors.

I know the question has been answered, but you can always do this too:
var serializer = new ItemSerializer();
var directories = Directory.GetDirectories(somePath);
foreach (string directory in directories)
{
await Task.Run(() => serializer.Deserialize(directory))
.ContinueWith(priorTask => DoSomething(priorTask.Result));
}
Notice I pulled out the serializer instantiation (assuming there are no side effects).

How to throttle multiple asynchronous tasks?

I have some code of the following form:
static async Task DoSomething(int n)
{
...
}
static void RunThreads(int totalThreads, int throttle)
{
var tasks = new List<Task>();
for (var n = 0; n < totalThreads; n++)
{
var task = DoSomething(n);
tasks.Add(task);
}
Task.WhenAll(tasks).Wait(); // all threads must complete
}
Trouble is, if I don't throttle the threads, things start falling apart. Now, I want to launch a maximum of throttle threads, and only start the new thread when an old one is complete. I've tried a few approaches and none so far has worked. Problems I have encountered include:
The tasks collection must be fully populated with all tasks, whether active or awaiting execution, otherwise the final .Wait() call only looks at the threads that it started with.
Chaining the execution seems to require use of Task.Run() or the like. But I need a reference to each task from the outset, and instantiating a task seems to kick it off automatically, which is what I don't want.
How to do this?

First, abstract away from threads. Especially since your operation is asynchronous, you shouldn't be thinking about "threads" at all. In the asynchronous world, you have tasks, and you can have a huge number of tasks compared to threads.
Throttling asynchronous code can be done using SemaphoreSlim:
static async Task DoSomething(int n);
static void RunConcurrently(int total, int throttle)
{
var mutex = new SemaphoreSlim(throttle);
var tasks = Enumerable.Range(0, total).Select(async item =>
{
await mutex.WaitAsync();
try { await DoSomething(item); }
finally { mutex.Release(); }
});
Task.WhenAll(tasks).Wait();
}

The simplest option IMO is to use TPL Dataflow. You just create an ActionBLock, limit it by the desired parallelism and start posting items into it. It makes sure to only run a certain amount of tasks at the same time, and when a task completes, it starts executing the next item:
async Task RunAsync(int totalThreads, int throttle)
{
var block = new ActionBlock<int>(
DoSomething,
new ExecutionDataFlowOptions { MaxDegreeOfParallelism = throttle });
for (var n = 0; n < totalThreads; n++)
{
block.Post(n);
}
block.Complete();
await block.Completion;
}

If I understand correctly, you can start tasks limited number of tasks mentioned by throttle parameter and wait for them to finish before starting next..
To wait for all started tasks to complete before starting new tasks, use the following implementation.
static async Task RunThreads(int totalThreads, int throttle)
{
var tasks = new List<Task>();
for (var n = 0; n < totalThreads; n++)
{
var task = DoSomething(n);
tasks.Add(task);
if (tasks.Count == throttle)
{
await Task.WhenAll(tasks);
tasks.Clear();
}
}
await Task.WhenAll(tasks); // wait for remaining
}
To add tasks as on when it is completed you can use the following code
static async Task RunThreads(int totalThreads, int throttle)
{
var tasks = new List<Task>();
for (var n = 0; n < totalThreads; n++)
{
var task = DoSomething(n);
tasks.Add(task);
if (tasks.Count == throttle)
{
var completed = await Task.WhenAny(tasks);
tasks.Remove(completed);
}
}
await Task.WhenAll(tasks); // all threads must complete
}

Stephen Toub gives the following example for throttling in his The Task-based Asynchronous Pattern document.
const int CONCURRENCY_LEVEL = 15;
Uri [] urls = …;
int nextIndex = 0;
var imageTasks = new List<Task<Bitmap>>();
while(nextIndex < CONCURRENCY_LEVEL && nextIndex < urls.Length)
{
imageTasks.Add(GetBitmapAsync(urls[nextIndex]));
nextIndex++;
}
while(imageTasks.Count > 0)
{
try
{
Task<Bitmap> imageTask = await Task.WhenAny(imageTasks);
imageTasks.Remove(imageTask);
Bitmap image = await imageTask;
panel.AddImage(image);
}
catch(Exception exc) { Log(exc); }
if (nextIndex < urls.Length)
{
imageTasks.Add(GetBitmapAsync(urls[nextIndex]));
nextIndex++;
}
}

Microsoft's Reactive Extensions (Rx) - NuGet "Rx-Main" - has this problem sorted very nicely.
Just do this:
static void RunThreads(int totalThreads, int throttle)
{
Observable
.Range(0, totalThreads)
.Select(n => Observable.FromAsync(() => DoSomething(n)))
.Merge(throttle)
.Wait();
}
Job done.

.NET 6 introduces Parallel.ForEachAsync. You could rewrite your code like this:
static async ValueTask DoSomething(int n)
{
...
}
static Task RunThreads(int totalThreads, int throttle)
=> Parallel.ForEachAsync(Enumerable.Range(0, totalThreads), new ParallelOptions() { MaxDegreeOfParallelism = throttle }, (i, _) => DoSomething(i));
Notes:
I had to change the return type of your DoSomething function from Task to ValueTask.
You probably want to avoid the .Wait() call, so I made the RunThreads method async.
It is not obvious from your example why you need access to the individual tasks. This code does not give you access to the tasks, but might still be helpful in many cases.

Here are some extension method variations to build on Sriram Sakthivel answer.
In the usage example, calls to DoSomething are being wrapped in an explicitly cast closure to allow passing arguments.
public static async Task RunMyThrottledTasks()
{
var myArgsSource = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
await myArgsSource
.Select(a => (Func<Task<object>>)(() => DoSomething(a)))
.Throttle(2);
}
public static async Task<object> DoSomething(int arg)
{
// Await some async calls that need arg..
// ..then return result async Task..
return new object();
}
public static async Task<IEnumerable<T>> Throttle<T>(IEnumerable<Func<Task<T>>> toRun, int throttleTo)
{
var running = new List<Task<T>>(throttleTo);
var completed = new List<Task<T>>(toRun.Count());
foreach(var taskToRun in toRun)
{
running.Add(taskToRun());
if(running.Count == throttleTo)
{
var comTask = await Task.WhenAny(running);
running.Remove(comTask);
completed.Add(comTask);
}
}
return completed.Select(t => t.Result);
}
public static async Task Throttle(this IEnumerable<Func<Task>> toRun, int throttleTo)
{
var running = new List<Task>(throttleTo);
foreach(var taskToRun in toRun)
{
running.Add(taskToRun());
if(running.Count == throttleTo)
{
var comTask = await Task.WhenAny(running);
running.Remove(comTask);
}
}
}

What you need is a custom task scheduler. You can derive a class from System.Threading.Tasks.TaskScheduler and implement two major functions GetScheduledTasks(), QueueTask(), along with other functions to gain complete control over throttling tasks. Here is a well documented example.
https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskscheduler?view=net-5.0

You can actually emulate the Parallel.ForEachAsync method introduced as part of .NET 6. In order to emulate the same you can use the following code.
public static Task ForEachAsync<T>(IEnumerable<T> source, int maxDegreeOfParallelism, Func<T, Task> body) {
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(maxDegreeOfParallelism)
select Task.Run(async delegate {
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Process a list of tasks without concurrency - c#

Related

Task.WhenAll but process results one by one

Switch new Task(()=>{ }) for Func<Task>

Task.Continue with task.IsFaulted not capturing exceptions [duplicate]

Creating awaitable tasks using LINQ

How to throttle multiple asynchronous tasks?

Categories

Resources