How can I use Async with ForEach? - c#

Is it possible to use Async when using ForEach? Below is the code I am trying:
using (DataContext db = new DataLayer.DataContext())
{
db.Groups.ToList().ForEach(i => async {
await GetAdminsFromGroup(i.Gid);
});
}
I am getting the error:
The name 'Async' does not exist in the current context
The method the using statement is enclosed in is set to async.

List<T>.ForEach doesn't play particularly well with async (neither does LINQ-to-objects, for the same reasons).
In this case, I recommend projecting each element into an asynchronous operation, and you can then (asynchronously) wait for them all to complete.
using (DataContext db = new DataLayer.DataContext())
{
var tasks = db.Groups.ToList().Select(i => GetAdminsFromGroupAsync(i.Gid));
var results = await Task.WhenAll(tasks);
}
The benefits of this approach over giving an async delegate to ForEach are:
Error handling is more proper. Exceptions from async void cannot be caught with catch; this approach will propagate exceptions at the await Task.WhenAll line, allowing natural exception handling.
You know that the tasks are complete at the end of this method, since it does an await Task.WhenAll. If you use async void, you cannot easily tell when the operations have completed.
This approach has a natural syntax for retrieving the results. GetAdminsFromGroupAsync sounds like it's an operation that produces a result (the admins), and such code is more natural if such operations can return their results rather than setting a value as a side effect.

This little extension method should give you exception-safe async iteration:
public static async Task ForEachAsync<T>(this List<T> list, Func<T, Task> func)
{
foreach (var value in list)
{
await func(value);
}
}
Since we're changing the return type of the lambda from void to Task, exceptions will propagate up correctly. This will allow you to write something like this in practice:
await db.Groups.ToList().ForEachAsync(async i => {
await GetAdminsFromGroup(i.Gid);
});

Starting with C# 8.0, you can create and consume streams asynchronously.
private async void button1_Click(object sender, EventArgs e)
{
IAsyncEnumerable<int> enumerable = GenerateSequence();
await foreach (var i in enumerable)
{
Debug.WriteLine(i);
}
}
public static async IAsyncEnumerable<int> GenerateSequence()
{
for (int i = 0; i < 20; i++)
{
await Task.Delay(100);
yield return i;
}
}
More

The simple answer is to use the foreach keyword instead of the ForEach() method of List().
using (DataContext db = new DataLayer.DataContext())
{
foreach(var i in db.Groups)
{
await GetAdminsFromGroup(i.Gid);
}
}

Here is an actual working version of the above async foreach variants with sequential processing:
public static async Task ForEachAsync<T>(this List<T> enumerable, Action<T> action)
{
foreach (var item in enumerable)
await Task.Run(() => { action(item); }).ConfigureAwait(false);
}
Here is the implementation:
public async void SequentialAsync()
{
var list = new List<Action>();
Action action1 = () => {
//do stuff 1
};
Action action2 = () => {
//do stuff 2
};
list.Add(action1);
list.Add(action2);
await list.ForEachAsync();
}
What's the key difference? .ConfigureAwait(false); which keeps the context of main thread while async sequential processing of each task.

This is not an old question, but .Net 6 introduced Parallel.ForeachAsync:
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, async (i, token) =>
{
await GetAdminsFromGroup(i);
});
ForeachAsync also accepts a ParallelOptions object, but usually you don't want to mess with the MaxDegreeOfParallelism property:
ParallelOptions parallelOptions = new ParallelOptions { MaxDegreeOfParallelism = 4 };
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, parallelOptions , async (i, token) =>
{
await GetAdminsFromGroup(i);
});
From Microsoft Docs: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.paralleloptions.maxdegreeofparallelism?view=net-6.0
By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.
Generally, you do not need to modify this setting....

Add this extension method
public static class ForEachAsyncExtension
{
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate
{
using (partition)
while (partition.MoveNext())
await body(partition.Current).ConfigureAwait(false);
}));
}
}
And then use like so:
Task.Run(async () =>
{
var s3 = new AmazonS3Client(Config.Instance.Aws.Credentials, Config.Instance.Aws.RegionEndpoint);
var buckets = await s3.ListBucketsAsync();
foreach (var s3Bucket in buckets.Buckets)
{
if (s3Bucket.BucketName.StartsWith("mybucket-"))
{
log.Information("Bucket => {BucketName}", s3Bucket.BucketName);
ListObjectsResponse objects;
try
{
objects = await s3.ListObjectsAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error getting objects. Bucket => {BucketName}", s3Bucket.BucketName);
continue;
}
// ForEachAsync (4 is how many tasks you want to run in parallel)
await objects.S3Objects.ForEachAsync(4, async s3Object =>
{
try
{
log.Information("Bucket => {BucketName} => {Key}", s3Bucket.BucketName, s3Object.Key);
await s3.DeleteObjectAsync(s3Bucket.BucketName, s3Object.Key);
}
catch
{
log.Error("Error deleting bucket {BucketName} object {Key}", s3Bucket.BucketName, s3Object.Key);
}
});
try
{
await s3.DeleteBucketAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error deleting bucket {BucketName}", s3Bucket.BucketName);
}
}
}
}).Wait();

If you are using EntityFramework.Core there is an extension method ForEachAsync.
The example usage looks like this:
using Microsoft.EntityFrameworkCore;
using System.Threading.Tasks;
public class Example
{
private readonly DbContext _dbContext;
public Example(DbContext dbContext)
{
_dbContext = dbContext;
}
public async void LogicMethod()
{
await _dbContext.Set<dbTable>().ForEachAsync(async x =>
{
//logic
await AsyncTask(x);
});
}
public async Task<bool> AsyncTask(object x)
{
//other logic
return await Task.FromResult<bool>(true);
}
}

I would like to add that there is a Parallel class with ForEach function built in that can be used for this purpose.

The problem was that the async keyword needs to appear before the lambda, not before the body:
db.Groups.ToList().ForEach(async (i) => {
await GetAdminsFromGroup(i.Gid);
});

This is method I created to handle async scenarios with ForEach.
If one of tasks fails then other tasks will continue their execution.
You have ability to add function that will be executed on every exception.
Exceptions are being collected as aggregateException at the end and are available for you.
Can handle CancellationToken
public static class ParallelExecutor
{
/// <summary>
/// Executes asynchronously given function on all elements of given enumerable with task count restriction.
/// Executor will continue starting new tasks even if one of the tasks throws. If at least one of the tasks throwed exception then <see cref="AggregateException"/> is throwed at the end of the method run.
/// </summary>
/// <typeparam name="T">Type of elements in enumerable</typeparam>
/// <param name="maxTaskCount">The maximum task count.</param>
/// <param name="enumerable">The enumerable.</param>
/// <param name="asyncFunc">asynchronous function that will be executed on every element of the enumerable. MUST be thread safe.</param>
/// <param name="onException">Acton that will be executed on every exception that would be thrown by asyncFunc. CAN be thread unsafe.</param>
/// <param name="cancellationToken">The cancellation token.</param>
public static async Task ForEachAsync<T>(int maxTaskCount, IEnumerable<T> enumerable, Func<T, Task> asyncFunc, Action<Exception> onException = null, CancellationToken cancellationToken = default)
{
using var semaphore = new SemaphoreSlim(initialCount: maxTaskCount, maxCount: maxTaskCount);
// This `lockObject` is used only in `catch { }` block.
object lockObject = new object();
var exceptions = new List<Exception>();
var tasks = new Task[enumerable.Count()];
int i = 0;
try
{
foreach (var t in enumerable)
{
await semaphore.WaitAsync(cancellationToken);
tasks[i++] = Task.Run(
async () =>
{
try
{
await asyncFunc(t);
}
catch (Exception e)
{
if (onException != null)
{
lock (lockObject)
{
onException.Invoke(e);
}
}
// This exception will be swallowed here but it will be collected at the end of ForEachAsync method in order to generate AggregateException.
throw;
}
finally
{
semaphore.Release();
}
}, cancellationToken);
if (cancellationToken.IsCancellationRequested)
{
break;
}
}
}
catch (OperationCanceledException e)
{
exceptions.Add(e);
}
foreach (var t in tasks)
{
if (cancellationToken.IsCancellationRequested)
{
break;
}
// Exception handling in this case is actually pretty fast.
// https://gist.github.com/shoter/d943500eda37c7d99461ce3dace42141
try
{
await t;
}
#pragma warning disable CA1031 // Do not catch general exception types - we want to throw that exception later as aggregate exception. Nothing wrong here.
catch (Exception e)
#pragma warning restore CA1031 // Do not catch general exception types
{
exceptions.Add(e);
}
}
if (exceptions.Any())
{
throw new AggregateException(exceptions);
}
}
}

Related

ForEach() gives "A second operation started on this context before a previous operation completed". foreach loop does not [duplicate]

Is it possible to use Async when using ForEach? Below is the code I am trying:
using (DataContext db = new DataLayer.DataContext())
{
db.Groups.ToList().ForEach(i => async {
await GetAdminsFromGroup(i.Gid);
});
}
I am getting the error:
The name 'Async' does not exist in the current context
The method the using statement is enclosed in is set to async.
List<T>.ForEach doesn't play particularly well with async (neither does LINQ-to-objects, for the same reasons).
In this case, I recommend projecting each element into an asynchronous operation, and you can then (asynchronously) wait for them all to complete.
using (DataContext db = new DataLayer.DataContext())
{
var tasks = db.Groups.ToList().Select(i => GetAdminsFromGroupAsync(i.Gid));
var results = await Task.WhenAll(tasks);
}
The benefits of this approach over giving an async delegate to ForEach are:
Error handling is more proper. Exceptions from async void cannot be caught with catch; this approach will propagate exceptions at the await Task.WhenAll line, allowing natural exception handling.
You know that the tasks are complete at the end of this method, since it does an await Task.WhenAll. If you use async void, you cannot easily tell when the operations have completed.
This approach has a natural syntax for retrieving the results. GetAdminsFromGroupAsync sounds like it's an operation that produces a result (the admins), and such code is more natural if such operations can return their results rather than setting a value as a side effect.
This little extension method should give you exception-safe async iteration:
public static async Task ForEachAsync<T>(this List<T> list, Func<T, Task> func)
{
foreach (var value in list)
{
await func(value);
}
}
Since we're changing the return type of the lambda from void to Task, exceptions will propagate up correctly. This will allow you to write something like this in practice:
await db.Groups.ToList().ForEachAsync(async i => {
await GetAdminsFromGroup(i.Gid);
});
Starting with C# 8.0, you can create and consume streams asynchronously.
private async void button1_Click(object sender, EventArgs e)
{
IAsyncEnumerable<int> enumerable = GenerateSequence();
await foreach (var i in enumerable)
{
Debug.WriteLine(i);
}
}
public static async IAsyncEnumerable<int> GenerateSequence()
{
for (int i = 0; i < 20; i++)
{
await Task.Delay(100);
yield return i;
}
}
More
The simple answer is to use the foreach keyword instead of the ForEach() method of List().
using (DataContext db = new DataLayer.DataContext())
{
foreach(var i in db.Groups)
{
await GetAdminsFromGroup(i.Gid);
}
}
Here is an actual working version of the above async foreach variants with sequential processing:
public static async Task ForEachAsync<T>(this List<T> enumerable, Action<T> action)
{
foreach (var item in enumerable)
await Task.Run(() => { action(item); }).ConfigureAwait(false);
}
Here is the implementation:
public async void SequentialAsync()
{
var list = new List<Action>();
Action action1 = () => {
//do stuff 1
};
Action action2 = () => {
//do stuff 2
};
list.Add(action1);
list.Add(action2);
await list.ForEachAsync();
}
What's the key difference? .ConfigureAwait(false); which keeps the context of main thread while async sequential processing of each task.
This is not an old question, but .Net 6 introduced Parallel.ForeachAsync:
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, async (i, token) =>
{
await GetAdminsFromGroup(i);
});
ForeachAsync also accepts a ParallelOptions object, but usually you don't want to mess with the MaxDegreeOfParallelism property:
ParallelOptions parallelOptions = new ParallelOptions { MaxDegreeOfParallelism = 4 };
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, parallelOptions , async (i, token) =>
{
await GetAdminsFromGroup(i);
});
From Microsoft Docs: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.paralleloptions.maxdegreeofparallelism?view=net-6.0
By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.
Generally, you do not need to modify this setting....
Add this extension method
public static class ForEachAsyncExtension
{
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate
{
using (partition)
while (partition.MoveNext())
await body(partition.Current).ConfigureAwait(false);
}));
}
}
And then use like so:
Task.Run(async () =>
{
var s3 = new AmazonS3Client(Config.Instance.Aws.Credentials, Config.Instance.Aws.RegionEndpoint);
var buckets = await s3.ListBucketsAsync();
foreach (var s3Bucket in buckets.Buckets)
{
if (s3Bucket.BucketName.StartsWith("mybucket-"))
{
log.Information("Bucket => {BucketName}", s3Bucket.BucketName);
ListObjectsResponse objects;
try
{
objects = await s3.ListObjectsAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error getting objects. Bucket => {BucketName}", s3Bucket.BucketName);
continue;
}
// ForEachAsync (4 is how many tasks you want to run in parallel)
await objects.S3Objects.ForEachAsync(4, async s3Object =>
{
try
{
log.Information("Bucket => {BucketName} => {Key}", s3Bucket.BucketName, s3Object.Key);
await s3.DeleteObjectAsync(s3Bucket.BucketName, s3Object.Key);
}
catch
{
log.Error("Error deleting bucket {BucketName} object {Key}", s3Bucket.BucketName, s3Object.Key);
}
});
try
{
await s3.DeleteBucketAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error deleting bucket {BucketName}", s3Bucket.BucketName);
}
}
}
}).Wait();
If you are using EntityFramework.Core there is an extension method ForEachAsync.
The example usage looks like this:
using Microsoft.EntityFrameworkCore;
using System.Threading.Tasks;
public class Example
{
private readonly DbContext _dbContext;
public Example(DbContext dbContext)
{
_dbContext = dbContext;
}
public async void LogicMethod()
{
await _dbContext.Set<dbTable>().ForEachAsync(async x =>
{
//logic
await AsyncTask(x);
});
}
public async Task<bool> AsyncTask(object x)
{
//other logic
return await Task.FromResult<bool>(true);
}
}
I would like to add that there is a Parallel class with ForEach function built in that can be used for this purpose.
The problem was that the async keyword needs to appear before the lambda, not before the body:
db.Groups.ToList().ForEach(async (i) => {
await GetAdminsFromGroup(i.Gid);
});
This is method I created to handle async scenarios with ForEach.
If one of tasks fails then other tasks will continue their execution.
You have ability to add function that will be executed on every exception.
Exceptions are being collected as aggregateException at the end and are available for you.
Can handle CancellationToken
public static class ParallelExecutor
{
/// <summary>
/// Executes asynchronously given function on all elements of given enumerable with task count restriction.
/// Executor will continue starting new tasks even if one of the tasks throws. If at least one of the tasks throwed exception then <see cref="AggregateException"/> is throwed at the end of the method run.
/// </summary>
/// <typeparam name="T">Type of elements in enumerable</typeparam>
/// <param name="maxTaskCount">The maximum task count.</param>
/// <param name="enumerable">The enumerable.</param>
/// <param name="asyncFunc">asynchronous function that will be executed on every element of the enumerable. MUST be thread safe.</param>
/// <param name="onException">Acton that will be executed on every exception that would be thrown by asyncFunc. CAN be thread unsafe.</param>
/// <param name="cancellationToken">The cancellation token.</param>
public static async Task ForEachAsync<T>(int maxTaskCount, IEnumerable<T> enumerable, Func<T, Task> asyncFunc, Action<Exception> onException = null, CancellationToken cancellationToken = default)
{
using var semaphore = new SemaphoreSlim(initialCount: maxTaskCount, maxCount: maxTaskCount);
// This `lockObject` is used only in `catch { }` block.
object lockObject = new object();
var exceptions = new List<Exception>();
var tasks = new Task[enumerable.Count()];
int i = 0;
try
{
foreach (var t in enumerable)
{
await semaphore.WaitAsync(cancellationToken);
tasks[i++] = Task.Run(
async () =>
{
try
{
await asyncFunc(t);
}
catch (Exception e)
{
if (onException != null)
{
lock (lockObject)
{
onException.Invoke(e);
}
}
// This exception will be swallowed here but it will be collected at the end of ForEachAsync method in order to generate AggregateException.
throw;
}
finally
{
semaphore.Release();
}
}, cancellationToken);
if (cancellationToken.IsCancellationRequested)
{
break;
}
}
}
catch (OperationCanceledException e)
{
exceptions.Add(e);
}
foreach (var t in tasks)
{
if (cancellationToken.IsCancellationRequested)
{
break;
}
// Exception handling in this case is actually pretty fast.
// https://gist.github.com/shoter/d943500eda37c7d99461ce3dace42141
try
{
await t;
}
#pragma warning disable CA1031 // Do not catch general exception types - we want to throw that exception later as aggregate exception. Nothing wrong here.
catch (Exception e)
#pragma warning restore CA1031 // Do not catch general exception types
{
exceptions.Add(e);
}
}
if (exceptions.Any())
{
throw new AggregateException(exceptions);
}
}
}

How to put async method in a list and invoke them iteratively?

Recently I want to implement a health check for a list of service calls. They are all async task (e.g. Task<IHttpOperationResponse<XXX_Model>> method_name(...)
I would like to put all of them into a list. I followed the answer of this post: Storing a list of methods in C# However, they are async methods.
I put it like this:
a collection of async method
List<Action> _functions = new List<Action> {
() => accountDetailsServiceProvider.GetEmployer(EmployerId),
() => accountServiceProvider.GetAccountStatus(EmployerId)
}
Can someone direct me to the right way to implement putting async methods in to a list and invoke them iteratively?
Thanks in advance!
First, you need to make your methods async. That means they must return a Task. For example:
public static async Task Foo()
{
await Task.Delay(1);
Console.WriteLine("Foo!");
}
public static async Task Bar()
{
await Task.Delay(1);
Console.WriteLine("Bar!");
}
Then to put them in a list, you must define the list as containing the right type. Since an async method actually returns something, it's a Func, not an action. It returns a Task.
var actions = new List<Func<Task>>
{
Foo, Bar
};
To invoke them, Select over the list (using Linq) to invoke them. This creates a list of Tasks in place of the list of Funcs.
var tasks = actions.Select( x => x() );
Then just await them:
await Task.WhenAll(tasks);
Full example:
public static async Task MainAsync()
{
var actions = new List<Func<Task>>
{
Foo, Bar
};
var tasks = actions.Select( x => x() );
await Task.WhenAll(tasks);
}
Output:
Foo!
Bar!
Example on DotNetFiddle
If your methods return a Boolean value, then the return type becomes Task<bool> and the rest follows suit:
public static async Task<bool> Foo()
{
await Task.Delay(1);
Console.WriteLine("Foo!");
return true;
}
public static async Task<bool> Bar()
{
await Task.Delay(1);
Console.WriteLine("Bar!");
return true;
}
var actions = new List<Func<Task<bool>>>
{
Foo, Bar
};
var tasks = actions.Select( x => x() );
await Task.WhenAll(tasks);
After you have awaited them, you can convert the tasks to their results with one more LINQ statement:
List<bool> results = tasks.Select( task => task.Result ).ToList();
I think you are just looking for something simple like this?
var myList = new List<Action>()
{
async() => { await Foo.GetBarAsync(); },
...
};
I would recommend you to change the type from Action to Func<Task> like so instead.
var myList = new List<Func<Task>>()
{
async() => { await Foo.GetBarAsync(); },
};
You can read more about why here: https://blogs.msdn.microsoft.com/pfxteam/2012/02/08/potential-pitfalls-to-avoid-when-passing-around-async-lambdas/
To invoke (simplified)
foreach (var action in myList)
{
await action.Invoke();
}
Based on the comments:
However, my task requires a boolean value for each method call,
because I have to report the status to the frontend whether the
service is down or not
Create a wrapper method for the method which will return required boolean value
public async Task<Result> Check(string name, Func<Task> execute)
{
try
{
await execute();
return new Result(name, true, string.Empty);
}
catch (Exception ex)
{
return new Result(name, false, ex.Message);
}
}
public class Result
{
public string Name { get; }
public bool Success { get; }
public string Message { get; }
public Result(string name, bool success, string message)
=> (Name, Success, Message) = (name, success, message);
}
Then you don't need to have collection of delegates, instead you will have collection of Task.
var tasks = new[]
{
Check(nameof(details.GetEmployer), () => details.GetEmployer(Id)),
Check(nameof(accounts.GetAccountStatus), () => accounts.GetAccountStatus(Id)),
};
var completed = await Task.WhenAll(tasks);
foreach (var task in completed)
{
Console.WriteLine($"Task: {task.Name}, Success: {task.Success};");
}

Semaphore wrapper method

Based on some questions on SO, mainly this one:
Throttling asynchronous tasks
I have implemented the SemaphoreSlim object to concurrently processes requests over a range of methods in my application. Most of these methods are taking in lists of IDs and getting single byte arrays back per ID in a concurrent fashion from the web. The implementation looks like this:
using (var semaphore = new SemaphoreSlim(MaxConcurrency))
{
var tasks = fileMetadata.GroupBy(x => x.StorageType).Select(async storageTypeFileMetadata=>
{
await semaphore.WaitAsync();
try
{
var fileManager = FileManagerFactory.CreateFileManager((StorageType)storageTypeFileMetadata.Key);
await fileManager.UpdateFilesAsync(storageTypeFileMetadata);
}
finally
{
semaphore.Release();
}
});
await Task.WhenAll(tasks);
}
Is there a way to abstract out a method or some reusable code snippet for the semaphore code, and pass in the work I need done, so it can be reused without re-writing the semaphore code each time? The only difference amongst multiple methods using this same semaphore pattern is the list I am iterating and the work it is doing in the try{}.
I am thinking something like pass list.select(x=> my task method with my work in it) to a semaphore method which is all the wrapper semaphore code.
So I'm guessing something like:
public static class Extension
{
public static async Task ExecuteAsync<T>(this IEnumerable<T> items, Func<T, Task> task, int concurrency)
{
var tasks = new List<Task>();
using (var semaphore = new SemaphoreSlim(concurrency))
{
foreach (var item in items)
{
tasks.Add(ExecuteInSemaphore(semaphore, task, item));
}
await Task.WhenAll(tasks);
}
}
private static async Task ExecuteInSemaphore<T>(SemaphoreSlim semaphore, Func<T, Task> task, T item)
{
await semaphore.WaitAsync();
try
{
await task(item);
}
finally
{
semaphore.Release();
}
}
}
Then you would use it like:
await fileMetadata.GroupBy(x => x.StorageType).ExecuteAsync(storageTypeFileMetadata =>
{
var fileManager = FileManagerFactory.CreateFileManager((StorageType)storageTypeFileMetadata.Key);
return fileManager.UpdateFilesAsync(storageTypeFileMetadata);
}, 4);

Anonymous asynchronicity, what is the right way?

I have a simple class that does a synchronous thing,
public static class Synchronous
{
public static void DoTheWholeThing()
{
AStuff aStuff;
using (var a = new A())
{
aStuff = a.GetStuff();
}
BStuff bStuff;
using (var b = new B())
{
bStuff = b.GetStuff();
}
var combination = CombineStuff(aStuff, bStuff);
}
private static Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Obviously, this code is not fully defined but it does illustrate my question.
Now, the classes A and B are both responsible for retrieving data from different remote sources. Consequently, the developers of A and B have implemented asynchronous entry points called GetStuffAsync which return Task<AStuff> and Task<BStuff> respectively.
I want to take maximum advantage of the asynchronous methods and call them concurrently so I can reduce the overall wait time of my code.
Here is what I've concocted, so far.
public static class Asynchronous
{
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var getAStuffTask = new Func<Task<AStuff>>(
async () =>
{
using (var a = new A())
{
return await a.GetStuffAsync(cancellationToken);
}
})();
var getBStuffTask = new Func<Task<BStuff>>(
async () =>
{
using (var b = new B())
{
return await b.GetStuffAsync(cancellationToken);
}
})();
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
private Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Aside from this code looking curiously like the javascript module pattern, is this the correct approach. I don't think I should be using Task.Run as this code is clearly not CPU bound.
It seems a bit "clunky" that I need to instantiate typed delegates to do this. Is there a better way?
EDIT
following two good answers I'm in a quandary between named functions and continuations.
The code becomes radically simpler when you simply extract the anonymous methods out into named methods:
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var getAStuffTask = GetAStuffAsync(cancellationToken);
var getBStuffTask = GetBStuffAsync(cancellationToken);
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
private static async Task<AStuff> GetAStuffAsync(CancellationToken cancellationToken)
{
using (var a = new A())
{
return await a.GetStuffAsync(cancellationToken);
}
}
private static async Task<BStuff> GetBStuffAsync(CancellationToken cancellationToken)
{
using (var b = new B())
{
return await b.GetStuffAsync(cancellationToken);
}
}
That said, if you really want to stick with the anonymous methods, you can create a helper method that will allow generic type inference and lambdas to implicitly figure out the type of the delegate:
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var getAStuffTask = Start(async () =>
{
using (var a = new A())
{
return await a.GetStuffAsync(cancellationToken);
}
});
var getBStuffTask = Start(async () =>
{
using (var b = new B())
{
return await b.GetStuffAsync(cancellationToken);
}
});
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
public static Task<T> Start<T>(Func<Task<T>> asyncOperation)
{
return asyncOperation();
}
Use TPL continuations to call Dispose as soon as the task is complete.
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var a = new A();
var b = new B();
// start the tasks and store them for awaiting later
var getAStuffTask = a.GetStuffAsync(cancellationToken);
var getBStuffTask = b.GetStuffAsync(cancellationToken);
// queue up continuations to dispose of the resource as soon as it is not needed
getAStuffTask.ContinueWith(() => a.Dispose());
getBStuffTask.ContinueWith(() => b.Dispose());
// await as normal
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
I am unsure if wrapping the whole method in an addition using block will accomplish anything but it may provide peace of mind.
You don't need to wrap your async calls in delegates to get them to execute immediately. If you call the GetStuffAsync methods directly without awaiting them you will have the same result.
public static class Asynchronous
{
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
using (var a = new A())
using (var b = new B()) {
var taskA = a.GetStuffAsync(cancellationToken);
var taskB = b.GetStuffAsync(cancellationToken);
await Task.WhenAll(new [] { taskA, taskB });
var combination = CombineStuff(taskA.Result, taskB.Result);
}
}
private Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Note that this does keep the a and b objects alive during the call to CombineStuff as #Servy notes. If that is a problem the declaration of the Task objects can be moved outside of the using blocks as below:
public static class Asynchronous
{
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
Task taskA;
Task taskB;
using (var a = new A())
using (var b = new B()) {
taskA = a.GetStuffAsync(cancellationToken);
taskB = b.GetStuffAsync(cancellationToken);
await Task.WhenAll(new [] { taskA, taskB });
}
var combination = CombineStuff(taskA.Result, taskB.Result);
}
private Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Although this still holds onto a and b as long as both tasks are running, rather than disposing of each as they return.

awaitable Task based queue

I'm wondering if there exists an implementation/wrapper for ConcurrentQueue, similar to BlockingCollection where taking from the collection does not block, but is instead asynchronous and will cause an async await until an item is placed in the queue.
I've come up with my own implementation, but it does not seem to be performing as expected. I'm wondering if I'm reinventing something that already exists.
Here's my implementation:
public class MessageQueue<T>
{
ConcurrentQueue<T> queue = new ConcurrentQueue<T>();
ConcurrentQueue<TaskCompletionSource<T>> waitingQueue =
new ConcurrentQueue<TaskCompletionSource<T>>();
object queueSyncLock = new object();
public void Enqueue(T item)
{
queue.Enqueue(item);
ProcessQueues();
}
public async Task<T> Dequeue()
{
TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
waitingQueue.Enqueue(tcs);
ProcessQueues();
return tcs.Task.IsCompleted ? tcs.Task.Result : await tcs.Task;
}
private void ProcessQueues()
{
TaskCompletionSource<T> tcs=null;
T firstItem=default(T);
while (true)
{
bool ok;
lock (queueSyncLock)
{
ok = waitingQueue.TryPeek(out tcs) && queue.TryPeek(out firstItem);
if (ok)
{
waitingQueue.TryDequeue(out tcs);
queue.TryDequeue(out firstItem);
}
}
if (!ok) break;
tcs.SetResult(firstItem);
}
}
}
I don't know of a lock-free solution, but you can take a look at the new Dataflow library, part of the Async CTP. A simple BufferBlock<T> should suffice, e.g.:
BufferBlock<int> buffer = new BufferBlock<int>();
Production and consumption are most easily done via extension methods on the dataflow block types.
Production is as simple as:
buffer.Post(13);
and consumption is async-ready:
int item = await buffer.ReceiveAsync();
I do recommend you use Dataflow if possible; making such a buffer both efficient and correct is more difficult than it first appears.
Simple approach with C# 8.0 IAsyncEnumerable and Dataflow library
// Instatiate an async queue
var queue = new AsyncQueue<int>();
// Then, loop through the elements of queue.
// This loop won't stop until it is canceled or broken out of
// (for that, use queue.WithCancellation(..) or break;)
await foreach(int i in queue) {
// Writes a line as soon as some other Task calls queue.Enqueue(..)
Console.WriteLine(i);
}
With an implementation of AsyncQueue as follows:
public class AsyncQueue<T> : IAsyncEnumerable<T>
{
private readonly SemaphoreSlim _enumerationSemaphore = new SemaphoreSlim(1);
private readonly BufferBlock<T> _bufferBlock = new BufferBlock<T>();
public void Enqueue(T item) =>
_bufferBlock.Post(item);
public async IAsyncEnumerator<T> GetAsyncEnumerator(CancellationToken token = default)
{
// We lock this so we only ever enumerate once at a time.
// That way we ensure all items are returned in a continuous
// fashion with no 'holes' in the data when two foreach compete.
await _enumerationSemaphore.WaitAsync();
try {
// Return new elements until cancellationToken is triggered.
while (true) {
// Make sure to throw on cancellation so the Task will transfer into a canceled state
token.ThrowIfCancellationRequested();
yield return await _bufferBlock.ReceiveAsync(token);
}
} finally {
_enumerationSemaphore.Release();
}
}
}
There is an official way to do this now: System.Threading.Channels. It's built into the core runtime on .NET Core 3.0 and higher (including .NET 5.0 and 6.0), but it's also available as a NuGet package on .NET Standard 2.0 and 2.1. You can read through the docs here.
var channel = System.Threading.Channels.Channel.CreateUnbounded<int>();
To enqueue work:
// This will succeed and finish synchronously if the channel is unbounded.
channel.Writer.TryWrite(42);
To complete the channel:
channel.Writer.TryComplete();
To read from the channel:
var i = await channel.Reader.ReadAsync();
Or, if you have .NET Core 3.0 or higher:
await foreach (int i in channel.Reader.ReadAllAsync())
{
// whatever processing on i...
}
One simple and easy way to implement this is with a SemaphoreSlim:
public class AwaitableQueue<T>
{
private SemaphoreSlim semaphore = new SemaphoreSlim(0);
private readonly object queueLock = new object();
private Queue<T> queue = new Queue<T>();
public void Enqueue(T item)
{
lock (queueLock)
{
queue.Enqueue(item);
semaphore.Release();
}
}
public T WaitAndDequeue(TimeSpan timeSpan, CancellationToken cancellationToken)
{
semaphore.Wait(timeSpan, cancellationToken);
lock (queueLock)
{
return queue.Dequeue();
}
}
public async Task<T> WhenDequeue(TimeSpan timeSpan, CancellationToken cancellationToken)
{
await semaphore.WaitAsync(timeSpan, cancellationToken);
lock (queueLock)
{
return queue.Dequeue();
}
}
}
The beauty of this is that the SemaphoreSlim handles all of the complexity of implementing the Wait() and WaitAsync() functionality. The downside is that queue length is tracked by both the semaphore and the queue itself, and they both magically stay in sync.
My atempt (it have an event raised when a "promise" is created, and it can be used by an external producer to know when to produce more items):
public class AsyncQueue<T>
{
private ConcurrentQueue<T> _bufferQueue;
private ConcurrentQueue<TaskCompletionSource<T>> _promisesQueue;
private object _syncRoot = new object();
public AsyncQueue()
{
_bufferQueue = new ConcurrentQueue<T>();
_promisesQueue = new ConcurrentQueue<TaskCompletionSource<T>>();
}
/// <summary>
/// Enqueues the specified item.
/// </summary>
/// <param name="item">The item.</param>
public void Enqueue(T item)
{
TaskCompletionSource<T> promise;
do
{
if (_promisesQueue.TryDequeue(out promise) &&
!promise.Task.IsCanceled &&
promise.TrySetResult(item))
{
return;
}
}
while (promise != null);
lock (_syncRoot)
{
if (_promisesQueue.TryDequeue(out promise) &&
!promise.Task.IsCanceled &&
promise.TrySetResult(item))
{
return;
}
_bufferQueue.Enqueue(item);
}
}
/// <summary>
/// Dequeues the asynchronous.
/// </summary>
/// <param name="cancellationToken">The cancellation token.</param>
/// <returns></returns>
public Task<T> DequeueAsync(CancellationToken cancellationToken)
{
T item;
if (!_bufferQueue.TryDequeue(out item))
{
lock (_syncRoot)
{
if (!_bufferQueue.TryDequeue(out item))
{
var promise = new TaskCompletionSource<T>();
cancellationToken.Register(() => promise.TrySetCanceled());
_promisesQueue.Enqueue(promise);
this.PromiseAdded.RaiseEvent(this, EventArgs.Empty);
return promise.Task;
}
}
}
return Task.FromResult(item);
}
/// <summary>
/// Gets a value indicating whether this instance has promises.
/// </summary>
/// <value>
/// <c>true</c> if this instance has promises; otherwise, <c>false</c>.
/// </value>
public bool HasPromises
{
get { return _promisesQueue.Where(p => !p.Task.IsCanceled).Count() > 0; }
}
/// <summary>
/// Occurs when a new promise
/// is generated by the queue
/// </summary>
public event EventHandler PromiseAdded;
}
It may be overkill for your use case (given the learning curve), but Reactive Extentions provides all the glue you could ever want for asynchronous composition.
You essentially subscribe to changes and they are pushed to you as they become available, and you can have the system push the changes on a separate thread.
Check out https://github.com/somdoron/AsyncCollection, you can both dequeue asynchronously and use C# 8.0 IAsyncEnumerable.
The API is very similar to BlockingCollection.
AsyncCollection<int> collection = new AsyncCollection<int>();
var t = Task.Run(async () =>
{
while (!collection.IsCompleted)
{
var item = await collection.TakeAsync();
// process
}
});
for (int i = 0; i < 1000; i++)
{
collection.Add(i);
}
collection.CompleteAdding();
t.Wait();
With IAsyncEnumeable:
AsyncCollection<int> collection = new AsyncCollection<int>();
var t = Task.Run(async () =>
{
await foreach (var item in collection)
{
// process
}
});
for (int i = 0; i < 1000; i++)
{
collection.Add(i);
}
collection.CompleteAdding();
t.Wait();
Here's the implementation I'm currently using.
public class MessageQueue<T>
{
ConcurrentQueue<T> queue = new ConcurrentQueue<T>();
ConcurrentQueue<TaskCompletionSource<T>> waitingQueue =
new ConcurrentQueue<TaskCompletionSource<T>>();
object queueSyncLock = new object();
public void Enqueue(T item)
{
queue.Enqueue(item);
ProcessQueues();
}
public async Task<T> DequeueAsync(CancellationToken ct)
{
TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
ct.Register(() =>
{
lock (queueSyncLock)
{
tcs.TrySetCanceled();
}
});
waitingQueue.Enqueue(tcs);
ProcessQueues();
return tcs.Task.IsCompleted ? tcs.Task.Result : await tcs.Task;
}
private void ProcessQueues()
{
TaskCompletionSource<T> tcs = null;
T firstItem = default(T);
lock (queueSyncLock)
{
while (true)
{
if (waitingQueue.TryPeek(out tcs) && queue.TryPeek(out firstItem))
{
waitingQueue.TryDequeue(out tcs);
if (tcs.Task.IsCanceled)
{
continue;
}
queue.TryDequeue(out firstItem);
}
else
{
break;
}
tcs.SetResult(firstItem);
}
}
}
}
It works good enough, but there's quite a lot of contention on queueSyncLock, as I am making quite a lot of use of the CancellationToken to cancel some of the waiting tasks. Of course, this leads to considerably less blocking I would see with a BlockingCollection but...
I'm wondering if there is a smoother, lock free means of achieving the same end
Well 8 years later I hit this very question and was about to implement the MS AsyncQueue<T> class found in nuget package/namespace: Microsoft.VisualStudio.Threading
Thanks to #Theodor Zoulias for mentioning this api may be outdated and the DataFlow lib would be a good alternative.
So I edited my AsyncQueue<> implementation to use BufferBlock<>. Almost the same but works better.
I use this in an AspNet Core background thread and it runs fully async.
protected async Task MyRun()
{
BufferBlock<MyObj> queue = new BufferBlock<MyObj>();
Task enqueueTask = StartDataIteration(queue);
while (await queue.OutputAvailableAsync())
{
var myObj = queue.Receive();
// do something with myObj
}
}
public async Task StartDataIteration(BufferBlock<MyObj> queue)
{
var cursor = await RunQuery();
while(await cursor.Next()) {
queue.Post(cursor.Current);
}
queue.Complete(); // <<< signals the consumer when queue.Count reaches 0
}
I found that using the queue.OutputAvailableAsync() fixed the issue that I had with AsyncQueue<> -- trying to determine when the queue was complete and not having to inspect the dequeue task.
You could just use a BlockingCollection ( using the default ConcurrentQueue ) and wrap the call to Take in a Task so you can await it:
var bc = new BlockingCollection<T>();
T element = await Task.Run( () => bc.Take() );

Categories

Resources