awaitable Task based queue - c#

I'm wondering if there exists an implementation/wrapper for ConcurrentQueue, similar to BlockingCollection where taking from the collection does not block, but is instead asynchronous and will cause an async await until an item is placed in the queue.
I've come up with my own implementation, but it does not seem to be performing as expected. I'm wondering if I'm reinventing something that already exists.
Here's my implementation:
public class MessageQueue<T>
{
ConcurrentQueue<T> queue = new ConcurrentQueue<T>();
ConcurrentQueue<TaskCompletionSource<T>> waitingQueue =
new ConcurrentQueue<TaskCompletionSource<T>>();
object queueSyncLock = new object();
public void Enqueue(T item)
{
queue.Enqueue(item);
ProcessQueues();
}
public async Task<T> Dequeue()
{
TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
waitingQueue.Enqueue(tcs);
ProcessQueues();
return tcs.Task.IsCompleted ? tcs.Task.Result : await tcs.Task;
}
private void ProcessQueues()
{
TaskCompletionSource<T> tcs=null;
T firstItem=default(T);
while (true)
{
bool ok;
lock (queueSyncLock)
{
ok = waitingQueue.TryPeek(out tcs) && queue.TryPeek(out firstItem);
if (ok)
{
waitingQueue.TryDequeue(out tcs);
queue.TryDequeue(out firstItem);
}
}
if (!ok) break;
tcs.SetResult(firstItem);
}
}
}

I don't know of a lock-free solution, but you can take a look at the new Dataflow library, part of the Async CTP. A simple BufferBlock<T> should suffice, e.g.:
BufferBlock<int> buffer = new BufferBlock<int>();
Production and consumption are most easily done via extension methods on the dataflow block types.
Production is as simple as:
buffer.Post(13);
and consumption is async-ready:
int item = await buffer.ReceiveAsync();
I do recommend you use Dataflow if possible; making such a buffer both efficient and correct is more difficult than it first appears.

Simple approach with C# 8.0 IAsyncEnumerable and Dataflow library
// Instatiate an async queue
var queue = new AsyncQueue<int>();
// Then, loop through the elements of queue.
// This loop won't stop until it is canceled or broken out of
// (for that, use queue.WithCancellation(..) or break;)
await foreach(int i in queue) {
// Writes a line as soon as some other Task calls queue.Enqueue(..)
Console.WriteLine(i);
}
With an implementation of AsyncQueue as follows:
public class AsyncQueue<T> : IAsyncEnumerable<T>
{
private readonly SemaphoreSlim _enumerationSemaphore = new SemaphoreSlim(1);
private readonly BufferBlock<T> _bufferBlock = new BufferBlock<T>();
public void Enqueue(T item) =>
_bufferBlock.Post(item);
public async IAsyncEnumerator<T> GetAsyncEnumerator(CancellationToken token = default)
{
// We lock this so we only ever enumerate once at a time.
// That way we ensure all items are returned in a continuous
// fashion with no 'holes' in the data when two foreach compete.
await _enumerationSemaphore.WaitAsync();
try {
// Return new elements until cancellationToken is triggered.
while (true) {
// Make sure to throw on cancellation so the Task will transfer into a canceled state
token.ThrowIfCancellationRequested();
yield return await _bufferBlock.ReceiveAsync(token);
}
} finally {
_enumerationSemaphore.Release();
}
}
}

There is an official way to do this now: System.Threading.Channels. It's built into the core runtime on .NET Core 3.0 and higher (including .NET 5.0 and 6.0), but it's also available as a NuGet package on .NET Standard 2.0 and 2.1. You can read through the docs here.
var channel = System.Threading.Channels.Channel.CreateUnbounded<int>();
To enqueue work:
// This will succeed and finish synchronously if the channel is unbounded.
channel.Writer.TryWrite(42);
To complete the channel:
channel.Writer.TryComplete();
To read from the channel:
var i = await channel.Reader.ReadAsync();
Or, if you have .NET Core 3.0 or higher:
await foreach (int i in channel.Reader.ReadAllAsync())
{
// whatever processing on i...
}

One simple and easy way to implement this is with a SemaphoreSlim:
public class AwaitableQueue<T>
{
private SemaphoreSlim semaphore = new SemaphoreSlim(0);
private readonly object queueLock = new object();
private Queue<T> queue = new Queue<T>();
public void Enqueue(T item)
{
lock (queueLock)
{
queue.Enqueue(item);
semaphore.Release();
}
}
public T WaitAndDequeue(TimeSpan timeSpan, CancellationToken cancellationToken)
{
semaphore.Wait(timeSpan, cancellationToken);
lock (queueLock)
{
return queue.Dequeue();
}
}
public async Task<T> WhenDequeue(TimeSpan timeSpan, CancellationToken cancellationToken)
{
await semaphore.WaitAsync(timeSpan, cancellationToken);
lock (queueLock)
{
return queue.Dequeue();
}
}
}
The beauty of this is that the SemaphoreSlim handles all of the complexity of implementing the Wait() and WaitAsync() functionality. The downside is that queue length is tracked by both the semaphore and the queue itself, and they both magically stay in sync.

My atempt (it have an event raised when a "promise" is created, and it can be used by an external producer to know when to produce more items):
public class AsyncQueue<T>
{
private ConcurrentQueue<T> _bufferQueue;
private ConcurrentQueue<TaskCompletionSource<T>> _promisesQueue;
private object _syncRoot = new object();
public AsyncQueue()
{
_bufferQueue = new ConcurrentQueue<T>();
_promisesQueue = new ConcurrentQueue<TaskCompletionSource<T>>();
}
/// <summary>
/// Enqueues the specified item.
/// </summary>
/// <param name="item">The item.</param>
public void Enqueue(T item)
{
TaskCompletionSource<T> promise;
do
{
if (_promisesQueue.TryDequeue(out promise) &&
!promise.Task.IsCanceled &&
promise.TrySetResult(item))
{
return;
}
}
while (promise != null);
lock (_syncRoot)
{
if (_promisesQueue.TryDequeue(out promise) &&
!promise.Task.IsCanceled &&
promise.TrySetResult(item))
{
return;
}
_bufferQueue.Enqueue(item);
}
}
/// <summary>
/// Dequeues the asynchronous.
/// </summary>
/// <param name="cancellationToken">The cancellation token.</param>
/// <returns></returns>
public Task<T> DequeueAsync(CancellationToken cancellationToken)
{
T item;
if (!_bufferQueue.TryDequeue(out item))
{
lock (_syncRoot)
{
if (!_bufferQueue.TryDequeue(out item))
{
var promise = new TaskCompletionSource<T>();
cancellationToken.Register(() => promise.TrySetCanceled());
_promisesQueue.Enqueue(promise);
this.PromiseAdded.RaiseEvent(this, EventArgs.Empty);
return promise.Task;
}
}
}
return Task.FromResult(item);
}
/// <summary>
/// Gets a value indicating whether this instance has promises.
/// </summary>
/// <value>
/// <c>true</c> if this instance has promises; otherwise, <c>false</c>.
/// </value>
public bool HasPromises
{
get { return _promisesQueue.Where(p => !p.Task.IsCanceled).Count() > 0; }
}
/// <summary>
/// Occurs when a new promise
/// is generated by the queue
/// </summary>
public event EventHandler PromiseAdded;
}

It may be overkill for your use case (given the learning curve), but Reactive Extentions provides all the glue you could ever want for asynchronous composition.
You essentially subscribe to changes and they are pushed to you as they become available, and you can have the system push the changes on a separate thread.

Check out https://github.com/somdoron/AsyncCollection, you can both dequeue asynchronously and use C# 8.0 IAsyncEnumerable.
The API is very similar to BlockingCollection.
AsyncCollection<int> collection = new AsyncCollection<int>();
var t = Task.Run(async () =>
{
while (!collection.IsCompleted)
{
var item = await collection.TakeAsync();
// process
}
});
for (int i = 0; i < 1000; i++)
{
collection.Add(i);
}
collection.CompleteAdding();
t.Wait();
With IAsyncEnumeable:
AsyncCollection<int> collection = new AsyncCollection<int>();
var t = Task.Run(async () =>
{
await foreach (var item in collection)
{
// process
}
});
for (int i = 0; i < 1000; i++)
{
collection.Add(i);
}
collection.CompleteAdding();
t.Wait();

Here's the implementation I'm currently using.
public class MessageQueue<T>
{
ConcurrentQueue<T> queue = new ConcurrentQueue<T>();
ConcurrentQueue<TaskCompletionSource<T>> waitingQueue =
new ConcurrentQueue<TaskCompletionSource<T>>();
object queueSyncLock = new object();
public void Enqueue(T item)
{
queue.Enqueue(item);
ProcessQueues();
}
public async Task<T> DequeueAsync(CancellationToken ct)
{
TaskCompletionSource<T> tcs = new TaskCompletionSource<T>();
ct.Register(() =>
{
lock (queueSyncLock)
{
tcs.TrySetCanceled();
}
});
waitingQueue.Enqueue(tcs);
ProcessQueues();
return tcs.Task.IsCompleted ? tcs.Task.Result : await tcs.Task;
}
private void ProcessQueues()
{
TaskCompletionSource<T> tcs = null;
T firstItem = default(T);
lock (queueSyncLock)
{
while (true)
{
if (waitingQueue.TryPeek(out tcs) && queue.TryPeek(out firstItem))
{
waitingQueue.TryDequeue(out tcs);
if (tcs.Task.IsCanceled)
{
continue;
}
queue.TryDequeue(out firstItem);
}
else
{
break;
}
tcs.SetResult(firstItem);
}
}
}
}
It works good enough, but there's quite a lot of contention on queueSyncLock, as I am making quite a lot of use of the CancellationToken to cancel some of the waiting tasks. Of course, this leads to considerably less blocking I would see with a BlockingCollection but...
I'm wondering if there is a smoother, lock free means of achieving the same end

Well 8 years later I hit this very question and was about to implement the MS AsyncQueue<T> class found in nuget package/namespace: Microsoft.VisualStudio.Threading
Thanks to #Theodor Zoulias for mentioning this api may be outdated and the DataFlow lib would be a good alternative.
So I edited my AsyncQueue<> implementation to use BufferBlock<>. Almost the same but works better.
I use this in an AspNet Core background thread and it runs fully async.
protected async Task MyRun()
{
BufferBlock<MyObj> queue = new BufferBlock<MyObj>();
Task enqueueTask = StartDataIteration(queue);
while (await queue.OutputAvailableAsync())
{
var myObj = queue.Receive();
// do something with myObj
}
}
public async Task StartDataIteration(BufferBlock<MyObj> queue)
{
var cursor = await RunQuery();
while(await cursor.Next()) {
queue.Post(cursor.Current);
}
queue.Complete(); // <<< signals the consumer when queue.Count reaches 0
}
I found that using the queue.OutputAvailableAsync() fixed the issue that I had with AsyncQueue<> -- trying to determine when the queue was complete and not having to inspect the dequeue task.

You could just use a BlockingCollection ( using the default ConcurrentQueue ) and wrap the call to Take in a Task so you can await it:
var bc = new BlockingCollection<T>();
T element = await Task.Run( () => bc.Take() );

Related

ForEach() gives "A second operation started on this context before a previous operation completed". foreach loop does not [duplicate]

Is it possible to use Async when using ForEach? Below is the code I am trying:
using (DataContext db = new DataLayer.DataContext())
{
db.Groups.ToList().ForEach(i => async {
await GetAdminsFromGroup(i.Gid);
});
}
I am getting the error:
The name 'Async' does not exist in the current context
The method the using statement is enclosed in is set to async.
List<T>.ForEach doesn't play particularly well with async (neither does LINQ-to-objects, for the same reasons).
In this case, I recommend projecting each element into an asynchronous operation, and you can then (asynchronously) wait for them all to complete.
using (DataContext db = new DataLayer.DataContext())
{
var tasks = db.Groups.ToList().Select(i => GetAdminsFromGroupAsync(i.Gid));
var results = await Task.WhenAll(tasks);
}
The benefits of this approach over giving an async delegate to ForEach are:
Error handling is more proper. Exceptions from async void cannot be caught with catch; this approach will propagate exceptions at the await Task.WhenAll line, allowing natural exception handling.
You know that the tasks are complete at the end of this method, since it does an await Task.WhenAll. If you use async void, you cannot easily tell when the operations have completed.
This approach has a natural syntax for retrieving the results. GetAdminsFromGroupAsync sounds like it's an operation that produces a result (the admins), and such code is more natural if such operations can return their results rather than setting a value as a side effect.
This little extension method should give you exception-safe async iteration:
public static async Task ForEachAsync<T>(this List<T> list, Func<T, Task> func)
{
foreach (var value in list)
{
await func(value);
}
}
Since we're changing the return type of the lambda from void to Task, exceptions will propagate up correctly. This will allow you to write something like this in practice:
await db.Groups.ToList().ForEachAsync(async i => {
await GetAdminsFromGroup(i.Gid);
});
Starting with C# 8.0, you can create and consume streams asynchronously.
private async void button1_Click(object sender, EventArgs e)
{
IAsyncEnumerable<int> enumerable = GenerateSequence();
await foreach (var i in enumerable)
{
Debug.WriteLine(i);
}
}
public static async IAsyncEnumerable<int> GenerateSequence()
{
for (int i = 0; i < 20; i++)
{
await Task.Delay(100);
yield return i;
}
}
More
The simple answer is to use the foreach keyword instead of the ForEach() method of List().
using (DataContext db = new DataLayer.DataContext())
{
foreach(var i in db.Groups)
{
await GetAdminsFromGroup(i.Gid);
}
}
Here is an actual working version of the above async foreach variants with sequential processing:
public static async Task ForEachAsync<T>(this List<T> enumerable, Action<T> action)
{
foreach (var item in enumerable)
await Task.Run(() => { action(item); }).ConfigureAwait(false);
}
Here is the implementation:
public async void SequentialAsync()
{
var list = new List<Action>();
Action action1 = () => {
//do stuff 1
};
Action action2 = () => {
//do stuff 2
};
list.Add(action1);
list.Add(action2);
await list.ForEachAsync();
}
What's the key difference? .ConfigureAwait(false); which keeps the context of main thread while async sequential processing of each task.
This is not an old question, but .Net 6 introduced Parallel.ForeachAsync:
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, async (i, token) =>
{
await GetAdminsFromGroup(i);
});
ForeachAsync also accepts a ParallelOptions object, but usually you don't want to mess with the MaxDegreeOfParallelism property:
ParallelOptions parallelOptions = new ParallelOptions { MaxDegreeOfParallelism = 4 };
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, parallelOptions , async (i, token) =>
{
await GetAdminsFromGroup(i);
});
From Microsoft Docs: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.paralleloptions.maxdegreeofparallelism?view=net-6.0
By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.
Generally, you do not need to modify this setting....
Add this extension method
public static class ForEachAsyncExtension
{
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate
{
using (partition)
while (partition.MoveNext())
await body(partition.Current).ConfigureAwait(false);
}));
}
}
And then use like so:
Task.Run(async () =>
{
var s3 = new AmazonS3Client(Config.Instance.Aws.Credentials, Config.Instance.Aws.RegionEndpoint);
var buckets = await s3.ListBucketsAsync();
foreach (var s3Bucket in buckets.Buckets)
{
if (s3Bucket.BucketName.StartsWith("mybucket-"))
{
log.Information("Bucket => {BucketName}", s3Bucket.BucketName);
ListObjectsResponse objects;
try
{
objects = await s3.ListObjectsAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error getting objects. Bucket => {BucketName}", s3Bucket.BucketName);
continue;
}
// ForEachAsync (4 is how many tasks you want to run in parallel)
await objects.S3Objects.ForEachAsync(4, async s3Object =>
{
try
{
log.Information("Bucket => {BucketName} => {Key}", s3Bucket.BucketName, s3Object.Key);
await s3.DeleteObjectAsync(s3Bucket.BucketName, s3Object.Key);
}
catch
{
log.Error("Error deleting bucket {BucketName} object {Key}", s3Bucket.BucketName, s3Object.Key);
}
});
try
{
await s3.DeleteBucketAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error deleting bucket {BucketName}", s3Bucket.BucketName);
}
}
}
}).Wait();
If you are using EntityFramework.Core there is an extension method ForEachAsync.
The example usage looks like this:
using Microsoft.EntityFrameworkCore;
using System.Threading.Tasks;
public class Example
{
private readonly DbContext _dbContext;
public Example(DbContext dbContext)
{
_dbContext = dbContext;
}
public async void LogicMethod()
{
await _dbContext.Set<dbTable>().ForEachAsync(async x =>
{
//logic
await AsyncTask(x);
});
}
public async Task<bool> AsyncTask(object x)
{
//other logic
return await Task.FromResult<bool>(true);
}
}
I would like to add that there is a Parallel class with ForEach function built in that can be used for this purpose.
The problem was that the async keyword needs to appear before the lambda, not before the body:
db.Groups.ToList().ForEach(async (i) => {
await GetAdminsFromGroup(i.Gid);
});
This is method I created to handle async scenarios with ForEach.
If one of tasks fails then other tasks will continue their execution.
You have ability to add function that will be executed on every exception.
Exceptions are being collected as aggregateException at the end and are available for you.
Can handle CancellationToken
public static class ParallelExecutor
{
/// <summary>
/// Executes asynchronously given function on all elements of given enumerable with task count restriction.
/// Executor will continue starting new tasks even if one of the tasks throws. If at least one of the tasks throwed exception then <see cref="AggregateException"/> is throwed at the end of the method run.
/// </summary>
/// <typeparam name="T">Type of elements in enumerable</typeparam>
/// <param name="maxTaskCount">The maximum task count.</param>
/// <param name="enumerable">The enumerable.</param>
/// <param name="asyncFunc">asynchronous function that will be executed on every element of the enumerable. MUST be thread safe.</param>
/// <param name="onException">Acton that will be executed on every exception that would be thrown by asyncFunc. CAN be thread unsafe.</param>
/// <param name="cancellationToken">The cancellation token.</param>
public static async Task ForEachAsync<T>(int maxTaskCount, IEnumerable<T> enumerable, Func<T, Task> asyncFunc, Action<Exception> onException = null, CancellationToken cancellationToken = default)
{
using var semaphore = new SemaphoreSlim(initialCount: maxTaskCount, maxCount: maxTaskCount);
// This `lockObject` is used only in `catch { }` block.
object lockObject = new object();
var exceptions = new List<Exception>();
var tasks = new Task[enumerable.Count()];
int i = 0;
try
{
foreach (var t in enumerable)
{
await semaphore.WaitAsync(cancellationToken);
tasks[i++] = Task.Run(
async () =>
{
try
{
await asyncFunc(t);
}
catch (Exception e)
{
if (onException != null)
{
lock (lockObject)
{
onException.Invoke(e);
}
}
// This exception will be swallowed here but it will be collected at the end of ForEachAsync method in order to generate AggregateException.
throw;
}
finally
{
semaphore.Release();
}
}, cancellationToken);
if (cancellationToken.IsCancellationRequested)
{
break;
}
}
}
catch (OperationCanceledException e)
{
exceptions.Add(e);
}
foreach (var t in tasks)
{
if (cancellationToken.IsCancellationRequested)
{
break;
}
// Exception handling in this case is actually pretty fast.
// https://gist.github.com/shoter/d943500eda37c7d99461ce3dace42141
try
{
await t;
}
#pragma warning disable CA1031 // Do not catch general exception types - we want to throw that exception later as aggregate exception. Nothing wrong here.
catch (Exception e)
#pragma warning restore CA1031 // Do not catch general exception types
{
exceptions.Add(e);
}
}
if (exceptions.Any())
{
throw new AggregateException(exceptions);
}
}
}

A simple data producer/consumer implementation using IProducerConsumerCollection? [duplicate]

I would like to await on the result of BlockingCollection<T>.Take() asynchronously, so I do not block the thread. Looking for anything like this:
var item = await blockingCollection.TakeAsync();
I know I could do this:
var item = await Task.Run(() => blockingCollection.Take());
but that kinda kills the whole idea, because another thread (of ThreadPool) gets blocked instead.
Is there any alternative?
There are four alternatives that I know of.
The first is Channels, which provides a threadsafe queue that supports asynchronous Read and Write operations. Channels are highly optimized and optionally support dropping some items if a threshold is reached.
The next is BufferBlock<T> from TPL Dataflow. If you only have a single consumer, you can use OutputAvailableAsync or ReceiveAsync, or just link it to an ActionBlock<T>. For more information, see my blog.
The last two are types that I created, available in my AsyncEx library.
AsyncCollection<T> is the async near-equivalent of BlockingCollection<T>, capable of wrapping a concurrent producer/consumer collection such as ConcurrentQueue<T> or ConcurrentBag<T>. You can use TakeAsync to asynchronously consume items from the collection. For more information, see my blog.
AsyncProducerConsumerQueue<T> is a more portable async-compatible producer/consumer queue. You can use DequeueAsync to asynchronously consume items from the queue. For more information, see my blog.
The last three of these alternatives allow synchronous and asynchronous puts and takes.
...or you can do this:
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
public class AsyncQueue<T>
{
private readonly SemaphoreSlim _sem;
private readonly ConcurrentQueue<T> _que;
public AsyncQueue()
{
_sem = new SemaphoreSlim(0);
_que = new ConcurrentQueue<T>();
}
public void Enqueue(T item)
{
_que.Enqueue(item);
_sem.Release();
}
public void EnqueueRange(IEnumerable<T> source)
{
var n = 0;
foreach (var item in source)
{
_que.Enqueue(item);
n++;
}
_sem.Release(n);
}
public async Task<T> DequeueAsync(CancellationToken cancellationToken = default(CancellationToken))
{
for (; ; )
{
await _sem.WaitAsync(cancellationToken);
T item;
if (_que.TryDequeue(out item))
{
return item;
}
}
}
}
Simple, fully functional asynchronous FIFO queue.
Note: SemaphoreSlim.WaitAsync was added in .NET 4.5 before that, this was not all that straightforward.
Here is a very basic implementation of a BlockingCollection that supports awaiting, with lots of missing features. It uses the AsyncEnumerable library, that makes asynchronous enumeration possible for C# versions older than 8.0.
public class AsyncBlockingCollection<T>
{ // Missing features: cancellation, boundedCapacity, TakeAsync
private Queue<T> _queue = new Queue<T>();
private SemaphoreSlim _semaphore = new SemaphoreSlim(0);
private int _consumersCount = 0;
private bool _isAddingCompleted;
public void Add(T item)
{
lock (_queue)
{
if (_isAddingCompleted) throw new InvalidOperationException();
_queue.Enqueue(item);
}
_semaphore.Release();
}
public void CompleteAdding()
{
lock (_queue)
{
if (_isAddingCompleted) return;
_isAddingCompleted = true;
if (_consumersCount > 0) _semaphore.Release(_consumersCount);
}
}
public IAsyncEnumerable<T> GetConsumingEnumerable()
{
lock (_queue) _consumersCount++;
return new AsyncEnumerable<T>(async yield =>
{
while (true)
{
lock (_queue)
{
if (_queue.Count == 0 && _isAddingCompleted) break;
}
await _semaphore.WaitAsync();
bool hasItem;
T item = default;
lock (_queue)
{
hasItem = _queue.Count > 0;
if (hasItem) item = _queue.Dequeue();
}
if (hasItem) await yield.ReturnAsync(item);
}
});
}
}
Usage example:
var abc = new AsyncBlockingCollection<int>();
var producer = Task.Run(async () =>
{
for (int i = 1; i <= 10; i++)
{
await Task.Delay(100);
abc.Add(i);
}
abc.CompleteAdding();
});
var consumer = Task.Run(async () =>
{
await abc.GetConsumingEnumerable().ForEachAsync(async item =>
{
await Task.Delay(200);
await Console.Out.WriteAsync(item + " ");
});
});
await Task.WhenAll(producer, consumer);
Output:
1 2 3 4 5 6 7 8 9 10
Update: With the release of C# 8, asynchronous enumeration has become a build-in language feature. The required classes (IAsyncEnumerable, IAsyncEnumerator) are embedded in .NET Core 3.0, and are offered as a package for .NET Framework 4.6.1+ (Microsoft.Bcl.AsyncInterfaces).
Here is an alternative GetConsumingEnumerable implementation, featuring the new C# 8 syntax:
public async IAsyncEnumerable<T> GetConsumingEnumerable()
{
lock (_queue) _consumersCount++;
while (true)
{
lock (_queue)
{
if (_queue.Count == 0 && _isAddingCompleted) break;
}
await _semaphore.WaitAsync();
bool hasItem;
T item = default;
lock (_queue)
{
hasItem = _queue.Count > 0;
if (hasItem) item = _queue.Dequeue();
}
if (hasItem) yield return item;
}
}
Note the coexistence of await and yield in the same method.
Usage example (C# 8):
var consumer = Task.Run(async () =>
{
await foreach (var item in abc.GetConsumingEnumerable())
{
await Task.Delay(200);
await Console.Out.WriteAsync(item + " ");
}
});
Note the await before the foreach.
If you don't mind a little hack, you can try these extensions.
public static async Task AddAsync<TEntity>(
this BlockingCollection<TEntity> Bc, TEntity item, CancellationToken abortCt)
{
while (true)
{
try
{
if (Bc.TryAdd(item, 0, abortCt))
return;
else
await Task.Delay(100, abortCt);
}
catch (Exception)
{
throw;
}
}
}
public static async Task<TEntity> TakeAsync<TEntity>(
this BlockingCollection<TEntity> Bc, CancellationToken abortCt)
{
while (true)
{
try
{
TEntity item;
if (Bc.TryTake(out item, 0, abortCt))
return item;
else
await Task.Delay(100, abortCt);
}
catch (Exception)
{
throw;
}
}
}

Custom thread pool supporting async actions

I would like to have a custom thread pool satisfying the following requirements:
Real threads are preallocated according to the pool capacity. The actual work is free to use the standard .NET thread pool, if needed to spawn concurrent tasks.
The pool must be able to return the number of idle threads. The returned number may be less than the actual number of the idle threads, but it must not be greater. Of course, the more accurate the number the better.
Queuing work to the pool should return a corresponding Task, which should place nice with the Task based API.
NEW The max job capacity (or degree of parallelism) should be adjustable dynamically. Trying to reduce the capacity does not have to take effect immediately, but increasing it should do so immediately.
The rationale for the first item is depicted below:
The machine is not supposed to be running more than N work items concurrently, where N is relatively small - between 10 and 30.
The work is fetched from the database and if K items are fetched then we want to make sure that there are K idle threads to start the work right away. A situation where work is fetched from the database, but remains waiting for the next available thread is unacceptable.
The last item also explains the reason for having the idle thread count - I am going to fetch that many work items from the database. It also explains why the reported idle thread count must never be higher than the actual one - otherwise I might fetch more work that can be immediately started.
Anyway, here is my implementation along with a small program to test it (BJE stands for Background Job Engine):
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace TaskStartLatency
{
public class BJEThreadPool
{
private sealed class InternalTaskScheduler : TaskScheduler
{
private int m_idleThreadCount;
private readonly BlockingCollection<Task> m_bus;
public InternalTaskScheduler(int threadCount, BlockingCollection<Task> bus)
{
m_idleThreadCount = threadCount;
m_bus = bus;
}
public void RunInline(Task task)
{
Interlocked.Decrement(ref m_idleThreadCount);
try
{
TryExecuteTask(task);
}
catch
{
// The action is responsible itself for the error handling, for the time being...
}
Interlocked.Increment(ref m_idleThreadCount);
}
public int IdleThreadCount
{
get { return m_idleThreadCount; }
}
#region Overrides of TaskScheduler
protected override void QueueTask(Task task)
{
m_bus.Add(task);
}
protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
{
return TryExecuteTask(task);
}
protected override IEnumerable<Task> GetScheduledTasks()
{
throw new NotSupportedException();
}
#endregion
public void DecrementIdleThreadCount()
{
Interlocked.Decrement(ref m_idleThreadCount);
}
}
private class ThreadContext
{
private readonly InternalTaskScheduler m_ts;
private readonly BlockingCollection<Task> m_bus;
private readonly CancellationTokenSource m_cts;
public readonly Thread Thread;
public ThreadContext(string name, InternalTaskScheduler ts, BlockingCollection<Task> bus, CancellationTokenSource cts)
{
m_ts = ts;
m_bus = bus;
m_cts = cts;
Thread = new Thread(Start)
{
IsBackground = true,
Name = name
};
Thread.Start();
}
private void Start()
{
try
{
foreach (var task in m_bus.GetConsumingEnumerable(m_cts.Token))
{
m_ts.RunInline(task);
}
}
catch (OperationCanceledException)
{
}
m_ts.DecrementIdleThreadCount();
}
}
private readonly InternalTaskScheduler m_ts;
private readonly CancellationTokenSource m_cts = new CancellationTokenSource();
private readonly BlockingCollection<Task> m_bus = new BlockingCollection<Task>();
private readonly List<ThreadContext> m_threadCtxs = new List<ThreadContext>();
public BJEThreadPool(int threadCount)
{
m_ts = new InternalTaskScheduler(threadCount, m_bus);
for (int i = 0; i < threadCount; ++i)
{
m_threadCtxs.Add(new ThreadContext("BJE Thread " + i, m_ts, m_bus, m_cts));
}
}
public void Terminate()
{
m_cts.Cancel();
foreach (var t in m_threadCtxs)
{
t.Thread.Join();
}
}
public Task Run(Action<CancellationToken> action)
{
return Task.Factory.StartNew(() => action(m_cts.Token), m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts);
}
public Task Run(Action action)
{
return Task.Factory.StartNew(action, m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts);
}
public int IdleThreadCount
{
get { return m_ts.IdleThreadCount; }
}
}
class Program
{
static void Main()
{
const int THREAD_COUNT = 32;
var pool = new BJEThreadPool(THREAD_COUNT);
var tcs = new TaskCompletionSource<bool>();
var tasks = new List<Task>();
var allRunning = new CountdownEvent(THREAD_COUNT);
for (int i = pool.IdleThreadCount; i > 0; --i)
{
var index = i;
tasks.Add(pool.Run(cancellationToken =>
{
Console.WriteLine("Started action " + index);
allRunning.Signal();
tcs.Task.Wait(cancellationToken);
Console.WriteLine(" Ended action " + index);
}));
}
Console.WriteLine("pool.IdleThreadCount = " + pool.IdleThreadCount);
allRunning.Wait();
Debug.Assert(pool.IdleThreadCount == 0);
int expectedIdleThreadCount = THREAD_COUNT;
Console.WriteLine("Press [c]ancel, [e]rror, [a]bort or any other key");
switch (Console.ReadKey().KeyChar)
{
case 'c':
Console.WriteLine("Cancel All");
tcs.TrySetCanceled();
break;
case 'e':
Console.WriteLine("Error All");
tcs.TrySetException(new Exception("Failed"));
break;
case 'a':
Console.WriteLine("Abort All");
pool.Terminate();
expectedIdleThreadCount = 0;
break;
default:
Console.WriteLine("Done All");
tcs.TrySetResult(true);
break;
}
try
{
Task.WaitAll(tasks.ToArray());
}
catch (AggregateException exc)
{
Console.WriteLine(exc.Flatten().InnerException.Message);
}
Debug.Assert(pool.IdleThreadCount == expectedIdleThreadCount);
pool.Terminate();
Console.WriteLine("Press any key");
Console.ReadKey();
}
}
}
It is a very simple implementation and it appears to be working. However, there is a problem - the BJEThreadPool.Run method does not accept asynchronous methods. I.e. my implementation does not allow me to add the following overloads:
public Task Run(Func<CancellationToken, Task> action)
{
return Task.Factory.StartNew(() => action(m_cts.Token), m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts).Unwrap();
}
public Task Run(Func<Task> action)
{
return Task.Factory.StartNew(action, m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts).Unwrap();
}
The pattern I use in InternalTaskScheduler.RunInline does not work in this case.
So, my question is how to add the support for asynchronous work items? I am fine with changing the entire design as long as the requirements outlined at the beginning of the post are upheld.
EDIT
I would like to clarify the intented usage of the desired pool. Please, observe the following code:
if (pool.IdleThreadCount == 0)
{
return;
}
foreach (var jobData in FetchFromDB(pool.IdleThreadCount))
{
pool.Run(CreateJobAction(jobData));
}
Notes:
The code is going to be run periodically, say every 1 minute.
The code is going to be run concurrently by multiple machines watching the same database.
FetchFromDB is going to use the technique described in Using SQL Server as a DB queue with multiple clients to atomically fetch and lock the work from the DB.
CreateJobAction is going to invoke the code denoted by jobData (the job code) and close the work upon the completion of that code. The job code is out of my control and it could be pretty much anything - heavy CPU bound code or light asynchronous IO bound code, badly written synchronous IO bound code or a mix of all. It could run for minutes and it could run for hours. Closing the work is my code and it would by asynchronous IO bound code. Because of this, the signature of the returned job action is that of an asynchronous method.
Item 2 underlines the importance of correctly identifying the amount of idle threads. If there are 900 pending work items and 10 agent machines I cannot allow an agent to fetch 300 work items and queue them on the thread pool. Why? Because, it is most unlikely that the agent will be able to run 300 work items concurrently. It will run some, sure enough, but others will be waiting in the thread pool work queue. Suppose it will run 100 and let 200 wait (even though 100 is probably far fetched). This wields 3 fully loaded agents and 7 idle ones. But only 300 work items out of 900 are actually being processed concurrently!!!
My goal is to maximize the spread of the work amongst the available agents. Ideally, I should evaluate the load of an agent and the "heaviness" of the pending work, but it is a formidable task and is reserved for the future versions. Right now, I wish to assign each agent the max job capacity with the intention to provide the means to increase/decrease it dynamically without restarting the agents.
Next observation. The work can take quite a long time to run and it could be all synchronous code. As far as I understand it is undesirable to utilize thread pool threads for such kind of work.
EDIT 2
There is a statement that TaskScheduler is only for the CPU bound work. But what if I do not know the nature of the work? I mean it is a general purpose Background Job Engine and it runs thousands of different kinds of jobs. I do not have means to tell "that job is CPU bound" and "that on is synchronous IO bound" and yet another one is asynchronous IO bound. I wish I could, but I cannot.
EDIT 3
At the end, I do not use the SemaphoreSlim, but neither do I use the TaskScheduler - it finally trickled down my thick skull that it is unappropriate and plain wrong, plus it makes the code overly complex.
Still, I failed to see how SemaphoreSlim is the way. The proposed pattern:
public async Task Enqueue(Func<Task> taskGenerator)
{
await semaphore.WaitAsync();
try
{
await taskGenerator();
}
finally
{
semaphore.Release();
}
}
Expects taskGenerator either be an asynchronous IO bound code or open a new thread otherwise. However, I have no means to determine whether the work to be executed is one or another. Plus, as I have learned from SemaphoreSlim.WaitAsync continuation code if the semaphore is unlocked, the code following the WaitAsync() is going to run on the same thread, which is not very good for me.
Anyway, below is my implementation, in case anyone fancies. Unfortunately, I am yet to understand how to reduce the pool thread count dynamically, but this is a topic for another question.
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace TaskStartLatency
{
public interface IBJEThreadPool
{
void SetThreadCount(int threadCount);
void Terminate();
Task Run(Action action);
Task Run(Action<CancellationToken> action);
Task Run(Func<Task> action);
Task Run(Func<CancellationToken, Task> action);
int IdleThreadCount { get; }
}
public class BJEThreadPool : IBJEThreadPool
{
private interface IActionContext
{
Task Run(CancellationToken ct);
TaskCompletionSource<object> TaskCompletionSource { get; }
}
private class ActionContext : IActionContext
{
private readonly Action m_action;
public ActionContext(Action action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
m_action();
return null;
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private class CancellableActionContext : IActionContext
{
private readonly Action<CancellationToken> m_action;
public CancellableActionContext(Action<CancellationToken> action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
m_action(ct);
return null;
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private class AsyncActionContext : IActionContext
{
private readonly Func<Task> m_action;
public AsyncActionContext(Func<Task> action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
return m_action();
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private class AsyncCancellableActionContext : IActionContext
{
private readonly Func<CancellationToken, Task> m_action;
public AsyncCancellableActionContext(Func<CancellationToken, Task> action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
return m_action(ct);
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private readonly CancellationTokenSource m_ctsTerminateAll = new CancellationTokenSource();
private readonly BlockingCollection<IActionContext> m_bus = new BlockingCollection<IActionContext>();
private readonly LinkedList<Thread> m_threads = new LinkedList<Thread>();
private int m_idleThreadCount;
private static int s_threadCount;
public BJEThreadPool(int threadCount)
{
ReserveAdditionalThreads(threadCount);
}
private void ReserveAdditionalThreads(int n)
{
for (int i = 0; i < n; ++i)
{
var index = Interlocked.Increment(ref s_threadCount) - 1;
var t = new Thread(Start)
{
IsBackground = true,
Name = "BJE Thread " + index
};
Interlocked.Increment(ref m_idleThreadCount);
t.Start();
m_threads.AddLast(t);
}
}
private void Start()
{
try
{
foreach (var actionContext in m_bus.GetConsumingEnumerable(m_ctsTerminateAll.Token))
{
RunWork(actionContext).Wait();
}
}
catch (OperationCanceledException)
{
}
catch
{
// Should never happen - log the error
}
Interlocked.Decrement(ref m_idleThreadCount);
}
private async Task RunWork(IActionContext actionContext)
{
Interlocked.Decrement(ref m_idleThreadCount);
try
{
var task = actionContext.Run(m_ctsTerminateAll.Token);
if (task != null)
{
await task;
}
actionContext.TaskCompletionSource.SetResult(null);
}
catch (OperationCanceledException)
{
actionContext.TaskCompletionSource.TrySetCanceled();
}
catch (Exception exc)
{
actionContext.TaskCompletionSource.TrySetException(exc);
}
Interlocked.Increment(ref m_idleThreadCount);
}
private Task PostWork(IActionContext actionContext)
{
m_bus.Add(actionContext);
return actionContext.TaskCompletionSource.Task;
}
#region Implementation of IBJEThreadPool
public void SetThreadCount(int threadCount)
{
if (threadCount > m_threads.Count)
{
ReserveAdditionalThreads(threadCount - m_threads.Count);
}
else if (threadCount < m_threads.Count)
{
throw new NotSupportedException();
}
}
public void Terminate()
{
m_ctsTerminateAll.Cancel();
foreach (var t in m_threads)
{
t.Join();
}
}
public Task Run(Action action)
{
return PostWork(new ActionContext(action));
}
public Task Run(Action<CancellationToken> action)
{
return PostWork(new CancellableActionContext(action));
}
public Task Run(Func<Task> action)
{
return PostWork(new AsyncActionContext(action));
}
public Task Run(Func<CancellationToken, Task> action)
{
return PostWork(new AsyncCancellableActionContext(action));
}
public int IdleThreadCount
{
get { return m_idleThreadCount; }
}
#endregion
}
public static class Extensions
{
public static Task WithCancellation(this Task task, CancellationToken token)
{
return task.ContinueWith(t => t.GetAwaiter().GetResult(), token);
}
}
class Program
{
static void Main()
{
const int THREAD_COUNT = 16;
var pool = new BJEThreadPool(THREAD_COUNT);
var tcs = new TaskCompletionSource<bool>();
var tasks = new List<Task>();
var allRunning = new CountdownEvent(THREAD_COUNT);
for (int i = pool.IdleThreadCount; i > 0; --i)
{
var index = i;
tasks.Add(pool.Run(async ct =>
{
Console.WriteLine("Started action " + index);
allRunning.Signal();
await tcs.Task.WithCancellation(ct);
Console.WriteLine(" Ended action " + index);
}));
}
Console.WriteLine("pool.IdleThreadCount = " + pool.IdleThreadCount);
allRunning.Wait();
Debug.Assert(pool.IdleThreadCount == 0);
int expectedIdleThreadCount = THREAD_COUNT;
Console.WriteLine("Press [c]ancel, [e]rror, [a]bort or any other key");
switch (Console.ReadKey().KeyChar)
{
case 'c':
Console.WriteLine("ancel All");
tcs.TrySetCanceled();
break;
case 'e':
Console.WriteLine("rror All");
tcs.TrySetException(new Exception("Failed"));
break;
case 'a':
Console.WriteLine("bort All");
pool.Terminate();
expectedIdleThreadCount = 0;
break;
default:
Console.WriteLine("Done All");
tcs.TrySetResult(true);
break;
}
try
{
Task.WaitAll(tasks.ToArray());
}
catch (AggregateException exc)
{
Console.WriteLine(exc.Flatten().InnerException.Message);
}
Debug.Assert(pool.IdleThreadCount == expectedIdleThreadCount);
pool.Terminate();
Console.WriteLine("Press any key");
Console.ReadKey();
}
}
}
Asynchronous "work items" are often based on async IO. Async IO does not use threads while it runs. Task schedulers are used to execute CPU work (tasks based on a delegate). The concept TaskScheduler does not apply. You cannot use a custom TaskScheduler to influence what async code does.
Make your work items throttle themselves:
static SemaphoreSlim sem = new SemaphoreSlim(maxDegreeOfParallelism); //shared object
async Task MyWorkerFunction()
{
await sem.WaitAsync();
try
{
MyWork();
}
finally
{
sem.Release();
}
}
As mentioned in another answer by usr you can't do this with a TaskScheduler as that is only for CPU bound work, not limiting the level of parallelization of all types of work, whether parallel or not. He also shows you how you can use a SemaphoreSlim to asynchronously limit the degrees of parallelism.
You can expand on this to generalize these concepts in a few ways. The one that seems like it would be the most beneficial to you would be to create a special type of queue that takes operations that return a Task and executes them in such a way that a given max degree of parallelization is achieved.
public class FixedParallelismQueue
{
private SemaphoreSlim semaphore;
public FixedParallelismQueue(int maxDegreesOfParallelism)
{
semaphore = new SemaphoreSlim(maxDegreesOfParallelism,
maxDegreesOfParallelism);
}
public async Task<T> Enqueue<T>(Func<Task<T>> taskGenerator)
{
await semaphore.WaitAsync();
try
{
return await taskGenerator();
}
finally
{
semaphore.Release();
}
}
public async Task Enqueue(Func<Task> taskGenerator)
{
await semaphore.WaitAsync();
try
{
await taskGenerator();
}
finally
{
semaphore.Release();
}
}
}
This allows you to create a queue for your application (you can even have several separate queues if you want) that has a fixed degree of parallelization. You can then provide operations returning a Task when they complete and the queue will schedule it when it can and return a Task representing when that unit of work has finished.

How can I use Async with ForEach?

Is it possible to use Async when using ForEach? Below is the code I am trying:
using (DataContext db = new DataLayer.DataContext())
{
db.Groups.ToList().ForEach(i => async {
await GetAdminsFromGroup(i.Gid);
});
}
I am getting the error:
The name 'Async' does not exist in the current context
The method the using statement is enclosed in is set to async.
List<T>.ForEach doesn't play particularly well with async (neither does LINQ-to-objects, for the same reasons).
In this case, I recommend projecting each element into an asynchronous operation, and you can then (asynchronously) wait for them all to complete.
using (DataContext db = new DataLayer.DataContext())
{
var tasks = db.Groups.ToList().Select(i => GetAdminsFromGroupAsync(i.Gid));
var results = await Task.WhenAll(tasks);
}
The benefits of this approach over giving an async delegate to ForEach are:
Error handling is more proper. Exceptions from async void cannot be caught with catch; this approach will propagate exceptions at the await Task.WhenAll line, allowing natural exception handling.
You know that the tasks are complete at the end of this method, since it does an await Task.WhenAll. If you use async void, you cannot easily tell when the operations have completed.
This approach has a natural syntax for retrieving the results. GetAdminsFromGroupAsync sounds like it's an operation that produces a result (the admins), and such code is more natural if such operations can return their results rather than setting a value as a side effect.
This little extension method should give you exception-safe async iteration:
public static async Task ForEachAsync<T>(this List<T> list, Func<T, Task> func)
{
foreach (var value in list)
{
await func(value);
}
}
Since we're changing the return type of the lambda from void to Task, exceptions will propagate up correctly. This will allow you to write something like this in practice:
await db.Groups.ToList().ForEachAsync(async i => {
await GetAdminsFromGroup(i.Gid);
});
Starting with C# 8.0, you can create and consume streams asynchronously.
private async void button1_Click(object sender, EventArgs e)
{
IAsyncEnumerable<int> enumerable = GenerateSequence();
await foreach (var i in enumerable)
{
Debug.WriteLine(i);
}
}
public static async IAsyncEnumerable<int> GenerateSequence()
{
for (int i = 0; i < 20; i++)
{
await Task.Delay(100);
yield return i;
}
}
More
The simple answer is to use the foreach keyword instead of the ForEach() method of List().
using (DataContext db = new DataLayer.DataContext())
{
foreach(var i in db.Groups)
{
await GetAdminsFromGroup(i.Gid);
}
}
Here is an actual working version of the above async foreach variants with sequential processing:
public static async Task ForEachAsync<T>(this List<T> enumerable, Action<T> action)
{
foreach (var item in enumerable)
await Task.Run(() => { action(item); }).ConfigureAwait(false);
}
Here is the implementation:
public async void SequentialAsync()
{
var list = new List<Action>();
Action action1 = () => {
//do stuff 1
};
Action action2 = () => {
//do stuff 2
};
list.Add(action1);
list.Add(action2);
await list.ForEachAsync();
}
What's the key difference? .ConfigureAwait(false); which keeps the context of main thread while async sequential processing of each task.
This is not an old question, but .Net 6 introduced Parallel.ForeachAsync:
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, async (i, token) =>
{
await GetAdminsFromGroup(i);
});
ForeachAsync also accepts a ParallelOptions object, but usually you don't want to mess with the MaxDegreeOfParallelism property:
ParallelOptions parallelOptions = new ParallelOptions { MaxDegreeOfParallelism = 4 };
var collectionToIterate = db.Groups.ToList();
await Parallel.ForEachAsync(collectionToIterate, parallelOptions , async (i, token) =>
{
await GetAdminsFromGroup(i);
});
From Microsoft Docs: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.paralleloptions.maxdegreeofparallelism?view=net-6.0
By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.
Generally, you do not need to modify this setting....
Add this extension method
public static class ForEachAsyncExtension
{
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate
{
using (partition)
while (partition.MoveNext())
await body(partition.Current).ConfigureAwait(false);
}));
}
}
And then use like so:
Task.Run(async () =>
{
var s3 = new AmazonS3Client(Config.Instance.Aws.Credentials, Config.Instance.Aws.RegionEndpoint);
var buckets = await s3.ListBucketsAsync();
foreach (var s3Bucket in buckets.Buckets)
{
if (s3Bucket.BucketName.StartsWith("mybucket-"))
{
log.Information("Bucket => {BucketName}", s3Bucket.BucketName);
ListObjectsResponse objects;
try
{
objects = await s3.ListObjectsAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error getting objects. Bucket => {BucketName}", s3Bucket.BucketName);
continue;
}
// ForEachAsync (4 is how many tasks you want to run in parallel)
await objects.S3Objects.ForEachAsync(4, async s3Object =>
{
try
{
log.Information("Bucket => {BucketName} => {Key}", s3Bucket.BucketName, s3Object.Key);
await s3.DeleteObjectAsync(s3Bucket.BucketName, s3Object.Key);
}
catch
{
log.Error("Error deleting bucket {BucketName} object {Key}", s3Bucket.BucketName, s3Object.Key);
}
});
try
{
await s3.DeleteBucketAsync(s3Bucket.BucketName);
}
catch
{
log.Error("Error deleting bucket {BucketName}", s3Bucket.BucketName);
}
}
}
}).Wait();
If you are using EntityFramework.Core there is an extension method ForEachAsync.
The example usage looks like this:
using Microsoft.EntityFrameworkCore;
using System.Threading.Tasks;
public class Example
{
private readonly DbContext _dbContext;
public Example(DbContext dbContext)
{
_dbContext = dbContext;
}
public async void LogicMethod()
{
await _dbContext.Set<dbTable>().ForEachAsync(async x =>
{
//logic
await AsyncTask(x);
});
}
public async Task<bool> AsyncTask(object x)
{
//other logic
return await Task.FromResult<bool>(true);
}
}
I would like to add that there is a Parallel class with ForEach function built in that can be used for this purpose.
The problem was that the async keyword needs to appear before the lambda, not before the body:
db.Groups.ToList().ForEach(async (i) => {
await GetAdminsFromGroup(i.Gid);
});
This is method I created to handle async scenarios with ForEach.
If one of tasks fails then other tasks will continue their execution.
You have ability to add function that will be executed on every exception.
Exceptions are being collected as aggregateException at the end and are available for you.
Can handle CancellationToken
public static class ParallelExecutor
{
/// <summary>
/// Executes asynchronously given function on all elements of given enumerable with task count restriction.
/// Executor will continue starting new tasks even if one of the tasks throws. If at least one of the tasks throwed exception then <see cref="AggregateException"/> is throwed at the end of the method run.
/// </summary>
/// <typeparam name="T">Type of elements in enumerable</typeparam>
/// <param name="maxTaskCount">The maximum task count.</param>
/// <param name="enumerable">The enumerable.</param>
/// <param name="asyncFunc">asynchronous function that will be executed on every element of the enumerable. MUST be thread safe.</param>
/// <param name="onException">Acton that will be executed on every exception that would be thrown by asyncFunc. CAN be thread unsafe.</param>
/// <param name="cancellationToken">The cancellation token.</param>
public static async Task ForEachAsync<T>(int maxTaskCount, IEnumerable<T> enumerable, Func<T, Task> asyncFunc, Action<Exception> onException = null, CancellationToken cancellationToken = default)
{
using var semaphore = new SemaphoreSlim(initialCount: maxTaskCount, maxCount: maxTaskCount);
// This `lockObject` is used only in `catch { }` block.
object lockObject = new object();
var exceptions = new List<Exception>();
var tasks = new Task[enumerable.Count()];
int i = 0;
try
{
foreach (var t in enumerable)
{
await semaphore.WaitAsync(cancellationToken);
tasks[i++] = Task.Run(
async () =>
{
try
{
await asyncFunc(t);
}
catch (Exception e)
{
if (onException != null)
{
lock (lockObject)
{
onException.Invoke(e);
}
}
// This exception will be swallowed here but it will be collected at the end of ForEachAsync method in order to generate AggregateException.
throw;
}
finally
{
semaphore.Release();
}
}, cancellationToken);
if (cancellationToken.IsCancellationRequested)
{
break;
}
}
}
catch (OperationCanceledException e)
{
exceptions.Add(e);
}
foreach (var t in tasks)
{
if (cancellationToken.IsCancellationRequested)
{
break;
}
// Exception handling in this case is actually pretty fast.
// https://gist.github.com/shoter/d943500eda37c7d99461ce3dace42141
try
{
await t;
}
#pragma warning disable CA1031 // Do not catch general exception types - we want to throw that exception later as aggregate exception. Nothing wrong here.
catch (Exception e)
#pragma warning restore CA1031 // Do not catch general exception types
{
exceptions.Add(e);
}
}
if (exceptions.Any())
{
throw new AggregateException(exceptions);
}
}
}

How to create a thread/Task with a continuous loop?

I am looking for the correct way/structure to create a loop in a Thread/Task...
The reason for this is, i need to check the DB every 15sec for report requests.
This is what i tried so far, but i get OutOfMemoryException:
private void ViewBase_Loaded(object sender, RoutedEventArgs e)
{
//On my main view loaded start thread to check report requests.
Task.Factory.StartNew(() => CreateAndStartReportRequestTask());
}
private void CreateAndStartReportRequestTask()
{
bool noRequest = false;
do
{
//Starting thread to Check Report Requests And Generate Reports
//Also need the ability to Wait/Sleep when there are noRequest.
reportRequestTask = Task.Factory.StartNew(() => noRequest = CheckReportRequestsAndGenerateReports());
if (noRequest)
{
//Sleep 15sec
reportRequestTask.Wait(15000);
reportRequestTask = null;
}
else
{
if (reportRequestTask.IsCompleted)
{
reportRequestTask = null;
}
else
{
//Don't want the loop to continue until the first request is done
//Reason for this is, losts of new threads being create in CheckReportRequestsAndGenerateReports()
//Looping until first request is done.
do
{
} while (!reportRequestTask.IsCompleted);
reportRequestTask = null;
}
}
} while (true);
}
private bool CheckReportRequestsAndGenerateReports()
{
var possibleReportRequest = //Some linq query to check for new requests
if (possibleReportRequest != null)
{
//Processing report here - lots of new threads/task in here as well
return false;
}
else
{
return true;
}
}
What am i doing wrong?
Is this correct way or am i total off?
EDIT:
Most important, my UI must still be responsive!
Something like this would work:
var cancellationTokenSource = new CancellationTokenSource();
var task = Repeat.Interval(
TimeSpan.FromSeconds(15),
() => CheckDatabaseForNewReports(), cancellationTokenSource.Token);
The Repeat class looks like this:
internal static class Repeat
{
public static Task Interval(
TimeSpan pollInterval,
Action action,
CancellationToken token)
{
// We don't use Observable.Interval:
// If we block, the values start bunching up behind each other.
return Task.Factory.StartNew(
() =>
{
for (;;)
{
if (token.WaitCancellationRequested(pollInterval))
break;
action();
}
}, token, TaskCreationOptions.LongRunning, TaskScheduler.Default);
}
}
static class CancellationTokenExtensions
{
public static bool WaitCancellationRequested(
this CancellationToken token,
TimeSpan timeout)
{
return token.WaitHandle.WaitOne(timeout);
}
}
Sounds like you want something like this. Please correct me if I am misinterpretting your intentions...
First, in your kick-off, set as a long running task so it doesn't consume a thread from the thread pool but creates a new one...
private void ViewBase_Loaded(object sender, RoutedEventArgs e)
{
// store this references as a private member, call Cancel() on it if UI wants to stop
_cancelationTokenSource = new CancellationTokenSource();
new Task(() => CreateAndStartReportRequestTask(), _cancelationTokenSource.Token, TaskCreationOptions.LongRunning).Start();
}
Then, in your report watching thread, loop until IsCancelRequested has been set. If there is no work, just wait on the cancel token for 15 seconds (this way if cancelled will wake sooner).
private bool CheckReportRequestsAndGenerateReports()
{
while (!_cancellationTokenSource.Token.IsCancelRequested)
{
var possibleReportRequest = //Some linq query
var reportRequestTask = Task.Factory.StartNew(() => noRequest = CheckReportRequestsAndGenerateReports(), _cancellationTokenSource.Token);
if (noRequest)
{
// it looks like if no request, you want to sleep 15 seconds, right?
// so we'll wait to see if cancelled in next 15 seconds.
_cancellationTokenSource.Token.WaitHandle.WaitOne(15000);
}
else
{
// otherwise, you just want to wait till the task is completed, right?
reportRequestTask.Wait(_cancellationTokenSource.Token);
}
}
}
I'd also be wary of having your task kick off more tasks. I have a feeling you are spinning up so many you're consuming too many resources. I think the main reason your program was failing was that you had:
if (noRequest)
{
reportRequestTask.Wait(15000);
reportRequestTask = null;
}
This will return immediately and not wait 15s, because the thread is already complete at this point. Switching it to the cancel token (or a Thread.Sleep(), but then you can't abort it as easily) will give you the processing wait you need.
Hope this helps, let me know if i'm off on my assumptions.
I've made a work-around starting from #Roger's answer. (A friend of mine has given good advices regarding this too)... I copy it here I guess it could be useful:
/// <summary>
/// Recurrent Cancellable Task
/// </summary>
public static class RecurrentCancellableTask
{
/// <summary>
/// Starts a new task in a recurrent manner repeating it according to the polling interval.
/// Whoever use this method should protect himself by surrounding critical code in the task
/// in a Try-Catch block.
/// </summary>
/// <param name="action">The action.</param>
/// <param name="pollInterval">The poll interval.</param>
/// <param name="token">The token.</param>
/// <param name="taskCreationOptions">The task creation options</param>
public static void StartNew(Action action,
TimeSpan pollInterval,
CancellationToken token,
TaskCreationOptions taskCreationOptions = TaskCreationOptions.None)
{
Task.Factory.StartNew(
() =>
{
do
{
try
{
action();
if (token.WaitHandle.WaitOne(pollInterval)) break;
}
catch
{
return;
}
}
while (true);
},
token,
taskCreationOptions,
TaskScheduler.Default);
}
}
feeling adventurous?
internal class Program
{
private static void Main(string[] args)
{
var ct = new CancellationTokenSource();
new Task(() => Console.WriteLine("Running...")).Repeat(ct.Token, TimeSpan.FromSeconds(1));
Console.WriteLine("Starting. Hit Enter to Stop.. ");
Console.ReadLine();
ct.Cancel();
Console.WriteLine("Stopped. Hit Enter to exit.. ");
Console.ReadLine();
}
}
public static class TaskExtensions
{
public static void Repeat(this Task taskToRepeat, CancellationToken cancellationToken, TimeSpan intervalTimeSpan)
{
var action = taskToRepeat
.GetType()
.GetField("m_action", BindingFlags.NonPublic | BindingFlags.Instance)
.GetValue(taskToRepeat) as Action;
Task.Factory.StartNew(() =>
{
while (true)
{
if (cancellationToken.WaitHandle.WaitOne(intervalTimeSpan))
break;
if (cancellationToken.IsCancellationRequested)
break;
Task.Factory.StartNew(action, cancellationToken);
}
}, cancellationToken);
}
}

Categories

Resources