I've got this pattern for preventing calling into an async method before it has had a chance to complete previously.
My solution involving needing a flag, and then needing to lock around the flag, feels pretty verbose. Is there a more natural way of achieving this?
public class MyClass
{
private object SyncIsFooRunning = new object();
private bool IsFooRunning { get; set;}
public async Task FooAsync()
{
try
{
lock(SyncIsFooRunning)
{
if(IsFooRunning)
return;
IsFooRunning = true;
}
// Use a semaphore to enforce maximum number of Tasks which are able to run concurrently.
var semaphoreSlim = new SemaphoreSlim(5);
var trackedTasks = new List<Task>();
for(int i = 0; i < 100; i++)
{
await semaphoreSlim.WaitAsync();
trackedTasks.Add(Task.Run(() =>
{
// DoTask();
semaphoreSlim.Release();
}));
}
// Using await makes try/catch/finally possible.
await Task.WhenAll(trackedTasks);
}
finally
{
lock(SyncIsFooRunning)
{
IsFooRunning = false;
}
}
}
}
As noted in the comments, you can use Interlocked.CompareExchange() if you prefer:
public class MyClass
{
private int _flag;
public async Task FooAsync()
{
try
{
if (Interlocked.CompareExchange(ref _flag, 1, 0) == 1)
{
return;
}
// do stuff
}
finally
{
Interlocked.Exchange(ref _flag, 0);
}
}
}
That said, I think it's overkill. Nothing wrong with using lock in this type of scenario, especially if you don't expect a lot of contention on the method. What I do think would be better is to wrap the method so that the caller can always await on the result, whether a new asynchronous operation was started or not:
public class MyClass
{
private readonly object _lock = new object();
private Task _task;
public Task FooAsync()
{
lock (_lock)
{
return _task != null ? _task : (_task = FooAsyncImpl());
}
}
public async Task FooAsyncImpl()
{
try
{
// do async stuff
}
finally
{
lock (_lock) _task = null;
}
}
}
Finally, in the comments, you say this:
Seems a bit odd that all the return types are still valid for Task?
Not clear to me what you mean by that. In your method, the only valid return types would be void and Task. If your return statement(s) returned an actual value, you'd have to use Task<T> where T is the type returned by the return statement(s).
Related
I working on real-time search. At this moment on property setter which is bounded to edit text, I call a method which calls API and then fills the list with the result it looks like this:
private string searchPhrase;
public string SearchPhrase
{
get => searchPhrase;
set
{
SetProperty(ref searchPhrase, value);
RunOnMainThread(SearchResult.Clear);
isAllFriends = false;
currentPage = 0;
RunInAsync(LoadData);
}
}
private async Task LoadData()
{
var response = await connectionRepository.GetConnections(currentPage,
pageSize, searchPhrase);
foreach (UserConnection uc in response)
{
if (uc.Type != UserConnection.TypeEnum.Awaiting)
{
RunOnMainThread(() =>
SearchResult.Add(new ConnectionUser(uc)));
}
}
}
But this way is totally useless because of it totally mashup list of a result if a text is entering quickly. So to prevent this I want to run this method async in a property but if a property is changed again I want to kill the previous Task and star it again. How can I achieve this?
Some informations from this thread:
create a CancellationTokenSource
var ctc = new CancellationTokenSource();
create a method doing the async work
private static Task ExecuteLongCancellableMethod(CancellationToken token)
{
return Task.Run(() =>
{
token.ThrowIfCancellationRequested();
// more code here
// check again if this task is canceled
token.ThrowIfCancellationRequested();
// more code
}
}
It is important to have this checks for cancel in the code.
Execute the function:
var cancellable = ExecuteLongCancellableMethod(ctc.Token);
To stop the long running execution use
ctc.Cancel();
For further details please consult the linked thread.
This question can be answered in many different ways. However IMO I would look at creating a class that
Delays itself automatically for X (ms) before performing the seach
Has the ability to be cancelled at any time as the search request changes.
Realistically this will change your code design, and should encapsulate the logic for both 1 & 2 in a separate class.
My initial thoughts are (and none of this is tested and mostly pseudo code).
class ConnectionSearch
{
public ConnectionSearch(string phrase, Action<object> addAction)
{
_searchPhrase = phrase;
_addAction = addAction;
_cancelSource = new CancellationTokenSource();
}
readonly string _searchPhrase = null;
readonly Action<object> _addAction;
readonly CancellationTokenSource _cancelSource;
public void Cancel()
{
_cancelSource?.Cancel();
}
public async void PerformSearch()
{
await Task.Delay(300); //await 300ms between keystrokes
if (_cancelSource.IsCancellationRequested)
return;
//continue your code keep checking for
//loop your dataset
//call _addAction?.Invoke(uc);
}
}
This is basic, really just encapsulates the logic for both points 1 & 2, you will need to adapt the code to do the search.
Next you could change your property to cancel a previous running instance, and then start another instance immediatly after something like below.
ConnectionSearch connectionSearch;
string searchPhrase;
public string SearchPhrase
{
get => searchPhrase;
set
{
//do your setter work
if(connectionSearch != null)
{
connectionSearch.Cancel();
}
connectionSearch = new ConnectionSearch(value, addConnectionUser);
connectionSearch.PerformSearch();
}
}
void addConnectionUser(object uc)
{
//pperform your add logic..
}
The code is pretty straight forward, however you will see in the setter is simply cancelling an existing request and then creating a new request. You could put some disposal cleanup logic in place but this should get you started.
You can implement some sort of debouncer which will encapsulate the logics of task result debouncing, i.e. it will assure if you run many tasks, then only the latest task result will be used:
public class TaskDebouncer<TResult>
{
public delegate void TaskDebouncerHandler(TResult result, object sender);
public event TaskDebouncerHandler OnCompleted;
public event TaskDebouncerHandler OnDebounced;
private Task _lastTask;
private object _lock = new object();
public void Run(Task<TResult> task)
{
lock (_lock)
{
_lastTask = task;
}
task.ContinueWith(t =>
{
if (t.IsFaulted)
throw t.Exception;
lock (_lock)
{
if (_lastTask == task)
{
OnCompleted?.Invoke(t.Result, this);
}
else
{
OnDebounced?.Invoke(t.Result, this);
}
}
});
}
public async Task WaitLast()
{
await _lastTask;
}
}
Then, you can just do:
private readonly TaskDebouncer<Connections[]> _connectionsDebouncer = new TaskDebouncer<Connections[]>();
public ClassName()
{
_connectionsDebouncer.OnCompleted += OnConnectionUpdate;
}
public void OnConnectionUpdate(Connections[] connections, object sender)
{
RunOnMainThread(SearchResult.Clear);
isAllFriends = false;
currentPage = 0;
foreach (var conn in connections)
RunOnMainThread(() => SearchResult.Add(new ConnectionUser(conn)));
}
private string searchPhrase;
public string SearchPhrase
{
get => searchPhrase;
set
{
SetProperty(ref searchPhrase, value);
_connectionsDebouncer.Add(RunInAsync(LoadData));
}
}
private async Task<Connection[]> LoadData()
{
return await connectionRepository
.GetConnections(currentPage, pageSize, searchPhrase)
.Where(conn => conn.Type != UserConnection.TypeEnum.Awaiting)
.ToArray();
}
It is not pretty clear what RunInAsync and RunOnMainThread methods are.
I guess, you don't actually need them.
I would like to have a custom thread pool satisfying the following requirements:
Real threads are preallocated according to the pool capacity. The actual work is free to use the standard .NET thread pool, if needed to spawn concurrent tasks.
The pool must be able to return the number of idle threads. The returned number may be less than the actual number of the idle threads, but it must not be greater. Of course, the more accurate the number the better.
Queuing work to the pool should return a corresponding Task, which should place nice with the Task based API.
NEW The max job capacity (or degree of parallelism) should be adjustable dynamically. Trying to reduce the capacity does not have to take effect immediately, but increasing it should do so immediately.
The rationale for the first item is depicted below:
The machine is not supposed to be running more than N work items concurrently, where N is relatively small - between 10 and 30.
The work is fetched from the database and if K items are fetched then we want to make sure that there are K idle threads to start the work right away. A situation where work is fetched from the database, but remains waiting for the next available thread is unacceptable.
The last item also explains the reason for having the idle thread count - I am going to fetch that many work items from the database. It also explains why the reported idle thread count must never be higher than the actual one - otherwise I might fetch more work that can be immediately started.
Anyway, here is my implementation along with a small program to test it (BJE stands for Background Job Engine):
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace TaskStartLatency
{
public class BJEThreadPool
{
private sealed class InternalTaskScheduler : TaskScheduler
{
private int m_idleThreadCount;
private readonly BlockingCollection<Task> m_bus;
public InternalTaskScheduler(int threadCount, BlockingCollection<Task> bus)
{
m_idleThreadCount = threadCount;
m_bus = bus;
}
public void RunInline(Task task)
{
Interlocked.Decrement(ref m_idleThreadCount);
try
{
TryExecuteTask(task);
}
catch
{
// The action is responsible itself for the error handling, for the time being...
}
Interlocked.Increment(ref m_idleThreadCount);
}
public int IdleThreadCount
{
get { return m_idleThreadCount; }
}
#region Overrides of TaskScheduler
protected override void QueueTask(Task task)
{
m_bus.Add(task);
}
protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
{
return TryExecuteTask(task);
}
protected override IEnumerable<Task> GetScheduledTasks()
{
throw new NotSupportedException();
}
#endregion
public void DecrementIdleThreadCount()
{
Interlocked.Decrement(ref m_idleThreadCount);
}
}
private class ThreadContext
{
private readonly InternalTaskScheduler m_ts;
private readonly BlockingCollection<Task> m_bus;
private readonly CancellationTokenSource m_cts;
public readonly Thread Thread;
public ThreadContext(string name, InternalTaskScheduler ts, BlockingCollection<Task> bus, CancellationTokenSource cts)
{
m_ts = ts;
m_bus = bus;
m_cts = cts;
Thread = new Thread(Start)
{
IsBackground = true,
Name = name
};
Thread.Start();
}
private void Start()
{
try
{
foreach (var task in m_bus.GetConsumingEnumerable(m_cts.Token))
{
m_ts.RunInline(task);
}
}
catch (OperationCanceledException)
{
}
m_ts.DecrementIdleThreadCount();
}
}
private readonly InternalTaskScheduler m_ts;
private readonly CancellationTokenSource m_cts = new CancellationTokenSource();
private readonly BlockingCollection<Task> m_bus = new BlockingCollection<Task>();
private readonly List<ThreadContext> m_threadCtxs = new List<ThreadContext>();
public BJEThreadPool(int threadCount)
{
m_ts = new InternalTaskScheduler(threadCount, m_bus);
for (int i = 0; i < threadCount; ++i)
{
m_threadCtxs.Add(new ThreadContext("BJE Thread " + i, m_ts, m_bus, m_cts));
}
}
public void Terminate()
{
m_cts.Cancel();
foreach (var t in m_threadCtxs)
{
t.Thread.Join();
}
}
public Task Run(Action<CancellationToken> action)
{
return Task.Factory.StartNew(() => action(m_cts.Token), m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts);
}
public Task Run(Action action)
{
return Task.Factory.StartNew(action, m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts);
}
public int IdleThreadCount
{
get { return m_ts.IdleThreadCount; }
}
}
class Program
{
static void Main()
{
const int THREAD_COUNT = 32;
var pool = new BJEThreadPool(THREAD_COUNT);
var tcs = new TaskCompletionSource<bool>();
var tasks = new List<Task>();
var allRunning = new CountdownEvent(THREAD_COUNT);
for (int i = pool.IdleThreadCount; i > 0; --i)
{
var index = i;
tasks.Add(pool.Run(cancellationToken =>
{
Console.WriteLine("Started action " + index);
allRunning.Signal();
tcs.Task.Wait(cancellationToken);
Console.WriteLine(" Ended action " + index);
}));
}
Console.WriteLine("pool.IdleThreadCount = " + pool.IdleThreadCount);
allRunning.Wait();
Debug.Assert(pool.IdleThreadCount == 0);
int expectedIdleThreadCount = THREAD_COUNT;
Console.WriteLine("Press [c]ancel, [e]rror, [a]bort or any other key");
switch (Console.ReadKey().KeyChar)
{
case 'c':
Console.WriteLine("Cancel All");
tcs.TrySetCanceled();
break;
case 'e':
Console.WriteLine("Error All");
tcs.TrySetException(new Exception("Failed"));
break;
case 'a':
Console.WriteLine("Abort All");
pool.Terminate();
expectedIdleThreadCount = 0;
break;
default:
Console.WriteLine("Done All");
tcs.TrySetResult(true);
break;
}
try
{
Task.WaitAll(tasks.ToArray());
}
catch (AggregateException exc)
{
Console.WriteLine(exc.Flatten().InnerException.Message);
}
Debug.Assert(pool.IdleThreadCount == expectedIdleThreadCount);
pool.Terminate();
Console.WriteLine("Press any key");
Console.ReadKey();
}
}
}
It is a very simple implementation and it appears to be working. However, there is a problem - the BJEThreadPool.Run method does not accept asynchronous methods. I.e. my implementation does not allow me to add the following overloads:
public Task Run(Func<CancellationToken, Task> action)
{
return Task.Factory.StartNew(() => action(m_cts.Token), m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts).Unwrap();
}
public Task Run(Func<Task> action)
{
return Task.Factory.StartNew(action, m_cts.Token, TaskCreationOptions.DenyChildAttach, m_ts).Unwrap();
}
The pattern I use in InternalTaskScheduler.RunInline does not work in this case.
So, my question is how to add the support for asynchronous work items? I am fine with changing the entire design as long as the requirements outlined at the beginning of the post are upheld.
EDIT
I would like to clarify the intented usage of the desired pool. Please, observe the following code:
if (pool.IdleThreadCount == 0)
{
return;
}
foreach (var jobData in FetchFromDB(pool.IdleThreadCount))
{
pool.Run(CreateJobAction(jobData));
}
Notes:
The code is going to be run periodically, say every 1 minute.
The code is going to be run concurrently by multiple machines watching the same database.
FetchFromDB is going to use the technique described in Using SQL Server as a DB queue with multiple clients to atomically fetch and lock the work from the DB.
CreateJobAction is going to invoke the code denoted by jobData (the job code) and close the work upon the completion of that code. The job code is out of my control and it could be pretty much anything - heavy CPU bound code or light asynchronous IO bound code, badly written synchronous IO bound code or a mix of all. It could run for minutes and it could run for hours. Closing the work is my code and it would by asynchronous IO bound code. Because of this, the signature of the returned job action is that of an asynchronous method.
Item 2 underlines the importance of correctly identifying the amount of idle threads. If there are 900 pending work items and 10 agent machines I cannot allow an agent to fetch 300 work items and queue them on the thread pool. Why? Because, it is most unlikely that the agent will be able to run 300 work items concurrently. It will run some, sure enough, but others will be waiting in the thread pool work queue. Suppose it will run 100 and let 200 wait (even though 100 is probably far fetched). This wields 3 fully loaded agents and 7 idle ones. But only 300 work items out of 900 are actually being processed concurrently!!!
My goal is to maximize the spread of the work amongst the available agents. Ideally, I should evaluate the load of an agent and the "heaviness" of the pending work, but it is a formidable task and is reserved for the future versions. Right now, I wish to assign each agent the max job capacity with the intention to provide the means to increase/decrease it dynamically without restarting the agents.
Next observation. The work can take quite a long time to run and it could be all synchronous code. As far as I understand it is undesirable to utilize thread pool threads for such kind of work.
EDIT 2
There is a statement that TaskScheduler is only for the CPU bound work. But what if I do not know the nature of the work? I mean it is a general purpose Background Job Engine and it runs thousands of different kinds of jobs. I do not have means to tell "that job is CPU bound" and "that on is synchronous IO bound" and yet another one is asynchronous IO bound. I wish I could, but I cannot.
EDIT 3
At the end, I do not use the SemaphoreSlim, but neither do I use the TaskScheduler - it finally trickled down my thick skull that it is unappropriate and plain wrong, plus it makes the code overly complex.
Still, I failed to see how SemaphoreSlim is the way. The proposed pattern:
public async Task Enqueue(Func<Task> taskGenerator)
{
await semaphore.WaitAsync();
try
{
await taskGenerator();
}
finally
{
semaphore.Release();
}
}
Expects taskGenerator either be an asynchronous IO bound code or open a new thread otherwise. However, I have no means to determine whether the work to be executed is one or another. Plus, as I have learned from SemaphoreSlim.WaitAsync continuation code if the semaphore is unlocked, the code following the WaitAsync() is going to run on the same thread, which is not very good for me.
Anyway, below is my implementation, in case anyone fancies. Unfortunately, I am yet to understand how to reduce the pool thread count dynamically, but this is a topic for another question.
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace TaskStartLatency
{
public interface IBJEThreadPool
{
void SetThreadCount(int threadCount);
void Terminate();
Task Run(Action action);
Task Run(Action<CancellationToken> action);
Task Run(Func<Task> action);
Task Run(Func<CancellationToken, Task> action);
int IdleThreadCount { get; }
}
public class BJEThreadPool : IBJEThreadPool
{
private interface IActionContext
{
Task Run(CancellationToken ct);
TaskCompletionSource<object> TaskCompletionSource { get; }
}
private class ActionContext : IActionContext
{
private readonly Action m_action;
public ActionContext(Action action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
m_action();
return null;
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private class CancellableActionContext : IActionContext
{
private readonly Action<CancellationToken> m_action;
public CancellableActionContext(Action<CancellationToken> action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
m_action(ct);
return null;
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private class AsyncActionContext : IActionContext
{
private readonly Func<Task> m_action;
public AsyncActionContext(Func<Task> action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
return m_action();
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private class AsyncCancellableActionContext : IActionContext
{
private readonly Func<CancellationToken, Task> m_action;
public AsyncCancellableActionContext(Func<CancellationToken, Task> action)
{
m_action = action;
TaskCompletionSource = new TaskCompletionSource<object>();
}
#region Implementation of IActionContext
public Task Run(CancellationToken ct)
{
return m_action(ct);
}
public TaskCompletionSource<object> TaskCompletionSource { get; private set; }
#endregion
}
private readonly CancellationTokenSource m_ctsTerminateAll = new CancellationTokenSource();
private readonly BlockingCollection<IActionContext> m_bus = new BlockingCollection<IActionContext>();
private readonly LinkedList<Thread> m_threads = new LinkedList<Thread>();
private int m_idleThreadCount;
private static int s_threadCount;
public BJEThreadPool(int threadCount)
{
ReserveAdditionalThreads(threadCount);
}
private void ReserveAdditionalThreads(int n)
{
for (int i = 0; i < n; ++i)
{
var index = Interlocked.Increment(ref s_threadCount) - 1;
var t = new Thread(Start)
{
IsBackground = true,
Name = "BJE Thread " + index
};
Interlocked.Increment(ref m_idleThreadCount);
t.Start();
m_threads.AddLast(t);
}
}
private void Start()
{
try
{
foreach (var actionContext in m_bus.GetConsumingEnumerable(m_ctsTerminateAll.Token))
{
RunWork(actionContext).Wait();
}
}
catch (OperationCanceledException)
{
}
catch
{
// Should never happen - log the error
}
Interlocked.Decrement(ref m_idleThreadCount);
}
private async Task RunWork(IActionContext actionContext)
{
Interlocked.Decrement(ref m_idleThreadCount);
try
{
var task = actionContext.Run(m_ctsTerminateAll.Token);
if (task != null)
{
await task;
}
actionContext.TaskCompletionSource.SetResult(null);
}
catch (OperationCanceledException)
{
actionContext.TaskCompletionSource.TrySetCanceled();
}
catch (Exception exc)
{
actionContext.TaskCompletionSource.TrySetException(exc);
}
Interlocked.Increment(ref m_idleThreadCount);
}
private Task PostWork(IActionContext actionContext)
{
m_bus.Add(actionContext);
return actionContext.TaskCompletionSource.Task;
}
#region Implementation of IBJEThreadPool
public void SetThreadCount(int threadCount)
{
if (threadCount > m_threads.Count)
{
ReserveAdditionalThreads(threadCount - m_threads.Count);
}
else if (threadCount < m_threads.Count)
{
throw new NotSupportedException();
}
}
public void Terminate()
{
m_ctsTerminateAll.Cancel();
foreach (var t in m_threads)
{
t.Join();
}
}
public Task Run(Action action)
{
return PostWork(new ActionContext(action));
}
public Task Run(Action<CancellationToken> action)
{
return PostWork(new CancellableActionContext(action));
}
public Task Run(Func<Task> action)
{
return PostWork(new AsyncActionContext(action));
}
public Task Run(Func<CancellationToken, Task> action)
{
return PostWork(new AsyncCancellableActionContext(action));
}
public int IdleThreadCount
{
get { return m_idleThreadCount; }
}
#endregion
}
public static class Extensions
{
public static Task WithCancellation(this Task task, CancellationToken token)
{
return task.ContinueWith(t => t.GetAwaiter().GetResult(), token);
}
}
class Program
{
static void Main()
{
const int THREAD_COUNT = 16;
var pool = new BJEThreadPool(THREAD_COUNT);
var tcs = new TaskCompletionSource<bool>();
var tasks = new List<Task>();
var allRunning = new CountdownEvent(THREAD_COUNT);
for (int i = pool.IdleThreadCount; i > 0; --i)
{
var index = i;
tasks.Add(pool.Run(async ct =>
{
Console.WriteLine("Started action " + index);
allRunning.Signal();
await tcs.Task.WithCancellation(ct);
Console.WriteLine(" Ended action " + index);
}));
}
Console.WriteLine("pool.IdleThreadCount = " + pool.IdleThreadCount);
allRunning.Wait();
Debug.Assert(pool.IdleThreadCount == 0);
int expectedIdleThreadCount = THREAD_COUNT;
Console.WriteLine("Press [c]ancel, [e]rror, [a]bort or any other key");
switch (Console.ReadKey().KeyChar)
{
case 'c':
Console.WriteLine("ancel All");
tcs.TrySetCanceled();
break;
case 'e':
Console.WriteLine("rror All");
tcs.TrySetException(new Exception("Failed"));
break;
case 'a':
Console.WriteLine("bort All");
pool.Terminate();
expectedIdleThreadCount = 0;
break;
default:
Console.WriteLine("Done All");
tcs.TrySetResult(true);
break;
}
try
{
Task.WaitAll(tasks.ToArray());
}
catch (AggregateException exc)
{
Console.WriteLine(exc.Flatten().InnerException.Message);
}
Debug.Assert(pool.IdleThreadCount == expectedIdleThreadCount);
pool.Terminate();
Console.WriteLine("Press any key");
Console.ReadKey();
}
}
}
Asynchronous "work items" are often based on async IO. Async IO does not use threads while it runs. Task schedulers are used to execute CPU work (tasks based on a delegate). The concept TaskScheduler does not apply. You cannot use a custom TaskScheduler to influence what async code does.
Make your work items throttle themselves:
static SemaphoreSlim sem = new SemaphoreSlim(maxDegreeOfParallelism); //shared object
async Task MyWorkerFunction()
{
await sem.WaitAsync();
try
{
MyWork();
}
finally
{
sem.Release();
}
}
As mentioned in another answer by usr you can't do this with a TaskScheduler as that is only for CPU bound work, not limiting the level of parallelization of all types of work, whether parallel or not. He also shows you how you can use a SemaphoreSlim to asynchronously limit the degrees of parallelism.
You can expand on this to generalize these concepts in a few ways. The one that seems like it would be the most beneficial to you would be to create a special type of queue that takes operations that return a Task and executes them in such a way that a given max degree of parallelization is achieved.
public class FixedParallelismQueue
{
private SemaphoreSlim semaphore;
public FixedParallelismQueue(int maxDegreesOfParallelism)
{
semaphore = new SemaphoreSlim(maxDegreesOfParallelism,
maxDegreesOfParallelism);
}
public async Task<T> Enqueue<T>(Func<Task<T>> taskGenerator)
{
await semaphore.WaitAsync();
try
{
return await taskGenerator();
}
finally
{
semaphore.Release();
}
}
public async Task Enqueue(Func<Task> taskGenerator)
{
await semaphore.WaitAsync();
try
{
await taskGenerator();
}
finally
{
semaphore.Release();
}
}
}
This allows you to create a queue for your application (you can even have several separate queues if you want) that has a fixed degree of parallelization. You can then provide operations returning a Task when they complete and the queue will schedule it when it can and return a Task representing when that unit of work has finished.
I have some problems with ReaderWriterLockSlim. I cannot understand how it's magic working.
My code:
private async Task LoadIndex()
{
if (!File.Exists(FileName + ".index.txt"))
{
return;
}
_indexLock.EnterWriteLock();// <1>
_index.Clear();
using (TextReader index = File.OpenText(FileName + ".index.txt"))
{
string s;
while (null != (s = await index.ReadLineAsync()))
{
var ss = s.Split(':');
_index.Add(ss[0], Convert.ToInt64(ss[1]));
}
}
_indexLock.ExitWriteLock();<2>
}
When I enter write lock at <1>, in debugger I can see that _indexLock.IsWriteLockHeld is true, but when execution steps to <2> I see _indexLock.IsWriteLockHeld is false
and _indexLock.ExitWriteLock throws an exception SynchronizationLockException with message "The write lock is being released without being held". What I doing wrong?
ReaderWriterLockSlim is a thread-affine lock type, so it usually cannot be used with async and await.
You should either use SemaphoreSlim with WaitAsync, or (if you really need a reader/writer lock), use my AsyncReaderWriterLock from AsyncEx or Stephen Toub's AsyncReaderWriterLock.
You can safely emulate a reader/writer locking mechanism using the reliable and lightweight SemaphoreSlim and keep the benefits of async/await. Create the SemaphoreSlim giving it the number of available locks equivalent to the number of routines that will lock your resource for reading simultaneously. Each one will request one lock as usual. For your writing routine, make sure it requests all the available locks before doing its thing.That way, your writing routine will always run alone while your reading routines might share the resource only between themselves.For example, suppose you have 2 reading routines and 1 writing routine.
SemaphoreSlim semaphore = new SemaphoreSlim(2);
async void Reader1()
{
await semaphore.WaitAsync();
try
{
// ... reading stuff ...
}
finally
{
semaphore.Release();
}
}
async void Reader2()
{
await semaphore.WaitAsync();
try
{
// ... reading other stuff ...
}
finally
{
semaphore.Release();
}
}
async void ExclusiveWriter()
{
// the exclusive writer must request all locks
// to make sure the readers don't have any of them
// (I wish we could specify the number of locks
// instead of spamming multiple calls!)
await semaphore.WaitAsync();
await semaphore.WaitAsync();
try
{
// ... writing stuff ...
}
finally
{
// release all locks here
semaphore.Release(2);
// (oh here we don't need multiple calls, how about that)
}
}
Obviously this method only works if you know beforehand how many reading routines you could have running at the same time. Admittedly, too much of them would make this code very ugly.
Some time ago I implemented for my project class AsyncReaderWriterLock based on two SemaphoreSlim. Hope it can help. It is implemented the same logic (Multiple Readers and Single Writer) and at the same time support async/await pattern. Definitely, it does not support recursion and has no protection from incorrect usage:
var rwLock = new AsyncReaderWriterLock();
await rwLock.AcquireReaderLock();
try
{
// ... reading ...
}
finally
{
rwLock.ReleaseReaderLock();
}
await rwLock.AcquireWriterLock();
try
{
// ... writing ...
}
finally
{
rwLock.ReleaseWriterLock();
}
public sealed class AsyncReaderWriterLock : IDisposable
{
private readonly SemaphoreSlim _readSemaphore = new SemaphoreSlim(1, 1);
private readonly SemaphoreSlim _writeSemaphore = new SemaphoreSlim(1, 1);
private int _readerCount;
public async Task AcquireWriterLock(CancellationToken token = default)
{
await _writeSemaphore.WaitAsync(token).ConfigureAwait(false);
await SafeAcquireReadSemaphore(token).ConfigureAwait(false);
}
public void ReleaseWriterLock()
{
_readSemaphore.Release();
_writeSemaphore.Release();
}
public async Task AcquireReaderLock(CancellationToken token = default)
{
await _writeSemaphore.WaitAsync(token).ConfigureAwait(false);
if (Interlocked.Increment(ref _readerCount) == 1)
{
try
{
await SafeAcquireReadSemaphore(token).ConfigureAwait(false);
}
catch
{
Interlocked.Decrement(ref _readerCount);
throw;
}
}
_writeSemaphore.Release();
}
public void ReleaseReaderLock()
{
if (Interlocked.Decrement(ref _readerCount) == 0)
{
_readSemaphore.Release();
}
}
private async Task SafeAcquireReadSemaphore(CancellationToken token)
{
try
{
await _readSemaphore.WaitAsync(token).ConfigureAwait(false);
}
catch
{
_writeSemaphore.Release();
throw;
}
}
public void Dispose()
{
_writeSemaphore.Dispose();
_readSemaphore.Dispose();
}
}
https://learn.microsoft.com/en-us/dotnet/api/system.threading.readerwriterlockslim?view=net-5.0
From source:
ReaderWriterLockSlim has managed thread affinity; that is, each Thread
object must make its own method calls to enter and exit lock modes. No
thread can change the mode of another thread.
So here expected behavour. Async / await does not guarantee continuation in the same thread, so you can catch exception when you enter in write lock in one thread and try to exit in other thread.
Better to use other lock mechanisms from other answers like SemaphoreSlim.
Like Stephen Cleary says, ReaderWriterLockSlim is a thread-affine lock type, so it usually cannot be used with async and await.
You have to build a mechanism to avoid readers and writers accessing shared data at the same time. This algorithm should follow a few rules.
When requesting a readerlock:
Is there a writerlock active?
Is there anything already queued? (I think it's nice to execute it in order of requests)
When requesting a writerlock:
Is there a writerlock active? (because writerlocks shouldn't execute parallel)
Are there any reader locks active?
Is there anything already queued?
If any of these creteria answers yes, the execution should be queued and executed later. You could use TaskCompletionSources to continue awaits.
When any reader or writer is done, You should evaluate the queue and continue execute items when possible.
For example (nuget): AsyncReaderWriterLock
I have been inspired by #xtadex's answer. However, it seems to me, one instance of SemaphoreSlim is enougth to implement the same behaviour:
internal sealed class ReadWriteSemaphore : IDisposable
{
private readonly SemaphoreSlim _semaphore;
private int _count;
public ReadWriteSemaphore()
{
_semaphore = new SemaphoreSlim(2, 2);
_count = 0;
}
public void Dispose() => _semaphore.Dispose();
public async Task<IDisposable> WaitReadAsync(CancellationToken cancellation = default)
{
await _semaphore.WaitAsync(cancellation);
if (Interlocked.Increment(ref _count) > 1)
_semaphore.Release();
return new Token(() =>
{
if (Interlocked.Decrement(ref _count) == 0)
_semaphore.Release();
});
}
public async Task<IDisposable> WaitWriteAsync(CancellationToken cancellation = default)
{
int count = 0;
try
{
while (await _semaphore.WaitAsync(Timeout.Infinite, cancellation) && ++count < 2)
continue;
}
catch (OperationCanceledException) when (count > 0)
{
_semaphore.Release(count);
throw;
}
return new Token(() => _semaphore.Release(count));
}
private sealed class Token : IDisposable
{
private readonly Action _action;
public Token(Action action)
{
_action = action;
}
public void Dispose() => _action.Invoke();
}
}
It allows you to synchronize access to resource with using:
using (await _semaphore.WaitReadAsync()) // _semaphore is an instance of ReadWriteSemaphore
{
// read something
}
And for writhing:
using (await _semaphore.WaitWriteAsync()) // _semaphore is an instance of ReadWriteSemaphore
{
// write something
}
I adopted my implementation of parallel/consumer based on the code in this question
class ParallelConsumer<T> : IDisposable
{
private readonly int _maxParallel;
private readonly Action<T> _action;
private readonly TaskFactory _factory = new TaskFactory();
private CancellationTokenSource _tokenSource;
private readonly BlockingCollection<T> _entries = new BlockingCollection<T>();
private Task _task;
public ParallelConsumer(int maxParallel, Action<T> action)
{
_maxParallel = maxParallel;
_action = action;
}
public void Start()
{
try
{
_tokenSource = new CancellationTokenSource();
_task = _factory.StartNew(
() =>
{
Parallel.ForEach(
_entries.GetConsumingEnumerable(),
new ParallelOptions { MaxDegreeOfParallelism = _maxParallel, CancellationToken = _tokenSource.Token },
(item, loopState) =>
{
Log("Taking" + item);
if (!_tokenSource.IsCancellationRequested)
{
_action(item);
Log("Finished" + item);
}
else
{
Log("Not Taking" + item);
_entries.CompleteAdding();
loopState.Stop();
}
});
},
_tokenSource.Token);
}
catch (OperationCanceledException oce)
{
System.Diagnostics.Debug.WriteLine(oce);
}
}
private void Log(string message)
{
Console.WriteLine(message);
}
public void Stop()
{
Dispose();
}
public void Enqueue(T entry)
{
Log("Enqueuing" + entry);
_entries.Add(entry);
}
public void Dispose()
{
if (_task == null)
{
return;
}
_tokenSource.Cancel();
while (!_task.IsCanceled)
{
}
_task.Dispose();
_tokenSource.Dispose();
_task = null;
}
}
And here is a test code
class Program
{
static void Main(string[] args)
{
TestRepeatedEnqueue(100, 1);
}
private static void TestRepeatedEnqueue(int itemCount, int parallelCount)
{
bool[] flags = new bool[itemCount];
var consumer = new ParallelConsumer<int>(parallelCount,
(i) =>
{
flags[i] = true;
}
);
consumer.Start();
for (int i = 0; i < itemCount; i++)
{
consumer.Enqueue(i);
}
Thread.Sleep(1000);
Debug.Assert(flags.All(b => b == true));
}
}
The test always fails - it always stuck at around 93th-item from the 100 tested. Any idea which part of my code caused this issue, and how to fix it?
You cannot use Parallel.Foreach() with BlockingCollection.GetConsumingEnumerable(), as you have discovered.
For an explanation, see this blog post:
https://devblogs.microsoft.com/pfxteam/parallelextensionsextras-tour-4-blockingcollectionextensions/
Excerpt from the blog:
BlockingCollection’s GetConsumingEnumerable implementation is using BlockingCollection’s internal synchronization which already supports multiple consumers concurrently, but ForEach doesn’t know that, and its enumerable-partitioning logic also needs to take a lock while accessing the enumerable.
As such, there’s more synchronization here than is actually necessary, resulting in a potentially non-negligable performance hit.
[Also] the partitioning algorithm employed by default by both Parallel.ForEach and PLINQ use chunking in order to minimize synchronization costs: rather than taking the lock once per element, it'll take the lock, grab a group of elements (a chunk), and then release the lock.
While this design can help with overall throughput, for scenarios that are focused more on low latency, that chunking can be prohibitive.
That blog also provides the source code for a method called GetConsumingPartitioner() which you can use to solve the problem.
public static class BlockingCollectionExtensions
{
public static Partitioner<T> GetConsumingPartitioner<T>(this BlockingCollection<T> collection)
{
return new BlockingCollectionPartitioner<T>(collection);
}
public class BlockingCollectionPartitioner<T> : Partitioner<T>
{
private BlockingCollection<T> _collection;
internal BlockingCollectionPartitioner(BlockingCollection<T> collection)
{
if (collection == null)
throw new ArgumentNullException("collection");
_collection = collection;
}
public override bool SupportsDynamicPartitions
{
get { return true; }
}
public override IList<IEnumerator<T>> GetPartitions(int partitionCount)
{
if (partitionCount < 1)
throw new ArgumentOutOfRangeException("partitionCount");
var dynamicPartitioner = GetDynamicPartitions();
return Enumerable.Range(0, partitionCount).Select(_ => dynamicPartitioner.GetEnumerator()).ToArray();
}
public override IEnumerable<T> GetDynamicPartitions()
{
return _collection.GetConsumingEnumerable();
}
}
}
The reason for failure is because of the following reason as explained here
The partitioning algorithm employed by default by both
Parallel.ForEach and PLINQ use chunking in order to minimize
synchronization costs: rather than taking the lock once per element,
it'll take the lock, grab a group of elements (a chunk), and then
release the lock.
To get it to work, you can add a method on your ParallelConsumer<T> class to indicate that the adding is completed, as below
public void StopAdding()
{
_entries.CompleteAdding();
}
And now call this method after your for loop , as below
consumer.Start();
for (int i = 0; i < itemCount; i++)
{
consumer.Enqueue(i);
}
consumer.StopAdding();
Otherwise, Parallel.ForEach() would wait for the threshold to be reached so as to grab the chunk and start processing.
I have a method that queues some work to be executed asynchronously. I'd like to return some sort of handle to the caller that can be polled, waited on, or used to fetch the return value from the operation, but I can't find a class or interface that's suitable for the task.
BackgroundWorker comes close, but it's geared to the case where the worker has its own dedicated thread, which isn't true in my case. IAsyncResult looks promising, but the provided AsyncResult implementation is also unusable for me. Should I implement IAsyncResult myself?
Clarification:
I have a class that conceptually looks like this:
class AsyncScheduler
{
private List<object> _workList = new List<object>();
private bool _finished = false;
public SomeHandle QueueAsyncWork(object workObject)
{
// simplified for the sake of example
_workList.Add(workObject);
return SomeHandle;
}
private void WorkThread()
{
// simplified for the sake of example
while (!_finished)
{
foreach (object workObject in _workList)
{
if (!workObject.IsFinished)
{
workObject.DoSomeWork();
}
}
Thread.Sleep(1000);
}
}
}
The QueueAsyncWork function pushes a work item onto the polling list for a dedicated work thread, of which there will only over be one. My problem is not with writing the QueueAsyncWork function--that's fine. My question is, what do I return to the caller? What should SomeHandle be?
The existing .Net classes for this are geared towards the situation where the asynchronous operation can be encapsulated in a single method call that returns. That's not the case here--all of the work objects do their work on the same thread, and a complete work operation might span multiple calls to workObject.DoSomeWork(). In this case, what's a reasonable approach for offering the caller some handle for progress notification, completion, and getting the final outcome of the operation?
Yes, implement IAsyncResult (or rather, an extended version of it, to provide for progress reporting).
public class WorkObjectHandle : IAsyncResult, IDisposable
{
private int _percentComplete;
private ManualResetEvent _waitHandle;
public int PercentComplete {
get {return _percentComplete;}
set
{
if (value < 0 || value > 100) throw new InvalidArgumentException("Percent complete should be between 0 and 100");
if (_percentComplete = 100) throw new InvalidOperationException("Already complete");
if (value == 100 && Complete != null) Complete(this, new CompleteArgs(WorkObject));
_percentComplete = value;
}
public IWorkObject WorkObject {get; private set;}
public object AsyncState {get {return WorkObject;}}
public bool IsCompleted {get {return _percentComplete == 100;}}
public event EventHandler<CompleteArgs> Complete; // CompleteArgs in a usual pattern
// you may also want to have Progress event
public bool CompletedSynchronously {get {return false;}}
public WaitHandle
{
get
{
// initialize it lazily
if (_waitHandle == null)
{
ManualResetEvent newWaitHandle = new ManualResetEvent(false);
if (Interlocked.CompareExchange(ref _waitHandle, newWaitHandle, null) != null)
newWaitHandle.Dispose();
}
return _waitHandle;
}
}
public void Dispose()
{
if (_waitHandle != null)
_waitHandle.Dispose();
// dispose _workObject too, if needed
}
public WorkObjectHandle(IWorkObject workObject)
{
WorkObject = workObject;
_percentComplete = 0;
}
}
public class AsyncScheduler
{
private Queue<WorkObjectHandle> _workQueue = new Queue<WorkObjectHandle>();
private bool _finished = false;
public WorkObjectHandle QueueAsyncWork(IWorkObject workObject)
{
var handle = new WorkObjectHandle(workObject);
lock(_workQueue)
{
_workQueue.Enqueue(handle);
}
return handle;
}
private void WorkThread()
{
// simplified for the sake of example
while (!_finished)
{
WorkObjectHandle handle;
lock(_workQueue)
{
if (_workQueue.Count == 0) break;
handle = _workQueue.Dequeue();
}
try
{
var workObject = handle.WorkObject;
// do whatever you want with workObject, set handle.PercentCompleted, etc.
}
finally
{
handle.Dispose();
}
}
}
}
If I understand correctly you have a collection of work objects (IWorkObject) that each complete a task via multiple calls to a DoSomeWork method. When an IWorkObject object has finished its work you'd like to respond to that somehow and during the process you'd like to respond to any reported progress?
In that case I'd suggest you take a slightly different approach. You could take a look at the Parallel Extension framework (blog). Using the framework, you could write something like this:
public void QueueWork(IWorkObject workObject)
{
Task.TaskFactory.StartNew(() =>
{
while (!workObject.Finished)
{
int progress = workObject.DoSomeWork();
DoSomethingWithReportedProgress(workObject, progress);
}
WorkObjectIsFinished(workObject);
});
}
Some things to note:
QueueWork now returns void. The reason for this is that the actions that occur when progress is reported or when the task completes have become part of the thread that executes the work. You could of course return the Task that the factory creates and return that from the method (to enable polling for example).
The progress-reporting and finish-handling are now part of the thread because you should always avoid polling when possible. Polling is more expensive because usually you either poll too frequently (too early) or not often enough (too late). There is no reason you can't report on the progress and finishing of the task from within the thread that is running the task.
The above could also be implemented using the (lower level) ThreadPool.QueueUserWorkItem method.
Using QueueUserWorkItem:
public void QueueWork(IWorkObject workObject)
{
ThreadPool.QueueUserWorkItem(() =>
{
while (!workObject.Finished)
{
int progress = workObject.DoSomeWork();
DoSomethingWithReportedProgress(workObject, progress);
}
WorkObjectIsFinished(workObject);
});
}
The WorkObject class can contain the properties that need to be tracked.
public class WorkObject
{
public PercentComplete { get; private set; }
public IsFinished { get; private set; }
public void DoSomeWork()
{
// work done here
this.PercentComplete = 50;
// some more work done here
this.PercentComplete = 100;
this.IsFinished = true;
}
}
Then in your example:
Change the collection from a List to a Dictionary that can hold Guid values (or any other means of uniquely identifying the value).
Expose the correct WorkObject's properties by having the caller pass the Guid that it received from QueueAsyncWork.
I'm assuming that you'll start WorkThread asynchronously (albeit, the only asynchronous thread); plus, you'll have to make retrieving the dictionary values and WorkObject properties thread-safe.
private Dictionary<Guid, WorkObject> _workList =
new Dictionary<Guid, WorkObject>();
private bool _finished = false;
public Guid QueueAsyncWork(WorkObject workObject)
{
Guid guid = Guid.NewGuid();
// simplified for the sake of example
_workList.Add(guid, workObject);
return guid;
}
private void WorkThread()
{
// simplified for the sake of example
while (!_finished)
{
foreach (WorkObject workObject in _workList)
{
if (!workObject.IsFinished)
{
workObject.DoSomeWork();
}
}
Thread.Sleep(1000);
}
}
// an example of getting the WorkObject's property
public int GetPercentComplete(Guid guid)
{
WorkObject workObject = null;
if (!_workList.TryGetValue(guid, out workObject)
throw new Exception("Unable to find Guid");
return workObject.PercentComplete;
}
The simplest way to do this is described here. Suppose you have a method string DoSomeWork(int). You then create a delegate of the correct type, for example:
Func<int, string> myDelegate = DoSomeWork;
Then you call the BeginInvoke method on the delegate:
int parameter = 10;
myDelegate.BeginInvoke(parameter, Callback, null);
The Callback delegate will be called once your asynchronous call has completed. You can define this method as follows:
void Callback(IAsyncResult result)
{
var asyncResult = (AsyncResult) result;
var #delegate = (Func<int, string>) asyncResult.AsyncDelegate;
string methodReturnValue = #delegate.EndInvoke(result);
}
Using the described scenario, you can also poll for results or wait on them. Take a look at the url I provided for more info.
Regards,
Ronald
If you don't want to use async callbacks, you can use an explicit WaitHandle, such as a ManualResetEvent:
public abstract class WorkObject : IDispose
{
ManualResetEvent _waitHandle = new ManualResetEvent(false);
public void DoSomeWork()
{
try
{
this.DoSomeWorkOverride();
}
finally
{
_waitHandle.Set();
}
}
protected abstract DoSomeWorkOverride();
public void WaitForCompletion()
{
_waitHandle.WaitOne();
}
public void Dispose()
{
_waitHandle.Dispose();
}
}
And in your code you could say
using (var workObject = new SomeConcreteWorkObject())
{
asyncScheduler.QueueAsyncWork(workObject);
workObject.WaitForCompletion();
}
Don't forget to call Dispose on your workObject though.
You can always use alternate implementations which create a wrapper like this for every work object, and who call _waitHandle.Dispose() in WaitForCompletion(), you can lazily instantiate the wait handle (careful: race conditions ahead), etc. (That's pretty much what BeginInvoke does for delegates.)