Multiple worker threads vs One worker with async/await - c#

My code currently has the following 10 worker threads. Each worker thread continues polling a job from the queue and then process the long running job.
for (int k=0; k<10; k++)
{
Task.Factory.StartNew(() => DoPollingThenWork(), TaskCreationOptions.LongRunning);
}
void DoPollingThenWork()
{
while (true)
{
var msg = Poll();
if (msg != null)
{
Thread.Sleep(3000); // process the I/O bound job
}
}
}
I am refactoring the underlying code to use async/await pattern. I think I can rewrite the above code to the followings. It uses one main thread that keeps creating the async task, and use SemaphoreSlim to throttle the number of concurrent tasks to 10.
Task.Factory.StartNew(() => WorkerMainAsync(), TaskCreationOptions.LongRunning);
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
while (true)
{
await ss.WaitAsync();
Task.Run(async () =>
{
await DoPollingThenWorkAsync();
ss.Release();
});
}
}
async Task DoPollingThenWorkAsync()
{
var msg = Poll();
if (msg != null)
{
await Task.Delay(3000); // process the I/O-bound job
}
}
Both should behave the same. But I think the second options seems better because it doesn't block the thread. But the downside is I can't do Wait (to gracefully stop the task) since the task is like fire and forget. Is the second option the right way to replace the traditional worker threads pattern?

When you have code that's asynchronous, you usually have no reason to use Task.Run() (or, even worse, Task.Factory.StartNew()). This means that you can change your code to something like this:
await WorkerMainAsync();
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
while (true)
{
await ss.WaitAsync();
// you should probably store this task somewhere and then await it
var task = DoPollingThenWorkAsync();
}
}
async Task DoPollingThenWorkAsync(SemaphoreSlim semaphore)
{
var msg = Poll();
if (msg != null)
{
await Task.Delay(3000); // process the I/O-bound job
}
// this assumes you don't have to worry about exceptions
// otherwise consider try-finally
semaphore.Release();
}

Usually you don't use async/await inside a CPU-bound task. The method that starts such a task (WorkerMainAsync) can use async/await, but you should be tracking pending tasks:
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
List<Task> trackedTasks = new List<Task>();
while (DoMore())
{
await ss.WaitAsync();
trackedTasks.Add(Task.Run(() =>
{
DoPollingThenWorkAsync();
ss.Release();
}));
}
await Task.WhenAll(trackedTasks);
}
void DoPollingThenWorkAsync()
{
var msg = Poll();
if (msg != null)
{
Thread.Sleep(2000); // process the long running CPU-bound job
}
}
Another exercise would be to remove tasks from trackedTasks as they are finishing. For example, you could use ContinueWith to remove a finished tasks (in this case, remember to use lock to protect trackedTasks from simultaneous access).
If you really need to use await inside DoPollingThenWorkAsync, the code wouldn't change a lot:
trackedTasks.Add(Task.Run(async () =>
{
await DoPollingThenWorkAsync();
ss.Release();
}));
Note that in this case, you'd be dealing with a nested task here for the async lambda, which Task.Run will automatically unwrap for you.

Related

thread safe processing of a queue

I am putting tasks from the UI-thread into a queue, so that they can be processed in another thread. if there is nothing to do, the thread should wait with an AutoResetEvent - obviously all this should be threadsafe.
i am putting tasks in the queue from the UI-thread like this:
lock (_syncObject)
{
_queue.Enqueue(new FakeTask());
}
_autoResetEvent.Set();
here is how my processing thread-loop looks so far:
while (true)
{
FakeTask task = null;
lock (_syncObject)
{
if (_queue.Count > 0)
{
task = _queue.Dequeue();
}
}
if (task != null)
{
task.Run();
Thread.Sleep(1000); //just here for testing purposes
}
if (_queue.Count == 0)
{
_autoResetEvent.WaitOne();
}
}
i am afraid that the last part where i check if something else is in the queue is not thread safe and would like to know how i can make it so. thanks!
In simple case, try using BlockingCollection which has been specially designed for implementing Producer / Consumer pattern:
private async Task Process() {
using (BlockingCollection<FakeTask> _queue = new BlockingCollection<FakeTask>()) {
Task producer = Task.Run(() => {
while (!completed) {
//TODO: put relevant code here
_queue.Add(new FakeTask());
}
_queue.CompleteAdding();
});
Task consumer = Task.Run(() => {
foreach (FakeTask task in _queue.GetConsumingEnumerable()) {
//TODO: process task - put relevant code here
}
});
await Task.WhenAll(producer, consumer).ConfigureAwait(false);
}
}
Edit: if producer is UI thread itself:
private async Task Process() {
using (BlockingCollection<FakeTask> _queue = new BlockingCollection<FakeTask>()) {
Task consumer = Task.Run(() => {
foreach (FakeTask task in _queue.GetConsumingEnumerable()) {
//TODO: process task - put relevant code here
}
});
// If we produce in UI we don't want any separate Task
while (!completed) {
//TODO: put relevant code here
_queue.Add(new FakeTask());
}
_queue.CompleteAdding();
await consumer.ConfigureAwait(false);
}
}
In case of entangled mesh (e.g. producers #1, #2 genetate tasks for consumers #1, #2, #3 which in turn create tasks for...) try DataFlow
It's not useful to create a thread just so that it can spend basically all of its time sitting around doing nothing waiting for work to do.
All you need to do is use an asynchronous locking mechanism around the UI task's scheduling of background work to be done. SemaphoreSlim provides such an asynchronous synchronization mechanism.
await sempahoreSlim.WaitAsync();
try
{
await Task.Run(() => DoWork());
}
finally
{
sempahoreSlim.Release();
}
Not only is it a lot less code, and has much simpler code that more accurately reflects what the business logic of the application is, but you're consuming quite a lot less system resources.
And of course if different background operations can safely run in parallel, then just use the thread pool, rather than your own message loop, and the code becomes even simpler.

Should I do a 'Task.Wait()' in a C# loop of sync and async methods

I have two methods I want to call within a loop. Step1() has to complete before Step2() is called. But in a loop, Step1() can start while Step2() is asynchronously executing. Should I simply wait for the Step2 task, before allowing any other 'Step2' tasks from being executed, as I do in the code below?
public MainViewModel()
{
StartCommand = new RelayCommand(Start);
}
public ICommand StartCommand { get; set; }
private async void Start()
{
await Task.Factory.StartNew(() =>
{
Console.WriteLine($"{DateTime.Now:hh:mm:ss.fff} - Started processing.");
for (int i = 0; i < 10; i++)
{
_counter++;
string result = Step1(i);
_step2Task?.Wait(); //Is this OK to do???
Step2(result).ConfigureAwait(false);
}
_step2Task?.Wait();
Console.WriteLine($"{DateTime.Now:hh:mm:ss.fff} - Finished processing.");
});
}
private string Step1(int i)
{
Thread.Sleep(5000); //simulates time-consuming task
Console.WriteLine($"{DateTime.Now:hh:mm:ss.fff} - Step 1 completed - Iteration {i}.");
return $"Step1Result{i}";
}
private async Task Step2(string result)
{
_step2Task = Task.Run(() =>
{
Thread.Sleep(4000); //simulates time-consuming task
Console.WriteLine($"{DateTime.Now:hh:mm:ss.fff} - Step 2 completed. - {result}");
});
await _step2Task;
}
Don't do any of this stuff; you will risk getting deadlocks all over the place. Also, don't move stuff onto threads unless it is CPU bound.
Start over:
Find every long-running synchronous method that is CPU intensive and write an async wrapper around it. The async wrapper should grab a worker thread, execute the CPU intensive task, and complete when the execution is done. Now you consistently have an abstraction in terms of tasks, not threads.
Move all of your control flow logic onto the UI thread.
Put an await everywhere that you mean "the code that comes after this must not execute until the awaited task is complete".
If we do that, your code gets a lot simpler:
// Return Task, not void
// Name async methods accordingly
private async Task StartAsync()
{
Console.WriteLine($"{DateTime.Now:hh:mm:ss.fff} - Started processing.");
Task task2 = null;
for (int i = 0; i < 10; i++)
{
// We cannot do Step2Async until Step1Async's task
// completes, so await it.
string result = await Step1Async(i);
// We can't run a new Step2Async until the old one is done:
if (task2 != null) {
await task2;
task2 = null;
}
// Now run a new Step2Async:
task2 = Step2Async(result);
// But *do not await it*. We don't care if a new Step1Async
// starts up before Step2Async is done.
}
// Finally, don't complete StartAsync until any pending Step2 is done.
if (task2 != null) {
await task2;
task2 = null;
}
Console.WriteLine($"{DateTime.Now:hh:mm:ss.fff} - Finished processing.");
}
private string Step1(int i)
{
// TODO: CPU intensive work here
}
private async Task<string> Step1Async(int i) {
// TODO: Run CPU-intensive Step1(i) on a worker thread
// return a Task<string> representing that work, that is
// completed when the work is done.
}
private void Step2(string result)
{
// TODO: CPU-intensive work here
}
private async Task Step2Async(string result)
{
// TODO: Again, make a worker thread that runs Step2
// and signals the task when it is complete.
}
Remember, await is the sequencing operation on workflows. It means don't proceed with this workflow until this task is complete; go find some other workflow.
Exercise: How would you write the code to represent the workflow:
Step1 must complete before Step2
Any number of Step2 may be running at the same time
All the Step2 must complete before Start completes
?

Do multiple awaits to the same Task from a single thread resume in FIFO order?

Supposing a Task is created and awaited multiple times from a single thread. Is the resume order FIFO?
Simplistic example: Is the Debug.Assert() really an invariant?
Task _longRunningTask;
async void ButtonStartSomething_Click()
{
// Wait for any previous runs to complete before starting the next
if (_longRunningTask != null) await _longRunningTask;
// Check our invariant
Debug.Assert(_longRunningTask == null, "This assumes awaits resume in FIFO order");
// Initialize
_longRunningTask = Task.Delay(10000);
// Yield and wait for completion
await _longRunningTask;
// Clean up
_longRunningTask = null;
}
Initialize and Clean up are kept to a bare minimum for the sake of simplicity, but the general idea is that the previous Clean up MUST be complete before the next Initialize runs.
The short answer is: no, it's not guaranteed.
Furthermore, you should not use ContinueWith; among other problems, it has a confusing default scheduler (more details on my blog). You should use await instead:
private async void ButtonStartSomething_Click()
{
// Wait for any previous runs to complete before starting the next
if (_longRunningTask != null) await _longRunningTask;
_longRunningTask = LongRunningTaskAsync();
await _longRunningTask;
}
private async Task LongRunningTaskAsync()
{
// Initialize
await Task.Delay(10000);
// Clean up
_longRunningTask = null;
}
Note that this could still have "interesting" semantics if the button can be clicked many times while the tasks are still running.
The standard way to prevent the multiple-execution problem for UI applications is to disable the button:
private async void ButtonStartSomething_Click()
{
ButtonStartSomething.Enabled = false;
await LongRunningTaskAsync();
ButtonStartSomething.Enabled = true;
}
private async Task LongRunningTaskAsync()
{
// Initialize
await Task.Delay(10000);
// Clean up
}
This forces your users into a one-operation-at-a-time queue.
The order of execution is pre-defined, however there is potential race condition on _longRunningTask variable if ButtonStartSomething_Click() is called concurrently from more than one thread (not likely the case).
Alternatively, you can explicitly schedule tasks using a queue. As a bonus a work can be scheduled from non-async methods, too:
void ButtonStartSomething_Click()
{
_scheduler.Add(async() =>
{
// Do something
await Task.Delay(10000);
// Do something else
});
}
Scheduler _scheduler;
class Scheduler
{
public Scheduler()
{
_queue = new ConcurrentQueue<Func<Task>>();
_state = STATE_IDLE;
}
public void Add(Func<Task> func)
{
_queue.Enqueue(func);
ScheduleIfNeeded();
}
public Task Completion
{
get
{
var t = _messageLoopTask;
if (t != null)
{
return t;
}
else
{
return Task.FromResult<bool>(true);
}
}
}
void ScheduleIfNeeded()
{
if (_queue.IsEmpty)
{
return;
}
if (Interlocked.CompareExchange(ref _state, STATE_RUNNING, STATE_IDLE) == STATE_IDLE)
{
_messageLoopTask = Task.Run(new Func<Task>(RunMessageLoop));
}
}
async Task RunMessageLoop()
{
Func<Task> item;
while (_queue.TryDequeue(out item))
{
await item();
}
var oldState = Interlocked.Exchange(ref _state, STATE_IDLE);
System.Diagnostics.Debug.Assert(oldState == STATE_RUNNING);
if (!_queue.IsEmpty)
{
ScheduleIfNeeded();
}
}
volatile Task _messageLoopTask;
ConcurrentQueue<Func<Task>> _queue;
static int _state;
const int STATE_IDLE = 0;
const int STATE_RUNNING = 1;
}
Found the answer under Task.ContinueWith(). It appear to be: no
Presuming await is just Task.ContinueWith() under the hood, there's documentation for TaskContinuationOptions.PreferFairness that reads:
A hint to a TaskScheduler to schedule task in the order in which they were scheduled, so that tasks scheduled sooner are more likely to run sooner, and tasks scheduled later are more likely to run later.
(bold-facing added)
This suggests there's no guarantee of any sorts, inherent or otherwise.
Correct ways to do this
For the sake of someone like me (OP), here's a look at the more correct ways to do this.
Based on Stephen Cleary's answer:
private async void ButtonStartSomething_Click()
{
// Wait for any previous runs to complete before starting the next
if (_longRunningTask != null) await _longRunningTask;
// Initialize
_longRunningTask = ((Func<Task>)(async () =>
{
await Task.Delay(10);
// Clean up
_longRunningTask = null;
}))();
// Yield and wait for completion
await _longRunningTask;
}
Suggested by Raymond Chen's comment:
private async void ButtonStartSomething_Click()
{
// Wait for any previous runs to complete before starting the next
if (_longRunningTask != null) await _longRunningTask;
// Initialize
_longRunningTask = Task.Delay(10000)
.ContinueWith(task =>
{
// Clean up
_longRunningTask = null;
}, TaskContinuationOptions.OnlyOnRanToCompletion);
// Yield and wait for completion
await _longRunningTask;
}
Suggested by Kirill Shlenskiy's comment:
readonly SemaphoreSlim _taskSemaphore = new SemaphoreSlim(1);
async void ButtonStartSomething_Click()
{
// Wait for any previous runs to complete before starting the next
await _taskSemaphore.WaitAsync();
try
{
// Do some initialization here
// Yield and wait for completion
await Task.Delay(10000);
// Do any clean up here
}
finally
{
_taskSemaphore.Release();
}
}
(Please -1 or comment if I've messed something up in either.)
Handling exceptions
Using continuations made me realize one thing: awaiting at multiple places gets complicated really quickly if _longRunningTask can throw exceptions.
If I'm going to use continuations, it looks like I need to top it off by handling all exceptions within the continuation as well.
i.e.
_longRunningTask = Task.Delay(10000)
.ContinueWith(task =>
{
// Clean up
_longRunningTask = null;
}, TaskContinuationOptions.OnlyOnRanToCompletion);
.ContinueWith(task =>
{
// Consume or handle exceptions here
}, TaskContinuationOptions.OnlyOnFaulted);
// Yield and wait for completion
await _longRunningTask;
If I use a SemaphoreSlim, I can do the same thing in the try-catch, and have the added option of bubbling exceptions directly out of ButtonStartSomething_Click.

How to have mutliple threads await a single Task?

I've read this: Is it ok to await the same task from multiple threads - is await thread safe? and I don't feel clear about the answer, so here's a specific use case.
I have a method that performs some async network I/O. Multiple threads can hit this method at once, and I dont wan't them all to invoke a network request, If a request is already in progress I want to block/await the 2nd+ threads, and have them all resume once the single IO operation has completed.
How should I write the following pseudcode?
I'm guessing each calling thread really needs to get its own Task, so each can get it's own continuation, so instead of returning currentTask I should return a new Task which is completed by the "inner" Task from DoAsyncNetworkIO.
Is there a clean way to do this, or do I have to hand roll it?
static object mutex = new object();
static Task currentTask;
async Task Fetch()
{
lock(mutex)
{
if(currentTask != null)
return currentTask;
}
currentTask = DoAsyncNetworkIO();
await currentTask;
lock(mutex)
{
var task = currentTask;
currentTask = null;
return task;
}
}
You could use a SemaphoreSlim to ensure that only one thread actually executes the background thread.
Assume your base task (the one actually doing the IO) is in a method called baseTask(), which I shall emulate like so:
static async Task baseTask()
{
Console.WriteLine("Starting long method.");
await Task.Delay(1000);
Console.WriteLine("Finished long method.");
}
Then you can initialise a SemaphoreSlim like so, to act a bit like an AutoResetEvent with initial state set to true:
static readonly SemaphoreSlim signal = new SemaphoreSlim(1, 1);
Then wrap the call to baseTask() in a method that checks signal to see if this is the first thread to try to run baseTask(), like so:
static async Task<bool> taskWrapper()
{
bool firstIn = await signal.WaitAsync(0);
if (firstIn)
{
await baseTask();
signal.Release();
}
else
{
await signal.WaitAsync();
signal.Release();
}
return firstIn;
}
Then your multiple threads would await taskWrapper() rather than awaiting baseTask() directly.
Putting that altogether in a compilable console application:
using System;
using System.Threading;
using System.Threading.Tasks;
namespace Demo
{
static class Program
{
static void Main()
{
for (int it = 0; it < 10; ++it)
{
Console.WriteLine($"\nStarting iteration {it}");
Task[] tasks = new Task[5];
for (int i = 0; i < 5; ++i)
tasks[i] = Task.Run(demoTask);
Task.WaitAll(tasks);
}
Console.WriteLine("\nFinished");
Console.ReadLine();
}
static async Task demoTask()
{
int id = Thread.CurrentThread.ManagedThreadId;
Console.WriteLine($"Thread {id} starting");
bool firstIn = await taskWrapper();
Console.WriteLine($"Task {id}: executed: {firstIn}");
}
static async Task<bool> taskWrapper()
{
bool firstIn = await signal.WaitAsync(0);
if (firstIn)
{
await baseTask();
signal.Release();
}
else
{
await signal.WaitAsync();
signal.Release();
}
return firstIn;
}
static async Task baseTask()
{
Console.WriteLine("Starting long method.");
await Task.Delay(1000);
Console.WriteLine("Finished long method.");
}
static readonly SemaphoreSlim signal = new SemaphoreSlim(1, 1);
}
}
(The methods are all static because they are in a console app; in real code they would be non-static methods.)
await doesn't necessarily use continuations (the Task.ContinueWith kind) at all. Even when it does, you can have multiple continuations on one Task - they just can't all run synchronously (and you might run into some issues if you have a synchronization context).
Do note that your pseudo-code isn't thread-safe, though - you can't just do currentTask = DoAsyncNetworkIO(); outside of a lock. Only the await itself is thread-safe, and even then, only because the Task class that you're awaiting implements the await contract in a thread-safe way. Anyone can write their own awaiter/awaitable, so make sure to pay attention :)

Awaiting task array which is executed in lock

TestAwaitTaskArrayAsync() can be called from several places in code. I need to lock execution of taskArray and wait asynchronously till its all tasks are finished before next call will start executing taskArray. Here is the code:
private async Task TestAwaitTaskArrayAsync()
{
Task[] taskArray;
lock (_lock_taskArray)
{
taskArray = new Task[]
{
Task.Run(() =>
{
SomeMethod1();
}),
Task.Run(() =>
{
SomeMethod2();
})
};
}
await Task.WhenAll(taskArray);
}
Await in lock is not allowed so I could use AsyncLock if necessary, but trying keep it simple. Is this code correct and thread safe? I am not sure if await Task.WhenAll(taskArray); can be outside of lock, should I use AsyncLock instead?
The lock you're using has almost no effect because creating the tasks is very fast and does not conflict with anything. The way you achieve mutual exclusion in an async setting is with the SemaphoreSlim class. It is a lock that supports the Task-async pattern.
SemaphoreSlim sem = new SemaphoreSlim(1);
private async Task TestAwaitTaskArrayAsync()
{
await sem.WaitAsync();
try {
Task[] taskArray = new Task[]
{
Task.Run(() =>
{
SomeMethod1();
}),
Task.Run(() =>
{
SomeMethod2();
})
};
}
await Task.WhenAll(taskArray);
}
finally { sem.Release(); }
}
In a synchronous way this would have been easier:
lock (_lock_taskArray)
Parallel.Invoke(() => SomeMethod1(), () => SomeMethod2());
Done.
You can also use AsyncLock if you like. That should allow you to use the using construct to release the lock reliably.

Categories

Resources