I'm currently reading in data via a SerialPort connection in an asynchronous Task in a console application that will theoretically run forever (always picking up new serial data as it comes in).
I have a separate Task that is responsible for pulling that serial data out of a HashSet type that gets populated from my "producer" task above and then it makes an API request with it. Since the "producer" will run forever, I need the "consumer" task to run forever as well to process it.
Here's a contrived example:
TagItems = new HashSet<Tag>();
Sem = new SemaphoreSlim(1, 1);
SerialPort = new SerialPort("COM3", 115200, Parity.None, 8, StopBits.One);
// serialport settings...
try
{
var producer = StartProducerAsync(cancellationToken);
var consumer = StartConsumerAsync(cancellationToken);
await producer; // this feels weird
await consumer; // this feels weird
}
catch (Exception e)
{
Console.WriteLine(e); // when I manually throw an error in the consumer, this never triggers for some reason
}
Here's the producer / consumer methods:
private async Task StartProducerAsync(CancellationToken cancellationToken)
{
using var reader = new StreamReader(SerialPort.BaseStream);
while (SerialPort.IsOpen)
{
var readData = await reader.ReadLineAsync()
.WaitAsync(cancellationToken)
.ConfigureAwait(false);
var tag = new Tag {Data = readData};
await Sem.WaitAsync(cancellationToken);
TagItems.Add(tag);
Sem.Release();
await Task.Delay(100, cancellationToken);
}
reader.Close();
}
private async Task StartConsumerAsync(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
await Sem.WaitAsync(cancellationToken);
if (TagItems.Any())
{
foreach (var item in TagItems)
{
await SendTagAsync(tag, cancellationToken);
}
}
Sem.Release();
await Task.Delay(1000, cancellationToken);
}
}
I think there are multiple problems with my solution but I'm not quite sure how to make it better. For instance, I want my "data" to be unique so I'm using a HashSet, but that data type isn't concurrent-friendly so I'm having to lock with a SemaphoreSlim which I'm guessing could present performance issues with large amounts of data flowing through.
I'm also not sure why my catch block never triggers when an exception is thrown in my StartConsumerAsync method.
Finally, are there better / more modern patterns I can be using to solve this same problem in a better way? I noticed that Channels might be an option but a lot of producer/consumer examples I've seen start with a producer having a fixed number of items that it has to "produce", whereas in my example the producer needs to stay alive forever and potentially produces infinitely.
First things first, starting multiple asynchronous operations and awaiting them one by one is wrong:
// Wrong
await producer;
await consumer;
The reason is that if the first operation fails, the second operation will become fire-and-forget. And allowing tasks to escape your supervision and continue running unattended, can only contribute to your program's instability. Nothing good can come out from that.
// Correct
await Task.WhenAll(producer, consumer)
Now regarding your main issue, which is how to make sure that a failure in one task will cause the timely completion of the other task. My suggestion is to hook the failure of each task with the cancellation of a CancellationTokenSource. In addition, both tasks should watch the associated CancellationToken, and complete cooperatively as soon as possible after they receive a cancellation signal.
var cts = new CancellationTokenSource();
Task producer = StartProducerAsync(cts.Token).OnErrorCancel(cts);
Task consumer = StartConsumerAsync(cts.Token).OnErrorCancel(cts);
await Task.WhenAll(producer, consumer)
Here is the OnErrorCancel extension method:
public static Task OnErrorCancel(this Task task, CancellationTokenSource cts)
{
return task.ContinueWith(t =>
{
if (t.IsFaulted) cts.Cancel();
return t;
}, default, TaskContinuationOptions.DenyChildAttach, TaskScheduler.Default).Unwrap();
}
Instead of doing this, you can also just add an all-enclosing try/catch block inside each task, and call cts.Cancel() in the catch.
Related
We are using BlockCollection to implement producer-consumer pattern in a real-time application, i.e.
BlockingCollection<T> collection = new BlockingCollection<T>();
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
// Starting up consumer
Task.Run(() => consumer(this.cancellationTokenSource.Token));
…
void Producer(T item)
{
collection.Add(item);
}
…
void consumer()
{
while (true)
{
var item = this.blockingCollection.Take(token);
process (item);
}
}
To be sure, this is a very simplified version of the actual production code.
Sometimes when the application is under heavy load, we observe that the consuming part is lagging behind the producing part. Since the application logic is very complex, it involves interaction with other applications over network, as well as with SQL databases. Delays could be occurring in many places; they could occur in the calls to process(), which might in principle explain why the consuming part can be slow.
All the above considerations aside, is there something inherent in using BlockingCollection, which could explain this phenomenon? Are there more efficient options in .Net to realise producer-consumer pattern?
First of all, BlockingCollection isn't the best choice for producer/consumer scenarios. There are at least two better options (Dataflow, Channels) and the choice depends on the actual application scenario - which is missing from the question.
It's also possible to create a producer/consumer pipeline without a buffer, by using async streams and IAsyncEnmerable.
Async Streams
In this case, the producer can be an async iterator. The consumer will receive the IAsyncEnumerable and iterate over it until it completes. It could also produce its own IAsyncEnumerable output, which can be passed to the next method in the pipeline:
The producer can be :
public static async IAsyncEnumerable<Message> ProducerAsync(CancellationToken token)
{
while(!token.IsCancellationRequested)
{
var msg=await Task.Run(()=>SomeHeavyWork());
yield return msg;
}
}
And the consumer :
async Task ConsumeAsync(IAsyncEnumerable<Message> source)
{
await foreach(var msg in source)
{
await consumeMessage(msg);
}
}
There's no buffering in this case, and the producer can't emit a new message until the consumer consumes the current one. The consumer can be parallelized with Parallel.ForEachAsync. Finally, the System.Linq.Async provides LINQ operations to async streams, allowing us to write eg :
List<OtherMsg> results=await ProducerAsync(cts.Token)
.Select(msg=>consumeAndReturn(msg))
.ToListAsync();
Dataflow - ActionBlock
Dataflow blocks can be used to construct entire processing pipelines, with each block receiving a message (data) from the previous one, processing it and passing it to the next block. Most blocks have input and where appropriate output buffers. Each block uses a single worker task but can be configured to use more. The application code doesn't have to handle the tasks though.
In the simplest case, a single ActionBlock can process messages posted to it by one or more producers, acting as a consumer:
async Task ConsumeAsync<Message>(Message message)
{
//Do something with the message
}
...
ExecutionDataflowBlockOptions _options= new () {
MaxDegreeOfParallelism=4,
BoundedCapacity=5
};
ActionBlock<Message> _block=new ActionBlock(ConsumeAsync,_options);
async Task ProduceAsync(CancellationToken token)
{
while(!token.IsCancellationRequested)
{
var msg=await produceNewMessageAsync();
await _block.SendAsync(msg);
}
_block.Complete();
await _block.Completion;
}
In this example the block uses 4 worker tasks and will block if more than 5 items are waiting in its input buffer, beyond those currently being processed.
BufferBlock as a producer/consumer queue
A BufferBlock is an inactive block that's used as a buffer by other blocks. It can be used as an asynchronous producer/consumer collection as shown in How to: Implement a producer-consumer dataflow pattern. In this case, the code needs to receive messages explicitly. Threading is up to the developer. :
static void Produce(ITargetBlock<byte[]> target)
{
var rand = new Random();
for (int i = 0; i < 100; ++ i)
{
var buffer = new byte[1024];
rand.NextBytes(buffer);
target.Post(buffer);
}
target.Complete();
}
static async Task<int> ConsumeAsync(ISourceBlock<byte[]> source)
{
int bytesProcessed = 0;
while (await source.OutputAvailableAsync())
{
byte[] data = await source.ReceiveAsync();
bytesProcessed += data.Length;
}
return bytesProcessed;
}
static async Task Main()
{
var buffer = new BufferBlock<byte[]>();
var consumerTask = ConsumeAsync(buffer);
Produce(buffer);
var bytesProcessed = await consumerTask;
Console.WriteLine($"Processed {bytesProcessed:#,#} bytes.");
}
Parallelized consumer
In .NET 6 the consumer can be simplified by using await foreach and ReceiveAllAsync :
static async Task<int> ConsumeAsync(IReceivableSourceBlock<byte[]> source)
{
int bytesProcessed = 0;
await foreach(var data in source.ReceiveAllAsync())
{
bytesProcessed += data.Length;
}
return bytesProcessed;
}
And processed concurrently using Parallel.ForEachAsync :
static async Task ConsumeAsync(IReceivableSourceBlock<byte[]> source)
{
var msgs=source.ReceiveAllAsync();
await Parallel.ForEachAsync(msgs,
new ParallelOptions { MaxDegreeOfParallelism = 4},
msg=>ConsumeMsgAsync(msg));
}
By default Parallel.ForeachAsync will use as many worker tasks as there are cores
Channels
Channels are similar to Go's channels. They are built specifically for producer/consumer scenarios and allow creating pipelines at a lower level than the Dataflow library. If the Dataflow library was built today, it would be built on top of Channels.
A channel can't be accessed directly, only through its Reader or Writer interfaces. This is intentional, and allows easy pipelining of methods. A very common pattern is for a producer method to create an channel it owns and return only a ChannelReader. Consuming methods accept that reader as input. This way, the producer can control the channel's lifetime without worrying whether other producers are writing to it.
With channels, a producer would look like this :
ChannelReader<Message> Producer(CancellationToken token)
{
var channel=Channel.CreateBounded(5);
var writer=channel.Writer;
_ = Task.Run(()=>{
while(!token.IsCancellationRequested)
{
...
await writer.SendAsync(msg);
}
},token)
.ContinueWith(t=>writer.TryComplete(t.Exception));
return channel.Reader;
}
The unusual .ContinueWith(t=>writer.TryComplete(t.Exception)); is used to signal completion to the writer. This will signal readers to complete as well. This way completion propagates from one method to the next. Any exceptions are propagated as well
writer.TryComplete(t.Exception)) doesn't block or perform any significant work so it doesn't matter what thread it executes on. This means there's no need to use await on the worker task, which would complicate the code by rethrowing any exceptions.
A consuming method only needs the ChannelReader as source.
async Task ConsumerAsync(ChannelReader<Message> source)
{
await Parallel.ForEachAsync(source.ReadAllAsync(),
new ParallelOptions { MaxDegreeOfParallelism = 4},
msg=>consumeMessageAsync(msg)
);
}
A method may read from one channel and publish new data to another using the producer pattern :
ChannelReader<OtherMessage> ConsumerAsync(ChannelReader<Message> source)
{
var channel=Channel.CreateBounded<OtherMessage>();
var writer=channel.Writer;
await Parallel.ForEachAsync(source.ReadAllAsync(),
new ParallelOptions { MaxDegreeOfParallelism = 4},
async msg=>{
var newMsg=await consumeMessageAsync(msg);
await writer.SendAsync(newMsg);
})
.ContinueWith(t=>writer.TryComplete(t.Exception));
}
You could look at using the Dataflow library. I'm not sure if it is more performant than a BlockingCollection. As others have said, there is no guarantee that you can consume faster than produce, so it is always possible to fall behind.
I have a method that runs multiple async methods within it. I have to iterate over a list of devices, and pass the device to this method. I am noticing that this is taking a long time to complete so I am thinking of using Parallel.ForEach so it can run this process against multiple devices at the same time.
Let's say this is my method.
public async Task ProcessDevice(Device device) {
var dev = await _deviceService.LookupDeviceIndbAsNoTracking(device);
var result = await DoSomething(dev);
await DoSomething2(dev);
}
Then DoSomething2 also calls an async method.
public async Task DoSomething2(Device dev) {
foreach(var obj in dev.Objects) {
await DoSomething3(obj);
}
}
The list of devices continuously gets larger over time, so the more this list grows, the longer it takes the program to finish running ProcessDevice() against each device. I would like to process more than one device at a time. So I have been looking into using Parallel.ForEach.
Parallel.ForEach(devices, async device => {
try {
await ProcessDevice(device);
} catch (Exception ex) {
throw ex;
}
})
It appears that the program is finishing before the device is fully processed. I have also tried creating a list of tasks, and then foreach device, add a new task running ProcessDevice to that list and then awaiting Task.WhenAll(listOfTasks);
var listOfTasks = new List<Task>();
foreach(var device in devices) {
var task = Task.Run(async () => await ProcessDevice(device));
listOfTasks.Add(task);
}
await Task.WhenAll(listOfTasks);
But it appears that the task is marked as completed before ProcessDevice() is actually finished running.
Please excuse my ignorance on this issue as I am new to parallel processing and not sure what is going on. What is happening to cause this behavior and is there any documentation that you could provide that could help me better understand what to do?
You can't mix async with Parallel.ForEach. Since your underlying operation is asynchronous, you'd want to use asynchronous concurrency, not parallelism. Asynchronous concurrency is most easily expressed with WhenAll:
var listOfTasks = devices.Select(ProcessDevice).ToList();
await Task.WhenAll(listOfTasks);
In your last example there's a few problems:
var listOfTasks = new List<Task>();
foreach (var device in devices)
{
await Task.Run(async () => await ProcessDevice(device));
}
await Task.WhenAll(listOfTasks);
Doing await Task.Run(async () => await ProcessDevice(device)); means you are not moving to the next iteration of the foreach loop until the previous one is done. Essentially, you're still doing them one at a time.
Additionally, you aren't adding any tasks to listOfTasks so it remains empty and therefore Task.WhenAll(listOfTasks) completes instantly because there's no tasks to await.
Try this:
var listOfTasks = new List<Task>();
foreach (var device in devices)
{
var task = Task.Run(async () => await ProcessDevice(device))
listOfTasks.Add(task);
}
await Task.WhenAll(listOfTasks);
I can explain the problem with Parallel.ForEach. An important thing to understand is that when the await keyword acts on an incomplete Task, it returns. It will return its own incomplete Task if the method signature allows (if it's not void). Then it is up to the caller to use that Task object to wait for the job to finish.
But the second parameter in Parallel.ForEach is an Action<T>, which is a void method, which means no Task can be returned, which means the caller (Parallel.ForEach in this case) has no way to wait until the job has finished.
So in your case, as soon as it hits await ProcessDevice(device), it returns and nothing waits for it to finish so it starts the next iteration. By the time Parallel.ForEach is finished, all it has done is started all the tasks, but not waited for them.
So don't use Parallel.ForEach with asynchronous code.
Stephen's answer is more appropriate. You can also use WSC's answer, but that can be dangerous with larger lists. Creating hundreds or thousands of new threads all at once will not help your performance.
not very sure it this if what you are asking for, but I can give example of how we start async process
private readonly Func<Worker> _worker;
private void StartWorkers(IEnumerable<Props> props){
Parallel.ForEach(props, timestamp => { _worker.Invoke().Consume(timestamp); });
}
Would recommend reading about Parallel.ForEach as it will do some part for you.
I have this code :
ManualResetEvent EventListenerStopped;
...
while (true)
{
IAsyncResult iar = this.ListenerHttp.BeginGetContext(ProcessRequest, null);
if (WaitHandle.WaitAny(new[] { this.EventListenerStopped, iar.AsyncWaitHandle }) == 0)
return;
}
Basically it waits for any of two events :
if a request is received, it processes it and wait for the next one.
if EventListenerStopped is raised, it exits the loop.
This code has been running in production beautifully for quite some time now.
I wanted to try and convert it to the new await/async mechanism and can't seem to find a good simple way to do it.
I tried with a boolean the caller can turn to false. It obviously does not work as it exits the loop only after a new request has been received and processed :
bool RunLoop;
...
while (this.RunLoop)
{
HttpListenerContext listenerContext = await this.ListenerHttp.GetContextAsync();
ProcessRequest(listenerContext);
}
I'm wondering if it's even possible to rewrite my simple old-style loop with async/await. If yes, would someone be willing to show me how ?
It's not specific to async-await, but you're probably looking for CancellationToken (which is used with a lot of async-await code anyway):
http://blogs.msdn.com/b/pfxteam/archive/2009/05/22/9635790.aspx
The 'BlockingOperation' example code seems similar to what you're trying to do:
void BlockingOperation(CancellationToken token)
{
ManualResetEvent mre = new ManualResetEvent(false);
//register a callback that will set the MRE
CancellationTokenRegistration registration =
token.Register(() => mre.Set());
using (registration)
{
mre.WaitOne();
if (token.IsCancellationRequested) //did cancellation wake us?
throw new OperationCanceledException(token);
} //dispose the registration, which performs the deregisteration.
}
Well, first I must point out that the old code is not quite correct. When dealing with the Begin/End pattern, you must always call End, even if you want to (or did) cancel the operation. End is often used to dispose resources.
If you do want to use cancellation, a CancellationToken is likely the best approach:
while (true)
{
// Throws an OperationCanceledException when cancellationToken is canceled.
var request = await this.ListenerHttp.GetContextAsync(cancellationToken);
ProcessRequest(request);
}
There are alternatives - it's possible to do something like Task.WhenAny, and there are even implementations of AsyncManualResetEvent, so it's possible to create an almost line-by-line equivalent to the old code, but IMO the cancellation token approach would be cleaner.
For example, using AsyncManualResetEvent from my AsyncEx library:
AsyncManualResetEvent eventListenerStopped;
while (true)
{
var task = GetContextAndProcessRequestAsync();
if (await Task.WhenAny(eventListenerStopped.WaitAsync(), task) != task)
return;
}
async Task GetContextAndProcessRequestAsync()
{
var request = await this.ListenerHttp.GetContextAsync();
ProcessRequest(request);
}
But personally, I would change to use CancellationToken.
So here's the situation: I need to make a call to a web site that starts a search. This search continues for an unknown amount of time, and the only way I know if the search has finished is by periodically querying the website to see if there's a "Download Data" link somewhere on it (it uses some strange ajax call on a javascript timer to check the backend and update the page, I think).
So here's the trick: I have hundreds of items I need to search for, one at a time. So I have some code that looks a little bit like this:
var items = getItems();
Parallel.ForEach(items, item =>
{
startSearch(item);
var finished = isSearchFinished(item);
while(finished == false)
{
finished = isSearchFinished(item); //<--- How do I delay this action 30 Secs?
}
downloadData(item);
}
Now, obviously this isn't the real code, because there could be things that cause isSearchFinished to always be false.
Obvious infinite loop danger aside, how would I correctly keep isSearchFinished() from calling over and over and over, but instead call every, say, 30 seconds or 1 minute?
I know Thread.Sleep() isn't the right solution, and I think the solution might be accomplished by using Threading.Timer() but I'm not very familiar with it, and there are so many threading options that I'm just not sure which to use.
It's quite easy to implement with tasks and async/await, as noted by #KevinS in the comments:
async Task<ItemData> ProcessItemAsync(Item item)
{
while (true)
{
if (await isSearchFinishedAsync(item))
break;
await Task.Delay(30 * 1000);
}
return await downloadDataAsync(item);
}
// ...
var items = getItems();
var tasks = items.Select(i => ProcessItemAsync(i)).ToArray();
await Task.WhenAll(tasks);
var data = tasks.Select(t = > t.Result);
This way, you don't block ThreadPool threads in vain for what is mostly a bunch of I/O-bound network operations. If you're not familiar with async/await, the async-await tag wiki might be a good place to start.
I assume you can convert your synchronous methods isSearchFinished and downloadData to asynchronous versions using something like HttpClient for non-blocking HTTP request and returning a Task<>. If you are unable to do so, you still can simply wrap them with Task.Run, as await Task.Run(() => isSearchFinished(item)) and await Task.Run(() => downloadData(item)). Normally this is not recommended, but as you have hundreds of items, it sill would give you a much better level of concurrency than with Parallel.ForEach in this case, because you won't be blocking pool threads for 30s, thanks to asynchronous Task.Delay.
You can also write a generic function using TaskCompletionSource and Threading.Timer to return a Task that becomes complete once a specified retry function succeeds.
public static Task RetryAsync(Func<bool> retryFunc, TimeSpan retryInterval)
{
return RetryAsync(retryFunc, retryInterval, CancellationToken.None);
}
public static Task RetryAsync(Func<bool> retryFunc, TimeSpan retryInterval, CancellationToken cancellationToken)
{
var tcs = new TaskCompletionSource<object>();
cancellationToken.Register(() => tcs.TrySetCanceled());
var timer = new Timer((state) =>
{
var taskCompletionSource = (TaskCompletionSource<object>) state;
try
{
if (retryFunc())
{
taskCompletionSource.TrySetResult(null);
}
}
catch (Exception ex)
{
taskCompletionSource.TrySetException(ex);
}
}, tcs, TimeSpan.FromMilliseconds(0), retryInterval);
// Once the task is complete, dispose of the timer so it doesn't keep firing. Also captures the timer
// in a closure so it does not get disposed.
tcs.Task.ContinueWith(t => timer.Dispose(),
CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
return tcs.Task;
}
You can then use RetryAsync like this:
var searchTasks = new List<Task>();
searchTasks.AddRange(items.Select(
downloadItem => RetryAsync( () => isSearchFinished(downloadItem), TimeSpan.FromSeconds(2)) // retry timout
.ContinueWith(t => downloadData(downloadItem),
CancellationToken.None,
TaskContinuationOptions.OnlyOnRanToCompletion,
TaskScheduler.Default)));
await Task.WhenAll(searchTasks.ToArray());
The ContinueWith part specifies what you do once the task has completed successfully. In this case it will run your downloadData method on a thread pool thread because we specified TaskScheduler.Default and the continuation will only execute if the task ran to completion, i.e. it was not canceled and no exception was thrown.
I've discovered that TaskCompletionSource.SetResult(); invokes the code awaiting the task before returning. In my case that result in a deadlock.
This is a simplified version that is started in an ordinary Thread
void ReceiverRun()
while (true)
{
var msg = ReadNextMessage();
TaskCompletionSource<Response> task = requests[msg.RequestID];
if(msg.Error == null)
task.SetResult(msg);
else
task.SetException(new Exception(msg.Error));
}
}
The "async" part of the code looks something like this.
await SendAwaitResponse("first message");
SendAwaitResponse("second message").Wait();
The Wait is actually nested inside non-async calls.
The SendAwaitResponse(simplified)
public static Task<Response> SendAwaitResponse(string msg)
{
var t = new TaskCompletionSource<Response>();
requests.Add(GetID(msg), t);
stream.Write(msg);
return t.Task;
}
My assumption was that the second SendAwaitResponse would execute in a ThreadPool thread but it continues in the thread created for ReceiverRun.
Is there anyway to set the result of a task without continuing its awaited code?
The application is a console application.
I've discovered that TaskCompletionSource.SetResult(); invokes the code awaiting the task before returning. In my case that result in a deadlock.
Yes, I have a blog post documenting this (AFAIK it's not documented on MSDN). The deadlock happens because of two things:
There's a mixture of async and blocking code (i.e., an async method is calling Wait).
Task continuations are scheduled using TaskContinuationOptions.ExecuteSynchronously.
I recommend starting with the simplest possible solution: removing the first thing (1). I.e., don't mix async and Wait calls:
await SendAwaitResponse("first message");
SendAwaitResponse("second message").Wait();
Instead, use await consistently:
await SendAwaitResponse("first message");
await SendAwaitResponse("second message");
If you need to, you can Wait at an alternative point further up the call stack (not in an async method).
That's my most-recommended solution. However, if you want to try removing the second thing (2), you can do a couple of tricks: either wrap the SetResult in a Task.Run to force it onto a separate thread (my AsyncEx library has *WithBackgroundContinuations extension methods that do exactly this), or give your thread an actual context (such as my AsyncContext type) and specify ConfigureAwait(false), which will cause the continuation to ignore the ExecuteSynchronously flag.
But those solutions are much more complex than just separating the async and blocking code.
As a side note, take a look at TPL Dataflow; it sounds like you may find it useful.
As your app is a console app, it runs on the default synchronization context, where the await continuation callback will be called on the same thread the awaiting task has become completed on. If you want to switch threads after await SendAwaitResponse, you can do so with await Task.Yield():
await SendAwaitResponse("first message");
await Task.Yield();
// will be continued on a pool thread
// ...
SendAwaitResponse("second message").Wait(); // so no deadlock
You could further improve this by storing Thread.CurrentThread.ManagedThreadId inside Task.Result and comparing it to the current thread's id after the await. If you're still on the same thread, do await Task.Yield().
While I understand that SendAwaitResponse is a simplified version of your actual code, it's still completely synchronous inside (the way you showed it in your question). Why would you expect any thread switch in there?
Anyway, you probably should redesign your logic the way it doesn't make assumptions about what thread you are currently on. Avoid mixing await and Task.Wait() and make all of your code asynchronous. Usually, it's possible to stick with just one Wait() somewhere on the top level (e.g. inside Main).
[EDITED] Calling task.SetResult(msg) from ReceiverRun actually transfers the control flow to the point where you await on the task - without a thread switch, because of the default synchronization context's behavior. So, your code which does the actual message processing is taking over the ReceiverRun thread. Eventually, SendAwaitResponse("second message").Wait() is called on the same thread, causing the deadlock.
Below is a console app code, modeled after your sample. It uses await Task.Yield() inside ProcessAsync to schedule the continuation on a separate thread, so the control flow returns to ReceiverRun and there's no deadlock.
using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication
{
class Program
{
class Worker
{
public struct Response
{
public string message;
public int threadId;
}
CancellationToken _token;
readonly ConcurrentQueue<string> _messages = new ConcurrentQueue<string>();
readonly ConcurrentDictionary<string, TaskCompletionSource<Response>> _requests = new ConcurrentDictionary<string, TaskCompletionSource<Response>>();
public Worker(CancellationToken token)
{
_token = token;
}
string ReadNextMessage()
{
// using Thread.Sleep(100) for test purposes here,
// should be using ManualResetEvent (or similar synchronization primitive),
// depending on how messages arrive
string message;
while (!_messages.TryDequeue(out message))
{
Thread.Sleep(100);
_token.ThrowIfCancellationRequested();
}
return message;
}
public void ReceiverRun()
{
LogThread("Enter ReceiverRun");
while (true)
{
var msg = ReadNextMessage();
LogThread("ReadNextMessage: " + msg);
var tcs = _requests[msg];
tcs.SetResult(new Response { message = msg, threadId = Thread.CurrentThread.ManagedThreadId });
_token.ThrowIfCancellationRequested(); // this is how we terminate the loop
}
}
Task<Response> SendAwaitResponse(string msg)
{
LogThread("SendAwaitResponse: " + msg);
var tcs = new TaskCompletionSource<Response>();
_requests.TryAdd(msg, tcs);
_messages.Enqueue(msg);
return tcs.Task;
}
public async Task ProcessAsync()
{
LogThread("Enter Worker.ProcessAsync");
var task1 = SendAwaitResponse("first message");
await task1;
LogThread("result1: " + task1.Result.message);
// avoid deadlock for task2.Wait() with Task.Yield()
// comment this out and task2.Wait() will dead-lock
if (task1.Result.threadId == Thread.CurrentThread.ManagedThreadId)
await Task.Yield();
var task2 = SendAwaitResponse("second message");
task2.Wait();
LogThread("result2: " + task2.Result.message);
var task3 = SendAwaitResponse("third message");
// still on the same thread as with result 2, no deadlock for task3.Wait()
task3.Wait();
LogThread("result3: " + task3.Result.message);
var task4 = SendAwaitResponse("fourth message");
await task4;
LogThread("result4: " + task4.Result.message);
// avoid deadlock for task5.Wait() with Task.Yield()
// comment this out and task5.Wait() will dead-lock
if (task4.Result.threadId == Thread.CurrentThread.ManagedThreadId)
await Task.Yield();
var task5 = SendAwaitResponse("fifth message");
task5.Wait();
LogThread("result5: " + task5.Result.message);
LogThread("Leave Worker.ProcessAsync");
}
public static void LogThread(string message)
{
Console.WriteLine("{0}, thread: {1}", message, Thread.CurrentThread.ManagedThreadId);
}
}
static void Main(string[] args)
{
Worker.LogThread("Enter Main");
var cts = new CancellationTokenSource(5000); // cancel after 5s
var worker = new Worker(cts.Token);
Task receiver = Task.Run(() => worker.ReceiverRun());
Task main = worker.ProcessAsync();
try
{
Task.WaitAll(main, receiver);
}
catch (Exception e)
{
Console.WriteLine("Exception: " + e.Message);
}
Worker.LogThread("Leave Main");
Console.ReadLine();
}
}
}
This is not much different from doing Task.Run(() => task.SetResult(msg)) inside ReceiverRun. The only advantage I can think of is that you have an explicit control over when to switch threads. This way, you can stay on the same thread for as long as possible (e.g., for task2, task3, task4, but you still need another thread switch after task4 to avoid a deadlock on task5.Wait()).
Both solutions would eventually make the thread pool grow, which is bad in terms of performance and scalability.
Now, if we replace task.Wait() with await task everywhere inside ProcessAsync in the above code, we will not have to use await Task.Yield and there still will be no deadlocks. However, the whole chain of await calls after the 1st await task1 inside ProcessAsync will actually be executed on the ReceiverRun thread. As long as we don't block this thread with other Wait()-style calls and don't do a lot of CPU-bound work as we're processing messages, this approach might work OK (asynchronous IO-bound await-style calls still should be OK, and they may actually trigger an implicit thread switch).
That said, I think you'd need a separate thread with a serializing synchronization context installed on it for processing messages (similar to WindowsFormsSynchronizationContext). That's where your asynchronous code containing awaits should run. You'd still need to avoid using Task.Wait on that thread. And if an individual message processing takes a lot of CPU-bound work, you should use Task.Run for such work. For async IO-bound calls, you could stay on the same thread.
You may want to look at ActionDispatcher/ActionDispatcherSynchronizationContext from #StephenCleary's
Nito Asynchronous Library for your asynchronous message processing logic. Hopefully, Stephen jumps in and provides a better answer.
"My assumption was that the second SendAwaitResponse would execute in a ThreadPool thread but it continues in the thread created for ReceiverRun."
It depends entirely on what you do within SendAwaitResponse. Asynchrony and concurrency are not the same thing.
Check out: C# 5 Async/Await - is it *concurrent*?
A little late to the party, but here's my solution which i think is added value.
I've been struggling with this also, i've solved it by capturing the SynchronizationContext on the method that is awaited.
It would look something like:
// just a default sync context
private readonly SynchronizationContext _defaultContext = new SynchronizationContext();
void ReceiverRun()
{
while (true) // <-- i would replace this with a cancellation token
{
var msg = ReadNextMessage();
TaskWithContext<TResult> task = requests[msg.RequestID];
// if it wasn't a winforms/wpf thread, it would be null
// we choose our default context (threadpool)
var context = task.Context ?? _defaultContext;
// execute it on the context which was captured where it was added. So it won't get completed on this thread.
context.Post(state =>
{
if (msg.Error == null)
task.TaskCompletionSource.SetResult(msg);
else
task.TaskCompletionSource.SetException(new Exception(msg.Error));
});
}
}
public static Task<Response> SendAwaitResponse(string msg)
{
// The key is here! Save the current synchronization context.
var t = new TaskWithContext<Response>(SynchronizationContext.Current);
requests.Add(GetID(msg), t);
stream.Write(msg);
return t.TaskCompletionSource.Task;
}
// class to hold a task and context
public class TaskWithContext<TResult>
{
public SynchronizationContext Context { get; }
public TaskCompletionSource<TResult> TaskCompletionSource { get; } = new TaskCompletionSource<Response>();
public TaskWithContext(SynchronizationContext context)
{
Context = context;
}
}