I'm working on a system that involves accepting commands over a TCP network connection, then sending responses upon execution of those commands. Fairly basic stuff, but I'm looking to support a few requirements:
Multiple clients can connect at the same time and establish separate sessions. Sessions can last as long or as short as desired, with the same client IP able to establish multiple parallel sessions, if desired.
Each session can process multiple commands at the same time, as some of the requested operations can be performed in parallel.
I'd like to implement this cleanly using async/await and, based on what I've read, TPL Dataflow sounds like a good way to cleanly break up the processing into nice chunks that can run on the thread pool instead of tying up threads for different sessions/commands, blocking on wait handles.
This is what I'm starting with (some parts stripped out to simplify, such as details of exception handling; I've also omitted a wrapper that provides an efficient awaitable for the network I/O):
private readonly Task _serviceTask;
private readonly Task _commandsTask;
private readonly CancellationTokenSource _cancellation;
private readonly BufferBlock<Command> _pendingCommands;
public NetworkService(ICommandProcessor commandProcessor)
{
_commandProcessor = commandProcessor;
IsRunning = true;
_cancellation = new CancellationTokenSource();
_pendingCommands = new BufferBlock<Command>();
_serviceTask = Task.Run((Func<Task>)RunService);
_commandsTask = Task.Run((Func<Task>)RunCommands);
}
public bool IsRunning { get; private set; }
private async Task RunService()
{
_listener = new TcpListener(IPAddress.Any, ServicePort);
_listener.Start();
while (IsRunning)
{
Socket client = null;
try
{
client = await _listener.AcceptSocketAsync();
client.Blocking = false;
var session = RunSession(client);
lock (_sessions)
{
_sessions.Add(session);
}
}
catch (Exception ex)
{ //Handling here...
}
}
}
private async Task RunCommands()
{
while (IsRunning)
{
var command = await _pendingCommands.ReceiveAsync(_cancellation.Token);
var task = Task.Run(() => RunCommand(command));
}
}
private async Task RunCommand(Command command)
{
try
{
var response = await _commandProcessor.RunCommand(command.Content);
Send(command.Client, response);
}
catch (Exception ex)
{
//Deal with general command exceptions here...
}
}
private async Task RunSession(Socket client)
{
while (client.Connected)
{
var reader = new DelimitedCommandReader(client);
try
{
var content = await reader.ReceiveCommand();
_pendingCommands.Post(new Command(client, content));
}
catch (Exception ex)
{
//Exception handling here...
}
}
}
The basics seem straightforward, but one part is tripping me up: how do I make sure that when I'm shutting down the application, I wait for all pending command tasks to complete? I get the Task object when I use Task.Run to execute the command, but how do I keep track of pending commands so that I can make sure that all of them are complete before allowing the service to shut down?
I've considered using a simple List, with removal of commands from the List as they finish, but I'm wondering if I'm missing some basic tools in TPL Dataflow that would allow me to accomplish this more cleanly.
EDIT:
Reading more about TPL Dataflow, I'm wondering if what I should be using is a TransformBlock with an increased MaxDegreeOfParallelism to allow processing parallel commands? This sets an upper limit on the number of commands that can run in parallel, but that's a sensible limitation for my system, I think. I'm curious to hear from those who have experience with TPL Dataflow to know if I'm on the right track.
Yeah, so... you're kinda half using the power of TPL here. The fact that you're still manually receiving items from the BufferBlock in your own while loop in a background Task is not the "way" you want to do it if you're subscribing to the TPL DataFlow style.
What you would do is link an ActionBlock to the BufferBlock and do your command processing/sending from within that. This is also the block where you would set the MaxDegreeOfParallelism to control just how many concurrent commands you want to process. So that setup might look something like this:
// Initialization logic to build up the TPL flow
_pendingCommands = new BufferBlock<Command>();
_commandProcessor = new ActionBlock<Command>(this.ProcessCommand);
_pendingCommands.LinkTo(_commandProcessor);
private Task ProcessCommand(Command command)
{
var response = await _commandProcessor.RunCommand(command.Content);
this.Send(command.Client, response);
}
Then, in your shutdown code, you would need to signal that you're done adding items into the pipeline by calling Complete on the _pipelineCommands BufferBlock and then wait on the _commandProcessor ActionBlock to complete to ensure that all items have made their way through the pipeline. You do this by grabbing the Task returned by the block's Completion property and calling Wait on it:
_pendingCommands.Complete();
_commandProcessor.Completion.Wait();
If you want to go for bonus points, you can even separate the command processing from the command sending. This would allow you to configure those steps separately from one another. For example, maybe you need to limit the number of threads processing commands, but want to have more sending out the responses. You would do this by simply introducing a TransformBlock into the middle of the flow:
_pendingCommands = new BufferBlock<Command>();
_commandProcessor = new TransformBlock<Command, Tuple<Client, Response>>(this.ProcessCommand);
_commandSender = new ActionBlock<Tuple<Client, Response>(this.SendResponseToClient));
_pendingCommands.LinkTo(_commandProcessor);
_commandProcessor.LinkTo(_commandSender);
private Task ProcessCommand(Command command)
{
var response = await _commandProcessor.RunCommand(command.Content);
return Tuple.Create(command, response);
}
private Task SendResponseToClient(Tuple<Client, Response> clientAndResponse)
{
this.Send(clientAndResponse.Item1, clientAndResponse.Item2);
}
You probably want to use your own data structure instead of Tuple, it was just for illustrative purposes, but the point is this is exactly the kind of structure you want to use to break up the pipeline so that you can control the various aspects of it exactly how you might need to.
Tasks are by default background, which means that when application terminates they are also immediately terminated. You should use a Thread not a Task. Then you can set:
Thread.IsBackground = false;
This will prevent your application from terminating while the worker thread is running.
Although of course this will require some changes in your above code.
What's more you, when executing the shutdown method, you could also just wait for any outstanding tasks from the main thread.
I do not see a better solution to this.
Related
I'm currently reading in data via a SerialPort connection in an asynchronous Task in a console application that will theoretically run forever (always picking up new serial data as it comes in).
I have a separate Task that is responsible for pulling that serial data out of a HashSet type that gets populated from my "producer" task above and then it makes an API request with it. Since the "producer" will run forever, I need the "consumer" task to run forever as well to process it.
Here's a contrived example:
TagItems = new HashSet<Tag>();
Sem = new SemaphoreSlim(1, 1);
SerialPort = new SerialPort("COM3", 115200, Parity.None, 8, StopBits.One);
// serialport settings...
try
{
var producer = StartProducerAsync(cancellationToken);
var consumer = StartConsumerAsync(cancellationToken);
await producer; // this feels weird
await consumer; // this feels weird
}
catch (Exception e)
{
Console.WriteLine(e); // when I manually throw an error in the consumer, this never triggers for some reason
}
Here's the producer / consumer methods:
private async Task StartProducerAsync(CancellationToken cancellationToken)
{
using var reader = new StreamReader(SerialPort.BaseStream);
while (SerialPort.IsOpen)
{
var readData = await reader.ReadLineAsync()
.WaitAsync(cancellationToken)
.ConfigureAwait(false);
var tag = new Tag {Data = readData};
await Sem.WaitAsync(cancellationToken);
TagItems.Add(tag);
Sem.Release();
await Task.Delay(100, cancellationToken);
}
reader.Close();
}
private async Task StartConsumerAsync(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
await Sem.WaitAsync(cancellationToken);
if (TagItems.Any())
{
foreach (var item in TagItems)
{
await SendTagAsync(tag, cancellationToken);
}
}
Sem.Release();
await Task.Delay(1000, cancellationToken);
}
}
I think there are multiple problems with my solution but I'm not quite sure how to make it better. For instance, I want my "data" to be unique so I'm using a HashSet, but that data type isn't concurrent-friendly so I'm having to lock with a SemaphoreSlim which I'm guessing could present performance issues with large amounts of data flowing through.
I'm also not sure why my catch block never triggers when an exception is thrown in my StartConsumerAsync method.
Finally, are there better / more modern patterns I can be using to solve this same problem in a better way? I noticed that Channels might be an option but a lot of producer/consumer examples I've seen start with a producer having a fixed number of items that it has to "produce", whereas in my example the producer needs to stay alive forever and potentially produces infinitely.
First things first, starting multiple asynchronous operations and awaiting them one by one is wrong:
// Wrong
await producer;
await consumer;
The reason is that if the first operation fails, the second operation will become fire-and-forget. And allowing tasks to escape your supervision and continue running unattended, can only contribute to your program's instability. Nothing good can come out from that.
// Correct
await Task.WhenAll(producer, consumer)
Now regarding your main issue, which is how to make sure that a failure in one task will cause the timely completion of the other task. My suggestion is to hook the failure of each task with the cancellation of a CancellationTokenSource. In addition, both tasks should watch the associated CancellationToken, and complete cooperatively as soon as possible after they receive a cancellation signal.
var cts = new CancellationTokenSource();
Task producer = StartProducerAsync(cts.Token).OnErrorCancel(cts);
Task consumer = StartConsumerAsync(cts.Token).OnErrorCancel(cts);
await Task.WhenAll(producer, consumer)
Here is the OnErrorCancel extension method:
public static Task OnErrorCancel(this Task task, CancellationTokenSource cts)
{
return task.ContinueWith(t =>
{
if (t.IsFaulted) cts.Cancel();
return t;
}, default, TaskContinuationOptions.DenyChildAttach, TaskScheduler.Default).Unwrap();
}
Instead of doing this, you can also just add an all-enclosing try/catch block inside each task, and call cts.Cancel() in the catch.
I've been working on a project and saw the below code. I am new to the async/await world. As far as I know, only a single task is performing in the method then why it is decorated with async/await. What benefits I am getting by using async/await and what is the drawback if I remove async/await i.e make it synchronous I am a little bit confused so any help will be appreciated.
[Route("UpdatePersonalInformation")]
public async Task<DataTransferObject<bool>> UpdatePersonalInformation([FromBody] UserPersonalInformationRequestModel model)
{
DataTransferObject<bool> transfer = new DataTransferObject<bool>();
try
{
model.UserId = UserIdentity;
transfer = await _userService.UpdateUserPersonalInformation(model);
}
catch (Exception ex)
{
transfer.TransactionStatusCode = 500;
transfer.ErrorMessage = ex.Message;
}
return transfer;
}
Service code
public async Task<DataTransferObject<bool>> UpdateUserPersonalInformation(UserPersonalInformationRequestModel model)
{
DataTransferObject<bool> transfer = new DataTransferObject<bool>();
await Task.Run(() =>
{
try
{
var data = _userProfileRepository.FindBy(x => x.AspNetUserId == model.UserId)?.FirstOrDefault();
if (data != null)
{
var userProfile = mapper.Map<UserProfile>(model);
userProfile.UpdatedBy = model.UserId;
userProfile.UpdateOn = DateTime.UtcNow;
userProfile.CreatedBy = data.CreatedBy;
userProfile.CreatedOn = data.CreatedOn;
userProfile.Id = data.Id;
userProfile.TypeId = data.TypeId;
userProfile.AspNetUserId = data.AspNetUserId;
userProfile.ProfileStatus = data.ProfileStatus;
userProfile.MemberSince = DateTime.UtcNow;
if(userProfile.DOB==DateTime.MinValue)
{
userProfile.DOB = null;
}
_userProfileRepository.Update(userProfile);
transfer.Value = true;
}
else
{
transfer.Value = false;
transfer.Message = "Invalid User";
}
}
catch (Exception ex)
{
transfer.ErrorMessage = ex.Message;
}
});
return transfer;
}
What benefits I am getting by using async/await
Normally, on ASP.NET, the benefit of async is that your server is more scalable - i.e., can handle more requests than it otherwise could. The "Synchronous vs. Asynchronous Request Handling" section of this article goes into more detail, but the short explanation is that async/await frees up a thread so that it can handle other requests while the asynchronous work is being done.
However, in this specific case, that's not actually what's going on. Using async/await in ASP.NET is good and proper, but using Task.Run on ASP.NET is not. Because what happens with Task.Run is that another thread is used to run the delegate within UpdateUserPersonalInformation. So this isn't asynchronous; it's just synchronous code running on a background thread. UpdateUserPersonalInformation will take another thread pool thread to run its synchronous repository call and then yield the request thread by using await. So it's just doing a thread switch for no benefit at all.
A proper implementation would make the repository asynchronous first, and then UpdateUserPersonalInformation can be implemented without Task.Run at all:
public async Task<DataTransferObject<bool>> UpdateUserPersonalInformation(UserPersonalInformationRequestModel model)
{
DataTransferObject<bool> transfer = new DataTransferObject<bool>();
try
{
var data = _userProfileRepository.FindBy(x => x.AspNetUserId == model.UserId)?.FirstOrDefault();
if (data != null)
{
...
await _userProfileRepository.UpdateAsync(userProfile);
transfer.Value = true;
}
else
{
transfer.Value = false;
transfer.Message = "Invalid User";
}
}
catch (Exception ex)
{
transfer.ErrorMessage = ex.Message;
}
return transfer;
}
The await keyword only indicates that the execution of the current function is halted until the Task which is being awaited is completed. This means if you remove the async, the method will continue execution and therefore immediately return the transfer object, even if the UpdateUserPersonalInformation Task is not finished.
Take a look at this example:
private void showInfo()
{
Task.Delay(1000);
MessageBox.Show("Info");
}
private async void showInfoAsync()
{
await Task.Delay(1000);
MessageBox.Show("Info");
}
In the first method, the MessageBox is immediately displayed, since the newly created Task (which only waits a specified amount of time) is not awaited. However, the second method specifies the await keyword, therefore the MessageBox is displayed only after the Task is finished (in the example, after 1000ms elapsed).
But, in both cases the delay Task is ran asynchronously in the background, so the main thread (for example the UI) will not freeze.
The usage of async-await mechanism mainly used
when you have some long calculation process which takes some time and you want it to be on the background
in UI when you don't want to make the main thread stuck which will be reflected on UI performance.
you can read more here:
https://learn.microsoft.com/en-us/dotnet/csharp/async
Time Outs
The main usages of async and await operates preventing TimeOuts by waiting for long operations to complete. However, there is another less known, but very powerful one.
If you don't await long operation, you will get a result back, such as a null, even though the actual request as not completed yet.
Cancellation Tokens
Async requests have a default parameter you can add:
public async Task<DataTransferObject<bool>> UpdatePersonalInformation(
[FromBody] UserPersonalInformationRequestModel model,
CancellationToken cancellationToken){..}
A CancellationToken allows the request to stop when the user changes pages or interrupts the connection. A good example of this is a user has a search box, and every time a letter is typed you filter and search results from your API. Now imagine the user types a very long string with say 15 characters. That means that 15 requests are sent and 15 requests need to be completed. Even if the front end is not awaiting the first 14 results, the API is still doing all the 15 requests.
A cancellation token simply tells the API to drop the unused threads.
I would like to chime in on this because most answers although good, do not point to a definite time when to use and when not.
From my experience, if you are developing anything with a front-end, add async/await to your methods when expecting output from other threads to be input to your UI. This is the best strategy for handling multithread output and Microsoft should be commended to come out with this when they did. Without async/await you would have to add more code to handle thread output to UI (e.g Event, Event Handler, Delegate, Event Subscription, Marshaller).
Don't need it anywhere else except if using strategically for slow peripherals.
I am using the below code snippet to try and run jobs (selected by user from UI) non-blocking from main thread (asynchronously) and concurrently w.r.t each other, with some throttling set up to prevent too many jobs hogging all RAM. I used many sources such as Stephen Cleary's blog, this link on ActionBlock as well as this one from #i3arnon
public class ActionBlockJobsAsyncImpl {
private ActionBlock<Job> qJobs;
private Dictionary<Job, CancellationTokenSource> cTokSrcs;
public ActionBlockJobsAsyncImpl () {
qJobs = new ActionBlock<Job>(
async a_job => await RunJobAsync(a_job),
new ExecutionDataflowBlockOptions
{
BoundedCapacity = boundedCapacity,
MaxDegreeOfParallelism = maxDegreeOfParallelism,
});
cTokSrcs = new Dictionary<Job, CancellationTokenSource>();
}
private async Task<bool> RunJobAsync(Job a_job) {
JobArgs args = JobAPI.GetJobArgs(a_job);
bool ok = await JobAPI.RunJobAsync(args, cTokSrcs[a_job].Token);
return ok;
}
private async Task Produce(IEnumerable<Job> jobs) {
foreach (var job in jobs)
{
await qJobs.SendAsync(job);
}
//qJobs.Complete();
}
public override async Task SubmitJobs(IEnumerable<Job> jobs) {
//-Establish new cancellation token and task status
foreach (var job in jobs) {
cTokSrcs[job] = new CancellationTokenSource();
}
// Start the producer.
var producer = Produce(jobs);
// Wait for everything to complete.
await Task.WhenAll(producer);
}
}
The reason I commented out the qJobs.Complete() method call was because the user should be able to submit jobs continuously from the UI (same ones or different ones), and I learnt from implementing and testing in my first pass using BufferBlock that I shouldn't have that Complete() call if I wanted such a continuous producer/consumer queue. But BufferBlock as I learnt doesn't support running concurrent jobs; hence this is my second pass with ActionBlock instead.
In the above code using ActionBlock, when the user selects jobs and clicks to run from UI, this calls the SubmitJobs method. The int parameters boundedCapacity=8 and maxDegreeOfParallelism=DataflowBlockOptions.Unbounded But the code as is, currently does nothing (i.e., it doesn't run any job) - my analogous BufferBlock implementation on the other hand, used to at least run the jobs asynchronously, albeit sequentially w.r.t each other. Here, it never runs any of the jobs and I don't see any error messages either. Appreciate any ideas on what I'm doing wrong and perhaps some useful ideas on how to fix the problem. Thanks for your interest.
I'm trying to learn those "new" keywords and tried to implement a simple async udp server.
public class UdpServerSync
{
private CancellationTokenSource _cts;
private CancellationToken _token;
private UdpClient _client;
public void Start()
{
Console.WriteLine("Start server");
_cts = new CancellationTokenSource();
_token = _cts.Token;
var ipAddress = IPAddress.Parse("192.168.0.25");
var ip = new IPEndPoint(ipAddress, 7070);
try
{
Task.Run(async () =>
{
using (_client = new UdpClient(ip))
{
while (!_token.IsCancellationRequested)
{
var receivedData = await _client.ReceiveAsync();
var msg = Encoding.ASCII.GetString(receivedData.Buffer);
// Process request e.g ProcessRequest(msg);
Console.WriteLine(msg);
}
}
}, _token).ConfigureAwait(false);
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
public void Stop()
{
Console.WriteLine("Stop server");
if (_cts != null) _cts.Cancel();
}
And then use it like this (for testing purpose):
var server = new UdpServerSync();
server.Start();
await Task.Delay(5000);
server.Stop();
The above code is just a proof of concept, its not about code review. By a simple udp server I mean a while loop with a udpclient listening for udp messages and writing them to the console - no processing or error handling.
The reason for the Task.Delay is just because its a proof of concept of calling the server's start() and stop() methods.
To narrow down my questions:
1) if i was going to call the Start() and Stop() methods from e.g a WPF applicationĀ“s start button, should I use server.Start() or Task.Run ? I don't want to await the call since there's no way to know how long the user is going to want the server started.
2) in the server code "ProcessRequest(msg), if that was a void method in another library, should I use Task.Run() to execute it to avoid the server thread being blocked or is there a better way ?
3) When we do async/await, does the code in the await statement execute in a new thread from the thread pool ?
4) Can I specify that the UdpServer is a long running process or it doesn't matter to the thread pool ?
Hope my question is more clear now, thanks guys :)
1) if i was going to call the Start() and Stop() methods from e.g a WPF applicationĀ“s start button, should I use server.Start() or Task.Run ?
With the code you have now, you can just call Start (which calls Task.Run). IMO, the call to Task.Run is unnecessary. And the call to ConfigureAwait is definitely unnecessary since there's no await to configure.
I don't want to await the call since there's no way to know how long the user is going to want the server started.
But you probably do want to know about any exceptions. So, think about how to handle those. One solution is to save the returned Task as a property. Or, you could just await it (remember, it doesn't matter "how long" it runs, because it's an asynchronous wait).
2) in the server code "ProcessRequest(msg), if that was a void method in another library, should I use Task.Run() to execute it to avoid the server thread being blocked or is there a better way ?
There's no "server thread". But if ProcessRequest takes a long time, then you might want to consider using Task.Run so that your code will accept the next request while processing that one.
3) When we do async/await, does the code in the await statement execute in a new thread from the thread pool ?
No. I have an async intro on my blog that goes into more detail.
4) Can I specify that the UdpServer is a long running process or it doesn't matter to the thread pool ?
It doesn't matter.
I'm writing a Windows Service that will kick off multiple worker threads that will listen to Amazon SQS queues and process messages. There will be about 20 threads listening to 10 queues.
The threads will have to be always running and that's why I'm leaning towards to actually using actual threads for the worker loops rather than threadpool threads.
Here is a top level implementation. Windows service will kick off multiple worker threads and each will listen to it's queue and process messages.
protected override void OnStart(string[] args)
{
for (int i = 0; i < _workers; i++)
{
new Thread(RunWorker).Start();
}
}
Here is the implementation of the work
public async void RunWorker()
{
while(true)
{
// .. get message from amazon sqs sync.. about 20ms
var message = sqsClient.ReceiveMessage();
try
{
await PerformWebRequestAsync(message);
await InsertIntoDbAsync(message);
}
catch(SomeExeception)
{
// ... log
//continue to retry
continue;
}
sqsClient.DeleteMessage();
}
}
I know I can perform the same operation with Task.Run and execute it on the threadpool thread rather than starting individual thread, but I don't see a reason for that since each thread will always be running.
Do you see any problems with this implementation? How reliable would it be to leave threads always running in this fashion and what can I do to make sure that each thread is always running?
One problem with your existing solution is that you call your RunWorker in a fire-and-forget manner, albeit on a new thread (i.e., new Thread(RunWorker).Start()).
RunWorker is an async method, it will return to the caller when the execution point hits the first await (i.e. await PerformWebRequestAsync(message)). If PerformWebRequestAsync returns a pending task, RunWorker returns and the new thread you just started terminates.
I don't think you need a new thread here at all, just use AmazonSQSClient.ReceiveMessageAsync and await its result. Another thing is that you shouldn't be using async void methods unless you really don't care about tracking the state of the asynchronous task. Use async Task instead.
Your code might look like this:
List<Task> _workers = new List<Task>();
CancellationTokenSource _cts = new CancellationTokenSource();
protected override void OnStart(string[] args)
{
for (int i = 0; i < _MAX_WORKERS; i++)
{
_workers.Add(RunWorkerAsync(_cts.Token));
}
}
public async Task RunWorkerAsync(CancellationToken token)
{
while(true)
{
token.ThrowIfCancellationRequested();
// .. get message from amazon sqs sync.. about 20ms
var message = await sqsClient.ReceiveMessageAsync().ConfigureAwait(false);
try
{
await PerformWebRequestAsync(message);
await InsertIntoDbAsync(message);
}
catch(SomeExeception)
{
// ... log
//continue to retry
continue;
}
sqsClient.DeleteMessage();
}
}
Now, to stop all pending workers, you could simple do this (from the main "request dispatcher" thread):
_cts.Cancel();
try
{
Task.WaitAll(_workers.ToArray());
}
catch (AggregateException ex)
{
ex.Handle(inner => inner is OperationCanceledException);
}
Note, ConfigureAwait(false) is optional for Windows Service, because there's no synchronization context on the initial thread, by default. However, I'd keep it that way to make the code independent of the execution environment (for cases where there is synchronization context).
Finally, if for some reason you cannot use ReceiveMessageAsync, or you need to call another blocking API, or simply do a piece of CPU intensive work at the beginning of RunWorkerAsync, just wrap it with Task.Run (as opposed to wrapping the whole RunWorkerAsync):
var message = await Task.Run(
() => sqsClient.ReceiveMessage()).ConfigureAwait(false);
Well, for one I'd use a CancellationTokenSource instantiated in the service and passed down to the workers. Your while statement would become:
while(!cancellationTokenSource.IsCancellationRequested)
{
//rest of the code
}
This way you can cancel all your workers from the OnStop service method.
Additionally, you should watch for:
If you're playing with thread states from outside of the thread, then a ThreadStateException, or ThreadInterruptedException or one of the others might be thrown. So, you want to handle a proper thread restart.
Do the workers need to run without pause in-between iterations? I would throw in a sleep in there (even a few ms's) just so they don't keep the CPU up for nothing.
You need to handle ThreadStartException and restart the worker, if it occurs.
Other than that there's no reason why those 10 treads can't run for as long as the service runs (days, weeks, months at a time).