Using .NET TPL DataFlow blocks.
Is there any way to timeout the processing of a message?
e.g.
lets say I have a BufferBlock<T>, is it possible to link that to another block that processes one message at a time (MaxDegreeOfParallellism 1) and force a timeout if the processing runs for too long?
Or is it even possible to do using the BufferBlock only?
I suspect I can use a cancellation token somehow and a delay, but not sure how this would be done.
Also, how expensive would such timeout be? would it add much overhead to the message processing time?
Many methods of BufferBlock<T> do accept CancellationToken, and I believe that would be the proper way of timing out an operation. E.g.:
var cts = new CancellationTokenSource(5000); // cancel in 5s
// Alternatively: cts.CancelAfter(5000);
try
{
var output = await bufferBlock.ReceiveAsync(cts.Token);
}
catch (Exception ex)
{
// check if ex is OperationCanceledException,
// which could be wrapped with AggregateException
}
IMO, the only way of evaluating its efficiency would be to run some profiling tests.
[UPDATE] Based upon the comments, if you're looking to time-out the pipeline processing, you can probably do that when you construct your ActionBlock object and provide it with an instance of ExecutionDataflowBlockOptions. At that point, you can supply DataflowBlockOptions.CancellationToken and use it in the same way as described above. Also, you could pass a CancellationToken into LinkTo as a part of DataflowLinkOptions.
Once you've provided the pipeline with a CancellationToken, you can track the status of ActionBlock.Completion/TransformBlock.Completion, which is a Task, so you can await it and catch a cancellation exception, or use ContinueWith with it (if that's what you mean under some way to tell if the "processing" of the message times out).
Disclaimer: I haven't tried this myself and would be interested to know whether it works as expected.
Related
I already have some experience in working with threads in Windows but most of that experience comes from using Win32 API functions in C/C++ applications. When it comes to .NET applications however, I am often not sure about how to properly deal with multithreading. There are threads, tasks, the TPL and all sorts of other things I can use for multithreading but I never know when to use which of those options.
I am currently working on a C# based Windows service which needs to periodically validate different groups of data from different data sources. Implementing the validation itself is not really an issue for me but I am unsure about how to handle all of the validations running simultaneously.
I need a solution for this which allows me to do all of the following things:
Run the validations at different (predefined) intervals.
Control all of the different validations from one place so I can pause and/or stop them if necessary, for example when a user stops or restarts the service.
Use the system ressources as efficiently as possible to avoid performance issues.
So far I've only had one similar project before where I simply used Thread objects combined with a ManualResetEvent and a Thread.Join call with a timeout to notify the threads about when the service is stopped. The logic inside those threads to do something periodically then looked like this:
while (!shutdownEvent.WaitOne(0))
{
if (DateTime.Now > nextExecutionTime)
{
// Do something
nextExecutionTime = nextExecutionTime.AddMinutes(interval);
}
Thread.Sleep(1000);
}
While this did work as expected, I've often heard that using threads directly like this is considered "oldschool" or even a bad practice. I also think that this solution does not use threads very efficiently as they are just sleeping most of the time. How can I achive something like this in a more modern and efficient way?
If this question is too vague or opinion-based then please let me know and I will try my best to make it as specific as possible.
Question feels a bit broad but we can use the provided code and try to improve it.
Indeed the problem with the existing code is that for the majority of the time it holds thread blocked while doing nothing useful (sleeping). Also thread wakes up every second only to check the interval and in most cases go to sleep again since it's not validation time yet. Why it does that? Because if you will sleep for longer period - you might block for a long time when you signal shutdownEvent and then join a thread. Thread.Sleep doesn't provide a way to be interrupted on request.
To solve both problems we can use:
Cooperative cancellation mechanism in form of CancellationTokenSource + CancellationToken.
Task.Delay instead of Thread.Sleep.
For example:
async Task ValidationLoop(CancellationToken ct) {
while (!ct.IsCancellationRequested) {
try {
var now = DateTime.Now;
if (now >= _nextExecutionTime) {
// do something
_nextExecutionTime = _nextExecutionTime.AddMinutes(1);
}
var waitFor = _nextExecutionTime - now;
if (waitFor.Ticks > 0) {
await Task.Delay(waitFor, ct);
}
}
catch (OperationCanceledException) {
// expected, just exit
// otherwise, let it go and handle cancelled task
// at the caller of this method (returned task will be cancelled).
return;
}
catch (Exception) {
// either have global exception handler here
// or expect the task returned by this method to fail
// and handle this condition at the caller
}
}
}
Now we do not hold a thread any more, because await Task.Delay doesn't do this. Instead, after specificed time interval it will execute the subsequent code on a free thread pool thread (it's more complicated that this but we won't go into details here).
We also don't need to wake up every second for no reason, because Task.Delay accepts cancellation token as a parameter. When that token is signalled - Task.Delay will be immediately interrupted with exception, which we expect and break from the validation loop.
To stop the provided loop you need to use CancellationTokenSource:
private readonly CancellationTokenSource _cts = new CancellationTokenSource();
And you pass its _cts.Token token into the provided method. Then when you want to signal the token, just do:
_cts.Cancel();
To futher improve the resource management - IF your validation code uses any IO operations (reads files from disk, network, database access etc) - use Async versions of said operations. Then also while performing IO you will hold no unnecessary threads blocked waiting.
Now you don't need to manage threads yourself anymore and instead you operatate in terms of tasks you need to perform, letting framework \ OS manage threads for you.
You should use Microsoft's Reactive Framework (aka Rx) - NuGet System.Reactive and add using System.Reactive.Linq; - then you can do this:
Subject<bool> starter = new Subject<bool>();
IObservable<Unit> query =
starter
.StartWith(true)
.Select(x => x
? Observable.Interval(TimeSpan.FromSeconds(5.0)).SelectMany(y => Observable.Start(() => Validation()))
: Observable.Never<Unit>())
.Switch();
IDisposable subscription = query.Subscribe();
That fires off the Validation() method every 5.0 seconds.
When you need to pause and resume, do this:
starter.OnNext(false);
// Now paused
starter.OnNext(true);
// Now restarted.
When you want to stop it all call subscription.Dispose().
Background:
I have a web application which kicks off long running (and stateless) tasks:
var task = Task.Run(() => await DoWork(foo))
task.Wait();
Because they are long running, I need to be able to cancel them from a separate web request.
For this, I would like to use a CancellationToken and just throw an exception as soon as the token is canceled. However, from what I've read, Task Cancellation is cooperative, meaning the code the task is running must explicitly check the token to see if a cancellation request has been made (for example CancellationToken.ThrowIfCancellation())
I would like to avoid checking CancellationToken.ThrowIfCancellation() all over the place, since the task is quite long and goes through many functions. I think I can accomplish what I want creating an explicit Thread, but I would really like to avoid manual thread management. That said...
Question:
Is it possible to automatically throw an exception in the task when it has been canceled, and if not, are there any good alternatives (patterns, etc.) to reduce polluting the code with CancellationToken.ThrowIfCancellation()?
I'd like to avoid something like this:
async Task<Bar> DoWork(Foo foo)
{
CancellationToken.ThrowIfCancellation()
await DoStuff1();
CancellationToken.ThrowIfCancellation()
await DoStuff2();
CancellationToken.ThrowIfCancellation()
await DoStuff3();
...
}
I feel that this question is sufficiently different from this one because I'm explicitly asking for a way to minimize calls to check the cancellation token, to which the accepted answer responds "Every now and then, inside the functions, call token.ThrowIfCancellationRequested()"
Is it possible to automatically throw an exception in the task when it has been canceled, and if not, are there any good alternatives (patterns, etc.) to reduce polluting the code with CancellationToken.ThrowIfCancellation()?
No, and no. All cancellation is cooperative. The best way to cancel code is to have the code respond to a cancellation request. This is the only good pattern.
I think I can accomplish what I want creating an explicit Thread
Not really.
At this point, the question is "how do I cancel uncancelable code?" And the answer to that depends on how stable you want your system to be:
Run the code in a separate Thread and Abort the thread when it is no longer necessary. This is the easiest to implement but the most dangerous in terms of application instability. To put it bluntly, if you ever call Abort anywhere in your app, you should regularly restart that app, in addition to standard practices like heartbeat/smoketest checks.
Run the code in a separate AppDomain and Unload that AppDomain when it is no longer necessary. This is harder to implement (you have to use remoting), and isn't an option in the Core world. And it turns out that AppDomains don't even protect the containing application like they were supposed to, so any apps using this technique also need to be regularly restarted.
Run the code in a separate Process and Kill that process when it is no longer necessary. This is the most complex to implement, since you'll also need to implement some form of inter-process communication. But it is the only reliable solution to cancel uncancelable code.
If you discard the unstable solutions (1) and (2), then the only remaining solution (3) is a ton of work - way, way more than making the code cancelable.
TL;DR: Just use the cancellation APIs the way they were designed to be used. That is the simplest and most effective solution.
If you actually just have a bunch of method calls you are calling one after the other, you can implement a method runner that runs them in sequence and checks in between for the cancellation.
Something like this:
public static void WorkUntilFinishedOrCancelled(CancellationToken token, params Action[] work)
{
foreach (var workItem in work)
{
token.ThrowIfCancellationRequested();
workItem();
}
}
You could use it like this:
async Task<Bar> DoWork(Foo foo)
{
WorkUntilFinishedOrCancelled([YourCancellationToken], DoStuff1, DoStuff2, DoStuff3, ...);
}
This would essentially do what you want.
If you are OK with the implications of Thread.Abort (disposables not disposed, locks not released, application state corrupted), then here is how you could implement non-cooperative cancellation by aborting the task's dedicated thread.
private static Task<TResult> RunAbortable<TResult>(Func<TResult> function,
CancellationToken cancellationToken)
{
var tcs = new TaskCompletionSource<TResult>();
var thread = new Thread(() =>
{
try
{
TResult result;
using (cancellationToken.Register(Thread.CurrentThread.Abort))
{
result = function();
}
tcs.SetResult(result);
}
catch (ThreadAbortException)
{
tcs.TrySetCanceled();
}
catch (Exception ex)
{
tcs.TrySetException(ex);
}
});
thread.IsBackground = true;
thread.Start();
return tcs.Task;
}
Usage example:
var cts = new CancellationTokenSource();
var task = RunAbortable(() => DoWork(foo), cts.Token);
task.Wait();
I would like to write a timeout function for the BasicPublish method of the RabbitMQ C# client. For many reasons sometimes the queue is blocked, or rabbit is down or whatever. But I want to detect when the publish is failing right away. I do not want to block the site for any reason.
I'm worried to implement a timeout with Task or threads adding overhead for a simple publish, that we are doing it millions of times in production.
Does anyone have and idea how to write a quick timeout on a fast blocking method as BasicPublish?
Clarification: Also I'm working in .Net 4, I do not have async.
Polly has a TimeoutPolicy aimed at exactly this scenario.
Polly's TimeoutStrategy.Optimistic is close to #ThiagoCustodio's answer, but it also disposes the CancellationTokenSource correctly. RabbitMQ's C# client doesn't however (at time of writing) offer a BasicPublish() overload taking CancellationToken, so this approach is not relevant.
Polly's TimeoutStrategy.Pessimistic is aimed at scenarios such as BasicPublish(), where you want to impose a timeout on a delegate which doesn't have CancellationToken support.
Polly's TimeoutStrategy.Pessimistic:
[1] allows the calling thread to time-out on (walk away from waiting for) the execution, even when the executed delegate doesn't support cancellation.
[2] does so at the cost of an extra task/thread (in synchronous executions), and manages this for you.
[3] also captures the timed-out Task (the task you have walked away from). This can be valuable for logging, and is essential to avoid UnobservedTaskExceptions - particularly in .NET4.0, where an UnobservedTaskException can bring down your entire process.
Simple example:
Policy.Timeout(TimeSpan.FromSeconds(10), TimeoutStrategy.Pessimistic).Execute(() => BasicPublish(...));
Full example properly avoiding UnobservedTaskExceptions:
Policy timeoutPolicy = Policy.Timeout(TimeSpan.FromSeconds(10), TimeoutStrategy.Pessimistic, (context, timespan, task) =>
{
task.ContinueWith(t => { // ContinueWith important!: the abandoned task may very well still be executing, when the caller times out on waiting for it!
if (t.IsFaulted)
{
logger.Error($"{context.PolicyKey} at {context.ExecutionKey}: execution timed out after {timespan.TotalSeconds} seconds, eventually terminated with: {t.Exception}.");
}
else
{
// extra logic (if desired) for tasks which complete, despite the caller having 'walked away' earlier due to timeout.
}
});
});
timeoutPolicy.Execute(() => BasicPublish(...));
To avoid building up too many concurrent pending tasks/threads in the case where RabbitMQ becomes unavailable, you can use a Bulkhead Isolation policy to limit parallelization and/or a CircuitBreaker to prevent putting calls through for a period, once you detect a certain level of failures. These can be combined with the TimeoutPolicy using PolicyWrap.
I would say that the easiest way is using tasks / cancellation token. Do you think it's an overhead?
public static async Task WithTimeoutAfterStart(
Func<CancellationToken, Task> operation, TimeSpan timeout)
{
var source = new CancellationTokenSource();
var task = operation(source.Token);
source.CancelAfter(timeout);
await task;
}
Usage:
await WithTimeoutAfterStart(
ct => SomeOperationAsync(ct), TimeSpan.FromMilliseconds(n));
Multiple similar questions have been asked here before.
MSDN states as an important note that one should always dispose the CancellationTokenSource when done with it.
OK, but it becomes a little complicated with multithreaded applications and async-await model.
I'm developing a library. The problem I ran into is thatI'm using in several places CreateLinkedTokenSource out of a CancellationToken received from the user. Shortly, I'm doing it so that I'm able to cancel myself an operation if it takes longer than some time.
Example
public async Task<Result> DoAsync(CancellationToken cancellationToken)
{
using (var linkedTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
{
// here pass linkedTokenSource.Token further down the line
var resultTask = sender.DoAsync(linkedTokenSource.Token);
var timeoutTask = Task.Delay(timeout);
var completed = await Task.WhenAny(resultTask, timeoutTask);
if (completed == timeoutTask)
{
linkedTokenSource.Cancel();
throw TimeoutException();
}
return await resultTask;
// from the point of view of this piece of code
// we're done with the cancellationTokenSource right?
// so I need to dispose the source (done here via `using`)
}
}
However, down the line in different code sections, due to race conditions, it happens that some threads are trying to CreateLinkedTokenSource out of linkedTokenSource.Token resulting in an ObjectDisposedException since the linkedTokenSource has already been disposed after the TimeoutException was thrown.
This will end up in a UnobservedTaskException which will confuse the user if he listens on unobserved exceptions.
Putting a try-catch on every CreateLinkedTokenSource and silencing the ObjectDisposedException line seems again strange for me.
My questions are:
Why the CreateLinkedTokenSource throws this exception? Is there an explanation for this? Since the CencellationToken is a struct, why I shouldn't be able to create a new source out of it? (even in the cancellationToken is cancelled).
Any suggestions on how should handle disposing in this scenario?
This will end up in a UnobservedTaskException which will confuse the user if he listens on unobserved exceptions.
Well, sort of. UnobservedTaskException is pretty much always going to be a fact of life whenever you use Task.WhenAny (and abandon the incomplete task, which is the vast majority of the time Task.WhenAny is used).
So, they may get an ObjectDisposedException reported to UnobservedTaskException instead of an OperationCanceledException. Meh; in the async world, if you're using Task.WhenAny, you really need to ignore UnobservedTaskException anyway. Besides, a lot of "not easily cancelable" endpoints will close the underlying handle on cancellation requests, which cause (IIRC) ObjectDisposedException anyway.
Why the CreateLinkedTokenSource throws this exception? Is there an explanation for this?
It's part of those really, really old Microsoft design guidelines that were written with an '80s OOP mindset. I never agreed with MS's Dispose guidelines, preferring a much simpler model that covers all the same use cases with significantly less mental overhead.
Any suggestions on how should handle disposing in this scenario?
Just keep it as-is. UnobservedTaskException isn't a big deal.
As I can see, you've removed the rest of code with task creation. Whenether, it's very likely that your code is something like this:
public async Task<Result> DoAsync(CancellationToken cancellationToken)
{
using (var linkedTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
{
await Task.Run(() =>
{
// here pass linkedTokenSource.Token further down the line
// check if further processing reached some timeout and if it did:
if (timeout)
{
linkedTokenSource.Cancel();
throw TimeoutException();
}
// from the point of view of this piece of code
// we're done with the cancellationTokenSource right?
// so I need to dispose the source (done here via `using`)
}
}
}
So you catch a closure variable, which being used into a using construction. So it works like this:
You enter the using block
You start your task
You immediately return from the method
You execute the finally block for your method, which calls the .Dispose() method for your linkedTokenSource variable
You got a timeout
you access a disposed closure
You have to rewrite your code for manually disposing the linkedTokenSource after you are done with whole task, not when you're done with starting it.
I've designed and made a prototype application for a high performance, multi-threaded mail merge to run as a Windows Service (C#). This question refers to one sticky part of the problem, what to do if the process hangs on a database call. I have researched this a lot. I have read a lot of articles about thread cancellation and I ultimately only see one way to do this, thread.Abort(). Yes, I know, absolutely do not use Thread.Abort(), so I have been researching for days how to do it another way and as I see it, there is no alternative. I will tell you why and hopefully you can tell me why I am wrong.
FYI, these are meant as long running threads, so the TPL would make them outside the ThreadPool anyway.
TPL is just a nice wrapper for a Thread, so I see absolutely nothing a Task can do that a Thread cannot. It's just done differently.
Using a thread, you have two choices for stopping it.
1. Have the thread poll in a processing loop to see if a flag has requested cancellation and just end the processing and let the thread die. No problem.
2. Call Thread.Abort() (then catch the exception, do a Join and worry about Finally, etc.)
This is a database call in the thread, so polling will not work once it is started.
On the other hand, if you use TPL and a CancellationToken, it seems to me that you're still polling and then creating an exception. It looks like the same thing I described in case 1 with the thread. Once I start that database call (I also intend to put a async / await around it), there is no way I can test for a change in the CancellationToken. For that matter, the TPL is worse as calling the CancellationToken during a Db read will do exactly nothing, far less than a Thread.Abort() would do.
I cannot believe this is a unique problem, but I have not found a real solution and I have read a lot. Whether a Thread or Task, the worker thread has to poll to know it should stop and then stop (not possible when connected to a Db. It's not in a loop.) or else the thread must be aborted, throwing a ThreadAbortedException or a TaskCanceledException.
My current plan is to start each job as a longrunning thread. If the thread exceeds the time limit, I will call Thread.Abort, catch the exception in the thread and then do a Join() on the thread after the Abort().
I am very, very open to suggestions... Thanks, Mike
I will put this link, because it claims to do this, but I'm having trouble figuring it out and there are no replys to make me think it will work
multi-threading-cross-class-cancellation-with-tpl
Oh, this looked like a good possibility, but I don't know about it either Treating a Thread as a Service
You can't actually cancel the DB operation. The request is sent across the network; it's "out there" now, there's no pulling it back. The best you can really do is ignore the response that comes back, and continue on executing whatever code you would have executed had the operation actually completed. It's important to recognize what this is though; this isn't actually cancelling anything, it's just moving on even though you're not done. It's a very important distinction.
If you have some task, and you want it to instead become cancelled when you want it to be, you can create a continuation that uses a CancellationToken such that the continuation will be marked as canceled when the token indicates it should be, or it'll be completed when the task completes. You can then use that continuation's Task in place of the actual underlying tasks for all of your continuations, and the task will be cancelled if the token is cancelled.
public static Task WithCancellation(this Task task
, CancellationToken token)
{
return task.ContinueWith(t => t.GetAwaiter().GetResult(), token);
}
public static Task<T> WithCancellation<T>(this Task<T> task
, CancellationToken token)
{
return task.ContinueWith(t => t.GetAwaiter().GetResult(), token);
}
You can then take a given task, pass in a cancellation token, and get back a task that will have the same result except with altered cancellation semantics.
You have several other options for your thread cancellation. For example, your thread could make an asynchronous database call and then wait on that and on the cancellation token. For example:
// cmd is a SqlCommand object
// token is a cancellation token
IAsyncResult ia = cmd.BeginExecuteNonQuery(); // starts an async request
WaitHandle[] handles = new WaitHandle[]{token.WaitHandle, ia.AsyncWaitHandle};
var ix = WaitHandle.WaitAny(handles);
if (ix == 0)
{
// cancellation was requested
}
else if (ix == 1)
{
// async database operation is done. Harvest the result.
}
There's no need to throw an exception if the operation was canceled. And there's no need for Thread.Abort.
This all becomes much cleaner with Task, but it's essentially the same thing. Task handles common errors and helps you to do a better job fitting all the pieces together.
You said:
TPL is just a nice wrapper for a Thread, so I see absolutely nothing a Task can do that a Thread cannot. It's just done differently.
That's true, as far as it goes. After all, C# is just a nice wrapper for an assembly language program, so I see absolutely nothing a C# program can do that I can't do in assembly language. But it's a whole lot easier and faster to do it with C#.
Same goes for the difference between TPL or Tasks, and managing your own threads. You can do all manner of stuff managing your own threads, or you can let the TPL handle all the details and be more likely to get it right.