I recently had to respond to a production incident where a background thread that executed some code on a loop had stopped executing, but a related thread was continuing to do it's job, in spite of the second thread being explicitly cancelled.
As I read the code, I believe that because of the finally statement (See below) the second thread should have been cancelled and thus further executions stopped. It's worth noting, that the code in the implementation of IBackgroundJob failed with a TaskCancelledException, due to timeouts of > 100 seconds (default for Http I believe).
Can anyone clarify why, when the code exits the inner while loop (either because of cancellation token source state or an exception), the executing code on the second thread (KeepRenewingLock) would continue to run? Would the leaseCancellation?.Cancel() not cancel the token being used by the KeepRenewingLock and thus prevent further lock renewals (incidentally, the lock item behind the scenes continued to be renewed. We could see this from the timestamps on the lock item)
public class Leasing
{
private Task _task;
private readonly CancellationTokenSource _cts;
private readonly IBackgroundJob _backgroundJob;
public Leasing(IBackgroundJob backgroundJob)
{
_backgroundJob = backgroundJob;
_cts = new CancellationTokenSource();
}
public Task StartAsync(CancellationToken cancellationToken)
{
_task = Task.Run(async () =>
{
await RunAsync(_cts.Token);
}, cancellationToken);
return Task.CompletedTask;
}
private async Task RunAsync(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
CancellationTokenSource leaseCancellation = null;
try
{
leaseCancellation = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
var maintainLockTask = Task.Run(() => KeepRenewingLock(leaseCancellation.Token), leaseCancellation.Token);
while (!cancellationToken.IsCancellationRequested && !maintainLockTask.IsCompleted)
{
await RunBackgroundJobSafely(cancellationToken);
await Task.Delay(TimeSpan.FromMinutes(2), cancellationToken);
}
}
catch (Exception e)
{
//code to log error
}
finally
{
leaseCancellation?.Cancel();
leaseCancellation?.Dispose();
}
}
}
private async Task RunBackgroundJobSafely(CancellationToken cancellationToken)
{
try
{
await _backgroundJob.ExecuteAsync(cancellationToken);
}
catch (Exception ex) when (!_cts.IsCancellationRequested)
{
//log error
}
}
private void KeepRenewingLock(CancellationToken cancellationToken)
{
try
{
while (!cancellationToken.IsCancellationRequested)
{
//task delay here
//code to renew lease
}
}
catch(Exception ex)
{
//code to log error
}
}
}
public interface IBackgroundJob
{
Task ExecuteAsync(CancellationToken cancellationToken);
}
Related
I expected the following Connected and Disconnected handlers are called alternately.
But it wasn't. It is not 100% reproducible but sometimes the test failed. Repository is here.
I attach simplified code here because full code will be long. Please refer to the repository for the reproducible example.
public class ClientStressTest3 {
[Fact]
public async Task TestAsync() {
var client = new Client();
int openCloseDifference = 0;
var failures = Channel.CreateUnbounded<string>();
client.Connected += () => {
Interlocked.Increment(ref openCloseDifference);
int difference = openCloseDifference;
Debug.WriteLine("Connected: {}", difference);
if (Math.Abs(difference) > 1) {
// Failure point. Why enter here?
_ = failures.Writer.WriteAsync($"open close difference {difference}");
}
};
client.Disconnected += () => {
Interlocked.Decrement(ref openCloseDifference);
int difference = openCloseDifference;
Debug.WriteLine("Disconnected: {}", difference);
if (Math.Abs(difference) > 1) {
_ = failures.Writer.WriteAsync($"open close difference {difference}");
}
};
var tasks = new List<Task>();
for (int i = 0; i < 625; i++) {
tasks.Add(Task.Run(async () => {
try {
await client.ConnectAsync().ConfigureAwait(false);
}
catch (Exception ex) { }
}));
tasks.Add(Task.Run(async () => {
try {
await client.CloseAsync().ConfigureAwait(false);
}
catch (Exception ex) { }
}));
}
await Task.WhenAll(tasks).ConfigureAwait(false);
while (await failures.Reader.WaitToReadAsync().ConfigureAwait(false)) {
string failure = await failures.Reader.ReadAsync().ConfigureAwait(false);
throw new Exception(failure);
}
}
}
class Client {
public event Action Connected = delegate { };
public event Action Disconnected = delegate { };
private readonly SemaphoreSlim _connectSemaphore = new(1);
private Task _dispatch = Task.CompletedTask;
private WebSocket _clientSocket;
public async Task ConnectAsync() {
await _connectSemaphore.WaitAsync().ConfigureAwait(false);
// _dispatch may be same among multiple threads if restored at the same time.
try {
await _clientSocket.ConnectAsync().ConfigureAwait(false);
if (_dispatch != null) {
_dispatch = _dispatch.ContinueWith((_) => DispatchEventAsync(), TaskScheduler.Default);
}
}
finally {
_connectSemaphore.Release();
}
}
private async Task DispatchEventAsync() {
try {
Connected();
}
catch (Exception exception) { }
try {
while (await events.WaitToReadAsync().ConfigureAwait(false)) {
DispatchEvent(await events.ReadAsync().ConfigureAwait(false));
}
}
catch (Exception exception) { }
try {
Disconnected();
}
catch (Exception ex) { }
}
}
I think _connectSemaphore will guard the inner code execution, and only one thread will execute DispatchEventAsync.
But it seems sometimes two threads enter DispatchEventAsync at the same time. I can't understand this.
ContinueWith is a low-level method with dangerous default behavior (link is to my blog). In this case, your code is not behaving how you think it should because ContinueWith (like StartNew) doesn't understand asynchronous delegates.
Specifically, the task returned from ContinueWith (the same task stored in _dispatch) will complete when DispatchEventAsync asynchronously yields (i.e., hits its first await that asynchronously waits). This is likely the call to WaitToReadAsync, which is after Connected and before Disconnected. So the _dispatch task completes after Connected and before Disconnected.
Ideally, you should avoid ContinueWith completely. Sometimes a local asynchronous method helps, e.g.:
public async Task ConnectAsync() {
await _connectSemaphore.WaitAsync().ConfigureAwait(false);
try {
await _clientSocket.ConnectAsync().ConfigureAwait(false);
if (_dispatch != null) {
_dispatch = ChainAsync(_dispatch);
}
}
finally {
_connectSemaphore.Release();
}
static async Task ChainAsync(Task dispatch)
{
await dispatch;
await DispatchEventAsync();
}
}
If you do want to continue using ContinueWith for some reason, then you can use Unwrap:
public async Task ConnectAsync() {
await _connectSemaphore.WaitAsync().ConfigureAwait(false);
try {
await _clientSocket.ConnectAsync().ConfigureAwait(false);
if (_dispatch != null) {
_dispatch = _dispatch.ContinueWith(_ => DispatchEventAsync(), TaskScheduler.Default)
.Unwrap();
}
}
finally {
_connectSemaphore.Release();
}
}
Using either of these approaches, the _dispatch task will now complete at the end of DispatchEventAsync.
I think _connectSemaphore will guard the inner code execution, and only one thread will execute DispatchEventAsync.
Nope. Because you don't wait or await _dispatch. The semaphore only protects starting the task. The execution of the task happens in the background sometime after you release the semaphore.
But I'm chaining it with ContinueWith. Isn't it sufficien
No. That just adds another task that runs after; it doesn't actually run the task or wait for it to complete. If you want to run the _dispatch and then another task while holding the semaphore, just
await _dispatch;
await DispatchEventAsync();
I have two worker classes which executes a task.So basically both the tasks are scheduled where 1 task runs every hour and another every minute.
What i want is when the hourly task is running the minute task should pause its execution. I am not sure how i do this?
I have the below code in the worker classes
public override Task StopAsync(CancellationToken cancellationToken)
{
_logger.LogInformation("iLevel Refresh delta cache background service is stopped.");
return base.StopAsync(cancellationToken);
}
public override async Task StartAsync(CancellationToken cancellationToken)
{
await RefreshCache();
await base.StartAsync(cancellationToken);
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
await Task.Delay(AppSettings.Current.CacheRefreshDetlaDelay, stoppingToken);
await RefreshCache();
}
}
public async Task RefreshCache()
{
_logger.LogInformation("Starting delta cache refresh...");
var isExceptionLogEnabled = true;
while (true)
try
{
using var scope = _scopeFactory.CreateScope();
using var iLevel = scope.ServiceProvider.GetService<ILevel>();
var cache = scope.ServiceProvider.GetService<CacheService>();
var referenceDataService = new ReferenceDataService(iLevel, cache);
await referenceDataService.LoadReferenceDeltaDataFromILevel;
_logger.LogInformation("Cache refreshed successfully");
return;
}
catch (Exception ex)
{
if (isExceptionLogEnabled)
{
_logger.LogError(ex, "Failed to refresh cache");
isExceptionLogEnabled = false;
}
await Task.Delay(TimeSpan.FromSeconds(1));
}
}
}
}
How do i pause this task when a similar task is running in another worker class.I tried adding a static variable and checking if that variable is true or false. But is there any other way to achieve this?
You could use one of many synchronization primitives and techniques. Since this is async the simplest approach would be a SemaphoreSlim (as you can't use a lock). This will ensure only one task implementation (term used loosely) can run concurrently
private SemaphoreSlim _sync = new SemaphoreSlim(1, 1);
public async Task SomeThing()
{
await _sync.WaitAsync();
try
{
// implementation here
}
finally
{
_sync.Release();
}
}
Note : If this is a long running task, then SemaphoreSlim may not be the best solution as it's based on a SpinWait as such will tend to chew cpu cycles waiting to acquire the lock, if this is the case you could use a Semaphore or check out Overview of synchronization primitives for other alternatives.
My assertion of acceptor.IsStarted.Should().BeTrue(); (see unit test below) always fails, as it's getting evaluated too early. The call to await task returns immediately and doesn't give this.acceptor.Start() enough time to spin up.
I would like to make the startup of my FixAcceptor() more deterministic and therefor introduced the parameter TimeSpan startupDelay.
However I simply have no clue where and how I can delay the startup.
Putting an additional Thread.Sleep(startupDelay) between this.acceptor.Start() and this.IsStarted = true won't help as it will only block the worker task itself, but not the calling thread.
I hope it's clear what I'd like to archive and what I am struggling with. Thanks in advance.
public class FixAcceptor
{
// Type provided by QuickFix.net
private readonly ThreadedSocketAcceptor acceptor;
public FixAcceptor(IFixSettings settings)
{
// Shortened
}
public bool IsStarted { get; private set; }
public async void Run(CancellationToken cancellationToken, TimeSpan startupDelay)
{
var task = Task.Run(() =>
{
cancellationToken.ThrowIfCancellationRequested();
this.acceptor.Start();
this.IsStarted = true;
while (true)
{
// Stop if token has been canceled
if (cancellationToken.IsCancellationRequested)
{
this.acceptor.Stop();
this.IsStarted = false;
cancellationToken.ThrowIfCancellationRequested();
}
// Save some CPU cycles
Thread.Sleep(TimeSpan.FromSeconds(1));
}
}, cancellationToken);
try
{
await task;
}
catch (OperationCanceledException e)
{
Debug.WriteLine(e.Message);
}
}
}
And the corresponding consumer code
[Fact]
public void Should_Run_Acceptor_And_Stop_By_CancelationToken()
{
// Arrange
var acceptor = new FixAcceptor(new FixAcceptorSettings("test_acceptor.cfg", this.logger));
var tokenSource = new CancellationTokenSource();
// Act
tokenSource.CancelAfter(TimeSpan.FromSeconds(10));
acceptor.Run(tokenSource.Token, TimeSpan.FromSeconds(3));
// Assert
acceptor.IsStarted.Should().BeTrue();
IsListeningOnTcpPort(9823).Should().BeTrue();
// Wait for cancel event to occur
Thread.Sleep(TimeSpan.FromSeconds(15));
acceptor.IsStarted.Should().BeFalse();
}
Adding time delays to achieve determinism is not a recommended practice. You can achieve 100% determinism by using a TaskCompletionSource for controlling the completion of a task at just the right moment:
public Task<bool> Start(CancellationToken cancellationToken)
{
var startTcs = new TaskCompletionSource<bool>();
var task = Task.Run(() =>
{
cancellationToken.ThrowIfCancellationRequested();
this.acceptor.Start();
this.IsStarted = true;
startTcs.TrySetResult(true); // Signal that the starting phase is completed
while (true)
{
// ...
}
}, cancellationToken);
HandleTaskCompletion();
return startTcs.Task;
async void HandleTaskCompletion() // async void method = should never throw
{
try
{
await task;
}
catch (OperationCanceledException ex)
{
Debug.WriteLine(ex.Message);
startTcs.TrySetResult(false); // Signal that start failed
}
catch
{
startTcs.TrySetResult(false); // Signal that start failed
}
}
}
Then replace this line in your test:
acceptor.Run(tokenSource.Token, TimeSpan.FromSeconds(3));
...with this one:
bool startResult = await acceptor.Start(tokenSource.Token);
Another issue that caught my eye is the bool IsStarted property which is mutated from one thread and observed by another, without synchronization. This is not really a problem because you could rely on the undocumented memory barrier that is inserted automatically on every await, and be quite confident that you'll not have visibility issues, but if you want to be extra sure you could synchronize the access by using a lock (most robust), or backup the property with a volatile private field like this:
private volatile bool _isStarted;
public bool IsStarted => _isStarted;
I would recommend that you structure your FixAcceptor.Run() methode a little bit different
public async Task Run(CancellationToken cancellationToken, TimeSpan startupDelay)
{
var task = Task.Run(async () =>
{
try
{
cancellationToken.ThrowIfCancellationRequested();
this.acceptor.Start();
this.IsStarted = true;
while (true)
{
// Stop if token has been canceled
if (cancellationToken.IsCancellationRequested)
{
this.acceptor.Stop();
this.IsStarted = false;
cancellationToken.ThrowIfCancellationRequested();
}
// Save some CPU cycles
await Task.Delay(TimeSpan.FromSeconds(1));
}
}
catch (OperationCanceledException e)
{
Debut.WriteLine(e.Message);
}
}, cancellationToken);
await Task.Delay(startupDelay);
}
so the exception handling is in the inner task and the Run methode returns a Task that completes after the startupDelay.
(I also exchanged the Thread.Sleep() with a Task.Delay())
Then in the test methode you can await the Task returned by Run
[Fact]
public async Task Should_Run_Acceptor_And_Stop_By_CancelationToken()
{
// Arrange
var acceptor = new FixAcceptor(new FixAcceptorSettings("test_acceptor.cfg", this.logger));
var tokenSource = new CancellationTokenSource();
// Act
tokenSource.CancelAfter(TimeSpan.FromSeconds(10));
await acceptor.Run(tokenSource.Token, TimeSpan.FromSeconds(3));
// Assert
acceptor.IsStarted.Should().BeTrue();
IsListeningOnTcpPort(9823).Should().BeTrue();
// Wait for cancel event to occur
Thread.Sleep(TimeSpan.FromSeconds(15));
acceptor.IsStarted.Should().BeFalse();
}
It should be okay to make the mehtode async (it seams like you use xunit)
I created a small wrapper around CancellationToken and CancellationTokenSource. The problem I have is that the CancelAsync method of CancellationHelper doesn't work as expected.
I'm experiencing the problem with the ItShouldThrowAExceptionButStallsInstead method. To cancel the running task, it calls await coordinator.CancelAsync();, but the task is not cancelled actually and doesn't throw an exception on task.Wait
ItWorksWellAndThrowsException seems to be working well and it uses coordinator.Cancel, which is not an async method at all.
The question why is the task is not cancelled when I call CancellationTokenSource's Cancel method in async method?
Don't let the waitHandle confuse you, it's only for not letting the task finish early.
Let the code speak for itself:
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace TestCancellation
{
class Program
{
static void Main(string[] args)
{
ItWorksWellAndThrowsException();
//ItShouldThrowAExceptionButStallsInstead();
}
private static void ItShouldThrowAExceptionButStallsInstead()
{
Task.Run(async () =>
{
var coordinator = new CancellationHelper();
var waitHandle = new ManualResetEvent(false);
var task = Task.Run(() =>
{
waitHandle.WaitOne();
//this works well though - it throws
//coordinator.ThrowIfCancellationRequested();
}, coordinator.Token);
await coordinator.CancelAsync();
//waitHandle.Set(); -- with or without this it will throw
task.Wait();
}).Wait();
}
private static void ItWorksWellAndThrowsException()
{
Task.Run(() =>
{
var coordinator = new CancellationHelper();
var waitHandle = new ManualResetEvent(false);
var task = Task.Run(() => { waitHandle.WaitOne(); }, coordinator.Token);
coordinator.Cancel();
task.Wait();
}).Wait();
}
}
public class CancellationHelper
{
private CancellationTokenSource cancellationTokenSource;
private readonly List<Task> tasksToAwait;
public CancellationHelper()
{
cancellationTokenSource = new CancellationTokenSource();
tasksToAwait = new List<Task>();
}
public CancellationToken Token
{
get { return cancellationTokenSource.Token; }
}
public void AwaitOnCancellation(Task task)
{
if (task == null) return;
tasksToAwait.Add(task);
}
public void Reset()
{
tasksToAwait.Clear();
cancellationTokenSource = new CancellationTokenSource();
}
public void ThrowIfCancellationRequested()
{
cancellationTokenSource.Token.ThrowIfCancellationRequested();
}
public void Cancel()
{
cancellationTokenSource.Cancel();
Task.WaitAll(tasksToAwait.ToArray());
}
public async Task CancelAsync()
{
cancellationTokenSource.Cancel();
try
{
await Task.WhenAll(tasksToAwait.ToArray());
}
catch (AggregateException ex)
{
ex.Handle(p => p is OperationCanceledException);
}
}
}
}
Cancellation in .NET is cooperative.
That means that the one holding the CancellationTokenSource signals cancellation and the one holding the CancellationToken needs to check whether cancellation was signaled (either by polling the CancellationToken or by registering a delegate to run when it is signaled).
In your Task.Run you use the CancellationToken as a parameter, but you don't check it inside the task itself so the task will only be cancelled if the token was signaled before the task had to a chance to start.
To cancel the task while it's running you need to check the CancellationToken:
var task = Task.Run(() =>
{
token.ThrowIfCancellationRequested();
}, token);
In your case you block on a ManualResetEvent so you wouldn't be able to check the CancellationToken. You can register a delegate to the CancellationToken that frees up the reset event:
token.Register(() => waitHandle.Set())
I'm implementing a worker engine with an upper limit to concurrency. I'm using a semaphore to wait until concurrency drops below the maximum, then use Task.Factory.StartNew to wrap the async handler in a try/catch, with a finally which releases the semaphore.
I realise this creates threads on the thread pool - but my question is, when one of those task-running threads actually awaits (on a real IO call or wait handle), is the thread returned to the pool, as I'd hope it would be?
If there's a better way to implement a task scheduler with limited concurrency where the work handler is an async method (returns Task), I'd love to hear it too. Or, let's say ideally, if there's a way to queue up an async method (again, it's a Task-returning async method) that feels less dodgy than wrapping it in a synchronous delegate and passing it to Task.Factory.StartNew, that would seem perfect..?
(This also makes me think that there are two kinds of parallelism here: how many tasks are being processed overall, but also how many continuations are running on different threads concurrently. Might be cool to have configurable options for both, though not a fixed requirement..)
Edit: snippet:
concurrencySemaphore.Wait(cancelToken);
deferRelease = false;
try
{
var result = GetWorkItem();
if (result == null)
{ // no work, wait for new work or exit signal
signal = WaitHandle.WaitAny(signals);
continue;
}
deferRelease = true;
tasks.Add(Task.Factory.StartNew(() =>
{
try
{
DoWorkHereAsync(result); // guess I'd think to .GetAwaiter().GetResult() here.. not run this yet
}
finally
{
concurrencySemaphore.Release();
}
}, cancelToken));
}
finally
{
if (!deferRelease)
{
concurrencySemaphore.Release();
}
}
Here an example of a TaskWorker, that will not produce countless worker threads.
The magic is done by awaiting SemaphoreSlim.WaitAsync() which is an IO task (and there is no thread).
class TaskWorker
{
private readonly SemaphoreSlim _semaphore;
public TaskWorker(int maxDegreeOfParallelism)
{
if (maxDegreeOfParallelism <= 0)
{
throw new ArgumentOutOfRangeException(nameof(maxDegreeOfParallelism));
}
_semaphore = new SemaphoreSlim(maxDegreeOfParallelism, maxDegreeOfParallelism);
}
public async Task RunAsync(Func<Task> taskFactory, CancellationToken cancellationToken = default(CancellationToken))
{
// No ConfigureAwait(false) here to keep the SyncContext if any
// for the real task
await _semaphore.WaitAsync(cancellationToken);
try
{
await taskFactory().ConfigureAwait(false);
}
finally
{
_semaphore.Release(1);
}
}
public async Task<T> RunAsync<T>(Func<Task<T>> taskFactory, CancellationToken cancellationToken = default(CancellationToken))
{
await _semaphore.WaitAsync(cancellationToken);
try
{
return await taskFactory().ConfigureAwait(false);
}
finally
{
_semaphore.Release(1);
}
}
}
and a simple console app to test
class Program
{
static void Main(string[] args)
{
var worker = new TaskWorker(1);
var cts = new CancellationTokenSource();
var token = cts.Token;
var tasks = Enumerable.Range(1, 10)
.Select(e => worker.RunAsync(() => SomeWorkAsync(e, token), token))
.ToArray();
Task.WhenAll(tasks).GetAwaiter().GetResult();
}
static async Task SomeWorkAsync(int id, CancellationToken cancellationToken)
{
Console.WriteLine($"Some Started {id}");
await Task.Delay(2000, cancellationToken).ConfigureAwait(false);
Console.WriteLine($"Some Finished {id}");
}
}
Update
TaskWorker implementing IDisposable
class TaskWorker : IDisposable
{
private readonly CancellationTokenSource _cts = new CancellationTokenSource();
private readonly SemaphoreSlim _semaphore;
private readonly int _maxDegreeOfParallelism;
public TaskWorker(int maxDegreeOfParallelism)
{
if (maxDegreeOfParallelism <= 0)
{
throw new ArgumentOutOfRangeException(nameof(maxDegreeOfParallelism));
}
_maxDegreeOfParallelism = maxDegreeOfParallelism;
_semaphore = new SemaphoreSlim(maxDegreeOfParallelism, maxDegreeOfParallelism);
}
public async Task RunAsync(Func<Task> taskFactory, CancellationToken cancellationToken = default(CancellationToken))
{
ThrowIfDisposed();
using (var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, _cts.Token))
{
// No ConfigureAwait(false) here to keep the SyncContext if any
// for the real task
await _semaphore.WaitAsync(cts.Token);
try
{
await taskFactory().ConfigureAwait(false);
}
finally
{
_semaphore.Release(1);
}
}
}
public async Task<T> RunAsync<T>(Func<Task<T>> taskFactory, CancellationToken cancellationToken = default(CancellationToken))
{
ThrowIfDisposed();
using (var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, _cts.Token))
{
await _semaphore.WaitAsync(cts.Token);
try
{
return await taskFactory().ConfigureAwait(false);
}
finally
{
_semaphore.Release(1);
}
}
}
private void ThrowIfDisposed()
{
if (disposedValue)
{
throw new ObjectDisposedException(this.GetType().FullName);
}
}
#region IDisposable Support
private bool disposedValue = false;
protected virtual void Dispose(bool disposing)
{
if (!disposedValue)
{
if (disposing)
{
_cts.Cancel();
// consume all semaphore slots
for (int i = 0; i < _maxDegreeOfParallelism; i++)
{
_semaphore.WaitAsync().GetAwaiter().GetResult();
}
_semaphore.Dispose();
_cts.Dispose();
}
disposedValue = true;
}
}
public void Dispose()
{
Dispose(true);
}
#endregion
}
You can think that thread is returned to a ThreadPool even thought it is not actauly a return. The thread simply picks next queued item when async operation starts.
I would suggest you to look at Task.Run instead of Task.Factory.StartNew Task.Run vs Task.Factory.StartNew.
And also have a look at TPL DataFlow. I think it will match your requirements.