EventProcessorClient delays between events

EventProcessorClient delays between events - c#

I'm trying to create a way to handle spikes of events in the eventhub. My current poc solution is just to fire and forget tasks as I'm consuming events, instead of awaiting them and then throttle parallel task amount using semaphore to avoid resource starvation.
Utility that throttles things:
public class ThrottledParallelTaskFactory
{
...
public Task StartNew(Func<Task> func)
{
_logger.LogDebug("Available semaphore count {AvailableDataConsumerCount} out of total {DataConsumerCountLimit}", _semaphore.CurrentCount, _limit);
_semaphoreSlim.Wait(_timeout);
_ = Task.Run(func)
.ContinueWith(t =>
{
if (t.Status is TaskStatus.Faulted or TaskStatus.Canceled or TaskStatus.RanToCompletion)
{
_semaphoreSlim.Release();
_logger.LogDebug("Available semaphore count {AvailableDataConsumerCount} out of total {DataConsumerCountLimit}", _semaphore.CurrentCount, _limit);
}
if (t.Status is TaskStatus.Canceled or TaskStatus.Faulted)
{
_logger?.LogError(t.Exception, "Parallel task failed");
}
});
return Task.CompletedTask;
}
}
My EventProcessorClient.ProcessEventAsync delegate:
private Task ProcessEvent(ProcessEventArgs arg)
{
var sw = Stopwatch.StartNew();
try
{
_throttledParallelTaskFactory.StartNew(async () => await Task.Delay(1000));
}
catch (Exception e)
{
_logger.LogError(e, "Failed to process event");
}
_logger.LogDebug($"Took {sw.ElapsedMilliseconds} ms");
return Task.CompletedTask;
}
After running this setup for a while, I noticed that my throttler's Semaphore maxes out at 2-3 tasks running in parallel, when my configured limit is 15. This kind of suggests to me that my handler takes 333-500ms to finish, but Stopwatch inside the handler says that the whole handler takes 0 ms to execute. I later added timestamp logging of when handler starts/ends to confirm it and it does take 0-1ms, but there's a mystery 300-600ms gap between them. NOTE: For current tests, this client is processing a backlog of millions of events, it's not processing live data, which could cause similar delays between events.
Does by any chance EventProcessorClient checkpoint internally after every single event? 300-500ms seems massive in my head.
I have both used default cached event/prefetch counts and increased ones without much difference.
Edit:
It ended up being not implementation related networking issue

You are not measuring the right thing and basically you are using async/await & Task wrong.
private Task ProcessEvent(ProcessEventArgs arg)
{
var sw = Stopwatch.StartNew();
try
{
_throttledParallelTaskFactory.StartNew(async () => await Task.Delay(1000));
}
catch (Exception e)
{
_logger.LogError(e, "Failed to process event");
}
_logger.LogDebug($"Took {sw.ElapsedMilliseconds} ms");
return Task.CompletedTask;
}
In the above code the call to _throttledParallelTaskFactory.StartNew is not awaited. So the stopwatch has nothing to measure. Furthermore, since the call is not awaited any exception won't be caught.
You should move the exception handling and time measurement to the StartNew method like this:
private Task ProcessEvent(ProcessEventArgs arg)
{
_throttledParallelTaskFactory.StartNew(() => Task.Delay(1000));
return Task.CompletedTask;
}
public class ThrottledParallelTaskFactory
{
public async Task StartNew(Func<Task> func)
{
var sw = Stopwatch.StartNew();
_logger.LogDebug("Available semaphore count {AvailableDataConsumerCount} out of total {DataConsumerCountLimit}", _semaphore.CurrentCount, _limit);
_semaphoreSlim.Wait(_timeout);
try
{
await func.Invoke();
}
catch
{
_logger.LogError(e, "Failed to process event");
_logger?.LogError(t.Exception, "Parallel task failed");
}
finally
{
_semaphoreSlim.Release();
_logger.LogDebug("Available semaphore count {AvailableDataConsumerCount} out of total {DataConsumerCountLimit}", _semaphore.CurrentCount, _limit);
_logger.LogDebug($"Took {sw.ElapsedMilliseconds} ms");
}
}
}
See how we got rid of the call to ContinueWith? Also, since the func already represents a Task there is no need to wrap the code in a call to Task.Run.
Does by any chance EventProcessorClient checkpoint internally after every single event?
No, it does not. You have to do checkpointing manually.

Related

How do you run a variable number of concurrent parametrizable infinite loop type of threads in C#?

I am creating my first multithreading C#/.NET based app that will run on a Azure Service Fabric cluster. As the title says, I wish to run a variable number of concurrent parametrizable infinite-loop type of threads, that will utilize the RunAsync method.
Each child thread looks something like this:
public async Task childThreadCall(...argument list...)
{
while (true)
{
try
{
//long running code
//do something useful here
//sleep for an independently parameterizable period, then wake up and repeat
}
catch (Exception e)
{
//Exception Handling
}
}
}
There are a variable number of such child threads that are called in the RunAsync method. I want to do something like this:
protected override async Task RunAsync(CancellationToken cancellationToken)
{
try
{
for (int i = 0; i < input.length; i++)
{
ThreadStart ts[i] = new ThreadStart(childThreadCall(...argument list...));
Thread tc[i] = new Thread(ts);
tc[i].Start();
}
}
catch (Exception e)
{
//Exception Handling
}
}
So basically each of the child threads run independently from the others, and keep doing so forever. Is it possible to do such a thing? Could someone point me in the right direction? Are there any pitfalls to be aware of?

The RunAsync method is called upon start of the service. So yes it can be used to do what you want. I suggest using Tasks, as they play nicely with the cancelation token. Here is a rough draft:
protected override async Task RunAsync(CancellationToken cancellationToken)
{
var tasks = new List<Task>();
try
{
for (int i = 0; i < input.length; i++)
{
tasks.Add(MyTask(cancellationToken, i);
}
await Task.WhenAll(tasks);
}
catch (Exception e)
{
//Exception Handling
}
}
public async Task MyTask(CancellationToken cancellationToken, int a)
{
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
try
{
//long running code, if possible check for cancellation using the token
//do something useful here
await SomeUseFullTask(cancellationToken);
//sleep for an independently parameterizable period, then wake up and repeat
await Task.Delay(TimeSpan.FromHours(1), cancellationToken);
}
catch (Exception e)
{
//Exception Handling
}
}
}
Regarding pitfalls, there is a nice list of things to think of in general when using Tasks.
Do mind that Tasks are best suited for I/O bound work. If you can post what exactly is done in the long running process please do, then I can maybe improve the answer to best suit your use case.
One important thing it to respect the cancellation token passed to the RunAsync method as it indicates the service is about to stop. It gives you the opportunity to gracefully stop your work. From the docs:
Make sure cancellationToken passed to RunAsync(CancellationToken) is honored and once it has been signaled, RunAsync(CancellationToken) exits gracefully as soon as possible. Please note that if RunAsync(CancellationToken) has finished its intended work, it does not need to wait for cancellationToken to be signaled and can return gracefully.
As you can see in my code I pass the CancellationToken to child methods so they can react on a possible cancellation. In your case there will be a cancellation because of the endless loop.

Using Task.Delay within Task.Run

I have a Windows Service that monitors my application by running a couple of tests every second. A bug report has been submitted that said that the service stoppes running after a while, and I'm trying to figure out why.
I suspect that the code below is the culprit, but I have trouble understanding exactly how it works. The ContinueWith statement has recently been commented out, but I dont know if it is needed
private Task CreateTask(Action action)
{
var ct = _cts.Token;
return Task.Run(async () =>
{
ct.ThrowIfCancellationRequested();
var sw = new Stopwatch();
while (true)
{
sw.Restart();
action();
if (ct.IsCancellationRequested)
{
_logger.Debug("Cancellation requested");
break;
}
var wait = _settings.loopStepFrequency - sw.ElapsedMilliseconds;
if (wait <= 0) // No need to delay
continue;
// If ContinueWith is needed wrap this in an ugly try/catch
// handling the exception
await Task.Delay(
(int)(_settings.loopStepFrequency - sw.ElapsedMilliseconds),
ct); //.ContinueWith(tsk => { }, ct);
}
_logger.Debug("Task was cancelled");
}, _cts.Token);
}
Are there any obvious problems with this code?

Are there any obvious problems with this code?
The one that jumps out to me is the calculation for the number of milliseconds to delay. Specifically, there's no floor. If action() takes an unusually long time, then the task could fail in a possibly unexpected way.
There are several ways for the task to complete in either a cancelled or failed state, or it can delay forever:
The task can be cancelled before the delegate begins, due to the cancellation token passed to Task.Run.
The task can be cancelled by the ThrowIfCancellationRequested call.
The task can complete successfully after being cancelled, due to the IsCancellationRequested logic.
The task can be cancelled by the cancellation token passed to Task.Delay.
The task may fail with an ArgumentOutOfRangeException if _settings.loopStepFrequency - sw.ElapsedMilliseconds is less than -1. This is probably a bug.
The task may delay indefinitely (until cancelled) if _settings.loopStepFrequency - sw.ElapsedMilliseconds happens to be exactly -1. This is probably a bug.
To fix this code, I recommend two things:
The code is probably intending to do await Task.Delay((int) wait, ct); instead of await Task.Delay((int)(_settings.loopStepFrequency - sw.ElapsedMilliseconds), ct);. This will remove the last two conditions above.
Choose one method of cancellation. The standard pattern to express cancellation is via OperationCanceledExcpetion; this is the pattern used by ThrowIfCancellationRequested and by Task.Delay. The IsCancellationRequested check is using a different pattern; it will successfully complete the task on cancellation, instead of cancelling it.

There are so many problems with this code, that makes more sense to rewrite it than attempt to fix it. Here is a possible way to rewrite this method, with some (possibly superfluous) argument validation added:
private Task CreateTask(Action action)
{
if (action == null) throw new ArgumentNullException(nameof(action));
var ct = _cts.Token;
var delayMsec = _settings.loopStepFrequency;
if (delayMsec <= 0) throw new ArgumentOutOfRangeException("loopStepFrequency");
return Task.Run(async () =>
{
while (true)
{
var delayTask = Task.Delay(delayMsec, ct);
action();
await delayTask;
}
}, ct);
}
The responsibility for logging a possible exception/cancellation belongs now to the caller of the method, that (hopefully) awaits the created task.
var task = CreateTask(TheAction);
try
{
await task; // If the caller is async
//task.GetAwaiter().GetResult(); // If the caller is sync
_logger.Info("The task completed successfully");
}
catch (OperationCanceledException)
{
_logger.Info("The task was canceled");
}
catch (Exception ex)
{
_logger.Error("The task failed", ex);
}

Recursively call a method with the same thread

I have the following method:
public async Task ScrapeObjects(int page = 1)
{
try
{
while (!isObjectSearchCompleted)
{
..do calls..
}
}
catch (HttpRequestException ex)
{
Thread.Sleep(TimeSpan.FromSeconds(60));
ScrapeObjects(page);
Log.Fatal(ex, ex.Message);
}
}
I call this long running method async and I don't wait for it to finish. Thing is that an exception my occur and in that case I want to handle it. But then I want to start from where I left and with the same thread. At the current state a new thread gets used when I recursively call the method after handling the exception. I would like to keep using the same thread. Is there a way to do so? Thank you!

You probably need to move the try/catch block inside the while loop, and add a counter with the errors occurred, to bail out in case of continuous faulted attempts.
public async Task ScrapeObjects()
{
int failedCount = 0;
int page = 1;
while (!isObjectSearchCompleted)
{
try
{
//..do calls..
}
catch (HttpRequestException ex)
{
failedCount++;
if (failedCount < 3)
{
Log.Info(ex, ex.Message);
await Task.Delay(TimeSpan.FromSeconds(60));
}
else
{
Log.Fatal(ex, ex.Message);
throw; // or return;
}
}
}
}
As a side note it is generally better to await Task.Delay instead of Thread.Sleep inside asynchronous methods, to avoid blocking a thread without a reason.

One simple question before you read the long answer below:
Why you need the same thread? Are you accessing thread static / contextual data?
If yes, there will be ways to solve that easily than limiting your tasks to run on the same thread.
How to limit tasks to run on a single thread
As long as you use async calls on the default synchronization context, and as soon as the code is resumed from an await, it is possible that the thread can change after an await. This is because the default context schedules tasks to the next available thread in the thread pool. Like in the below case, before can be different from after:
public async Task ScrapeObjects(int page = 1)
{
var before = Thread.CurrentThread.ManagedThreadId;
await Task.Delay(1000);
var after = Thread.CurrentThread.ManagedThreadId;
}
The only reliable way to guarantee that your code could come back on the same thread is to schedule your async code onto a single threaded synchronization context:
class SingleThreadSynchronizationContext : SynchronizationContext
{
private readonly BlockingCollection<Action> _actions = new BlockingCollection<Action>();
private readonly Thread _theThread;
public SingleThreadSynchronizationContext()
{
_theThread = new Thread(DoWork);
_theThread.IsBackground = true;
_theThread.Start();
}
public override void Send(SendOrPostCallback d, object state)
{
// Send requires run the delegate immediately.
d(state);
}
public override void Post(SendOrPostCallback d, object state)
{
// Schedule the action by adding to blocking collection.
_actions.Add(() => d(state));
}
private void DoWork()
{
// Keep picking up actions to run from the collection.
while (!_actions.IsAddingCompleted)
{
try
{
var action = _actions.Take();
action();
}
catch (InvalidOperationException)
{
break;
}
}
}
}
And you need to schedule ScrapeObjects to the custom context:
SynchronizationContext.SetSynchronizationContext(new SingleThreadSynchronizationContext());
await Task.Factory.StartNew(
() => ScrapeObjects(),
CancellationToken.None,
TaskCreationOptions.DenyChildAttach | TaskCreationOptions.LongRunning,
TaskScheduler.FromCurrentSynchronizationContext()
).Unwrap();
By doing that, all your async code shall be scheduled to the same context, and run by the thread on that context.
However
This is typically dangerous, as you suddenly lose the ability to use the thread pool. If you block the thread, the entire async operation is blocked, meaning you will have deadlocks.

Another way to cancel Task

I have a simple console application
class Program
{
private static void MyTask(object obj)
{
var cancellationToken = (CancellationToken) obj;
if(cancellationToken.IsCancellationRequested)
cancellationToken.ThrowIfCancellationRequested();
Console.WriteLine("MyTask() started");
for (var i = 0; i < 10; i++)
{
try
{
if (cancellationToken.IsCancellationRequested)
cancellationToken.ThrowIfCancellationRequested();
}
catch (Exception ex)
{
return;
}
Console.WriteLine($"Counter in MyTask() = {i}");
Thread.Sleep(500);
}
Console.WriteLine("MyTask() finished");
}
static void Main(string[] args)
{
var cancelationTokenSource = new CancellationTokenSource();
var task = Task.Factory.StartNew(MyTask, cancelationTokenSource.Token,
cancelationTokenSource.Token);
Thread.Sleep(3000);
try
{
cancelationTokenSource.Cancel();
task.Wait();
}
catch (Exception ex)
{
if(task.IsCanceled)
Console.WriteLine("Task has been cancelled");
Console.WriteLine(ex.Message);
}
finally
{
cancelationTokenSource.Dispose();
task.Dispose();
}
Console.WriteLine("Main finished");
Console.ReadLine();
}
}
I'm trying to start new Task and after some time cancel it. Is there any other way to achieve this result instead of using this
if(cancellationToken.IsCancellationRequested)
cancellationToken.ThrowIfCancellationRequested();
on every iteration in for loop? Why do we have to check cancellationToken.IsCancellationRequested on every iteration, maybe we can to use something else?

In this specific case you could avoid the .ThrowIfCancellationRequested(), and instead simply use a break to stop the execution of the loop and then finish the Task. The ThrowIfCancellationRequested is more useful in deeper task trees, where there are many more descendants and it is more difficult to maintain cancellation.
if (cancellationToken.IsCancellationRequested)
{
break;
}
Stephen Toub has a good explanation on how the throwing of the OCE is more of an acknowledgement.
If the body of the task is also monitoring the cancellation token and throws an OperationCanceledException containing that token (which is what ThrowIfCancellationRequested does), then when the task sees that OCE, it checks whether the OCE's token matches the Task's token. If it does, that exception is viewed as an acknowledgement of cooperative cancellation and the Task transitions to the Canceled state (rather than the Faulted state).

Not sure what your objection to checking on every iteration is, but if you do not want to check on every iteration, for whatever reason, check the value of i:
E.g. this checks every 10th loop:
if (i % 10 == 0 && cancellationToken.IsCancellationRequested)
The reason you must check is so that you can decide where is best to stop the task so that work is not left in an inconsistent state. Here though all you will achieve by checking less frequently is a task that is slower to cancel, maybe that will lead to a less responsive UX. E.g. where the task would end in 500ms before, now it can take up to 10x that, 5 secs.
However if each loop was very fast, and checking the flag every loop proves to significantly increase the overall time the task takes, then checking every n loops makes sence.

Cancellation Token source and nested tasks

I have a doubt with cancellation token source which I am using as shown in the below code:
void Process()
{
//for the sake of simplicity I am taking 1, in original implementation it is more than 1
var cancellationToken = _cancellationTokenSource.Token;
Task[] tArray = new Task[1];
tArray[0] = Task.Factory.StartNew(() =>
{
cancellationToken.ThrowIfCancellationRequested();
//do some work here
MainTaskRoutine();
}, cancellationToken);
try
{
Task.WaitAll(tArray);
}
catch (Exception ex)
{
//do error handling here
}
}
void MainTaskRoutine()
{
//for the sake of simplicity I am taking 1, in original implementation it is more than 1
//this method shows that a nested task is created
var cancellationToken = _cancellationTokenSource.Token;
Task[] tArray = new Task[1];
tArray[0] = Task.Factory.StartNew(() =>
{
cancellationToken.ThrowIfCancellationRequested();
//do some work here
}, cancellationToken);
try
{
Task.WaitAll(tArray);
}
catch (Exception ex)
{
//do error handling here
}
}
Edit: further elaboration
End objective is: when a user cancels the operation, all the immediate pending tasks(either children or grand children) should cancel.
Scenario:
As per above code:
1. I first check whether user has asked for cancellation
2. If user has not asked for cancellation then only continue with the task (Please see Process method).
sample code shows only one task here but actually there can be three or more
Lets say that CPU started processing Task1 while other tasks are still in the Task queue waiting for some CPU to come and execute them.
User requests cancellation: Task 2,3 in Process method are immediately cancelled, but Task 1 will continue to work since it is already undergoing processing.
In Task 1 it calls method MainTaskRoutine, which in turn creates more tasks.
In the function of MainTaskRoutine I have written: cancellationToken.ThrowIfCancellationRequested();
So the question is: is it correct way of using CancellationTokenSource as it is dependent on Task.WaitAll()?

[EDITED] As you use an array in your code, I assume there could be multiple tasks, not just one. I also assume that within each task that you're starting from Process you want to do some CPU-bound work first (//do some work here), and then run MainTaskRoutine.
How you handle task cancellation exceptions is determined by your project design workflow. E.g., you could do it inside Process method, or from where you call Process. If your only concern is to remove Task objects from the array where you keep track of the pending tasks, this can be done using Task.ContinueWith. The continuation will be executed regardless of the task's completion status (Cancelled, Faulted or RanToCompletion):
Task Process(CancellationToken cancellationToken)
{
var tArray = new List<Task>();
var tArrayLock = new Object();
var task = Task.Run(() =>
{
cancellationToken.ThrowIfCancellationRequested();
//do some work here
return MainTaskRoutine(cancellationToken);
}, cancellationToken);
// add the task to the array,
// use lock as we may remove tasks from this array on a different thread
lock (tArrayLock)
tArray.Add(task);
task.ContinueWith((antecedentTask) =>
{
if (antecedentTask.IsCanceled || antecedentTask.IsFaulted)
{
// handle cancellation or exception inside the task
// ...
}
// remove task from the array,
// could be on a different thread from the Process's thread, use lock
lock (tArrayLock)
tArray.Remove(antecedentTask);
}, TaskContinuationOptions.ExecuteSynchronously);
// add more tasks like the above
// ...
// Return aggregated task
Task[] allTasks = null;
lock (tArrayLock)
allTasks = tArray.ToArray();
return Task.WhenAll(allTasks);
}
Your MainTaskRoutine can be structured in exactly the same way as Process, and have the same method signature (return a Task).
Then you may want to perform a blocking wait on the aggregated task returned by Process, or handle its completion asynchronously, e.g:
// handle the completion asynchronously with a blocking wait
void RunProcessSync()
{
try
{
Process(_cancellationTokenSource.Token).Wait();
MessageBox.Show("Process complete");
}
catch (Exception e)
{
MessageBox.Show("Process cancelled (or faulted): " + e.Message);
}
}
// handle the completion asynchronously using ContinueWith
Task RunProcessAync()
{
return Process(_cancellationTokenSource.Token).ContinueWith((task) =>
{
// check task.Status here
MessageBox.Show("Process complete (or cancelled, or faulted)");
}, TaskScheduler.FromCurrentSynchronizationContext());
}
// handle the completion asynchronously with async/await
async Task RunProcessAync()
{
try
{
await Process(_cancellationTokenSource.Token);
MessageBox.Show("Process complete");
}
catch (Exception e)
{
MessageBox.Show("Process cancelled (or faulted): " + e.Message);
}
}

After doing some research I found this link.
The code now looks like this:
see the usage of CancellationTokenSource.CreateLinkedTokenSource in below code
void Process()
{
//for the sake of simplicity I am taking 1, in original implementation it is more than 1
var cancellationToken = _cancellationTokenSource.Token;
Task[] tArray = new Task[1];
tArray[0] = Task.Factory.StartNew(() =>
{
cancellationToken.ThrowIfCancellationRequested();
//do some work here
MainTaskRoutine(cancellationToken);
}, cancellationToken);
try
{
Task.WaitAll(tArray);
}
catch (Exception ex)
{
//do error handling here
}
}
void MainTaskRoutine(CancellationToken cancellationToken)
{
//for the sake of simplicity I am taking 1, in original implementation it is more than 1
//this method shows that a nested task is created
using (var cancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
{
var cancelToken = cancellationTokenSource.Token;
Task[] tArray = new Task[1];
tArray[0] = Task.Factory.StartNew(() =>
{
cancelToken.ThrowIfCancellationRequested();
//do some work here
}, cancelToken);
try
{
Task.WaitAll(tArray);
}
catch (Exception ex)
{
//do error handling here
}
}
}
Note: I haven't used it, but I will let You know once it is done :)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.