Implementing correct completion of a retryable block - c#

Teaser: guys, this question is not about how to implement retry policy. It's about correct completion of a TPL Dataflow block.
This question is mostly a continuation of my previous question Retry policy within ITargetBlock. The answer to this question was #svick's smart solution that utilizes TransformBlock (source) and TransformManyBlock (target). The only problem left is to complete this block in a right way: wait for all the retries to be completed first, and then complete the target block. Here is what I ended up with (it's just a snippet, don't pay too many attention to a non-threadsafe retries set):
var retries = new HashSet<RetryingMessage<TInput>>();
TransformManyBlock<RetryableMessage<TInput>, TOutput> target = null;
target = new TransformManyBlock<RetryableMessage<TInput>, TOutput>(
async message =>
{
try
{
var result = new[] { await transform(message.Data) };
retries.Remove(message);
return result;
}
catch (Exception ex)
{
message.Exceptions.Add(ex);
if (message.RetriesRemaining == 0)
{
if (failureHandler != null)
failureHandler(message.Exceptions);
retries.Remove(message);
}
else
{
retries.Add(message);
message.RetriesRemaining--;
Task.Delay(retryDelay)
.ContinueWith(_ => target.Post(message));
}
return null;
}
}, dataflowBlockOptions);
source.LinkTo(target);
source.Completion.ContinueWith(async _ =>
{
while (target.InputCount > 0 || retries.Any())
await Task.Delay(100);
target.Complete();
});
The idea is to perform some kind of polling and verify whether there are still messages that waiting to be processed and there are no messages that require retrying. But in this solution I don't like the idea of polling.
Yes, I can encapsulate the logic of adding/removing retries into a separate class, and even e.g. perform some action when the set of retries becomes empty, but how to deal with target.InputCount > 0 condition? There is not such a callback that get called when there are no pending messages for the block, so it seems that verifying target.ItemCount in a loop with a small delay is an only option.
Does anybody knows a smarter way to achieve this?

Maybe a ManualResetEvent can do the trick for you.
Add a public property to TransformManyBlock
private ManualResetEvent _signal = new ManualResetEvent(false);
public ManualResetEvent Signal { get { return _signal; } }
And here you go:
var retries = new HashSet<RetryingMessage<TInput>>();
TransformManyBlock<RetryableMessage<TInput>, TOutput> target = null;
target = new TransformManyBlock<RetryableMessage<TInput>, TOutput>(
async message =>
{
try
{
var result = new[] { await transform(message.Data) };
retries.Remove(message);
// Sets the state of the event to signaled, allowing one or more waiting threads to proceed
if(!retries.Any()) Signal.Set();
return result;
}
catch (Exception ex)
{
message.Exceptions.Add(ex);
if (message.RetriesRemaining == 0)
{
if (failureHandler != null)
failureHandler(message.Exceptions);
retries.Remove(message);
// Sets the state of the event to signaled, allowing one or more waiting threads to proceed
if(!retries.Any()) Signal.Set();
}
else
{
retries.Add(message);
message.RetriesRemaining--;
Task.Delay(retryDelay)
.ContinueWith(_ => target.Post(message));
}
return null;
}
}, dataflowBlockOptions);
source.LinkTo(target);
source.Completion.ContinueWith(async _ =>
{
//Blocks the current thread until the current WaitHandle receives a signal.
target.Signal.WaitOne();
target.Complete();
});
I am not sure where your target.InputCount is set. So at the place you change target.InputCount you can add following code:
if(InputCount == 0) Signal.Set();

Combining hwcverwe answer and JamieSee comment could be the ideal solution.
First, you need to create more than one event:
var signal = new ManualResetEvent(false);
var completedEvent = new ManualResetEvent(false);
Then, you have to create an observer, and subscribe to the TransformManyBlock, so you are notified when a relevant event happens:
var observer = new RetryingBlockObserver<TOutput>(completedEvent);
var observable = target.AsObservable();
observable.Subscribe(observer);
The observable can be quite easy:
private class RetryingBlockObserver<T> : IObserver<T> {
private ManualResetEvent completedEvent;
public RetryingBlockObserver(ManualResetEvent completedEvent) {
this.completedEvent = completedEvent;
}
public void OnCompleted() {
completedEvent.Set();
}
public void OnError(Exception error) {
//TODO
}
public void OnNext(T value) {
//TODO
}
}
And you can wait for either the signal, or completion (exhaustion of all the source items), or both
source.Completion.ContinueWith(async _ => {
WaitHandle.WaitAll(completedEvent, signal);
// Or WaitHandle.WaitAny, depending on your needs!
target.Complete();
});
You can inspect the result value of WaitAll to understand which event was set, and react accordingly.
You can also add other events to the code, passing them to the observer, so that it can set them when needed. You can differentiate your behaviour and respond differently when an error is raised, for example

Related

C# Abortable Asynchronous Fifo Queue - leaking massive amounts of memory

I need to process data from a producer in FIFO fashion with the ability to abort processing if the same producer produces a new bit of data.
So I implemented an abortable FIFO queue based on Stephen Cleary's AsyncCollection (called AsyncCollectionAbortableFifoQueuein my sample) and one on TPL's BufferBlock (BufferBlockAbortableAsyncFifoQueue in my sample). Here's the implementation based on AsyncCollection
public class AsyncCollectionAbortableFifoQueue<T> : IExecutableAsyncFifoQueue<T>
{
private AsyncCollection<AsyncWorkItem<T>> taskQueue = new AsyncCollection<AsyncWorkItem<T>>();
private readonly CancellationToken stopProcessingToken;
public AsyncCollectionAbortableFifoQueue(CancellationToken cancelToken)
{
stopProcessingToken = cancelToken;
_ = processQueuedItems();
}
public Task<T> EnqueueTask(Func<Task<T>> action, CancellationToken? cancelToken)
{
var tcs = new TaskCompletionSource<T>();
var item = new AsyncWorkItem<T>(tcs, action, cancelToken);
taskQueue.Add(item);
return tcs.Task;
}
protected virtual async Task processQueuedItems()
{
while (!stopProcessingToken.IsCancellationRequested)
{
try
{
var item = await taskQueue.TakeAsync(stopProcessingToken).ConfigureAwait(false);
if (item.CancelToken.HasValue && item.CancelToken.Value.IsCancellationRequested)
item.TaskSource.SetCanceled();
else
{
try
{
T result = await item.Action().ConfigureAwait(false);
item.TaskSource.SetResult(result); // Indicate completion
}
catch (Exception ex)
{
if (ex is OperationCanceledException && ((OperationCanceledException)ex).CancellationToken == item.CancelToken)
item.TaskSource.SetCanceled();
item.TaskSource.SetException(ex);
}
}
}
catch (Exception) { }
}
}
}
public interface IExecutableAsyncFifoQueue<T>
{
Task<T> EnqueueTask(Func<Task<T>> action, CancellationToken? cancelToken);
}
processQueuedItems is the task that dequeues AsyncWorkItem's from the queue, and executes them unless cancellation has been requested.
The asynchronous action to execute gets wrapped into an AsyncWorkItem which looks like this
internal class AsyncWorkItem<T>
{
public readonly TaskCompletionSource<T> TaskSource;
public readonly Func<Task<T>> Action;
public readonly CancellationToken? CancelToken;
public AsyncWorkItem(TaskCompletionSource<T> taskSource, Func<Task<T>> action, CancellationToken? cancelToken)
{
TaskSource = taskSource;
Action = action;
CancelToken = cancelToken;
}
}
Then there's a task looking and dequeueing items for processing and either processing them, or aborting if the CancellationToken has been triggered.
That all works just fine - data gets processed, and if a new piece of data is received, processing of the old is aborted. My problem now stems from these Queues leaking massive amounts of memory if I crank up the usage (producer producing a lot more than the consumer processes). Given it's abortable, the data that is not processed, should be discarded and eventually disappear from memory.
So let's look at how I'm using these queues. I have a 1:1 match of producer and consumer. Every consumer handles data of a single producer. Whenever I get a new data item, and it doesn't match the previous one, I catch the queue for the given producer (User.UserId) or create a new one (the 'executor' in the code snippet). Then I have a ConcurrentDictionary that holds a CancellationTokenSource per producer/consumer combo. If there's a previous CancellationTokenSource, I call Cancel on it and Dispose it 20 seconds later (immediate disposal would cause exceptions in the queue). I then enqueue processing of the new data. The queue returns me a task that I can await so I know when processing of the data is complete, and I then return the result.
Here's that in code
internal class SimpleLeakyConsumer
{
private ConcurrentDictionary<string, IExecutableAsyncFifoQueue<bool>> groupStateChangeExecutors = new ConcurrentDictionary<string, IExecutableAsyncFifoQueue<bool>>();
private readonly ConcurrentDictionary<string, CancellationTokenSource> userStateChangeAborters = new ConcurrentDictionary<string, CancellationTokenSource>();
protected CancellationTokenSource serverShutDownSource;
private readonly int operationDuration = 1000;
internal SimpleLeakyConsumer(CancellationTokenSource serverShutDownSource, int operationDuration)
{
this.serverShutDownSource = serverShutDownSource;
this.operationDuration = operationDuration * 1000; // convert from seconds to milliseconds
}
internal async Task<bool> ProcessStateChange(string userId)
{
var executor = groupStateChangeExecutors.GetOrAdd(userId, new AsyncCollectionAbortableFifoQueue<bool>(serverShutDownSource.Token));
CancellationTokenSource oldSource = null;
using (var cancelSource = userStateChangeAborters.AddOrUpdate(userId, new CancellationTokenSource(), (key, existingValue) =>
{
oldSource = existingValue;
return new CancellationTokenSource();
}))
{
if (oldSource != null && !oldSource.IsCancellationRequested)
{
oldSource.Cancel();
_ = delayedDispose(oldSource);
}
try
{
var executionTask = executor.EnqueueTask(async () => { await Task.Delay(operationDuration, cancelSource.Token).ConfigureAwait(false); return true; }, cancelSource.Token);
var result = await executionTask.ConfigureAwait(false);
userStateChangeAborters.TryRemove(userId, out var aborter);
return result;
}
catch (Exception e)
{
if (e is TaskCanceledException || e is OperationCanceledException)
return true;
else
{
userStateChangeAborters.TryRemove(userId, out var aborter);
return false;
}
}
}
}
private async Task delayedDispose(CancellationTokenSource src)
{
try
{
await Task.Delay(20 * 1000).ConfigureAwait(false);
}
finally
{
try
{
src.Dispose();
}
catch (ObjectDisposedException) { }
}
}
}
In this sample implementation, all that is being done is wait, then return true.
To test this mechanism, I wrote the following Data producer class:
internal class SimpleProducer
{
//variables defining the test
readonly int nbOfusers = 10;
readonly int minimumDelayBetweenTest = 1; // seconds
readonly int maximumDelayBetweenTests = 6; // seconds
readonly int operationDuration = 3; // number of seconds an operation takes in the tester
private readonly Random rand;
private List<User> users;
private readonly SimpleLeakyConsumer consumer;
protected CancellationTokenSource serverShutDownSource, testAbortSource;
private CancellationToken internalToken = CancellationToken.None;
internal SimpleProducer()
{
rand = new Random();
testAbortSource = new CancellationTokenSource();
serverShutDownSource = new CancellationTokenSource();
generateTestObjects(nbOfusers, 0, false);
consumer = new SimpleLeakyConsumer(serverShutDownSource, operationDuration);
}
internal void StartTests()
{
if (internalToken == CancellationToken.None || internalToken.IsCancellationRequested)
{
internalToken = testAbortSource.Token;
foreach (var user in users)
_ = setNewUserPresence(internalToken, user);
}
}
internal void StopTests()
{
testAbortSource.Cancel();
try
{
testAbortSource.Dispose();
}
catch (ObjectDisposedException) { }
testAbortSource = new CancellationTokenSource();
}
internal void Shutdown()
{
serverShutDownSource.Cancel();
}
private async Task setNewUserPresence(CancellationToken token, User user)
{
while (!token.IsCancellationRequested)
{
var nextInterval = rand.Next(minimumDelayBetweenTest, maximumDelayBetweenTests);
try
{
await Task.Delay(nextInterval * 1000, testAbortSource.Token).ConfigureAwait(false);
}
catch (TaskCanceledException)
{
break;
}
//now randomly generate a new state and submit it to the tester class
UserState? status;
var nbStates = Enum.GetValues(typeof(UserState)).Length;
if (user.CurrentStatus == null)
{
var newInt = rand.Next(nbStates);
status = (UserState)newInt;
}
else
{
do
{
var newInt = rand.Next(nbStates);
status = (UserState)newInt;
}
while (status == user.CurrentStatus);
}
_ = sendUserStatus(user, status.Value);
}
}
private async Task sendUserStatus(User user, UserState status)
{
await consumer.ProcessStateChange(user.UserId).ConfigureAwait(false);
}
private void generateTestObjects(int nbUsers, int nbTeams, bool addAllUsersToTeams = false)
{
users = new List<User>();
for (int i = 0; i < nbUsers; i++)
{
var usr = new User
{
UserId = $"User_{i}",
Groups = new List<Team>()
};
users.Add(usr);
}
}
}
It uses the variables at the beginning of the class to control the test. You can define the number of users (nbOfusers - every user is a producer that produces new data), the minimum (minimumDelayBetweenTest) and maximum (maximumDelayBetweenTests) delay between a user producing the next data and how long it takes the consumer to process the data (operationDuration).
StartTests starts the actual test, and StopTests stops the tests again.
I'm calling these as follows
static void Main(string[] args)
{
var tester = new SimpleProducer();
Console.WriteLine("Test successfully started, type exit to stop");
string str;
do
{
str = Console.ReadLine();
if (str == "start")
tester.StartTests();
else if (str == "stop")
tester.StopTests();
}
while (str != "exit");
tester.Shutdown();
}
So, if I run my tester and type 'start', the Producer class starts producing states that are consumed by Consumer. And memory usage starts to grow and grow and grow. The sample is configured to the extreme, the real-life scenario I'm dealing with is less intensive, but one action of the producer could trigger multiple actions on the consumer side which also have to be executed in the same asynchronous abortable fifo fashion - so worst case, one set of data produced triggers an action for ~10 consumers (that last part I stripped out for brevity).
When I'm having a 100 producers, and each producer produces a new data item every 1-6 seconds (randomly, also the data produces is random). Consuming the data takes 3 seconds.. so there's plenty of cases where there's a new set of data before the old one has been properly processed.
Looking at two consecutive memory dumps, it's obvious where the memory usage is coming from.. it's all fragments that have to do with the queue. Given that I'm disposing every TaskCancellationSource and not keeping any references to the produced data (and the AsyncWorkItem they're put into), I'm at a loss to explain why this keeps eating up my memory and I'm hoping somebody else can show me the errors of my way. You can also abort testing by typing 'stop'.. you'll see that no longer is memory being eaten, but even if you pause and trigger GC, memory is not being freed either.
The source code of the project in runnable form is on Github. After starting it, you have to type start (plus enter) in the console to tell the producer to start producing data. And you can stop producing data by typing stop (plus enter)
Your code has so many issues making it impossible to find a leak through debugging. But here are several things that already are an issue and should be fixed first:
Looks like getQueue creates a new queue for the same user each time processUseStateUpdateAsync gets called and does not reuse existing queues:
var executor = groupStateChangeExecutors.GetOrAdd(user.UserId, getQueue());
CancellationTokenSource is leaking on each call of the code below, as new value created each time the method AddOrUpdate is called, it should not be passed there that way:
userStateChangeAborters.AddOrUpdate(user.UserId, new CancellationTokenSource(), (key, existingValue
Also code below should use the same cts as you pass as new cts, if dictionary has no value for specific user.UserId:
return new CancellationTokenSource();
Also there is a potential leak of cancelSource variable as it gets bound to a delegate which can live for a time longer than you want, it's better to pass concrete CancellationToken there:
executor.EnqueueTask(() => processUserStateUpdateAsync(user, state, previousState,
cancelSource.Token));
By some reason you do not dispose aborter here and in one more place:
userStateChangeAborters.TryRemove(user.UserId, out var aborter);
Creation of Channel can have potential leaks:
taskQueue = Channel.CreateBounded<AsyncWorkItem<T>>(new BoundedChannelOptions(1)
You picked option FullMode = BoundedChannelFullMode.DropOldest which should remove oldest values if there are any, so I assume that that stops queued items from processing as they would not be read. It's a hypotheses, but I assume that if an old item is removed without being handled, then processUserStateUpdateAsync won't get called and all resources won't be freed.
You can start with these found issues and it should be easier to find the real cause after that.

C# event exception not caught in parent method

I`m working on implementing a get method for cache. This method will return to caller if a maximum wait time has passed(in my case 100ms for tests).
My issue is that the exception NEVER reaches the catch, after the timer triggered the event.
Please help me understand why? (I read that events are executed on the same thread, so that should`t be the issue)
public static T Get<T>(string key, int? maxMilisecondsForResponse = null)
{
var result = default(T);
try
{
// Return default if time expired
if (maxMilisecondsForResponse.HasValue)
{
var timer = new System.Timers.Timer(maxMilisecondsForResponse.Value);
timer.Elapsed += OnTimerElapsed;
timer.AutoReset = false;
timer.Enabled = true; // start the timer
}
var externalCache = new CacheServiceClient(BindingName);
Thread.Sleep(3000); // just for testing
}
catch (Exception ex)
{
// why is the exception not caught here?
}
return result;
}
private static void OnTimerElapsed(object source, System.Timers.ElapsedEventArgs e)
{
throw new Exception("Timer elapsed");
}
The timer fires on it's own thread. You can read more about it in this answer.
The answer to your question is to use async methods that can be cancelled. Then you can use a cancellation token source and do it the proper way instead of homebrewing a solution with timers.
You can find a good overview here.
For example:
cts = new CancellationTokenSource();
cts.CancelAfter(2500);
await Task.Delay(10000, cts.Token);
This would cancel the waiting task after 2500 (of 10000) because it took too long. Obviously you need to insert your own logic in a task instead of just waiting.
From MSDN
The Timer component catches and suppresses all exceptions thrown by
event handlers for the Elapsed event. This behavior is subject to
change in future releases of the .NET Framework.
And continues
Note, however, that this is not true of event handlers that execute
asynchronously and include the await operator (in C#) or the Await
operator (in Visual Basic). Exceptions thrown in these event handlers
are propagated back to the calling thread.
Please take a look Exception Handling (Task Parallel Library)
An applied example below:
public class Program
{
static void Main()
{
Console.WriteLine("Begin");
Get<string>("key", 1000);
Console.WriteLine("End");
}
public static T Get<T>(string key, int? maxMilisecondsForResponse = null)
{
var result = default(T);
try
{
var task = Task.Run(async () =>
{
await Task.Delay(maxMilisecondsForResponse.Value);
throw new Exception("Timer elapsed");
});
task.Wait();
}
catch (Exception ex)
{
// why the exception is not catched here?
Console.WriteLine(ex);
}
return result;
}
}
The timer is being executed in the own thread but you can't catch the exception at the caller level. So, it is not a good approach to use timer in this case and you can change it by creating the Task operation.
var result = default(T);
CacheServiceClient externalCache;
if (!Task.Run(() =>
{
externalCache = new CacheServiceClient(BindingName);
return externalCache;
}).Wait(100))//Wait for the 100 ms to complete operation.
{
throw new Exception("Task is not completed !");
}
// Do something
return result;

Task-based idle detection

It's not unusual to want a limit on the interval between certain events, and take action if the limit is exceeded. For example, a heartbeat message between network peers for detecting that the other end is alive.
In the C# async/await style, it is possible to implement that by replacing the timeout task each time the heartbeat arrives:
var client = new TcpClient { ... };
await client.ConnectAsync(...);
Task heartbeatLost = new Task.Delay(HEARTBEAT_LOST_THRESHOLD);
while (...)
{
Task<int> readTask = client.ReadAsync(buffer, 0, buffer.Length);
Task first = await Task.WhenAny(heartbeatLost, readTask);
if (first == readTask) {
if (ProcessData(buffer, 0, readTask.Result).HeartbeatFound) {
heartbeatLost = new Task.Delay(HEARTBEAT_LOST_THRESHOLD);
}
}
else if (first == heartbeatLost) {
TellUserPeerIsDown();
break;
}
}
This is convenient, but each instance of the delay Task owns a Timer, and if many heartbeat packets arrive in less time than the threshold, that's a lot of Timer objects loading the threadpool. Also, the completion of each Timer will run code on the threadpool, whether there's any continuation still linked to it or not.
You can't free the old Timer by calling heartbeatLost.Dispose(); that'll give an exception
InvalidOperationException: A task may only be disposed if it is in a completion state
One could create a CancellationTokenSource and use it to cancel the old delay task, but it seems suboptimal to create even more objects to accomplish this, when timers themselves have the feature of being reschedulable.
What's the best way to integrate timer rescheduling, so that the code could be structured more like this?
var client = new TcpClient { ... };
await client.ConnectAsync(...);
var idleTimeout = new TaskDelayedCompletionSource(HEARTBEAT_LOST_THRESHOLD);
Task heartbeatLost = idleTimeout.Task;
while (...)
{
Task<int> readTask = client.ReadAsync(buffer, 0, buffer.Length);
Task first = await Task.WhenAny(heartbeatLost, readTask);
if (first == readTask) {
if (ProcessData(buffer, 0, readTask.Result).HeartbeatFound) {
idleTimeout.ResetDelay(HEARTBEAT_LOST_THRESHOLD);
}
}
else if (first == heartbeatLost) {
TellUserPeerIsDown();
break;
}
}
Seems pretty straightforward to me, The name of your hypothetical class gets you most of the way there. All you need is a TaskCompletionSource and a single timer you keep resetting.
public class TaskDelayedCompletionSource
{
private TaskCompletionSource<bool> _completionSource;
private readonly System.Threading.Timer _timer;
private readonly object _lockObject = new object();
public TaskDelayedCompletionSource(int interval)
{
_completionSource = CreateCompletionSource();
_timer = new Timer(OnTimerCallback);
_timer.Change(interval, Timeout.Infinite);
}
private static TaskCompletionSource<bool> CreateCompletionSource()
{
return new TaskCompletionSource<bool>(TaskCreationOptions.DenyChildAttach | TaskCreationOptions.RunContinuationsAsynchronously | TaskCreationOptions.HideScheduler);
}
private void OnTimerCallback(object state)
{
//Cache a copy of the completion source before we entier the lock, so we don't complete the wrong source if ResetDelay is in the middle of being called.
var completionSource = _completionSource;
lock (_lockObject)
{
completionSource.TrySetResult(true);
}
}
public void ResetDelay(int interval)
{
lock (_lockObject)
{
var oldSource = _completionSource;
_timer.Change(interval, Timeout.Infinite);
_completionSource = CreateCompletionSource();
oldSource.TrySetCanceled();
}
}
public Task Task => _completionSource.Task;
}
This will only create a single timer and update it, the task completes when the timer fires.
You will need to change your code slightly, because a new TaskCompletionSource gets created every time you update the end time you need to put the Task heartbeatLost = idleTimeout.Task; call inside the while loop.
var client = new TcpClient { ... };
await client.ConnectAsync(...);
var idleTimeout = new TaskDelayedCompletionSource(HEARTBEAT_LOST_THRESHOLD);
while (...)
{
Task heartbeatLost = idleTimeout.Task;
Task<int> readTask = client.ReadAsync(buffer, 0, buffer.Length);
Task first = await Task.WhenAny(heartbeatLost, readTask);
if (first == readTask) {
if (ProcessData(buffer, 0, readTask.Result).HeartbeatFound) {
idleTimeout.ResetDelay(HEARTBEAT_LOST_THRESHOLD);
}
}
else if (first == heartbeatLost) {
TellUserPeerIsDown();
}
}
EDIT: If you where conserened about the object creation of the completion sources (for example you are programming in a Game Engine where GC collection is a large consern) you may be able to add extra logic to OnTimerCallback and ResetDelay to reuse the completion source if the call has not happened yet and you know for sure you are not inside of a Reset Delay.
You will likely need to switch from using a lock to a SemaphoreSlim and change the callback to
private void OnTimerCallback(object state)
{
if(_semaphore.Wait(0))
{
_completionSource.TrySetResult(true);
}
}
I may update this answer later to include what OnTimerCallback would have too, but I don't have time right now.

Run X number of Task<T> at any given time while keeping UI responsive

I have a C# WinForms (.NET 4.5.2) app utilizing the TPL. The tool has a synchronous function which is passed over to a task factory X amount of times (with different input parameters), where X is a number declared by the user before commencing the process. The tasks are started and stored in a List<Task>.
Assuming the user entered 5, we have this in an async button click handler:
for (int i = 0; i < X; i++)
{
var progress = Progress(); // returns a new IProgress<T>
var task = Task<int>.Factory.StartNew(() => MyFunction(progress), TaskCreationOptions.LongRunning);
TaskList.Add(task);
}
Each progress instance updates the UI.
Now, as soon as a task is finished, I want to fire up a new one. Essentially, the process should run indefinitely, having X tasks running at any given time, unless the user cancels via the UI (I'll use cancellation tokens for this). I try to achieve this using the following:
while (TaskList.Count > 0)
{
var completed = await Task.WhenAny(TaskList.ToArray());
if (completed.Exception == null)
{
// report success
}
else
{
// flatten AggregateException, print out, etc
}
// update some labels/textboxes in the UI, and then:
TaskList.Remove(completed);
var task = Task<int>.Factory.StartNew(() => MyFunction(progress), TaskCreationOptions.LongRunning);
TaskList.Add(task);
}
This is bogging down the UI. Is there a better way of achieving this functionality, while keeping the UI responsive?
A suggestion was made in the comments to use TPL Dataflow but due to time constraints and specs, alternative solutions are welcome
Update
I'm not sure whether the progress reporting might be the problem? Here's what it looks like:
private IProgress<string> Progress()
{
return new Progress<string>(msg =>
{
txtMsg.AppendText(msg);
});
}
Now, as soon as a task is finished, I want to fire up a new one. Essentially, the process should run indefinitely, having X tasks running at any given time
It sounds to me like you want an infinite loop inside your task:
for (int i = 0; i < X; i++)
{
var progress = Progress(); // returns a new IProgress<T>
var task = RunIndefinitelyAsync(progress);
TaskList.Add(task);
}
private async Task RunIndefinitelyAsync(IProgress<T> progress)
{
while (true)
{
try
{
await Task.Run(() => MyFunction(progress));
// handle success
}
catch (Exception ex)
{
// handle exceptions
}
// update some labels/textboxes in the UI
}
}
However, I suspect that the "bogging down the UI" is probably in the // handle success and/or // handle exceptions code. If my suspicion is correct, then push as much of the logic into the Task.Run as possible.
As I understand, you simply need a parallel execution with the defined degree of parallelization. There is a lot of ways to implement what you want. I suggest to use blocking collection and parallel class instead of tasks.
So when user clicks button, you need to create a new blocking collection which will be your data source:
BlockingCollection<IProgress> queue = new BlockingCollection<IProgress>();
CancellationTokenSource source = new CancellationTokenSource();
Now you need a runner that will execute your in parallel:
Task.Factory.StartNew(() =>
Parallel.For(0, X, i =>
{
foreach (IProgress p in queue.GetConsumingEnumerable(source.Token))
{
MyFunction(p);
}
}), source.Token);
Or you can choose more correct way with partitioner. So you'll need a partitioner class:
private class BlockingPartitioner<T> : Partitioner<T>
{
private readonly BlockingCollection<T> _Collection;
private readonly CancellationToken _Token;
public BlockingPartitioner(BlockingCollection<T> collection, CancellationToken token)
{
_Collection = collection;
_Token = token;
}
public override IList<IEnumerator<T>> GetPartitions(int partitionCount)
{
throw new NotImplementedException();
}
public override IEnumerable<T> GetDynamicPartitions()
{
return _Collection.GetConsumingEnumerable(_Token);
}
public override bool SupportsDynamicPartitions
{
get { return true; }
}
}
And runner will looks like this:
ParallelOptions Options = new ParallelOptions();
Options.MaxDegreeOfParallelism = X;
Task.Factory.StartNew(
() => Parallel.ForEach(
new BlockingPartitioner<IProgress>(queue, source.Token),
Options,
p => MyFunction(p)));
So all you need right now is to fill queue with necessary data. You can do it whenever you want.
And final touch, when the user cancels operation, you have two options:
first you can break execution with source.Cancel call,
or you can gracefully stop execution by marking collection complete (queue.CompleteAdding), in that case runner will execute all already queued data and finish.
Of course you need additional code to handle exceptions, progress, state and so on. But main idea is here.

Ensure a long running task is only fired once and subsequent request are queued but with only one entry in the queue

I have a compute intensive method Calculate that may run for a few seconds, requests come from multiple threads.
Only one Calculate should be executing, a subsequent request should be queued until the initial request completes. If there is already a request queued then the the subsequent request can be discarded (as the queued request will be sufficient)
There seems to be lots of potential solutions but I just need the simplest.
UPDATE: Here's my rudimentaryattempt:
private int _queueStatus;
private readonly object _queueStatusSync = new Object();
public void Calculate()
{
lock(_queueStatusSync)
{
if(_queueStatus == 2) return;
_queueStatus++;
if(_queueStatus == 2) return;
}
for(;;)
{
CalculateImpl();
lock(_queueStatusSync)
if(--_queueStatus == 0) return;
}
}
private void CalculateImpl()
{
// long running process will take a few seconds...
}
The simplest, cleanest solution IMO is using TPL Dataflow (as always) with a BufferBlock acting as the queue. BufferBlock is thread-safe, supports async-await, and more important, has TryReceiveAll to get all the items at once. It also has OutputAvailableAsync so you can wait asynchronously for items to be posted to the buffer. When multiple requests are posted you simply take the last and forget about the rest:
var buffer = new BufferBlock<Request>();
var task = Task.Run(async () =>
{
while (await buffer.OutputAvailableAsync())
{
IList<Request> requests;
buffer.TryReceiveAll(out requests);
Calculate(requests.Last());
}
});
Usage:
buffer.Post(new Request());
buffer.Post(new Request());
Edit: If you don't have any input or output for the Calculate method you can simply use a boolean to act as a switch. If it's true you can turn it off and calculate, if it became true again while Calculate was running then calculate again:
public bool _shouldCalculate;
public void Producer()
{
_shouldCalculate = true;
}
public async Task Consumer()
{
while (true)
{
if (!_shouldCalculate)
{
await Task.Delay(1000);
}
else
{
_shouldCalculate = false;
Calculate();
}
}
}
A BlockingCollection that only takes 1 at a time
The trick is to skip if there are any items in the collection
I would go with the answer from I3aron +1
This is (maybe) a BlockingCollection solution
public static void BC_AddTakeCompleteAdding()
{
using (BlockingCollection<int> bc = new BlockingCollection<int>(1))
{
// Spin up a Task to populate the BlockingCollection
using (Task t1 = Task.Factory.StartNew(() =>
{
for (int i = 0; i < 100; i++)
{
if (bc.TryAdd(i))
{
Debug.WriteLine(" add " + i.ToString());
}
else
{
Debug.WriteLine(" skip " + i.ToString());
}
Thread.Sleep(30);
}
bc.CompleteAdding();
}))
{
// Spin up a Task to consume the BlockingCollection
using (Task t2 = Task.Factory.StartNew(() =>
{
try
{
// Consume consume the BlockingCollection
while (true)
{
Debug.WriteLine("take " + bc.Take());
Thread.Sleep(100);
}
}
catch (InvalidOperationException)
{
// An InvalidOperationException means that Take() was called on a completed collection
Console.WriteLine("That's All!");
}
}))
Task.WaitAll(t1, t2);
}
}
}
It sounds like a classic producer-consumer. I'd recommend looking into BlockingCollection<T>. It is part of the System.Collection.Concurrent namespace. On top of that you can implement your queuing logic.
You may supply to a BlockingCollection any internal structure to hold its data, such as a ConcurrentBag<T>, ConcurrentQueue<T> etc. The latter is the default structure used.

Categories

Resources