Firing Recurring Tasks At The Same Time - c#

I am trying to get 2 tasks to fire at the same time at a specific point in time, then do it all over again. For example, below is a task that waits 1 minute and a second task that waits 5 minutes. The 1 minute task should fire 5 times in 5 minutes and the 5 minute task 1 time, the 1 minute task should fire 10 times in 10 minutes and the 5 minute task 2 times, on and on and on. However, I need the 1 minute task to fire at the same time as the 5 minute.
I was able to do this with System.Timers but that did not play well with the multithreading that I eventually needed. System.Thread did not have anything equivalent to System.Timers AutoReset unless I'm missing something.
What I have below is both delay timers start at the same time BUT t1 only triggers 1 time and not 5. Essentially it needs to keep going until the program is stopped not just X times.
int i = 0;
while (i < 1)
{
Task t1 = Task.Run(async delegate
{
await Task.Delay(TimeSpan.FromMinutes(1));
TaskWorkers.OneMinuteTasks();
});
//t1.Wait();
Task t2 = Task.Run(async delegate
{
await Task.Delay(TimeSpan.FromMinutes(5));
TaskWorkers.FiveMinuteTasks();
});
t2.Wait();
}
Update
I first read Johns comment below about just adding an inner loop to the Task. Below works as I was wanting. Simple fix. I know I did say I would want this to run for as long as the program runs but I was able to calculate out the max number of loops I would actually need. x < 10 is just a number I choose.
Task t1 = Task.Run(async delegate
{
for(int x = 0; x < 10; x++)
{
await Task.Delay(TimeSpan.FromMinutes(1));
TaskWorkers.OneMinuteTasks();
}
});
Task t2 = Task.Run(async delegate
{
for (int x = 0; x < 10; x++)
{
await Task.Delay(TimeSpan.FromMinutes(5));
TaskWorkers.FiveMinuteTasks();
}
});
As far as I can tell no gross usage of CPU or memory.

You could have a single loop that periodically fires the tasks in a coordinated fashion:
async Task LoopAsync(CancellationToken token)
{
while (true)
{
Task a = DoAsync_A(); // Every 5 minutes
for (int i = 0; i < 5; i++)
{
var delayTask = Task.Delay(TimeSpan.FromMinutes(1), token);
Task b = DoAsync_B(); // Every 1 minute
await Task.WhenAll(b, delayTask);
if (a.IsCompleted) await a;
}
await a;
}
}
This implementation awaits both the B task and the Task.Delay task to complete before starting a new 1-minute loop, so if the B task is extremely long-running, the schedule will slip. This is probably a desirable behavior, unless you are OK with the possibility of overlapping tasks.
In case of an exception in either the A or B task, the loop will report failure at the one minute checkpoints. This is not ideal, but making the loop perfectly responsive on errors would make the code quite complicated.
Update: Here is an advanced version that is more responsive in case of an exception. It uses a linked CancellationTokenSource, that is automatically canceled when any of the two tasks fails, which then results to the immediate cancellation of the delay task.
async Task LoopAsync(CancellationToken token)
{
using (var linked = CancellationTokenSource.CreateLinkedTokenSource(token))
{
while (true)
{
Task a = DoAsync_A(); // Every 5 minutes
await WithCompletionAsync(a, async () =>
{
OnErrorCancel(a, linked);
for (int i = 0; i < 5; i++)
{
var delayTask = Task.Delay(TimeSpan.FromMinutes(1),
linked.Token);
await WithCompletionAsync(delayTask, async () =>
{
Task b = DoAsync_B(); // Every 1 minute
OnErrorCancel(b, linked);
await b;
if (a.IsCompleted) await a;
});
}
});
}
}
}
async void OnErrorCancel(Task task, CancellationTokenSource cts)
{
try
{
await task.ConfigureAwait(false);
}
catch
{
cts.Cancel();
//try { cts.Cancel(); } catch { } // Safer alternative
}
}
async Task WithCompletionAsync(Task task, Func<Task> body)
{
try
{
await body().ConfigureAwait(false);
}
catch (OperationCanceledException)
{
await task.ConfigureAwait(false);
throw; // The task isn't faulted. Propagate the exception of the body.
}
catch
{
try
{
await task.ConfigureAwait(false);
}
catch { } // Suppress the task's exception
throw; // Propagate the exception of the body
}
await task.ConfigureAwait(false);
}
The logic of this version is significantly more perplexed than the initial simple version (which makes it more error prone). The introduction of the CancellationTokenSource creates the need for disposing it, which in turn makes mandatory to ensure that all tasks will be completed on every exit point of the asynchronous method. This is the reason for using the WithCompletionAsync method to enclose all code that follows every task inside the LoopAsync method.

I think timers or something like Vasily's suggestion would be the way to go, as these solutions are designed to handle recurring tasks more than just using threads. However, you could do this using threads, saying something like:
void TriggerTimers()
{
new Thread(() =>
{
while (true)
{
new Thread(()=> TaskA()).Start();
Thread.Sleep(60 * 1000); //start taskA every minute
}
}).Start();
new Thread(() =>
{
while (true)
{
new Thread(() => TaskB()).Start();
Thread.Sleep(5 * 60 * 1000); //start taskB every five minutes
}
}).Start();
}
void TaskA() { }
void TaskB() { }
Note that this solution will drift out my a small amount if used over a very long period of time, although this shouldn't be significant unless you're dealing with very delicate margins, or a very overloaded computer. Also, this solution doesn't have contingency for the description John mentioned - it's fairly lightweight, but also quite understandable

Related

TPL Dataflow: How to start the next async action when the current one hasn't finished yet, preserving the execution order?

Consider the following program, which uses TPL Dataflow. Hence, ActionBlock comes from the Dataflow library.
internal static class Program
{
public static async Task Main(string[] args)
{
var actionBlock = new ActionBlock<int>(async i =>
{
Console.WriteLine($"Started with {i}");
await DoSomethingAsync(i);
Console.WriteLine($"Done with {i}");
});
for (int i = 0; i < 5; i++)
{
actionBlock.Post(i);
}
actionBlock.Complete();
await actionBlock.Completion;
}
private static async Task DoSomethingAsync(int i)
{
await Task.Delay(1000);
}
}
The output of this program is:
Started with 0
Done with 0
Started with 1
Done with 1
Started with 2
Done with 2
Started with 3
Done with 3
Started with 4
Done with 4
Reason is that the ActionBlock only starts processing the next task when the previous asynynchronous task was finished.
How can I force it to start processing the next task, even though the previous wasn't fully finished. MaxDegreeOfParallelism isn't an option, as that can mess up the order.
So I'd like the output to be:
Started with 0
Started with 1
Started with 2
Started with 3
Started with 4
Done with 0
Done with 1
Done with 2
Done with 3
Done with 4
I could get rid of the async/await and replace it with ContinueWith. But that has two disadvantages:
The ActionBlock think it's done with the message immediately. An optional call to Complete() would result in the pipeline being completed directly, instead of after the asynchronous action to be completed.
I'd like to add a BoundedCapacity to limit the amount of messages currently still waiting to be fully finished. But because of 1. this BoundedCapacity has no effect.
In situations like this I would try to remove the requirement that things get processed in order, so that you can process in parallel, and then report sequentially.
//The transform block can process everything in parallel,
//but by default the inputs and outputs remain ordered
var processStuff = new TransformBlock<int, string>(async i =>
{
Console.WriteLine($"Started with {i}");
await DoSomethingAsync(i);
return $"Done with {i}";
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 5 });
//This action block is your reporting block that uses the results from
//the transform block, and it will be executed in order.
var useStuff = new ActionBlock<string>(result =>
{
Console.WriteLine(result);
});
//when linking make sure to propagate completion.
processStuff.LinkTo(useStuff, new DataflowLinkOptions { PropagateCompletion = true });
for (int i = 0; i < 5; i++)
{
Console.WriteLine("Posting {0}", i);
processStuff.Post(i);
}
//mark the top of your pipeline as complete, and that will propagate
//to the end.
processStuff.Complete();
//wait on your last block to finish processing everything.
await useStuff.Completion;
output from this code produced the following as an example. Notice that the "started with" statements are not necessarily even in the order of the postings.
Posting 0
Posting 1
Posting 2
Posting 3
Posting 4
Started with 1
Started with 0
Started with 2
Started with 4
Started with 3
Done with 0
Done with 1
Done with 2
Done with 3
Done with 4
I did, in the meantime, find a solution/workaround, by using two blocks, and passing the asynchronous Task from the first block to the next block, where it is waited for synchronously using .Wait().
So, like this:
using System.Reactive.Linq;
using System.Threading.Tasks.Dataflow;
internal static class Program
{
public static async Task Main(string[] args)
{
var transformBlock = new TransformBlock<int, Task<int>>(async i =>
{
Console.WriteLine($"Started with {i}");
await DoSomethingAsync(i);
return i;
});
var actionBlock = new ActionBlock<Task<int>>(task =>
{
task.Wait();
Console.WriteLine($"Done with {task.Result}");
});
transformBlock.LinkTo(actionBlock, new DataflowLinkOptions { PropagateCompletion = true });
for (int i = 0; i < 5; i++)
{
transformBlock.Post(i);
}
transformBlock.Complete();
await actionBlock.Completion;
}
private static Task DoSomethingAsync(int i)
{
return Task.Delay(1000);
}
}}
}
This way the first block just considers itself done with a message almost instantly and is able to handle, in order, the next message which calls DoSomethingAsync directly, without waiting for the response of the previous call.

Return data from long running Task on demand

I want to create a Task, which may run for many minutes, collecting data via an API call to another system. At some point in the future I need to stop the task and return the collected data. This future point is unknown at the time of starting the task.
I have read many question about returning data from tasks, but I can't find any that answer this scenario. I may be missing a trick, but all of the examples actually seem to wait in the man thread for the task to finish before continuing. This seems counter-intuitive, surely the purpose of a task is to hand off an activity whilst continuing with other activities in your main thread?
Here is one of those many examples, taken from DotNetPearls..
namespace TaskBasedAsynchronousProgramming
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine($"Main Thread Started");
Task<double> task1 = Task.Run(() =>
{
return CalculateSum(10);
});
Console.WriteLine($"Sum is: {task1.Result}");
Console.WriteLine($"Main Thread Completed");
Console.ReadKey();
}
static double CalculateSum(int num)
{
double sum = 0;
for (int count = 1; count <= num; count++)
{
sum += count;
}
return sum;
}
}
}
Is it possible to do what I need, and have a long-running task running in parallel, stop it and return the data at an arbitrary future point?
Here is a sample application how you can do that:
static double partialResult = -1;
static void Main()
{
CancellationTokenSource calculationEndSignal = new(TimeSpan.FromSeconds(3));
Task meaningOfLife = Task.Run(() =>
GetTheMeaningOfLife(calculationEndSignal.Token),
calculationEndSignal.Token);
calculationEndSignal.Token.Register(() => Console.WriteLine(partialResult));
Console.ReadLine();
}
static async Task GetTheMeaningOfLife(CancellationToken cancellationToken)
{
foreach (var semiResult in Enumerable.Range(1, 42))
{
partialResult = semiResult;
cancellationToken.ThrowIfCancellationRequested();
await Task.Delay(1000);
}
}
partialResult is a shared variable between the two threads
The worker thread (GetTheMeaningOfLife) only writes it
The main thread (Main) only reads it
The read operation is performed only after the Task has been cancelled
calculationEndSignal is used to cancel the long-running operation
I've have specified a timeout, but you can call the Cancel method if you want
meaningOfLife is the Task which represents the long-running operation call
I have passed the CancellationToken to the GetTheMeaningOfLife and to the Task.Run as well
For this very simple example the Task.Run should not need to receive the token but it is generally a good practice to pass there as well
Register is receiving a callback which should be called after the token is cancelled
ReadLine can be any other computation
I've used ReadLine to keep the application running
GetTheMeaningOfLife simply increments the partialResult shared variable
either until it reaches the meaning of life
or until it is cancelled
Here is one approach. It features a CancellationTokenSource that is used as a stopping mechanism, instead of its normal usage as a cancellation mechanism. That's because you want to get the partial results, and a canceled Task does not propagate results:
CancellationTokenSource stoppingTokenSource = new();
Task<List<int>> longRunningTask = Task.Run(() =>
{
List<int> list = new();
for (int i = 1; i <= 60; i++)
{
if (stoppingTokenSource.IsCancellationRequested) break;
// Simulate a synchronous operation that has 1 second duration.
Thread.Sleep(1000);
list.Add(i);
}
return list;
});
Then, somewhere else in your program, you can send a stopping signal to the task, and then await asynchronously until the task acknowledges the signal and completes successfully. The await will also propagate the partial results:
stoppingTokenSource.Cancel();
List<int> partialResults = await longRunningTask;
Or, if you are not in an asynchronous workflow, you can wait synchronously until the partial results are available:
stoppingTokenSource.Cancel();
List<int> partialResults = longRunningTask.Result;

Number of Request before DDOSing. Limiting # of async Tasks [duplicate]

I am using the HTTPClient in System.Net.Http to make requests against an API. The API is limited to 10 requests per second.
My code is roughly like so:
List<Task> tasks = new List<Task>();
items..Select(i => tasks.Add(ProcessItem(i));
try
{
await Task.WhenAll(taskList.ToArray());
}
catch (Exception ex)
{
}
The ProcessItem method does a few things but always calls the API using the following:
await SendRequestAsync(..blah). Which looks like:
private async Task<Response> SendRequestAsync(HttpRequestMessage request, CancellationToken token)
{
token.ThrowIfCancellationRequested();
var response = await HttpClient
.SendAsync(request: request, cancellationToken: token).ConfigureAwait(continueOnCapturedContext: false);
token.ThrowIfCancellationRequested();
return await Response.BuildResponse(response);
}
Originally the code worked fine but when I started using Task.WhenAll I started getting 'Rate Limit Exceeded' messages from the API. How can I limit the rate at which requests are made?
Its worth noting that ProcessItem can make between 1-4 API calls depending on the item.
The API is limited to 10 requests per second.
Then just have your code do a batch of 10 requests, ensuring they take at least one second:
Items[] items = ...;
int index = 0;
while (index < items.Length)
{
var timer = Task.Delay(TimeSpan.FromSeconds(1.2)); // ".2" to make sure
var tasks = items.Skip(index).Take(10).Select(i => ProcessItemsAsync(i));
var tasksAndTimer = tasks.Concat(new[] { timer });
await Task.WhenAll(tasksAndTimer);
index += 10;
}
Update
My ProcessItems method makes 1-4 API calls depending on the item.
In this case, batching is not an appropriate solution. You need to limit an asynchronous method to a certain number, which implies a SemaphoreSlim. The tricky part is that you want to allow more calls over time.
I haven't tried this code, but the general idea I would go with is to have a periodic function that releases the semaphore up to 10 times. So, something like this:
private readonly SemaphoreSlim _semaphore = new SemaphoreSlim(10);
private async Task<Response> ThrottledSendRequestAsync(HttpRequestMessage request, CancellationToken token)
{
await _semaphore.WaitAsync(token);
return await SendRequestAsync(request, token);
}
private async Task PeriodicallyReleaseAsync(Task stop)
{
while (true)
{
var timer = Task.Delay(TimeSpan.FromSeconds(1.2));
if (await Task.WhenAny(timer, stop) == stop)
return;
// Release the semaphore at most 10 times.
for (int i = 0; i != 10; ++i)
{
try
{
_semaphore.Release();
}
catch (SemaphoreFullException)
{
break;
}
}
}
}
Usage:
// Start the periodic task, with a signal that we can use to stop it.
var stop = new TaskCompletionSource<object>();
var periodicTask = PeriodicallyReleaseAsync(stop.Task);
// Wait for all item processing.
await Task.WhenAll(taskList);
// Stop the periodic task.
stop.SetResult(null);
await periodicTask;
The answer is similar to this one.
Instead of using a list of tasks and WhenAll, use Parallel.ForEach and use ParallelOptions to limit the number of concurrent tasks to 10, and make sure each one takes at least 1 second:
Parallel.ForEach(
items,
new ParallelOptions { MaxDegreeOfParallelism = 10 },
async item => {
ProcessItems(item);
await Task.Delay(1000);
}
);
Or if you want to make sure each item takes as close to 1 second as possible:
Parallel.ForEach(
searches,
new ParallelOptions { MaxDegreeOfParallelism = 10 },
async item => {
var watch = new Stopwatch();
watch.Start();
ProcessItems(item);
watch.Stop();
if (watch.ElapsedMilliseconds < 1000) await Task.Delay((int)(1000 - watch.ElapsedMilliseconds));
}
);
Or:
Parallel.ForEach(
searches,
new ParallelOptions { MaxDegreeOfParallelism = 10 },
async item => {
await Task.WhenAll(
Task.Delay(1000),
Task.Run(() => { ProcessItems(item); })
);
}
);
UPDATED ANSWER
My ProcessItems method makes 1-4 API calls depending on the item. So with a batch size of 10 I still exceed the rate limit.
You need to implement a rolling window in SendRequestAsync. A queue containing timestamps of each request is a suitable data structure. You dequeue entries with a timestamp older than 10 seconds. As it so happens, there is an implementation as an answer to a similar question on SO.
ORIGINAL ANSWER
May still be useful to others
One straightforward way to handle this is to batch your requests in groups of 10, run those concurrently, and then wait until a total of 10 seconds has elapsed (if it hasn't already). This will bring you in right at the rate limit if the batch of requests can complete in 10 seconds, but is less than optimal if the batch of requests takes longer. Have a look at the .Batch() extension method in MoreLinq. Code would look approximately like
foreach (var taskList in tasks.Batch(10))
{
Stopwatch sw = Stopwatch.StartNew(); // From System.Diagnostics
await Task.WhenAll(taskList.ToArray());
if (sw.Elapsed.TotalSeconds < 10.0)
{
// Calculate how long you still have to wait and sleep that long
// You might want to wait 10.5 or 11 seconds just in case the rate
// limiting on the other side isn't perfectly implemented
}
}
https://github.com/thomhurst/EnumerableAsyncProcessor
I've written a library to help with this sort of logic.
Usage would be:
var responses = await AsyncProcessorBuilder.WithItems(items) // Or Extension Method: items.ToAsyncProcessorBuilder()
.SelectAsync(item => ProcessItem(item), CancellationToken.None)
.ProcessInParallel(levelOfParallelism: 10, TimeSpan.FromSeconds(1));

How to execute tasks in parallel but not more than N tasks per T seconds?

I need to run many tasks in parallel as fast as possible. But if my program runs more than 30 tasks per 1 second, it will be blocked. How to ensure that tasks run no more than 30 per any 1-second interval?
In other words, we must prevent the new task from starting if 30 tasks were completed in the last 1-second interval.
My ugly possible solution:
private async Task Process(List<Task> taskList, int maxIntervalCount, int timeIntervalSeconds)
{
var timeList = new List<DateTime>();
var sem = new Semaphore(maxIntervalCount, maxIntervalCount);
var tasksToRun = taskList.Select(async task =>
{
do
{
sem.WaitOne();
}
while (HasAllowance(timeList, maxIntervalCount, timeIntervalSeconds));
await task;
timeList.Add(DateTime.Now);
sem.Release();
});
await Task.WhenAll(tasksToRun);
}
private bool HasAllowance(List<DateTime> timeList, int maxIntervalCount, int timeIntervalSeconds)
{
return timeList.Count <= maxIntervalCount
|| DateTime.Now.Subtract(TimeSpan.FromSeconds(timeIntervalSeconds)) > timeList[timeList.Count - maxIntervalCount];
}
User code should never have to control how tasks are scheduled directly. For one thing, it can't - controlling how tasks run is the job of the TaskScheduler. When user code calls .Start(), it simply adds a task to a threadpool queue for execution. await executes already executing tasks.
The TaskScheduler samples show how to create limited concurrency schedulers, but again, there are better, high-level options.
The question's code doesn't throttle the queued tasks anyway, it limits how many of them can be awaited. They are all running already. This is similar to batching the previous asynchronous operation in a pipeline, allowing only a limited number of messages to pass to the next level.
ActionBlock with delay
The easy, out-of-the-box way would be to use an ActionBlock with a limited MaxDegreeOfParallelism, to ensure no more than N concurrent operations can run at the same time. If we know how long each operation takes, we could add a bit of delay to ensure we don't overshoot the throttle limit.
In this case, 7 concurrent workers perform 4 requests/second, for a total of 28 maximum request per second. The BoundedCapacity means that only up to 7 items will be stored in the input buffer before downloader.SendAsync blocks. This way we avoid flooding the ActionBlock if the operations take too long.
var downloader = new ActionBlock<string>(
async url => {
await Task.Delay(250);
var response=await httpClient.GetStringAsync(url);
//Do something with it.
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 7, BoundedCapacity=7 }
);
//Start posting to the downloader
foreach(var item in urls)
{
await downloader.SendAsync(item);
}
downloader.Complete();
await downloader.Completion;
ActionBlock with SemaphoreSlim
Another option would be to combine this with a SemaphoreSlim that gets reset periodically by a timer.
var refreshTimer = new Timer(_=>sm.Release(30));
var downloader = new ActionBlock<string>(
async url => {
await semaphore.WaitAsync();
try
{
var response=await httpClient.GetStringAsync(url);
//Do something with it.
}
finally
{
semaphore.Release();
}
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 5, BoundedCapacity=5 }
);
//Start the timer right before we start posting
refreshTimer.Change(1000,1000);
foreach(....)
{
}
This is the snippet:
var tasks = new List<Task>();
foreach(item in listNeedInsert)
{
var task = TaskToRun(item);
tasks.Add(task);
if(tasks.Count == 100)
{
await Task.WhenAll(tasks);
tasks.Clear();
}
}
// Wait for anything left to finish
await Task.WhenAll(tasks);
Notice that I rather add the task into a List<Task>(); and after all is added, I await all in the same List<Task>();
What you do here:
var tasks = taskList.Select(async task =>
{
do
{
sem.WaitOne();
}
while (timeList.Count <= maxIntervalCount
|| DateTime.Now.Subtract(TimeSpan.FromSeconds(timeIntervalSeconds)) > timeList[timeList.Count - maxIntervalCount]);
await task;
is blocking until the task finishes it's work thus making this call:
Task.WhenAll(tasks).Wait();
completely redundant. Furthermore, this line Task.WhenAll(tasks).Wait(); is performing unnecessary blocking on the WhenAll method.
Is the blocking due to some server/firewall/hardware limit or it is based on observation?
You should try to use BlockingCollection<Task> or similar thread safe collections especially if the job of your tasks are I/O-bound. You can even set the capacity to 30:
var collection = BlockingCollection<Task>(30);
Then you can start 2 async method:
var population = Task.Factory.Start(Populate);
var processing = Task.Factory.Start(Dequeue);
await Task.WhenAll(population, processing);
Task Populate()
{
foreach (...)
collection.Add(...);
collection.CompleteAdding();
}
Task Dequeue
{
while(!collection.IsComplete)
await collection.Take(); //consider using TryTake()
}
If the limit presists due to some true limitation (should be very rare) change Populate() as follows:
var stopper = Stopwatch.StartNew();
for (var i = ....) //instead of foreach
{
if (i % 30 == 0)
{
if (stopper.ElapsedMilliseconds < 1000)
Task.Delay(1000 - stopper.ElapsedMilliseconds); //note that this race condition should be avoided in your code
stopper.Restart();
}
collection.Add(...);
}
collection.CompleteAdding();
I think that this problem can be solved by a SemaphoreSlim limited to the number of maximum tasks per interval, and also by a Task.Delay that delays the release of the SemaphoreSlim after each task's completion, for an interval equal to the required throttling interval. Below is an implementation based on this idea. The rate limiting can be applied in two ways:
With includeAsynchronousDuration: false the rate limit affects how many operations can be started during the specified time span. The duration of each operation is not taken into account.
With includeAsynchronousDuration: true the rate limit affects how many operations can be counted as "active" during the specified time span, and is more restrictive (makes the enumeration slower). Instead of counting each operation as a moment in time (when started), it is counted as a time span (between start and completion). An operation is counted as "active" for a specified time span, if and only if its own time span intersects with the specified time span.
/// <summary>
/// Applies an asynchronous transformation for each element of a sequence,
/// limiting the number of transformations that can start or be active during
/// the specified time span.
/// </summary>
public static async Task<TResult[]> ForEachAsync<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, Task<TResult>> action,
int maxActionsPerTimeUnit,
TimeSpan timeUnit,
bool includeAsynchronousDuration = false,
bool onErrorContinue = false, /* Affects only asynchronous errors */
bool executeOnCapturedContext = false)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (action == null) throw new ArgumentNullException(nameof(action));
if (maxActionsPerTimeUnit < 1)
throw new ArgumentOutOfRangeException(nameof(maxActionsPerTimeUnit));
if (timeUnit < TimeSpan.Zero || timeUnit.TotalMilliseconds > Int32.MaxValue)
throw new ArgumentOutOfRangeException(nameof(timeUnit));
using var semaphore = new SemaphoreSlim(maxActionsPerTimeUnit,
maxActionsPerTimeUnit);
using var cts = new CancellationTokenSource();
var tasks = new List<Task<TResult>>();
var releaseTasks = new List<Task>();
try // Watch for exceptions thrown by the source enumerator
{
foreach (var item in source)
{
try
{
await semaphore.WaitAsync(cts.Token)
.ConfigureAwait(executeOnCapturedContext);
}
catch (OperationCanceledException) { break; }
// Exceptions thrown synchronously by invoking the action are breaking
// the loop unconditionally (the onErrorContinue has no effect on them).
var task = action(item);
if (!onErrorContinue) task = ObserveFailureAsync(task);
tasks.Add(task);
releaseTasks.Add(ScheduleSemaphoreReleaseAsync(task));
}
}
catch (Exception ex) { tasks.Add(Task.FromException<TResult>(ex)); }
cts.Cancel(); // Cancel all release tasks
Task<TResult[]> whenAll = Task.WhenAll(tasks);
try { return await whenAll.ConfigureAwait(false); }
catch (OperationCanceledException) when (whenAll.IsCanceled) { throw; }
catch { whenAll.Wait(); throw; } // Propagate AggregateException
finally { await Task.WhenAll(releaseTasks); }
async Task<TResult> ObserveFailureAsync(Task<TResult> task)
{
try { return await task.ConfigureAwait(false); }
catch { cts.Cancel(); throw; }
}
async Task ScheduleSemaphoreReleaseAsync(Task<TResult> task)
{
if (includeAsynchronousDuration)
try { await task.ConfigureAwait(false); } catch { } // Ignore exceptions
// Release only if the Task.Delay completed successfully
try { await Task.Delay(timeUnit, cts.Token).ConfigureAwait(false); }
catch (OperationCanceledException) { return; }
semaphore.Release();
}
}
Usage example:
int[] results = await ForEachAsync(Enumerable.Range(1, 100), async n =>
{
await Task.Delay(500); // Simulate some asynchronous I/O-bound operation
return n;
}, maxActionsPerTimeUnit: 30, timeUnit: TimeSpan.FromSeconds(1.0),
includeAsynchronousDuration: true);
The reasons for propagating an AggregateException using the catch+Wait technique, are explained here.

Random tasks from Task.Factory.StartNew never finishes

I am using Async await with Task.Factory method.
public async Task<JobDto> ProcessJob(JobDto jobTask)
{
try
{
var T = Task.Factory.StartNew(() =>
{
JobWorker jobWorker = new JobWorker();
jobWorker.Execute(jobTask);
});
await T;
}
This method I am calling inside a loop like this
for(int i=0; i < jobList.Count(); i++)
{
tasks[i] = ProcessJob(jobList[i]);
}
What I notice is that new tasks opens up inside Process explorer and they also start working (based on log file). however out of 10 sometimes 8 or sometimes 7 finishes. Rest of them just never come back.
why would that be happening ?
Are they timing out ? Where can I set timeout for my tasks ?
UPDATE
Basically above, I would like each Task to start running as soon as they are called and wait for the response on AWAIT T keyword. I am assuming here that once they finish each of them will come back at Await T and do the next action. I am alraedy seeing this result for 7 out of 10 tasks but 3 of them are not coming back.
Thanks
It is hard to say what the issues is without the rest if the code, but you code can be simplified by making ProcessJob synchronous and then calling Task.Run with it.
public JobDto ProcessJob(JobDto jobTask)
{
JobWorker jobWorker = new JobWorker();
return jobWorker.Execute(jobTask);
}
Start tasks and wait for all tasks to finish. Prefer using Task.Run rather than Task.Factory.StartNew as it provides more favourable defaults for pushing work to the background. See here.
for(int i=0; i < jobList.Count(); i++)
{
tasks[i] = Task.Run(() => ProcessJob(jobList[i]));
}
try
{
await Task.WhenAll(tasks);
}
catch(Exception ex)
{
// handle exception
}
First, let's make a reproducible version of your code. This is NOT the best way to achieve what you are doing, but to show you what is happening in your code!
I'll keep the code almost same as your code, except I'll use simple int rather than your JobDto and on completion of the job Execute() I'll write in a file that we can verify later. Here's the code
public class SomeMainClass
{
public void StartProcessing()
{
var jobList = Enumerable.Range(1, 10).ToArray();
var tasks = new Task[10];
//[1] start 10 jobs, one-by-one
for (int i = 0; i < jobList.Count(); i++)
{
tasks[i] = ProcessJob(jobList[i]);
}
//[4] here we have 10 awaitable Task in tasks
//[5] do all other unrelated operations
Thread.Sleep(1500); //assume it works for 1.5 sec
// Task.WaitAll(tasks); //[6] wait for tasks to complete
// The PROCESS IS COMPLETE here
}
public async Task ProcessJob(int jobTask)
{
try
{
//[2] start job in a ThreadPool, Background thread
var T = Task.Factory.StartNew(() =>
{
JobWorker jobWorker = new JobWorker();
jobWorker.Execute(jobTask);
});
//[3] await here will keep context of calling thread
await T; //... and release the calling thread
}
catch (Exception) { /*handle*/ }
}
}
public class JobWorker
{
static object locker = new object();
const string _file = #"C:\YourDirectory\out.txt";
public void Execute(int jobTask) //on complete, writes in file
{
Thread.Sleep(500); //let's assume does something for 0.5 sec
lock(locker)
{
File.AppendAllText(_file,
Environment.NewLine + "Writing the value-" + jobTask);
}
}
}
After running just the StartProcessing(), this is what I get in the file
Writing the value-4
Writing the value-2
Writing the value-3
Writing the value-1
Writing the value-6
Writing the value-7
Writing the value-8
Writing the value-5
So, 8/10 jobs has completed. Obviously, every time you run this, the number and order might change. But, the point is, all the jobs did not complete!
Now, if I un-comment the step [6] Task.WaitAll(tasks);, this is what I get in my file
Writing the value-2
Writing the value-3
Writing the value-4
Writing the value-1
Writing the value-5
Writing the value-7
Writing the value-8
Writing the value-6
Writing the value-9
Writing the value-10
So, all my jobs completed here!
Why the code is behaving like this, is already explained in the code-comments. The main thing to note is, your tasks run in ThreadPool based Background threads. So, if you do not wait for them, they will be killed when the MAIN process ends and the main thread exits!!
If you still don't want to await the tasks there, you can return the list of tasks from this first method and await the tasks at the very end of the process, something like this
public Task[] StartProcessing()
{
...
for (int i = 0; i < jobList.Count(); i++)
{
tasks[i] = ProcessJob(jobList[i]);
}
...
return tasks;
}
//in the MAIN METHOD of your application/process
var tasks = new SomeMainClass().StartProcessing();
// do all other stuffs here, and just at the end of process
Task.WaitAll(tasks);
Hope this clears all confusion.
It's possible your code is swallowing exceptions. I would add a ContineWith call to the end of the part of the code that starts the new task. Something like this untested code:
var T = Task.Factory.StartNew(() =>
{
JobWorker jobWorker = new JobWorker();
jobWorker.Execute(jobTask);
}).ContinueWith(tsk =>
{
var flattenedException = tsk.Exception.Flatten();
Console.Log("Exception! " + flattenedException);
return true;
});
},TaskContinuationOptions.OnlyOnFaulted); //Only call if task is faulted
Another possibility is that something in one of the tasks is timing out (like you mentioned) or deadlocking. To track down whether a timeout (or maybe deadlock) is the root cause, you could add some timeout logic (as described in this SO answer):
int timeout = 1000; //set to something much greater than the time it should take your task to complete (at least for testing)
var task = TheMethodWhichWrapsYourAsyncLogic(cancellationToken);
if (await Task.WhenAny(task, Task.Delay(timeout, cancellationToken)) == task)
{
// Task completed within timeout.
// Consider that the task may have faulted or been canceled.
// We re-await the task so that any exceptions/cancellation is rethrown.
await task;
}
else
{
// timeout/cancellation logic
}
Check out the documentation on exception handling in the TPL on MSDN.

Categories

Resources