Get progress for multiple tasks while they running - c#

I have a created a background job which processes users(ex: 2000 users, calling other apis and adding to database), I have divided users into few chunks (ex: 10 chunks) and processing them in tasks for each chunk.
I need to get progress of how many users getting processed from each individual task and so that I can sum and take it as progress.
How can I do that?
private void CallingMethod()
{
List<Task> allTasks = new List<Task>();
allTasks.Add(Task.Run(() => ProcessUsers(usersList1));
allTasks.Add(Task.Run(() => ProcessUsers(usersList2));
allTasks.Add(Task.Run(() => ProcessUsers(usersList3));
await allTasks;
int GetCountsFromTasks????
}
private async Task ProcessUsers(List<User> usersList)
{
int processedUsers = 0;
foreach(var user in usersList)
{
// Processing of users
//End
processedUsers++; // Need to send this count to calling method
// after each user completion
}
// send results object as well to calling method which contains usernames,
// success/fail for each user at end of task
}

Progress updates are normally done via IProgress<T>. There isn't much in the way of combining IProgress<T> instances, but you can build your own solution. Regarding results, those are normally returned from asynchronous methods, and can be retrieved from await allTasks if you do that.
Something like this:
private void CallingMethod()
{
int usersProcessed1 = 0;
int usersProcessed2 = 0;
int usersProcessed3 = 0;
var progress1 = new Progress<int>(value => usersProcessed1 = value);
var progress2 = new Progress<int>(value => usersProcessed2 = value);
var progress3 = new Progress<int>(value => usersProcessed3 = value);
List<Task> allTasks = new List<Task>();
allTasks.Add(Task.Run(() => ProcessUsers(usersList1, progress1));
allTasks.Add(Task.Run(() => ProcessUsers(usersList2, progress2));
allTasks.Add(Task.Run(() => ProcessUsers(usersList3, progress3));
var results = await allTasks;
}
private async Task<IReadOnlyList<UserResult>> ProcessUsers(List<User> usersList, IProgress<int> progress)
{
int processedUsers = 0;
foreach(var user in usersList)
{
// Processing of users
processedUsers++;
progress?.Report(processedUsers);
}
return results;
}

Related

Elegant way to get a task for async code without running the task immediately

I have the following code that does what I want but I had to resort to using .GetAwaiter().GetResult() in the middle of asynchronous code to get it. I am wondering if there is an elegant way to achieve this without resorting to such hacks.
This is a simplified version of the code I have.
public async Task<string[]> GetValues(int[] keys)
{
List<int> keysNotYetActivelyRequested = null;
// don't start the task at this point because the
// keysNotYetActivelyRequested is not yet populated
var taskToCreateWithoutStarting = new Task<Dictionary<int, string>>(
() => GetValuesFromApi(keysNotYetActivelyRequested.ToArray())
.GetAwaiter().GetResult() /*not the best idea*/);
(var allTasksToAwait, keysNotYetActivelyRequested) = GetAllTasksToAwait(
keys, taskToCreateWithoutStarting);
if (keysNotYetActivelyRequested.Any())
{
// keysNotYetActivelyRequested will be empty when all keys
// are already part of another active request
taskToCreateWithoutStarting.Start(TaskScheduler.Current);
}
var allResults = await Task.WhenAll(allTasksToAwait);
var theReturn = new string[keys.Length];
for (int i = 0; i < keys.Length; i++)
{
foreach (var result in allResults)
{
if (result.TryGetValue(keys[i], out var value))
{
theReturn[i] = value;
}
}
}
if (keysNotYetActivelyRequested.Any())
{
taskToCreateWithoutStarting.Dispose();
}
return theReturn;
}
// all active requests indexed by the key, used to avoid generating
// multiple requests for the same key
private Dictionary<int, Task<Dictionary<int, string>>> _activeRequests = new();
private (HashSet<Task<Dictionary<int, string>>> allTasksToAwait,
List<int> keysNotYetActivelyRequested) GetAllTasksToAwait(
int[] keys, Task<Dictionary<int, string>> taskToCreateWithoutStarting)
{
var keysNotYetActivelyRequested = new List<int>();
// a HashSet because each task will have multiple keys hence _activeRequests
// will have the same task multiple times
var allTasksToAwait = new HashSet<Task<Dictionary<int, string>>>();
// add cleanup to the task to remove the requested keys from _activeRequests
// once it completes
var taskWithCleanup = taskToCreateWithoutStarting.ContinueWith(_ =>
{
lock (_activeRequests)
{
foreach (var key in keysNotYetActivelyRequested)
{
_activeRequests.Remove(key);
}
}
});
lock (_activeRequests)
{
foreach (var key in keys)
{
// use CollectionsMarshal to avoid a lookup for the same key twice
ref var refToTask = ref CollectionsMarshal.GetValueRefOrAddDefault(
_activeRequests, key, out var exists);
if (exists)
{
allTasksToAwait.Add(refToTask);
}
else
{
refToTask = taskToCreateWithoutStarting;
allTasksToAwait.Add(taskToCreateWithoutStarting);
keysNotYetActivelyRequested.Add(key);
}
}
}
return (allTasksToAwait, keysNotYetActivelyRequested);
}
// not the actual code
private async Task<Dictionary<int, string>> GetValuesFromApi(int[] keys)
{
// request duration dependent on the number of keys
await Task.Delay(keys.Length);
return keys.ToDictionary(k => k, k => k.ToString());
}
And a test method:
[Test]
public void TestGetValues()
{
var random = new Random();
var allTasks = new Task[10];
for (int i = 0; i < 10; i++)
{
var arrayofRandomInts = Enumerable.Repeat(random, random.Next(1, 100))
.Select(r => r.Next(1, 100)).ToArray();
allTasks[i] = GetValues(arrayofRandomInts);
}
Assert.DoesNotThrowAsync(() => Task.WhenAll(allTasks));
Assert.That(_activeRequests.Count, Is.EqualTo(0));
}
Instead of:
Task<Something> coldTask = new(() => GetAsync().GetAwaiter().GetResult());
You can do it like this:
Task<Task<Something>> coldTaskTask = new(() => GetAsync());
Task<Something> proxyTask = coldTaskTask.Unwrap();
The nested task coldTaskTask is the task that you will later Start (or RunSynchronously).
The unwrapped task proxyTask is a proxy that represents both the invocation of the GetAsync method, as well as the completion of the Task<Something> that this method generates.
You should never use the task constructor.
If you want to refer to some code to execute later, use a delegate. Just like you would with synchronous code. The delegate types for asynchronous code are slightly different, but they're still just delegates.
Func<Task<Dictionary<int, string>>> getValuesAsync = () => GetValuesFromApi(keysNotYetActivelyRequested.ToArray());
...
var result = await getValuesAsync();
Also, I strongly recommend replacing ContinueWith with await.
All links are to my blog.

Async generator, previous iterations await a future iteration?

I want to generate an enumerable of tasks, the tasks will complete at different times.
How can I make a generator in C# that:
yields tasks
every few iterations, resolves previously yielded tasks with results that are only now known
The reason I want to do this is because I am processing a long iterable of inputs, and every so often I accumulate enough data from these inputs to send a batch API request and finalise my outputs.
Pseudocode:
IEnumerable<Task<Output>> Process(IEnumerable<Input> inputs)
{
var queuedInputs = Queue<Input>();
var cumulativeLength = 0;
foreach (var input in inputs)
{
yield return waiting task for this input
queuedInputs.Enqueue(input);
cumulativeLength += input.Length;
if (cumulativeLength > 10)
{
cumulativeLength = 0
GetFromAPI(queue).ContinueWith((apiTask) => {
Queue<BatchResult> batchResults = apiTask.Result;
while (queuedInputs.Count > 0)
{
batchResult = batchResults.Dequeue();
historicalInput = queuedInputs.Dequeue();
var output = MakeOutput(historicalInput, batchResult);
resolve earlier input's task with this output
}
});
}
}
}
The shape of your solution is going to be driven by the shape of your problem. There's a couple of questions I have because your problem domain seems odd:
Are all your inputs known at the outset? The (synchronous) IEnumerable<Input> implies they are.
Are you sure you want to wait for a batch of inputs before sending any query? What about the "remainder" if you're batching by 10 but have 55 inputs?
Assuming you do have synchronous inputs, and that you want to batch with remainders, you can just accumulate all your inputs immediately, batch them, and walk the batches, asynchronously providing outputs:
async IAsyncEnumerable<Output> Process(IEnumerable<Input> inputs)
{
foreach (var batchedInput in inputs.Batch(10))
{
var batchResults = await GetFromAPI(batchedInput);
for (int i = 0; i != batchedInput.Count; ++i)
yield return MakeOutput(batchedInput[i], batchResults[i]);
}
}
public static IEnumerable<IReadOnlyList<TSource>> Batch<TSource>(this IEnumerable<TSource> source, int size)
{
List<TSource>? batch = null;
foreach (var item in source)
{
batch ??= new List<TSource>(capacity: size);
batch.Add(item);
if (batch.Count == size)
{
yield return batch;
batch = null;
}
}
if (batch?.Count > 0)
yield return batch;
}
Update:
If you want to start the API calls immediately, you can move those out of the loop:
async IAsyncEnumerable<Output> Process(IEnumerable<Input> inputs)
{
var batchedInputs = inputs.Batch(10).ToList();
var apiCallTasks = batchedInputs.Select(GetFromAPI).ToList();
foreach (int i = 0; i != apiCallTasks.Count; ++i)
{
var batchResults = await apiCallTasks[i];
var batchedInput = batchedInputs[i];
for (int j = 0; j != batchedInput.Count; ++j)
yield return MakeOutput(batchedInput[j], batchResults[j]);
}
}
One approach is to use the TPL Dataflow library. This library offers a variety of components named "blocks" (TransformBlock, ActionBlock etc), where each block is processing its input data, and then propagates the results to the next block. The blocks are linked together so that the completion of the previous block in the pipeline triggers the completion of the next block etc, until the final block which is usually an ActionBlock<T> with no output. Here is an example:
var block1 = new TransformBlock<int, string>(item =>
{
Thread.Sleep(1000); // Simulate synchronous work
return item.ToString();
}, new()
{
MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded,
EnsureOrdered = false
});
var block2 = new BatchBlock<string>(batchSize: 10);
var block3 = new ActionBlock<string[]>(async batch =>
{
await Task.Delay(1000); // Simulate asynchronous work
}); // The default MaxDegreeOfParallelism is 1
block1.LinkTo(block2, new() { PropagateCompletion = true });
block2.LinkTo(block3, new() { PropagateCompletion = true });
// Provide some input in the pipeline
block1.Post(1);
block1.Post(2);
block1.Post(3);
block1.Post(4);
block1.Post(5);
block1.Complete(); // Mark the first block as completed
await block3.Completion; // Await the completion of the last block
The TPL Dataflow library is powerful and flexible, but is has a weak point in the propagation of exceptions. There is no built-in way to instruct the block1 to stop working, if the block3 fails. You can read more about this issue here. It might not be a serious issue, if you don't expect your blocks to fail very often.
Assuming MyGenerator() returns List<Task<T>>, and the number of tasks is relatively small (even in the hundreds is probably fine) then you can use Task.WhenAny(), which returns the first Task that completes. Then remove that Task from the list, process the result, and move on to the next:
var tasks = MyGenerator();
while (tasks.Count > 0) {
var t = Task.WhenAny(tasks);
tasks.Remove(t);
var result = await t; // this won't actually wait since the task is already done
// Do something with result
}
There is a good discussion of this in an article by Stephen Toub, which explains in more detail, and gives alternatives if your task list is in the thousands: Processing tasks as they complete
There's also this article, but I think Stephen's is better written: Process asynchronous tasks as they complete (C#)
Using TaskCompletionSource:
IEnumerable<Task<Output>> Process(IEnumerable<Input> inputs)
{
var tcss = new List<TaskCompletionSource<Output>>();
var queue = new Queue<(Input, TaskCompletionSource<Output>)>();
var cumulativeLength = 0;
foreach (var input in inputs)
{
var tcs = new TaskCompletionSource<Output>();
queue.Enqueue((input, tcs));
tcss.Add(tcs);
cumulativeLength += input.Length;
if (cumulativeLength > 10)
{
cumulativeLength = 0
var queueClone = Queue<(Input, TaskCompletionSource<Input>)>(queue);
queue.Clear();
GetFromAPI(queueClone.Select(x => x.Item1)).ContinueWith((apiTask) => {
Queue<BatchResult> batchResults = apiTask.Result;
while (queueClone.Count > 0)
{
var batchResult = batchResults.Dequeue();
var (queuedInput, queuedTcs) = queueClone.Dequeue();
var output = MakeOutput(queuedInput, batchResult);
queuedTcs.SetResult(output)
}
});
}
}
GetFromAPI(queue.Select(x => x.Item1)).ContinueWith((apiTask) => {
Queue<BatchResult> batchResults = apiTask.Result;
while (queue.Count > 0)
{
var batchResult = batchResults.Dequeue();
var (queuedInput, queuedTcs) = queue.Dequeue();
var output = MakeOutput(queuedInput, batchResult);
queuedTcs.SetResult(output)
}
});
foreach (var tcs in tcss)
{
yield return tcs.Task;
}
}

C# Multithreading with slots

I have this function which checks for proxy servers and currently it checks only a number of threads and waits for all to finish until the next set is starting. Is it possible to start a new thread as soon as one is finished from the maximum allowed?
for (int i = 0; i < listProxies.Count(); i+=nThreadsNum)
{
for (nCurrentThread = 0; nCurrentThread < nThreadsNum; nCurrentThread++)
{
if (nCurrentThread < nThreadsNum)
{
string strProxyIP = listProxies[i + nCurrentThread].sIPAddress;
int nPort = listProxies[i + nCurrentThread].nPort;
tasks.Add(Task.Factory.StartNew<ProxyAddress>(() => CheckProxyServer(strProxyIP, nPort, nCurrentThread)));
}
}
Task.WaitAll(tasks.ToArray());
foreach (var tsk in tasks)
{
ProxyAddress result = tsk.Result;
UpdateProxyDBRecord(result.sIPAddress, result.bOnlineStatus);
}
tasks.Clear();
}
This seems much more simple:
int numberProcessed = 0;
Parallel.ForEach(listProxies,
new ParallelOptions { MaxDegreeOfParallelism = nThreadsNum },
(p)=> {
var result = CheckProxyServer(p.sIPAddress, s.nPort, Thread.CurrentThread.ManagedThreadId);
UpdateProxyDBRecord(result.sIPAddress, result.bOnlineStatus);
Interlocked.Increment(numberProcessed);
});
With slots:
var obj = new Object();
var slots = new List<int>();
Parallel.ForEach(listProxies,
new ParallelOptions { MaxDegreeOfParallelism = nThreadsNum },
(p)=> {
int threadId = Thread.CurrentThread.ManagedThreadId;
int slot = slots.IndexOf(threadId);
if (slot == -1)
{
lock(obj)
{
slots.Add(threadId);
}
slot = slots.IndexOf(threadId);
}
var result = CheckProxyServer(p.sIPAddress, s.nPort, slot);
UpdateProxyDBRecord(result.sIPAddress, result.bOnlineStatus);
});
I took a few shortcuts there to guarantee thread safety. You don't have to do the normal check-lock-check dance because there will never be two threads attempting to add the same threadid to the list, so the second check will always fail and isn't needed. Secondly, for the same reason, I don't believe you need to ever lock around the outer IndexOf either. That makes this a very highly efficient concurrent routine that rarely locks (it should only lock nThreadsNum times) no matter how many items are in the enumerable.
Another solution is to use a SemaphoreSlim or the Producer-Consumer Pattern using a BlockinCollection<T>. Both solution support cancellation.
SemaphoreSlim
private async Task CheckProxyServerAsync(IEnumerable<object> proxies)
{
var tasks = new List<Task>();
int currentThreadNumber = 0;
int maxNumberOfThreads = 8;
using (semaphore = new SemaphoreSlim(maxNumberOfThreads, maxNumberOfThreads))
{
foreach (var proxy in proxies)
{
// Asynchronously wait until thread is available if thread limit reached
await semaphore.WaitAsync();
string proxyIP = proxy.IPAddress;
int port = proxy.Port;
tasks.Add(Task.Run(() => CheckProxyServer(proxyIP, port, Interlocked.Increment(ref currentThreadNumber)))
.ContinueWith(
(task) =>
{
ProxyAddress result = task.Result;
// Method call must be thread-safe!
UpdateProxyDbRecord(result.IPAddress, result.OnlineStatus);
Interlocked.Decrement(ref currentThreadNumber);
// Allow to start next thread if thread limit was reached
semaphore.Release();
},
TaskContinuationOptions.OnlyOnRanToCompletion));
}
// Asynchronously wait until all tasks are completed
// to prevent premature disposal of semaphore
await Task.WhenAll(tasks);
}
}
Producer-Consumer Pattern
// Uses a fixed number of same threads
private async Task CheckProxyServerAsync(IEnumerable<ProxyInfo> proxies)
{
var pipe = new BlockingCollection<ProxyInfo>();
int maxNumberOfThreads = 8;
var tasks = new List<Task>();
// Create all threads (count == maxNumberOfThreads)
for (int currentThreadNumber = 0; currentThreadNumber < maxNumberOfThreads; currentThreadNumber++)
{
tasks.Add(
Task.Run(() => ConsumeProxyInfo(pipe, currentThreadNumber)));
}
proxies.ToList().ForEach(pipe.Add);
pipe.CompleteAdding();
await Task.WhenAll(tasks);
}
private void ConsumeProxyInfo(BlockingCollection<ProxyInfo> proxiesPipe, int currentThreadNumber)
{
while (!proxiesPipe.IsCompleted)
{
if (proxiesPipe.TryTake(out ProxyInfo proxy))
{
int port = proxy.Port;
string proxyIP = proxy.IPAddress;
ProxyAddress result = CheckProxyServer(proxyIP, port, currentThreadNumber);
// Method call must be thread-safe!
UpdateProxyDbRecord(result.IPAddress, result.OnlineStatus);
}
}
}
If I'm understanding your question properly, this is actually fairly simple to do with await Task.WhenAny. Basically, you keep a collection of all of the running tasks. Once you reach a certain number of tasks running, you wait for one or more of your tasks to finish, and then you remove the tasks that were completed from your collection and continue to add more tasks.
Here's an example of what I mean below:
var tasks = new List<Task>();
for (int i = 0; i < 20; i++)
{
// I want my list of tasks to contain at most 5 tasks at once
if (tasks.Count == 5)
{
// Wait for at least one of the tasks to complete
await Task.WhenAny(tasks.ToArray());
// Remove all of the completed tasks from the list
tasks = tasks.Where(t => !t.IsCompleted).ToList();
}
// Add some task to the list
tasks.Add(Task.Factory.StartNew(async delegate ()
{
await Task.Delay(1000);
}));
}
I suggest changing your approach slightly. Instead of starting and stopping threads, put your proxy server data in a concurrent queue, one item for each proxy server. Then create a fixed number of threads (or async tasks) to work on the queue. This is more likely to provide smooth performance (you aren't starting and stopping threads over and over, which has overhead) and is a lot easier to code, in my opinion.
A simple example:
class ProxyChecker
{
private ConcurrentQueue<ProxyInfo> _masterQueue = new ConcurrentQueue<ProxyInfo>();
public ProxyChecker(IEnumerable<ProxyInfo> listProxies)
{
foreach (var proxy in listProxies)
{
_masterQueue.Enqueue(proxy);
}
}
public async Task RunChecks(int maximumConcurrency)
{
var count = Math.Max(maximumConcurrency, _masterQueue.Count);
var tasks = Enumerable.Range(0, count).Select( i => WorkerTask() ).ToList();
await Task.WhenAll(tasks);
}
private async Task WorkerTask()
{
ProxyInfo proxyInfo;
while ( _masterList.TryDequeue(out proxyInfo))
{
DoTheTest(proxyInfo.IP, proxyInfo.Port)
}
}
}

Immediately process asynchronous results in the order they were requested

Suppose I kick off 5 async tasks, and I want to print the results in the order they were requested:
public async void RunTasks()
{
var tasks = new List<Task<int>>();
for(int i=1; i<=5; i++)
{
tasks.Add(DoSomething(i));
}
var results = await Task.WhenAll(tasks);
Console.WriteLine(String.Join(',', results));
}
public async Task<int> DoSomething(int taskNumber)
{
var random = new Random();
await Task.Delay(random.Next(5000));
return taskNumber;
}
This will always print "1,2,3,4,5" - because Task.WhenAll() orders the results by the order requested, not by the order in which they finished.
Unfortunately this means I have to wait for ALL Tasks to finish until I can print anything.
How might I instead print the result of each task as soon as it's finished, but still respecting the order they were requested?
So I should always see "1,2,3,4,5" - but it may arrive gradually:
"1"
"1,2,3"
"1,2,3,4"
"1,2,3,4,5"
(no need to worry about the actual reasoning for doing this, treat it as a fun problem)
var tasks = new List<Task<int>>();
for(int i=1; i<=5; i++)
{
tasks.Add(DoSomething(i));
}
foreach (var task in tasks)
{
var result = await task;
Console.WriteLine(result);
}
We kick off all of the tasks first, then loop over them in order, awaiting each in turn. If the task being awaited has previously completed, the await just returns its result. Otherwise we wait until it completes.
Try a TransformBlock it will output the items it processes one by one in the order the were received by default even if the elements are processed in parallel.
public async Task Order()
{
var tBlock = new TransformBlock<int, string>(async x =>
{
await Task.Delay(100);
return x.ToString();
}, new ExecutionDataflowBlockOptions() { MaxDegreeOfParallelism = 10 });
var sub = tBlock.AsObservable().Subscribe(x => Console.Write(x));
foreach (var num in Enumerable.Range(0, 10))
{
tBlock.Post(num);
}
tBlock.Complete();
await tBlock.Completion;
sub.Dispose();
}
Output:
0123456789

Task Parallel Library WaitAny with specified result

I'm trying to write some code that will make a web service call to a number of different servers in parallel, so TPL seems like the obvious choice to use.
Only one of my web service calls will ever return the result I want and all the others won't. I'm trying to work out a way of effectively having a Task.WaitAny but only unblocking when the first Task that matches a condition returns.
I tried with WaitAny but couldn't work out where to put the filter. I got this far:
public void SearchServers()
{
var servers = new[] {"server1", "server2", "server3", "server4"};
var tasks = servers
.Select(s => Task<bool>.Factory.StartNew(server => CallServer((string)server), s))
.ToArray();
Task.WaitAny(tasks); //how do I say "WaitAny where the result is true"?
//Omitted: cancel any outstanding tasks since the correct server has been found
}
private bool CallServer(string server)
{
//... make the call to the server and return the result ...
}
Edit: Quick clarification just in case there's any confusion above. I'm trying to do the following:
For each server, start a Task to check it
Either, wait until a server returns true (only a max of 1 server will ever return true)
Or, wait until all servers have returned false, i.e. there was no match.
The best of what I can think of is specifying a ContinueWith for each Task, checking the result, and if true cancelling the other tasks. For cancelling tasks you may want to use CancellationToken.
var tasks = servers
.Select(s => Task.Run(...)
.ContinueWith(t =>
if (t.Result) {
// cancel other threads
}
)
).ToArray();
UPDATE: An alternative solution would be to WaitAny until the right task completed (but it has some drawbacks, e.g. removing the finished tasks from the list and creating a new array out of the remaining ones is quite a heavy operation):
List<Task<bool>> tasks = servers.Select(s => Task<bool>.Factory.StartNew(server => CallServer((string)server), s)).ToList();
bool result;
do {
int idx = Task.WaitAny(tasks.ToArray());
result = tasks[idx].Result;
tasks.RemoveAt(idx);
} while (!result && tasks.Count > 0);
// cancel other tasks
UPDATE 2: Nowadays I would do it with Rx:
[Fact]
public async Task AwaitFirst()
{
var servers = new[] { "server1", "server2", "server3", "server4" };
var server = await servers
.Select(s => Observable
.FromAsync(ct => CallServer(s, ct))
.Where(p => p)
.Select(_ => s)
)
.Merge()
.FirstAsync();
output.WriteLine($"Got result from {server}");
}
private async Task<bool> CallServer(string server, CancellationToken ct)
{
try
{
if (server == "server1")
{
await Task.Delay(TimeSpan.FromSeconds(1), ct);
output.WriteLine($"{server} finished");
return false;
}
if (server == "server2")
{
await Task.Delay(TimeSpan.FromSeconds(2), ct);
output.WriteLine($"{server} finished");
return false;
}
if (server == "server3")
{
await Task.Delay(TimeSpan.FromSeconds(3), ct);
output.WriteLine($"{server} finished");
return true;
}
if (server == "server4")
{
await Task.Delay(TimeSpan.FromSeconds(4), ct);
output.WriteLine($"{server} finished");
return true;
}
}
catch(OperationCanceledException)
{
output.WriteLine($"{server} Cancelled");
throw;
}
throw new ArgumentOutOfRangeException(nameof(server));
}
The test takes 3.32 seconds on my machine (that means it didn't wait for the 4th server) and I got the following output:
server1 finished
server2 finished
server3 finished
server4 Cancelled
Got result from server3
You can use OrderByCompletion() from the AsyncEx library, which returns the tasks as they complete. Your code could look something like:
var tasks = servers
.Select(s => Task.Factory.StartNew(server => CallServer((string)server), s))
.OrderByCompletion();
foreach (var task in tasks)
{
if (task.Result)
{
Console.WriteLine("found");
break;
}
Console.WriteLine("not found yet");
}
// cancel any outstanding tasks since the correct server has been found
Using Interlocked.CompareExchange will do just that, only one Task will be able to write on serverReturedData
public void SearchServers()
{
ResultClass serverReturnedData = null;
var servers = new[] {"server1", "server2", "server3", "server4"};
var tasks = servers.Select(s => Task<bool>.Factory.StartNew(server =>
{
var result = CallServer((string)server), s);
Interlocked.CompareExchange(ref serverReturnedData, result, null);
}).ToArray();
Task.WaitAny(tasks); //how do I say "WaitAny where the result is true"?
//
// use serverReturnedData as you want.
//
}
EDIT: As Jasd said, the above code can return before the variable serverReturnedData have a valid value (if the server returns a null value, this can happen), to assure that you could wrap the result in a custom object.
Here's a generic solution based on svick's answer:
public static async Task<T> GetFirstResult<T>(
this IEnumerable<Func<CancellationToken, Task<T>>> taskFactories,
Action<Exception> exceptionHandler,
Predicate<T> predicate)
{
T ret = default(T);
var cts = new CancellationTokenSource();
var proxified = taskFactories.Select(tf => tf(cts.Token)).ProxifyByCompletion();
int i;
for (i = 0; i < proxified.Length; i++)
{
try
{
ret = await proxified[i].ConfigureAwait(false);
}
catch (Exception e)
{
exceptionHandler(e);
continue;
}
if (predicate(ret))
{
break;
}
}
if (i == proxified.Length)
{
throw new InvalidOperationException("No task returned the expected value");
}
cts.Cancel(); //we have our value, so we can cancel the rest of the tasks
for (int j = i+1; j < proxified.Length; j++)
{
//observe remaining tasks to prevent process crash
proxified[j].ContinueWith(
t => exceptionHandler(t.Exception), TaskContinuationOptions.OnlyOnFaulted)
.Forget();
}
return ret;
}
Where ProxifyByCompletion is implemented as:
public static Task<T>[] ProxifyByCompletion<T>(this IEnumerable<Task<T>> tasks)
{
var inputTasks = tasks.ToArray();
var buckets = new TaskCompletionSource<T>[inputTasks.Length];
var results = new Task<T>[inputTasks.Length];
for (int i = 0; i < buckets.Length; i++)
{
buckets[i] = new TaskCompletionSource<T>();
results[i] = buckets[i].Task;
}
int nextTaskIndex = -1;
foreach (var inputTask in inputTasks)
{
inputTask.ContinueWith(completed =>
{
var bucket = buckets[Interlocked.Increment(ref nextTaskIndex)];
if (completed.IsFaulted)
{
Trace.Assert(completed.Exception != null);
bucket.TrySetException(completed.Exception.InnerExceptions);
}
else if (completed.IsCanceled)
{
bucket.TrySetCanceled();
}
else
{
bucket.TrySetResult(completed.Result);
}
}, CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
}
return results;
}
And Forget is an empty method to suppress CS4014:
public static void Forget(this Task task) //suppress CS4014
{
}

Categories

Resources