C# How to remove a completed Task from Task[] - c#

I have an array of tasks running like this: (the number of tasks in the array is not fixed to 2, it can vary from time to time).
Task[] templates = new Task[2] {
Task.Factory.StartNew( () => { x.foo(); }),
Task.Factory.StartNew( () => { y.bar(); })
};
Then I wait for any of them to finish.
int taskID;
taskID = Task.WaitAny(templates);
As soon a task has finished I inspect various parameters inside the object (x or y) and if I don't have the result I expected I want to remove the finished task from the template array. something like this
template[taskID].RemoveThisTaskFromArray() // This is pseudo code to describe what I want.
taskID = Task.WaitAny(templates);
I will do this in a loop until the task array template[] is empty or I decide to abort the operation and break out of the loop.
int taskID;
do {
taskID = Task.WaitAny(templates);
if(template[taskID].valid == true) { // <-- example of test for success
// stop the remaining task in the template[] array.
break;
}
// remove the completed task so we can use WaitAany() again.
} while (template.Length > 0);
It seems like Task[] doesn't support array members so There is no Task[].RemoveAt() function.
I hope you get my point. Any way to solve this?

Just use a List<Task> instead. But also, use Task.Run instead of Task.Factory.StartNew for reasons described in detail here.
var templates = new List<Task> {
Task.Run( () => { x.foo(); }),
Task.Run( () => { y.bar(); })
};
int taskID;
do {
taskID = Task.WaitAny(templates.ToArray());
if(templates[taskID].valid == true) { // <-- example of test for success
// stop the remaining task in the template[] array.
break;
}
templates.RemoveAt(taskID);
} while (template.Count > 0);
This calls .ToArray(), which will creates a new array (preserving order) so is slightly inefficient, but it works. This is really the easiest way to give Task.WaitAll the array it's looking for, while preserving your ability to easily remove items.
But I don't know what you're trying to do with template[taskID].valid. Did you mean to use your templates (with an 's') array? (that's what I assumed) But Task doesn't have any property called valid. Did you mean to look at the Status property?
However, you can consider using async and await and use Task.WhenAny to avoid blocking the current thread while you wait. Task.WhenAll also accepts any IEnumerable, so you don't need to call .ToArray().
do {
var doneTask = await Task.WhenAny(templates);
// This will return any result, but also throw any exception that
// might have happened inside the task.
await doneTask;
templates.Remove(doneTask);
} while (templates.Count > 0);
There are some very well-written articles about asynchronous programming, starting here: Asynchronous programming with async and await

Remove completed task from list of Tasks can be done using linq like this
templates.RemoveAll(x => x.IsCompleted);

Related

Task.WaitAny() — Checking for results

I have a series of Tasks in an array. If a Task is "Good" it returns a string. If it's "Bad": it return a null.
I want to be able to run all the Tasks in parallel, and once the first one comes back that is "Good", then cancel the others and get the "Good" result.
I am doing this now, but the problem is that all the tasks need to run, then I loop through them looking for the first good result.
List<Task<string>> tasks = new List<Task<string>>();
Task.WaitAll(tasks.ToArray());
I want to be able to run all the Tasks in parallel, and once the first one comes back that is "Good", then cancel the others and get the "Good" result.
This is misunderstanding, since Cancellation in TPL is co-operative, so once the Task is started, there's no way to Cancel it. CancellationToken can work before Task is started or later to throw an exception, if Cancellation is requested, which is meant to initiate and take necessary action, like throw custom exception from the logic
Check the following query, it has many interesting answers listed, but none of them Cancel. Following is also a possible option:
public static class TaskExtension<T>
{
public static async Task<T> FirstSuccess(IEnumerable<Task<T>> tasks, T goodResult)
{
// Create a List<Task<T>>
var taskList = new List<Task<T>>(tasks);
// Placeholder for the First Completed Task
Task<T> firstCompleted = default(Task<T>);
// Looping till the Tasks are available in the List
while (taskList.Count > 0)
{
// Fetch first completed Task
var currentCompleted = await Task.WhenAny(taskList);
// Compare Condition
if (currentCompleted.Status == TaskStatus.RanToCompletion
&& currentCompleted.Result.Equals(goodResult))
{
// Assign Task and Clear List
firstCompleted = currentCompleted;
break;
}
else
// Remove the Current Task
taskList.Remove(currentCompleted);
}
return (firstCompleted != default(Task<T>)) ? firstCompleted.Result : default(T);
}
}
Usage:
var t1 = new Task<string>(()=>"bad");
var t2 = new Task<string>(()=>"bad");
var t3 = new Task<string>(()=>"good");
var t4 = new Task<string>(()=>"good");
var taskArray = new []{t1,t2,t3,t4};
foreach(var tt in taskArray)
tt.Start();
var finalTask = TaskExtension<string>.FirstSuccess(taskArray,"good");
Console.WriteLine(finalTask.Result);
You may even return Task<Task<T>>, instead of Task<T> for necessary logical processing
You can achieve your desired results using following example.
List<Task<string>> tasks = new List<Task<string>>();
// ***Use ToList to execute the query and start the tasks.
List<Task<string>> goodBadTasks = tasks.ToList();
// ***Add a loop to process the tasks one at a time until none remain.
while (goodBadTasks.Count > 0)
{
// Identify the first task that completes.
Task<string> firstFinishedTask = await Task.WhenAny(goodBadTasks);
// ***Remove the selected task from the list so that you don't
// process it more than once.
goodBadTasks.Remove(firstFinishedTask);
// Await the completed task.
string firstFinishedTaskResult = await firstFinishedTask;
if(firstFinishedTaskResult.Equals("good")
// do something
}
EDIT : If you want to terminate all the tasks you can use CancellationToken.
For more detail read the docs.
I was looking into Task.WhenAny() which will trigger on the first "completed" task. Unfortunately, a completed task in this sense is basically anything... even an exception is considered "completed". As far as I can tell there is no other way to check for what you call a "good" value.
While I don't believe there is a satisfactory answer for your question I think there may be an alternative solution to your problem. Consider using Parallel.ForEach.
Parallel.ForEach(tasks, (task, state) =>
{
if (task.Result != null)
state.Stop();
});
The state.Stop() will cease the execution of the Parallel loop when it finds a non-null result.
Besides having the ability to cease execution when it finds a "good" value, it will perform better under many (but not all) scenarios.
Use Task.WhenAny It returns the finished Task. Check if it's null. If it is, remove it from the List and call Task.WhenAny Again.
If it's good, Cancel all Tasks in the List (they should all have a CancellationTokenSource.Token.
Edit:
All Tasks should use the same CancellationTokenSource.Token. Then you only need to cancel once.
Here is some code to clarify:
private async void button1_Click(object sender, EventArgs e)
{
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
List<Task<string>> tasks = new List<Task<string>>();
tasks.Add(Task.Run<string>(() => // run your tasks
{
while (true)
{
if (cancellationTokenSource.Token.IsCancellationRequested)
{
return null;
}
return "Result"; //string or null
}
}));
while (tasks.Count > 0)
{
Task<string> resultTask = await Task.WhenAny(tasks);
string result = await resultTask;
if (result == null)
{
tasks.Remove(resultTask);
}
else
{
// success
cancellationTokenSource.Cancel(); // will cancel all tasks
}
}
}

C# Parallel - Adding items to the collection being iterated over, or equivalent?

Right now, I've got a C# program that performs the following steps on a recurring basis:
Grab current list of tasks from the database
Using Parallel.ForEach(), do work for each task
However, some of these tasks are very long-running. This delays the processing of other pending tasks because we only look for new ones at the start of the program.
Now, I know that modifying the collection being iterated over isn't possible (right?), but is there some equivalent functionality in the C# Parallel framework that would allow me to add work to the list while also processing items in the list?
Generally speaking, you're right that modifying a collection while iterating it is not allowed. But there are other approaches you could be using:
Use ActionBlock<T> from TPL Dataflow. The code could look something like:
var actionBlock = new ActionBlock<MyTask>(
task => DoWorkForTask(task),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });
while (true)
{
var tasks = GrabCurrentListOfTasks();
foreach (var task in tasks)
{
actionBlock.Post(task);
await Task.Delay(someShortDelay);
// or use Thread.Sleep() if you don't want to use async
}
}
Use BlockingCollection<T>, which can be modified while consuming items from it, along with GetConsumingParititioner() from ParallelExtensionsExtras to make it work with Parallel.ForEach():
var collection = new BlockingCollection<MyTask>();
Task.Run(async () =>
{
while (true)
{
var tasks = GrabCurrentListOfTasks();
foreach (var task in tasks)
{
collection.Add(task);
await Task.Delay(someShortDelay);
}
}
});
Parallel.ForEach(collection.GetConsumingPartitioner(), task => DoWorkForTask(task));
Here is an example of an approach you could try. I think you want to get away from Parallel.ForEaching and do something with asynchronous programming instead because you need to retrieve results as they finish, rather than in discrete chunks that could conceivably contain both long running tasks and tasks that finish very quickly.
This approach uses a simple sequential loop to retrieve results from a list of asynchronous tasks. In this case, you should be safe to use a simple non-thread safe mutable list because all of the mutation of the list happens sequentially in the same thread.
Note that this approach uses Task.WhenAny in a loop which isn't very efficient for large task lists and you should consider an alternative approach in that case. (See this blog: http://blogs.msdn.com/b/pfxteam/archive/2012/08/02/processing-tasks-as-they-complete.aspx)
This example is based on: https://msdn.microsoft.com/en-GB/library/jj155756.aspx
private async Task<ProcessResult> processTask(ProcessTask task)
{
// do something intensive with data
}
private IEnumerable<ProcessTask> GetOutstandingTasks()
{
// retreive some tasks from db
}
private void ProcessAllData()
{
List<Task<ProcessResult>> taskQueue =
GetOutstandingTasks()
.Select(tsk => processTask(tsk))
.ToList(); // grab initial task queue
while(taskQueue.Any()) // iterate while tasks need completing
{
Task<ProcessResult> firstFinishedTask = await Task.WhenAny(taskQueue); // get first to finish
taskQueue.Remove(firstFinishedTask); // remove the one that finished
ProcessResult result = await firstFinishedTask; // get the result
// do something with task result
taskQueue.AddRange(GetOutstandingTasks().Select(tsk => processData(tsk))) // add more tasks that need performing
}
}

WaitAll for Changing List<Task>

Updated to explain things more clearly
I've got an application that runs a number of tasks. Some are created initially and other can be added later. I need need a programming structure that will wait on all the tasks to complete. Once the all the tasks complete some other code should run that cleans things up and does some final processing of data generated by the other tasks.
I've come up with a way to do this, but wouldn't call it elegant. So I'm looking to see if there is a better way.
What I do is keep a list of the tasks in a ConcurrentBag (a thread safe collection). At the start of the process I create and add some tasks to the ConcurrentBag. As the process does its thing if a new task is created that also needs to finish before the final steps I also add it to the ConcurrentBag.
Task.Wait accepts an array of Tasks as its argument. I can convert the ConcurrentBag into an array, but that array won't include any Tasks added to the Bag after Task.Wait was called.
So I have a two step wait process in a do while loop. In the body of the loop I do a simple Task.Wait on the array generated from the Bag. When it completes it means all the original tasks are done. Then in the while test I do a quick 1 millisecond test of a new array generated from the ConcurrentBag. If no new tasks were added, or any new tasks also completed it will return true, so the not condition exits the loop.
If it returns false (because a new task was added that didn't complete) we go back and do a non-timed Task.Wait. Then rinse and repeat until all new and old tasks are done.
// defined on the class, perhaps they should be properties
CancellationTokenSource Source = new CancellationTokenSource();
CancellationToken Token = Source.Token;
ConcurrentBag<Task> ToDoList = new ConcurrentBag<Task>();
public void RunAndWait() {
// start some tasks add them to the list
for (int i = 0; i < 12; i++)
{
Task task = new Task(() => SillyExample(Token), Token);
ToDoList.Add(task);
task.Start();
}
// now wait for those task, and any other tasks added to ToDoList to complete
try
{
do
{
Task.WaitAll(ToDoList.ToArray(), Token);
} while (! Task.WaitAll(ToDoList.ToArray(), 1, Token));
}
catch (OperationCanceledException e)
{
// any special handling of cancel we might want to do
}
// code that should only run after all tasks complete
}
Is there a more elegant way to do this?
I'd recommend using a ConcurrentQueue and removing items as you wait for them. Due to the first-in-first-out nature of queues, if you get to the point where there's nothing left in the queue, you know that you've waited for all the tasks that have been added up to that point.
ConcurrentQueue<Task> ToDoQueue = new ConcurrentQueue<Task>();
...
while(ToDoQueue.Count > 0 && !Token.IsCancellationRequested)
{
Task task;
if(ToDoQueue.TryDequeue(out task))
{
task.Wait(Token);
}
}
Here's a very cool way using Microsoft's Reactive Framework (NuGet "Rx-Main").
var taskSubject = new Subject<Task>();
var query = taskSubject.Select(t => Observable.FromAsync(() => t)).Merge();
var subscription =
query.Subscribe(
u => { /* Each Task Completed */ },
() => Console.WriteLine("All Tasks Completed."));
Now, to add tasks, just do this:
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
And then to signal completion:
taskSubject.OnCompleted();
It is important to note that signalling completion doesn't complete the query immediately, it will wait for all of the tasks to finish too. Signalling completion just says that you will no longer add any new tasks.
Finally, if you want to cancel, then just do this:
subscription.Dispose();

Speculative execution using the TPL

I have a List<Task<bool>> that I want to enumerate in parallel finding the first task to complete with a result of true and not waiting for or observe exceptions on any of the other tasks still pending.
var tasks = new List<Task<bool>>
{
Task.Delay(2000).ContinueWith(x => false),
Task.Delay(0).ContinueWith(x => true),
};
I have tried to use PLINQ to do something like:
var task = tasks.AsParallel().FirstOrDefault(t => t.Result);
Which executes in parallel, but doesn't return as soon as it finds a satisfying result. because accessing the Result property is blocking. In order for this to work using PLINQ, I'd have to write this aweful statement:
var cts = new CancellationTokenSource();
var task = tasks.AsParallel()
.FirstOrDefault(t =>
{
try
{
t.Wait(cts.Token);
if (t.Result)
{
cts.Cancel();
}
return t.Result;
}
catch (OperationCanceledException)
{
return false;
}
} );
I've written up an extension method that yields tasks as they complete like so.
public static class Exts
{
public static IEnumerable<Task<T>> InCompletionOrder<T>(this IEnumerable<Task<T>> source)
{
var tasks = source.ToList();
while (tasks.Any())
{
var t = Task.WhenAny(tasks);
yield return t.Result;
tasks.Remove(t.Result);
}
}
}
// and run like so
var task = tasks.InCompletionOrder().FirstOrDefault(t => t.Result);
But it feels like this is something common enough that there is a better way. Suggestions?
Maybe something like this?
var tcs = new TaskCompletionSource<Task<bool>>();
foreach (var task in tasks)
{
task.ContinueWith((t, state) =>
{
if (t.Result)
{
((TaskCompletionSource<Task<bool>>)state).TrySetResult(t);
}
},
tcs,
TaskContinuationOptions.OnlyOnRanToCompletion |
TaskContinuationOptions.ExecuteSynchronously);
}
var firstTaskToComplete = tcs.Task;
Perhaps you could try the Rx.Net library. Its very good for in effect providing Linq to Work.
Try this snippet in LinqPad after you reference the Microsoft Rx.Net assemblies.
using System
using System.Linq
using System.Reactive.Concurrency
using System.Reactive.Linq
using System.Reactive.Threading.Tasks
using System.Threading.Tasks
void Main()
{
var tasks = new List<Task<bool>>
{
Task.Delay(2000).ContinueWith(x => false),
Task.Delay(0).ContinueWith(x => true),
};
var observable = (from t in tasks.ToObservable()
//Convert task to an observable
let o = t.ToObservable()
//SelectMany
from x in o
select x);
var foo = observable
.SubscribeOn(Scheduler.Default) //Run the tasks on the threadpool
.ToList()
.First();
Console.WriteLine(foo);
}
First, I don't understand why are you trying to use PLINQ here. Enumerating a list of Tasks shouldn't take long, so I don't think you're going to gain anything from parallelizing it.
Now, to get the first Task that already completed with true, you can use the (non-blocking) IsCompleted property:
var task = tasks.FirstOrDefault(t => t.IsCompleted && t.Result);
If you wanted to get a collection of Tasks, ordered by their completion, have a look at Stephen Toub's article Processing tasks as they complete. If you want to list those that return true first, you would need to modify that code. If you don't want to modify it, you can use a version of this approach from Stephen Cleary's AsyncEx library.
Also, in the specific case in your question, you could “fix” your code by adding .WithMergeOptions(ParallelMergeOptions.NotBuffered) to the PLINQ query. But doing so still wouldn't work most of the time and can waste threads a lot even when it does. That's because PLINQ uses a constant number of threads and partitioning and using Result would block those threads most of the time.

Async/await tasks and WaitHandle

Say I have 10N items(I need to fetch them via http protocol), in the code N Tasks are started to get data, each task takes 10 items in sequence. I put the items in a ConcurrentQueue<Item>. After that, the items are processed in a thread-unsafe method one by one.
async Task<Item> GetItemAsync()
{
//fetch one item from the internet
}
async Task DoWork()
{
var tasks = new List<Task>();
var items = new ConcurrentQueue<Item>();
var handles = new List<ManualResetEvent>();
for i 1 -> N
{
var handle = new ManualResetEvent(false);
handles.Add(handle);
tasks.Add(Task.Factory.StartNew(async delegate
{
for j 1 -> 10
{
var item = await GetItemAsync();
items.Enqueue(item);
}
handle.Set();
});
}
//begin to process the items when any handle is set
WaitHandle.WaitAny(handles);
while(true)
{
if (all handles are set && items collection is empty) //***
break;
//in another word: all tasks are really completed
while(items.TryDequeue(out item))
{
AThreadUnsafeMethod(item); //process items one by one
}
}
}
I don't know what if condition can be placed in the statement marked ***. I can't use Task.IsCompleted property here, because I use await in the task, so the task is completed very soon. And a bool[] that indicates whether the task is executed to the end looks really ugly, because I think ManualResetEvent can do the same work. Can anyone give me a suggestion?
Well, you could build this yourself, but I think it's tons easier with TPL Dataflow.
Something like:
static async Task DoWork()
{
// By default, ActionBlock uses MaxDegreeOfParallelism == 1,
// so AThreadUnsafeMethod is not called in parallel.
var block = new ActionBlock<Item>(AThreadUnsafeMethod);
// Start off N tasks, each asynchronously acquiring 10 items.
// Each item is sent to the block as it is received.
var tasks = Enumerable.Range(0, N).Select(Task.Run(
async () =>
{
for (int i = 0; i != 10; ++i)
block.Post(await GetItemAsync());
})).ToArray();
// Complete the block when all tasks have completed.
Task.WhenAll(tasks).ContinueWith(_ => { block.Complete(); });
// Wait for the block to complete.
await block.Completion;
}
You can do a WaitOne with a timeout of zero to check the state. Something like this should work:
if (handles.All(handle => handle.WaitOne(TimeSpan.Zero)) && !items.Any())
break;
http://msdn.microsoft.com/en-us/library/cc190477.aspx
Thanks all. At last I found CountDownEvent is very suitable for this scenario. The general implementation looks like this:(for others' information)
for i 1 -> N
{
//start N tasks
//invoke CountDownEvent.Signal() at the end of each task
}
//see if CountDownEvent.IsSet here

Categories

Resources