Updated to explain things more clearly
I've got an application that runs a number of tasks. Some are created initially and other can be added later. I need need a programming structure that will wait on all the tasks to complete. Once the all the tasks complete some other code should run that cleans things up and does some final processing of data generated by the other tasks.
I've come up with a way to do this, but wouldn't call it elegant. So I'm looking to see if there is a better way.
What I do is keep a list of the tasks in a ConcurrentBag (a thread safe collection). At the start of the process I create and add some tasks to the ConcurrentBag. As the process does its thing if a new task is created that also needs to finish before the final steps I also add it to the ConcurrentBag.
Task.Wait accepts an array of Tasks as its argument. I can convert the ConcurrentBag into an array, but that array won't include any Tasks added to the Bag after Task.Wait was called.
So I have a two step wait process in a do while loop. In the body of the loop I do a simple Task.Wait on the array generated from the Bag. When it completes it means all the original tasks are done. Then in the while test I do a quick 1 millisecond test of a new array generated from the ConcurrentBag. If no new tasks were added, or any new tasks also completed it will return true, so the not condition exits the loop.
If it returns false (because a new task was added that didn't complete) we go back and do a non-timed Task.Wait. Then rinse and repeat until all new and old tasks are done.
// defined on the class, perhaps they should be properties
CancellationTokenSource Source = new CancellationTokenSource();
CancellationToken Token = Source.Token;
ConcurrentBag<Task> ToDoList = new ConcurrentBag<Task>();
public void RunAndWait() {
// start some tasks add them to the list
for (int i = 0; i < 12; i++)
{
Task task = new Task(() => SillyExample(Token), Token);
ToDoList.Add(task);
task.Start();
}
// now wait for those task, and any other tasks added to ToDoList to complete
try
{
do
{
Task.WaitAll(ToDoList.ToArray(), Token);
} while (! Task.WaitAll(ToDoList.ToArray(), 1, Token));
}
catch (OperationCanceledException e)
{
// any special handling of cancel we might want to do
}
// code that should only run after all tasks complete
}
Is there a more elegant way to do this?
I'd recommend using a ConcurrentQueue and removing items as you wait for them. Due to the first-in-first-out nature of queues, if you get to the point where there's nothing left in the queue, you know that you've waited for all the tasks that have been added up to that point.
ConcurrentQueue<Task> ToDoQueue = new ConcurrentQueue<Task>();
...
while(ToDoQueue.Count > 0 && !Token.IsCancellationRequested)
{
Task task;
if(ToDoQueue.TryDequeue(out task))
{
task.Wait(Token);
}
}
Here's a very cool way using Microsoft's Reactive Framework (NuGet "Rx-Main").
var taskSubject = new Subject<Task>();
var query = taskSubject.Select(t => Observable.FromAsync(() => t)).Merge();
var subscription =
query.Subscribe(
u => { /* Each Task Completed */ },
() => Console.WriteLine("All Tasks Completed."));
Now, to add tasks, just do this:
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
And then to signal completion:
taskSubject.OnCompleted();
It is important to note that signalling completion doesn't complete the query immediately, it will wait for all of the tasks to finish too. Signalling completion just says that you will no longer add any new tasks.
Finally, if you want to cancel, then just do this:
subscription.Dispose();
Related
I need your help, with threads I'm full 0 and you only need to create a certain thread and complete it on command, BUT I do not create each thread in advance, as there will be a lot of them, I do it like this:
Thread thread = new Thread(() => Go(..... many many variables that are taken from the listview ......));
thread.Start();
So, as noted above, variables are taken from the listview, which in turn is loaded by me from the file and then I run the threads I need. BUT the process in the stream is infinite and will end only if I completely close the program, and I would like to end the stream in the same way as I started it (right click on the desired line-start/stop). As I said, I have never worked with threads and thought that it was somehow simple, like when you start a thread, you assign it an ID and end it with the same ID, but alas. I have searched all over Google and have not found an EXAMPLE that suits me (I will repeat for the third time - I have never worked with threads and I do not need to say "go read about TPL"), so I ask for help, preferably with an example)
I have a very bad idea: in the sheet there is an invisible column in which an id is generated at the start, then when I send a command to start the thread, a unique variable is created with the name for example int id1=0 and its name is passed to the thread itself and each time the loop starts, id1=0 or 1 is checked in it, respectively, if 0-continue, if 1-empty. Well, it is logical that when you click the stop button, its value changes to 1. But something seems to me that the holy spirit of multithreading will punish me for this when the threads become 100+. I read this idea somewhere, so don't swear)
You do not need hundreds of threads for this. Your worker "threads" are performing HTTP requests, which can be done asynchronously without requiring a new thread. Also, hundreds of threads wouldn't really help you unless you have hundreds of CPU cores (you don't).
For this sort of work, I'd recommend the following:
Write a method that does all the work your thread does, but also checks a CancellationToken with each iteration.
Calls the method in a loop, once for each account, and store the resulting tasks in an array or list. Or use LINQ (as I do in this example) to create the list.
When your program terminates, activate the CancellationToken.
After cancelling, you have to await all the tasks in order to observe any possible exceptions and exit cleanly.
For example
public async Task DoTheWork(Account account, CancellationToken token)
{
while (!token.IsCancellationRequested)
{
var result = await httpClient.GetAsync(account.Url);
await DoSomethingWithResult(result);
await Task.Delay(1000);
}
}
//Main program
var accounts = GetAccountList();
var source = new CancellationTokenSource();
var tasks = accounts.Select( x => DoTheWork(x, source.Token) ).ToList();
//When exiting
source.Cancel();
await Task.WhenAll( tasks );
source.Dispose();
Indivivdual cancellation
Here's another approach that keeps a list of the accounts and a delegate that can be used for cancelling the task for that specific account.
//Declare this somewhere it will persist for the duration of the program
//The key to this dictionary is the account you wish to cancel
//The value is a delegate that you can call to cancel its task
Dictionary<Account, Func<Task>> _tasks = new Dictionary<Account, Func<Task>>();
async Task CreateTasks()
{
var accounts = GetAccounts();
foreach (var account in accounts)
{
var source = new CancellationTokenSource();
var task = DoTheWork(account, source.Token);
_tasks.Add(account, () => { source.Cancel(); return task; });
}
}
//Retrieve the delegate from the dictionary and call it to cancel its task
//Then await the task to observe any exceptions
//Then remove it from the list
async Task CancelTask(Account account)
{
var cancelAction = _tasks[account];
var task = cancelAction();
await task;
_tasks.Remove(account);
}
async Task CancelAllTasks()
{
var tasks = _tasks.Select(x => x.Value()).ToList();
await Task.WhenAll(tasks);
}
Right now, I've got a C# program that performs the following steps on a recurring basis:
Grab current list of tasks from the database
Using Parallel.ForEach(), do work for each task
However, some of these tasks are very long-running. This delays the processing of other pending tasks because we only look for new ones at the start of the program.
Now, I know that modifying the collection being iterated over isn't possible (right?), but is there some equivalent functionality in the C# Parallel framework that would allow me to add work to the list while also processing items in the list?
Generally speaking, you're right that modifying a collection while iterating it is not allowed. But there are other approaches you could be using:
Use ActionBlock<T> from TPL Dataflow. The code could look something like:
var actionBlock = new ActionBlock<MyTask>(
task => DoWorkForTask(task),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded });
while (true)
{
var tasks = GrabCurrentListOfTasks();
foreach (var task in tasks)
{
actionBlock.Post(task);
await Task.Delay(someShortDelay);
// or use Thread.Sleep() if you don't want to use async
}
}
Use BlockingCollection<T>, which can be modified while consuming items from it, along with GetConsumingParititioner() from ParallelExtensionsExtras to make it work with Parallel.ForEach():
var collection = new BlockingCollection<MyTask>();
Task.Run(async () =>
{
while (true)
{
var tasks = GrabCurrentListOfTasks();
foreach (var task in tasks)
{
collection.Add(task);
await Task.Delay(someShortDelay);
}
}
});
Parallel.ForEach(collection.GetConsumingPartitioner(), task => DoWorkForTask(task));
Here is an example of an approach you could try. I think you want to get away from Parallel.ForEaching and do something with asynchronous programming instead because you need to retrieve results as they finish, rather than in discrete chunks that could conceivably contain both long running tasks and tasks that finish very quickly.
This approach uses a simple sequential loop to retrieve results from a list of asynchronous tasks. In this case, you should be safe to use a simple non-thread safe mutable list because all of the mutation of the list happens sequentially in the same thread.
Note that this approach uses Task.WhenAny in a loop which isn't very efficient for large task lists and you should consider an alternative approach in that case. (See this blog: http://blogs.msdn.com/b/pfxteam/archive/2012/08/02/processing-tasks-as-they-complete.aspx)
This example is based on: https://msdn.microsoft.com/en-GB/library/jj155756.aspx
private async Task<ProcessResult> processTask(ProcessTask task)
{
// do something intensive with data
}
private IEnumerable<ProcessTask> GetOutstandingTasks()
{
// retreive some tasks from db
}
private void ProcessAllData()
{
List<Task<ProcessResult>> taskQueue =
GetOutstandingTasks()
.Select(tsk => processTask(tsk))
.ToList(); // grab initial task queue
while(taskQueue.Any()) // iterate while tasks need completing
{
Task<ProcessResult> firstFinishedTask = await Task.WhenAny(taskQueue); // get first to finish
taskQueue.Remove(firstFinishedTask); // remove the one that finished
ProcessResult result = await firstFinishedTask; // get the result
// do something with task result
taskQueue.AddRange(GetOutstandingTasks().Select(tsk => processData(tsk))) // add more tasks that need performing
}
}
First of all I am totally new to threading in C#. I have created multiple threads as shown below.
if (flag)
{
foreach (string empNo in empList)
{
Thread thrd = new Thread(()=>ComputeSalary(empNo));
threadList.Add(thrd);
thrd.Start();
}
}
Before proceeding further I need check if at least one thread is completed its execution so that I can perform additional operations.
I also tried creating the list of type thread and by added it to list, so that I can check if at least one thread has completed its execution. I tried with thrd.IsAlive but it always gives me current thread status.
Is there any other way to check if atleast on thread has completed its execution?
You can use AutoResetEvent.
var reset = new AutoResetEvent(false); // ComputeSalary should have access to reset
.....
....
if (flag)
{
foreach (string empNo in empList)
{
Thread thrd = new Thread(()=>ComputeSalary(empNo));
threadList.Add(thrd);
thrd.Start();
}
reset.WaitOne();
}
.....
.....
void ComputeSalary(int empNo)
{
.....
reset.set()
}
Other options are callback function, event or a flag/counter(this is not advised).
Here is a solution based on the Task Parallel Library:
// Create a list of tasks for each string in empList
List<Task> empTaskList = empList.Select(emp => Task.Run(() => ComputeSalary(emp)))
.ToList();
// Give me the task that finished first.
var firstFinishedTask = await Task.WhenAny(empTaskList);
A couple of things to note:
In order to use await inside your method, you will have to declare it as async Task or or async Task<T> where T is the desired return type
Task.Run is your equivalent of new Thread().Start(). The difference is Task.Run will use the ThreadPool (unless you explicitly tell it not to), and the Thread class will construct an entirely new thread.
Notice the use of await. This tells the compiler to yield control back to the caller until Task.WhenAny returns the first task that finished.
You should read more about async-await here
So here's the situation: I need to make a call to a web site that starts a search. This search continues for an unknown amount of time, and the only way I know if the search has finished is by periodically querying the website to see if there's a "Download Data" link somewhere on it (it uses some strange ajax call on a javascript timer to check the backend and update the page, I think).
So here's the trick: I have hundreds of items I need to search for, one at a time. So I have some code that looks a little bit like this:
var items = getItems();
Parallel.ForEach(items, item =>
{
startSearch(item);
var finished = isSearchFinished(item);
while(finished == false)
{
finished = isSearchFinished(item); //<--- How do I delay this action 30 Secs?
}
downloadData(item);
}
Now, obviously this isn't the real code, because there could be things that cause isSearchFinished to always be false.
Obvious infinite loop danger aside, how would I correctly keep isSearchFinished() from calling over and over and over, but instead call every, say, 30 seconds or 1 minute?
I know Thread.Sleep() isn't the right solution, and I think the solution might be accomplished by using Threading.Timer() but I'm not very familiar with it, and there are so many threading options that I'm just not sure which to use.
It's quite easy to implement with tasks and async/await, as noted by #KevinS in the comments:
async Task<ItemData> ProcessItemAsync(Item item)
{
while (true)
{
if (await isSearchFinishedAsync(item))
break;
await Task.Delay(30 * 1000);
}
return await downloadDataAsync(item);
}
// ...
var items = getItems();
var tasks = items.Select(i => ProcessItemAsync(i)).ToArray();
await Task.WhenAll(tasks);
var data = tasks.Select(t = > t.Result);
This way, you don't block ThreadPool threads in vain for what is mostly a bunch of I/O-bound network operations. If you're not familiar with async/await, the async-await tag wiki might be a good place to start.
I assume you can convert your synchronous methods isSearchFinished and downloadData to asynchronous versions using something like HttpClient for non-blocking HTTP request and returning a Task<>. If you are unable to do so, you still can simply wrap them with Task.Run, as await Task.Run(() => isSearchFinished(item)) and await Task.Run(() => downloadData(item)). Normally this is not recommended, but as you have hundreds of items, it sill would give you a much better level of concurrency than with Parallel.ForEach in this case, because you won't be blocking pool threads for 30s, thanks to asynchronous Task.Delay.
You can also write a generic function using TaskCompletionSource and Threading.Timer to return a Task that becomes complete once a specified retry function succeeds.
public static Task RetryAsync(Func<bool> retryFunc, TimeSpan retryInterval)
{
return RetryAsync(retryFunc, retryInterval, CancellationToken.None);
}
public static Task RetryAsync(Func<bool> retryFunc, TimeSpan retryInterval, CancellationToken cancellationToken)
{
var tcs = new TaskCompletionSource<object>();
cancellationToken.Register(() => tcs.TrySetCanceled());
var timer = new Timer((state) =>
{
var taskCompletionSource = (TaskCompletionSource<object>) state;
try
{
if (retryFunc())
{
taskCompletionSource.TrySetResult(null);
}
}
catch (Exception ex)
{
taskCompletionSource.TrySetException(ex);
}
}, tcs, TimeSpan.FromMilliseconds(0), retryInterval);
// Once the task is complete, dispose of the timer so it doesn't keep firing. Also captures the timer
// in a closure so it does not get disposed.
tcs.Task.ContinueWith(t => timer.Dispose(),
CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
return tcs.Task;
}
You can then use RetryAsync like this:
var searchTasks = new List<Task>();
searchTasks.AddRange(items.Select(
downloadItem => RetryAsync( () => isSearchFinished(downloadItem), TimeSpan.FromSeconds(2)) // retry timout
.ContinueWith(t => downloadData(downloadItem),
CancellationToken.None,
TaskContinuationOptions.OnlyOnRanToCompletion,
TaskScheduler.Default)));
await Task.WhenAll(searchTasks.ToArray());
The ContinueWith part specifies what you do once the task has completed successfully. In this case it will run your downloadData method on a thread pool thread because we specified TaskScheduler.Default and the continuation will only execute if the task ran to completion, i.e. it was not canceled and no exception was thrown.
here is sample code for starting multiple task
Task.Factory.StartNew(() =>
{
//foreach (KeyValuePair<string, string> entry in dicList)
Parallel.ForEach(dicList,
entry =>
{
//create and add the Progress in UI thread
var ucProgress = (Progress)fpPanel.Invoke(createProgress, entry);
//execute ucProgress.Process(); in non-UI thread in parallel.
//the .Process(); must update UI by using *Invoke
ucProgress.Process();
System.Threading.Thread.SpinWait(5000000);
});
});
.ContinueWith(task =>
{
//to handle exceptions use task.Exception member
var progressBar = (ProgressBar)task.AsyncState;
if (!task.IsCancelled)
{
//hide progress bar here and reset pb.Value = 0
}
},
TaskScheduler.FromCurrentSynchronizationContext() //update UI from UI thread
);
when we start multiple task using Task.Factory.StartNew() then we can use .ContinueWith() block to determine when each task finish. i mean ContinueWith block fire once for each task completion. so i just want to know is there any mechanism in TPL library. if i start 10 task using Task.Factory.StartNew() so how do i notify after when 10 task will be finish. please give some insight with sample code.
if i start 10 task using Task.Factory.StartNew() so how do i notify after when 10 task will be finish
Three options:
The blocking Task.WaitAll call, which only returns when all the given tasks have completed
The async Task.WhenAll call, which returns a task which completes when all the given tasks have completed. (Introduced in .NET 4.5.)
TaskFactory.ContinueWhenAll, which adds a continuation task which will run when all the given tasks have completed.
if i start 10 task using Task.Factory.StartNew() so how do i notify after when 10 task will be finish
You can use Task.WaitAll. This call will block current thread until all tasks are finished.
Side note: you seem to be using Task, Parallel and Thread.SpinWait, which makes your code complex. I would spend a bit of time analysing if that complexity is really necessary.
You can use the WaitAll(). Example :
Func<bool> DummyMethod = () =>{
// When ready, send back complete!
return true;
};
// Create list of tasks
System.Threading.Tasks.Task<bool>[] tasks = new System.Threading.Tasks.Task<bool>[2];
// First task
var firstTask = System.Threading.Tasks.Task.Factory.StartNew(() => DummyMethod(), TaskCreationOptions.LongRunning);
tasks[0] = firstTask;
// Second task
var secondTask = System.Threading.Tasks.Task.Factory.StartNew(() => DummyMethod(), TaskCreationOptions.LongRunning);
tasks[1] = secondTask;
// Launch all
System.Threading.Tasks.Task.WaitAll(tasks);
Another solution:
After the completion of all the operation inside Parallel.For(...) it return an onject of ParallelLoopResult, Documentation:
For returns a System.Threading.Tasks.ParallelLoopResult object when
all threads have completed. This return value is useful when you are
stopping or breaking loop iteration manually, because the
ParallelLoopResult stores information such as the last iteration that
ran to completion. If one or more exceptions occur on one of the
threads, a System.AggregateException will be thrown.
The ParallelLoopResult class has a IsCompleted property that is set to false when a Stop() of Break() method has been executed.
Example:
ParallelLoopResult result = Parallel.For(...);
if (result.IsCompleted)
{
//Start another task
}
Note that it advised to use it only when breaking or stoping the loop manually (otherwise just use WaitAll, WhenAll etc).