This is probably a pretty basic question, but just something that I wanted to make sure I had right in my head.
Today I was digging with TPL library and found that there are two way of creating instance of Task class.
Way I
Task<int> t1 = Task.Factory.StartNew(() =>
{
//Some code
return 100;
});
Way II
TaskCompletionSource<int> task = new TaskCompletionSource<int>();
Task t2 = task.Task;
task.SetResult(100);
Now,I just wanted to know that
Is there any difference between these instances?
If yes then what?
The second example does not create a "real" task, i.e. there is no delegate that does anything.
You use it mostly to present a Task interface to the caller. Look at the example on
msdn
TaskCompletionSource<int> tcs1 = new TaskCompletionSource<int>();
Task<int> t1 = tcs1.Task;
// Start a background task that will complete tcs1.Task
Task.Factory.StartNew(() =>
{
Thread.Sleep(1000);
tcs1.SetResult(15);
});
// The attempt to get the result of t1 blocks the current thread until the completion source gets signaled.
// It should be a wait of ~1000 ms.
Stopwatch sw = Stopwatch.StartNew();
int result = t1.Result;
sw.Stop();
Console.WriteLine("(ElapsedTime={0}): t1.Result={1} (expected 15) ", sw.ElapsedMilliseconds, result);
As you are not firing any async operation in Way 1 above, you are wasting time by consuming another thread from the threadpool (possibly, if you don't change the default TaskScheduler).
However, in the Way 2, you are generating a completed task and you do it in the same thread that you are one. TCS can been also seen as a threadless task (probably the wrong description but used by several devs).
Related
I'm currently trying to write a status checking tool with a reliable timeout value. One way I'd seen how to do this was using Task.WhenAny() and including a Task.Delay, however it doesn't seem to produce the results I expect:
public void DoIUnderstandTasksTest()
{
var checkTasks = new List<Task>();
// Create a list of dummy tasks that should just delay or "wait"
// for some multiple of the timeout
for (int i = 0; i < 10; i++)
{
checkTasks.Add(Task.Delay(_timeoutMilliseconds/2));
}
// Wrap the group of tasks in a task that will wait till they all finish
var allChecks = Task.WhenAll(checkTasks);
// I think WhenAny is supposed to return the first task that completes
bool didntTimeOut = Task.WhenAny(allChecks, Task.Delay(_timeoutMilliseconds)) == allChecks;
Assert.True(didntTimeOut);
}
What am I missing here?
I think you're confusing the workings of the When... calls with Wait....
Task.WhenAny doesn't return the first task to complete among those you pass to it. Rather, it returns a new Task that will be completed when any of the internal tasks finish. This means your equality check will always return false - the new task will never equal the previous one.
The behavior you're expecting seems similar to Task.WaitAny, which will block current execution until any of the internal tasks complete, and return the index of the completed task.
Using WaitAny, your code will look like this:
// Wrap the group of tasks in a task that will wait till they all finish
var allChecks = Task.WhenAll(checkTasks);
var taskIndexThatCompleted = Task.WaitAny(allChecks, Task.Delay(_timeoutMilliseconds));
Assert.AreEqual(0, taskIndexThatCompleted);
I've the following code:
static void Main(string[] args)
{
IEnumerable<int> threadsIds = Enumerable.Range(1, 1000);
DateTime globalStart = DateTime.Now;
Console.WriteLine("{0:s.fff} Starting tasks", globalStart);
Parallel.ForEach(threadsIds, (threadsId) =>
{
DateTime taskStart = DateTime.Now;
const int sleepDuration = 1000;
Console.WriteLine("{1:s.fff} Starting task {0}, sleeping for {2}", threadsId, taskStart, sleepDuration);
Thread.Sleep(sleepDuration);
DateTime taskFinish = DateTime.Now;
Console.WriteLine("{1:s.fff} Ending task {0}, task duration {2}", threadsId, taskFinish, taskFinish- taskStart);
});
DateTime globalFinish= DateTime.Now;
Console.WriteLine("{0:s.fff} Tasks finished. Total duration: {1}", globalFinish, globalFinish-globalStart);
Console.ReadLine();
}
Currently when I run it, it takes ~60seconds to run it. For what I understand, it's because .Net doesn't create one thread per task but some threads for all the Tasks, and when I do the Thread.Sleep, I prevent this thread to execute some other tasks.
In my real case, I've some work to do in parallel, and in case of failure, I've to wait some amount of time before trying again.
I'm looking something else than the Thread.Sleep, that would allow other tasks to run during the "sleep time" of other tasks.
Unfortunately, I'm currently running .Net 4, which prevent me to use async and await(which I guess could have helped me in this case.
Ps, I got the same results by:
putting Task.Delay(sleepDuration).Wait()
Not using Parallel.Foreach, but a foreach with a Task.Factory.StartNew
Ps2, I know that I can do my real case differently, but I'm very interessted to understand how it could be achieved that way.
You are on the right path. Task.Delay(timespan) is the solution for your problem. Since you cannot use async/await, you have to write a bit more code to achieve the desired result.
Think about using Task.ContinueWith() method, for example:
Task.Run(() => { /* code before Thread.Sleep */ })
.ContinueWith(task => Task.Delay(sleepDuration)
.ContinueWith(task2 => { /* code after Thread.Sleep */ }));
Also you will need create a class to make local method variables accessible across subtasks.
If you want to create a task that will run polling every second some condition, you could try the following code:
Task PollTask(Func<bool> condition)
{
TaskCompletionSource<bool> tcs = new TaskCompletionSource<bool>();
PollTaskImpl(tcs, condition);
return tcs.Task;
}
void PollTaskImpl(TaskCompletionSource<bool> tcs, Func<bool> condition)
{
if (condition())
tcs.SetResult(true);
else
Task.Delay(1000).ContinueWith(_ => PollTaskImpl(tcs, condition));
}
Don't worry about creating new task every second - ContinueWith and async/await methods do the same thing internally.
Updated to explain things more clearly
I've got an application that runs a number of tasks. Some are created initially and other can be added later. I need need a programming structure that will wait on all the tasks to complete. Once the all the tasks complete some other code should run that cleans things up and does some final processing of data generated by the other tasks.
I've come up with a way to do this, but wouldn't call it elegant. So I'm looking to see if there is a better way.
What I do is keep a list of the tasks in a ConcurrentBag (a thread safe collection). At the start of the process I create and add some tasks to the ConcurrentBag. As the process does its thing if a new task is created that also needs to finish before the final steps I also add it to the ConcurrentBag.
Task.Wait accepts an array of Tasks as its argument. I can convert the ConcurrentBag into an array, but that array won't include any Tasks added to the Bag after Task.Wait was called.
So I have a two step wait process in a do while loop. In the body of the loop I do a simple Task.Wait on the array generated from the Bag. When it completes it means all the original tasks are done. Then in the while test I do a quick 1 millisecond test of a new array generated from the ConcurrentBag. If no new tasks were added, or any new tasks also completed it will return true, so the not condition exits the loop.
If it returns false (because a new task was added that didn't complete) we go back and do a non-timed Task.Wait. Then rinse and repeat until all new and old tasks are done.
// defined on the class, perhaps they should be properties
CancellationTokenSource Source = new CancellationTokenSource();
CancellationToken Token = Source.Token;
ConcurrentBag<Task> ToDoList = new ConcurrentBag<Task>();
public void RunAndWait() {
// start some tasks add them to the list
for (int i = 0; i < 12; i++)
{
Task task = new Task(() => SillyExample(Token), Token);
ToDoList.Add(task);
task.Start();
}
// now wait for those task, and any other tasks added to ToDoList to complete
try
{
do
{
Task.WaitAll(ToDoList.ToArray(), Token);
} while (! Task.WaitAll(ToDoList.ToArray(), 1, Token));
}
catch (OperationCanceledException e)
{
// any special handling of cancel we might want to do
}
// code that should only run after all tasks complete
}
Is there a more elegant way to do this?
I'd recommend using a ConcurrentQueue and removing items as you wait for them. Due to the first-in-first-out nature of queues, if you get to the point where there's nothing left in the queue, you know that you've waited for all the tasks that have been added up to that point.
ConcurrentQueue<Task> ToDoQueue = new ConcurrentQueue<Task>();
...
while(ToDoQueue.Count > 0 && !Token.IsCancellationRequested)
{
Task task;
if(ToDoQueue.TryDequeue(out task))
{
task.Wait(Token);
}
}
Here's a very cool way using Microsoft's Reactive Framework (NuGet "Rx-Main").
var taskSubject = new Subject<Task>();
var query = taskSubject.Select(t => Observable.FromAsync(() => t)).Merge();
var subscription =
query.Subscribe(
u => { /* Each Task Completed */ },
() => Console.WriteLine("All Tasks Completed."));
Now, to add tasks, just do this:
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
And then to signal completion:
taskSubject.OnCompleted();
It is important to note that signalling completion doesn't complete the query immediately, it will wait for all of the tasks to finish too. Signalling completion just says that you will no longer add any new tasks.
Finally, if you want to cancel, then just do this:
subscription.Dispose();
I have sample code to compare processing time for Parallel approach and Task approach. The goal of this experiment is understanding of how do they work.
So my questions are:
Why Parallel worked faster then Task?
Do my results mean that I should use Parallel instead of Task?
Where should I use Task and where Parallel?
What benefits of using Task in comparison to Parallel?
Does Task is just a wrap for ThreadPool.QueueUserWorkItem method?
public Task SomeLongOperation()
{
return Task.Delay(3000);
}
static void Main(string[] args)
{
Program p = new Program();
List<Task> tasks = new List<Task>();
tasks.Add(Task.Factory.StartNew(() => p.SomeLongOperation()));
tasks.Add(Task.Factory.StartNew(() => p.SomeLongOperation()));
var arr = tasks.ToArray();
Stopwatch sw = Stopwatch.StartNew();
Task.WaitAll(arr);
Console.WriteLine("Task wait all results: " + sw.Elapsed);
sw.Stop();
sw = Stopwatch.StartNew();
Parallel.Invoke(() => p.SomeLongOperation(), () => p.SomeLongOperation());
Console.WriteLine("Parallel invoke results: " + sw.Elapsed);
sw.Stop();
Console.ReadKey();
}
Here are my processing results:
EDIT:
Changed code to look like this:
Program p = new Program();
Task[] tasks = new Task[2];
Stopwatch sw = Stopwatch.StartNew();
tasks[0] = Task.Factory.StartNew(() => p.SomeLongOperation());
tasks[1] = Task.Factory.StartNew(() => p.SomeLongOperation());
Task.WaitAll(tasks);
Console.WriteLine("Task wait all results: " + sw.Elapsed);
sw.Stop();
sw = Stopwatch.StartNew();
Parallel.Invoke(() => p.SomeLongOperation(), () => p.SomeLongOperation());
Console.WriteLine("Parallel invoke results: " + sw.Elapsed);
sw.Stop();
My new results:
EDIT 2:
When I replaced code with Parallel.Invoke to be first and Task.WaitAll to be second the situation has been changed cardinally. Now Parallel is slower. It makes me think of incorrectness of my estimates. I changed code to look like this:
Program p = new Program();
Task[] tasks = new Task[2];
Stopwatch sw = null;
for (int i = 0; i < 10; i++)
{
sw = Stopwatch.StartNew();
Parallel.Invoke(() => p.SomeLongOperation(), () => p.SomeLongOperation());
string res = sw.Elapsed.ToString();
Console.WriteLine("Parallel invoke results: " + res);
sw.Stop();
}
for (int i = 0; i < 10; i++)
{
sw = Stopwatch.StartNew();
tasks[0] = Task.Factory.StartNew(() => p.SomeLongOperation());
tasks[1] = Task.Factory.StartNew(() => p.SomeLongOperation());
Task.WaitAll(tasks);
string res2 = sw.Elapsed.ToString();
Console.WriteLine("Task wait all results: " + res2);
sw.Stop();
}
And here are my new results:
Now I can suggest that this experiment is much more clear. The results are almost the same. Sometimes Parallel and sometimes Task is faster. Now my questions are:
1. Where should I use Task and where Parallel?
2. What benefits of using Task in comparison to Parallel?
3. Does Task is just a wrap for ThreadPool.QueueUserWorkItem method?
Any helpful info that can clarify those questions are welcome.
EDIT as of this article from MSDN:
Both Parallel and Task are wrappers for ThreadPool. Parallel invoke also awaits until all tasks will be finished.
Related to your questions:
Using Task, Parallel or ThreadPool depends on the granularity of control you need to have on the execution of your parallel tasks. I'm personally got used to Task.Factory.StartNew(), but that's a personal opinion. The same relates to ThreadPool.QueueUserWorkItem()
Additional Information: The first call to Parallel.Invoke() and Task.Factory.StartNew() might be slower due to internal initialization.
If you start nongeneric Tasks(i.e. "void Tasks without a return value") and immediately Wait for them, use Parallel.Invoke instead. Your intent is immediately clear to the reader.
Use Tasks if:
you do not Wait immediately
you need return values
you need to give parameters to the methods called
you require TaskCreationOptions functionality
you require CancellationToken or TaskScheduler functionality and don't want to use ParallelOptions
basically, if you want more options or control
Yes, you can get around some of these, e.g. Parallel.Invoke(() => p.OpWithToken(CancellationToken) but that obfuscates your intent. Parallel.Invoke is for doing a bunch of work using as much CPU power as possible. It gets done, it doesn't deadlock, and you know this in advance.
Your testing is horrid though. The red flag would be that your long action is to wait 3000 milliseconds, yet your tests take less than a tenth of a millisecond.
Task.Factory.StartNew(() => p.SomeLongOperation());
StartNew takes an Action, and executes this in a new main Task. The action () => SomeLongOperation() creates a subtask Task. After this subtask is created (not completed), the call to SomeLongOperation() returns, and the Action is done. So the main Task is already completed after a tenth millisecond, while the two subtasks you have no reference to are still running in the background. The Parallel path also creates two subtasks, which it doesn't track at all, and returns.
The correct way would be tasks[0] = p.SomeLongOperation();, which assigns a running task to the array. Then WaitAll checks for the finishing of this task.
I am currently replacing some home baked task functionality with a new implementation using the new System.Threading.Tasks functionality found in .net 4.
I have a slight issue though, and although I can think of some solutions I would like some advice on which is generally the best way to do it, and if I am missing a trick somewhere.
What I need is for an arbitrary process to be able to start a Task but then carry on and not wait for the Task to finish. Not a problem, but when I then need to do something with the result of a task i'm not quite sure the best way of doing it.
All the examples I have seen use either Wait() on the task until it completes or references the Result parameter on the task. Both of these will block the thread which started the Task, which I don't want.
Some solutions I have thought of:
Create a new thread and start the task on that, then use Wait() or .Result to block the new thread and sync the result back to the caller somehow, possibly with polling to the tasks IsCompleted parameter.
Have a 'Notify Completed' task which I can start after completion of the task I want to run which then raises a static event or something.
Pass a delegate into the input of the task and call that to notify that the task is finished.
I can think or pros and cons to all of them, but I especially don't like the idea of having to explicitly create a new thread to start the task on when the one of the aims of using the Task class in the first place is to abstract away from direct Thread usage.
Any thoughts about the best way? Am I missing something simple? Would a 'Completed' event be too much to ask for :)? (Sure there is a good reason why there isn't one!)
I suspect you're looking for Task.ContinueWith (or Task<T>.ContinueWith). These basically say, "When you've finished this task, execute this action." However, there are various options you can specify to take more control over it.
MSDN goes into a lot more detail on this in "How to: Chain Multiple Tasks With Continuations" and "Continuation Tasks".
In modern C#, one no longer needs to call ContinueWith() explicitly. An alternative to the original accepted answer would be to simply create an async method that awaits the Task in question, and does whatever it wants when the Task completes.
For example, suppose you want to raise an event called TaskCompleted when the Task completes. You would write a method like:
async Task RaiseEventWhenTaskCompleted(Task task)
{
await task;
TaskCompleted?.Invoke(this, EventArgs.Empty);
}
To "register" the wait, just call the above method. Add exception handling as desired, either in the method above, or in some code that will eventually observe the Task returned by the above method.
Task task = Task.Run ( () => { Thread.Sleep ( 2000 ); } );
task.GetAwaiter ().OnCompleted ( () =>
{
MessageBox.Show ( "the task completed in the main thread", "");
} );
You can apply a task continuation.
Alternatively, Task implements IAsyncResult, so you can use the standard approaches for that interface (blocking, polling, or waiting on its WaitHandle).
I created a small example illustrating Jon Skeet's answer, which I'd like to share with you:
using System;
using System.Threading;
using System.Threading.Tasks;
public class Program
{
static void Main(string[] args)
{
for (int cnt = 0; cnt < NumTasks; cnt++)
{
var task = new Task<int>(DoSomething); // any other type than int is possible
task.ContinueWith(t => Console.WriteLine($"Waited for {t.Result} milliseconds."));
task.Start(); // fire and forget
}
PlayMelodyWhileTasksAreRunning();
}
static int NumTasks => Environment.ProcessorCount;
static int DoSomething()
{
int milliSeconds = random.Next(4000) + 1000;
Console.WriteLine($"Waiting for {milliSeconds} milliseconds...");
Thread.Sleep(milliSeconds);
return milliSeconds; // make available to caller as t.Result
}
static Random random = new Random();
static void PlayMelodyWhileTasksAreRunning()
{
Console.Beep(587, 200); // D
Console.Beep(622, 200); // D#
Console.Beep(659, 200); // E
Console.Beep(1047, 400); // C
Console.Beep(659, 200); // E
Console.Beep(1047, 400); // C
Console.Beep(659, 200); // E
Console.Beep(1047, 1200); // C
Console.Beep(1047, 200); // C
Console.Beep(1175, 200); // D
Console.Beep(1245, 200); // D#
Console.Beep(1319, 200); // E
Console.Beep(1047, 200); // C
Console.Beep(1175, 200); // D
Console.Beep(1319, 400); // E
Console.Beep(988, 200); // H
Console.Beep(1175, 400); // D
Console.Beep(1047, 1600); // C
}
}
You can use the ContinueWith function with your routine as a first argument, and a task scheduler as the second argument given by TaskScheduler.FromCurrentSynchronizationContext().
It goes like this:
var task1 = new Task(() => {do_something_in_a_remote_thread();} );
task1.ContinueWith(() => {do_something_in_the_ui_thread();},
TaskScheduler.FromCurrentSynchronizationContext());