I'm reading a book which says:
Task objects internally contain a collection of ContinueWith tasks.
So you can actually call ContinueWith several times using a single Task object. When the task
completes, all the ContinueWith tasks will be queued to the thread pool.
so I try to check the source code of Task https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs,146
I didn't find any private field that looks like a collection of ContinueWith tasks.
So my question is , does Task objects internally contain a collection of ContinueWith tasks?
And if it does, let's say we have the following code:
Task<Int32> t = Task.Run(() => Sum(10000));
Task a = t.ContinueWith(task => Console.WriteLine("The sum is: " + task.Result),
TaskContinuationOptions.OnlyOnRanToCompletion);
Task b = t.ContinueWith(task => Console.WriteLine("Sum threw: " + task.Exception.InnerException),
TaskContinuationOptions.OnlyOnFaulted);
if (a == b) {
... // false
{
Since calling ContinueWith just add an item to a collection, then a and b should point to the same Task object, but a == b return false?
I guess you could say the answer is yes and no. The Task class utilizes Interlocked in order to keep up with continuations.
You can see the method for adding a continuation named AddTaskContinuationComplex(object tc, bool addBeforeOthers) at line 4727. Within this method the continuation is added to a list and then passed to Interlocked.
// Construct a new TaskContinuation list
List<object> newList = new List<object>();
// Add in the old single value
newList.Add(oldValue);
// Now CAS in the new list
Interlocked.CompareExchange(ref m_continuationObject, newList, oldValue);
Now if you take a look at the FinishContinuations (line 3595) method you can see
// Atomically store the fact that this task is completing. From this point on, the adding of continuations will
// result in the continuations being run/launched directly rather than being added to the continuation list.
object continuationObject = Interlocked.Exchange(ref m_continuationObject, s_taskCompletionSentinel);
TplEtwProvider.Log.RunningContinuation(Id, continuationObject);
Here the current task is being marked as completed and the next task is being obtained from Interlocked.
Now if you go to the Interlocked class and examine it, you can see that while it isn't necessarily a collection itself it keeps up with and maintains continuations in a thread-safe manner.
Related
This question already has answers here:
Parallel foreach with asynchronous lambda
(10 answers)
Closed 1 year ago.
I have a simple asynchronous code below : this is a WPF with one button and one textBox.
I used some list with five integers to mimic 5 different tasks.
My intention was to achieve that when I run all five tasks in parallel and asynchronously ,
I can observe that the numbers are one by one added to the textbox.
And I achieved it. Method "DoSomething" run all five tasks in parallel and each of the task has different execution time (simulated by Task.Delay) so all results in the numbers appearing in the textbox one by one.
The only problem that I cannot figure out is: why in the textbox I have the string text "This is end text" displayed at first ?! If I await method DoSomething then it should be accomplished first and then the rest of the code should be executed.Even though in my case is a repainting of the GUI.
I guess that this might be caused by the use of Dispacher.BeginInvoke which may "cause some disturbance " to async/await mechanism. But I would appreciate small clue and how to avoid this behawior.
I know that I could use the Progress event to achieve similar effect but is there any other way that I can use Parallel loop and update results progressively in WPF avoiding such a unexpected behaviour which I described?
private async void Button_Click(object sender, RoutedEventArgs e)
{
await DoSomething();
tbResults.Text += "This is end text";
}
private async Task DoSomething()
{
List<int> numbers = new List<int>(Enumerable.Range(1, 5));
await Task.Run(()=> Parallel.ForEach(numbers,async i =>
{
await Task.Delay(i * 300);
await Dispatcher.BeginInvoke(() => tbResults.Text += i.ToString() + Environment.NewLine);
}));
}
// output is:
//This is end text 1 2 3 4 5 (all in separatÄ™ lines).
My questions:
Why the text is displayed prior the method DoSomething .
How to solve it/ avoid it , any alternative way to solve it ( except using Progress event ).
Any info will be highly appreciate.
The threads of Parallel.Foreach are "real" background threads. They are created and the application continues execution. The point is that Parallel.Foreach is not awaitable, therefore the execution continues while the threads of the Parallel.Foreach are suspended using await.
private async Task DoSomething()
{
List<int> numbers = new List<int>(Enumerable.Range(1, 5));
// Create the threads of Parallel.Foreach
await Task.Run(() =>
{
// Create the threads of Parallel.Foreach and continue
Parallel.ForEach(numbers,async i =>
{
// await suspends the thread and forces to return.
// Because Parallel.ForEach is not awaitable,
// execution leaves the scope of the Parallel.Foreach to continue.
await Task.Delay(i * 300);
await Dispatcher.BeginInvoke(() => tbResults.Text += i.ToString() + Environment.NewLine);
});
// After the threads are created the internal await of the Parallel.Foreach suspends background threads and
// forces to the execution to return from the Parallel.Foreach.
// The Task.Run thread continues.
Dispatcher.InvokeAsync(() => tbResults.Text += "Text while Parallel.Foreach threads are suspended");
// Since the background threads of the Parallel.Foreach are not attached
// to the parent Task.Run, the Task.Run completes now and returns
// i.e. Task.run does not wait for child background threads to complete.
// ==> Leave Task.Run as there is no work.
});
// Leave DoSomething() and continue to execute the remaining code in the Button_Click().
// Parallel.Foreach threads is still suspended until the await chain, in this case Button_Click(), is completed.
}
The solution is to implement the pattern suggested by Clemens' comment or an async implementation of the Producer Consumer pattern using e.g., BlockingCollection or Channel to gain more control over the fixed number of threads while distributing the "unlimited" number of jobs.
private async Task DoSomething(int number)
{
await Task.Delay(number * 300);
Dispatcher.Invoke(() => tbResults.Text += number + Environment.NewLine);
}
private async void ButtonBase_OnClick(object sender, RoutedEventArgs e)
{
List<int> numbers = new List<int>(Enumerable.Range(1, 5));
List<Task> tasks = new List<Task>();
// Alternatively use LINQ Select
foreach (int number in numbers)
{
Task task = DoSomething(number);
tasks.Add(task);
}
await Task.WhenAll(tasks);
tbResults.Text += "This is end text" + Environment.NewLine;
}
Discussing the comments
"my intention was to run tasks in parallel and "report" once they are
completed i.e. the taks which takes the shortest would "report" first
and so on."
This is exactly what is happening in the above solution. The Task with the shortest delay appends text to the TextBox first.
"But implementing your suggested await Task.WhenAll(tasks) causes
that we need to wait for all tasks to complete and then report all at
once."
To process Task objects in their order of completion, you would replace Task.WhenAll with Task.WhenAny. In case you are not only interested in the first completed Task, you would have to use Task.WhenAny in an iterative manner until all Task instances have been completed:
Process all Task objects in their order of completion
private async Task DoSomething(int number)
{
await Task.Delay(number * 300);
Dispatcher.Invoke(() => tbResults.Text += number + Environment.NewLine);
}
private async void ButtonBase_OnClick(object sender, RoutedEventArgs e)
{
List<int> numbers = new List<int>(Enumerable.Range(1, 5));
List<Task> tasks = new List<Task>();
// Alternatively use LINQ Select
foreach (int number in numbers)
{
Task task = DoSomething(number);
tasks.Add(task);
}
// Until all Tasks have completed
while (tasks.Any())
{
Task<int> nextCompletedTask = await Task.WhenAny(tasks);
// Remove the completed Task so that
// we can await the next uncompleted Task that completes first
tasks.Remove(nextCompletedTask);
// Get the result of the completed Task
int taskId = await nextCompletedTask;
tbResults.Text += $"Task {taskId} has completed." + Environment.NewLine;
}
tbResults.Text += "This is end text" + Environment.NewLine;
}
"Parallel.ForEach is not awaitable so I thought that wrapping it up in
Task.Run allows me to await it but this is because as you said "Since
the background threads of the Parallel.Foreach are not attached to the
parent Task.Run""
No that's not exactly what I have said. The key point is the third sentence of my answer: "The point is that Parallel.Foreach is not awaitable, therefore the execution continues while the threads of the Parallel.Foreach are suspended using await.".
This means: normally Parallel.Foreach executes synchronously: the calling context continues execution when all threads of Parallel.Foreach have completed. But since you called await inside those threads, you suspend them in an async/await manner.
Since Parallel.Foreach is not awaitable, it can't handle the await calls and acts like the suspended threads have completed naturally. Parallel.Foreach does not understand that the threads are just suspended by await and will continue later. In other words, the await chain is broken as Parallel.Foreach is not able to return the Task to the parent awaited Task.Run context to signal its suspension.
That's what I meant when saying that the threads of Parallel.Foreach are not attached to the Task.Run. They run in total isolation from the async/await infrastructure.
"async lambdas should be "only use with events""
No, that's not correct. When you pass an async lambda to a void delegate like Action<T> you are correct: the async lambda can't be awaited in this case. But when passing an async lambda to a Func<T> delegate where T is of type Task, your lamda can be awaited:
private void NoAsyncDelegateSupportedMethod(Action lambda)
{
// Since Action does not return a Task (return type is always void),
// the async lambda can't be awaited
lambda.Invoke();
}
private async Task AsyncDelegateSupportedMethod(Func<Task> asyncLambda)
{
// Since Func returns a Task, the async lambda can be awaited
await asyncLambda.Invoke();
}
public voi DoSoemthing()
{
// Not a good idea as NoAsyncDelegateSupportedMethod can't handle async lamdas: it defines a void delegate
NoAsyncDelegateSupportedMethod(async () => await Task.Delay(1));
// A good idea as AsyncDelegateSupportedMethod can handle async lamdas: it defines a Func<Task> delegate
AsyncDelegateSupportedMethod(async () => await Task.Delay(1));
}
As you can see your statement is not correct. You must always check the signature of the called method and its overloads. If it accepts a Func<Task> type delegate you are good to go.
That's how async support is added to Parallel.ForeachAsync: the API supports a Func<ValueTask> type delegate. For example Task.Run accepts a Func<Task> and therefore the following call is perfectly fine:
Task.Run(async () => await Task.Delay(1));
" I guess that you admit that .Net 6.0 brought the best solution :
which is Parallel.ForEachASYNC! [...] We can spawn a couple of threads
which deal with our tasks in parallel and we can await the whole loop
and we do not need to wait for all tasks to complte- they "report" as
they finish "
That's wrong. Parallel.ForeachAsync supports threads thatr use async/await, that's true. Indeed, your original example would no longer break the intended flow: because Parallel.ForeachAsync supports await in its threads, it can handle suspended threads and propagate the Task object properly from its threads to the caller context e.g., to the wrapping await Task.Run.
It now knows how to wait for and resume suspended threads.
Important: Parallel.ForeachAsync still completes AFTER ALL its threads have completed. You assumption "they "report" as they finish" is wrong. That's the most intuitive concurrent implementation of a foreach. foreach also completes after all items are enumerated.
The solution to process Task objects as they complete is still using the Task.WhenAny pattern from above.
In general, if you you don't need the extra features like partitioning etc. of the the Parallel.Foreach and Parallel.ForeachAsync, you can always use Task.WhenAll instead. Task.WhenAll and especially Parallel.ForeachAsync are equivalent, except for Parallel.ForeachAsync provides greater customization by default: it suppors techniques like throttling and partitioning without the extra code.
Good links from Clemens see in comment. Answering your questions:
In Parallel.ForEach you start/fire for each entry of numbera an async task, which you don't await. So you do only await, that Parallel.ForEach does finish and it does finish before the async tasks of it.
What you could do e.g. remove async inside of Parallel.ForEach and use Dispatcher.Invoke instead of Dispatcher.BeginInvoke, Thread.Sleep is an anti-pattern ;) , so depending on your task may be take another solution(edited: BionicCode has a nice one):
private async Task DoSomething()
{
var numbers = new List<int>(Enumerable.Range(1, 5));
await Task.Run(()=> Parallel.ForEach(numbers, i =>
{
Thread.Sleep(i * 300);
Dispatcher.Invoke(() => tbResults.Text += i.ToString() + Environment.NewLine);
}));
}
public static async void DoSomething(IEnumerable<IDbContext> dbContexts)
{
IEnumerator<IDbContext> dbContextEnumerator = dbContexts.GetEnumerator();
Task<ProjectSchema> projectSchemaTask = Task.Run(() => Core.Data.ProjectRead
.GetAll(dbContextEnumerator.Current)
.Where(a => a.PJrecid == pjRecId)
.Select(b => new ProjectSchema
{
PJtextid = b.PJtextid,
PJcustomerid = b.PJcustomerid,
PJininvoiceable = b.PJininvoiceable,
PJselfmanning = b.PJselfmanning,
PJcategory = b.PJcategory
})
.FirstOrDefault());
Task<int?> defaultActivitySchemeTask = projectSchemaTask.ContinueWith(antecedent =>
{
//This is where an exception may get thrown
return ProjectTypeRead.GetAll(dbContextEnumerator.Current)
.Where(a => a.PTid == antecedent.Result.PJcategory)
.Select(a => a.PTactivitySchemeID)
.FirstOrDefaultAsync().Result;
}, TaskContinuationOptions.OnlyOnRanToCompletion);
Task<SomeModel> customerTask = projectSchemaTask.ContinueWith((antecedent) =>
{
//This is where an exception may get thrown
return GetCustomerDataAsync(antecedent.Result.PJcustomerid,
dbContextEnumerator.Current).Result;
}, TaskContinuationOptions.OnlyOnRanToCompletion);
await Task.WhenAll(defaultActivitySchemeTask, customerTask);
}
The exception I am getting:
NotSupportedException: A second operation started on this context before a previous asynchronous operation completed. Use 'await' to ensure that any asynchronous operations have completed before calling another method on this context. Any instance members are not guaranteed to be thread safe.
The exception is only thrown about every 1/20 calls to this function. And the exception seems only to happen when I am chaining tasks with ContinueWith().
How can there be a second operation on context, when I am using a new one for each request?
This is just an example of my code. In the real code I have 3 parent tasks, and each parent has 1-5 chained tasks attached to them.
What am I doing wrong?
yeah, you basically shouldn't use ContinueWith these days; in this case, you are ending up with two continuations on the same task (for defaultActivitySchemeTask and customerTask); how they interact is now basically undefined, and will depend on exactly how the two async flows work, but you could absolutely end up with overlapping async operations here (for example, in the simplest "continuations are sequential", as soon as the first awaits because it is incomplete, the second will start). Frankly, this should be logically sequential await based code, probably not using Task.Run too, but let's keep it for now:
ProjectSchema projectSchema = await Task.Run(() => ...);
int? defaultActivityScheme = await ... first bit
SomeModel customer = await ... second bit
We can't do the two subordinate queries concurrently without risking concurrent async operations on the same context.
In your example you seem to be running two continuations in parallel, so there is a possibility that they will overlap causing a concurrency problem. DbContext is not thread safe, so you need to make sure that your asynchronous calls are sequential. Keep in mind that using async/await will simply turn your code into a state machine, so you can control which operations has completed before moving to the next operation. Using async methods alone will not ensure parallel operations but wrapping your operation in Task.Run will. So you you need to ask yourself is Task.Run is really required (i.e. is scheduling work in the ThreadPool) to make it parallel.
You mentioned that in your real code you have 3 parent tasks and each parent has 1-5 chained tasks attached to them. If the 3 parent tasks have separate DbContexts, they could run in parallel (each one of them wrapped in Task.Run), but their chained continuations need to be sequential (leveraging async/await keywords). Like that:
public async Task DoWork()
{
var parentTask1 = Task.Run(ParentTask1);
var parentTask2 = Task.Run(ParentTask2);
var parentTask3 = Task.Run(ParentTask3);
await Task.WhenAll(parentTask1 , parentTask2, parentTask3);
}
private async Task ParentTask1()
{
// chained child asynchronous continuations
await Task.Delay(100);
await Task.Delay(100);
}
private async Task ParentTask2()
{
// chained child asynchronous continuations
await Task.Delay(100);
await Task.Delay(100);
}
private async Task ParentTask3()
{
// chained child asynchronous continuations
await Task.Delay(100);
await Task.Delay(100);
}
If your parent tasks operate on the same DbContext, in order to avoid concurrency you would need to await them one by one (no need to wrap them into Task.Run):
public async Task DoWork()
{
await ParentTask1();
await ParentTask2();
await ParentTask3();
}
Updated to explain things more clearly
I've got an application that runs a number of tasks. Some are created initially and other can be added later. I need need a programming structure that will wait on all the tasks to complete. Once the all the tasks complete some other code should run that cleans things up and does some final processing of data generated by the other tasks.
I've come up with a way to do this, but wouldn't call it elegant. So I'm looking to see if there is a better way.
What I do is keep a list of the tasks in a ConcurrentBag (a thread safe collection). At the start of the process I create and add some tasks to the ConcurrentBag. As the process does its thing if a new task is created that also needs to finish before the final steps I also add it to the ConcurrentBag.
Task.Wait accepts an array of Tasks as its argument. I can convert the ConcurrentBag into an array, but that array won't include any Tasks added to the Bag after Task.Wait was called.
So I have a two step wait process in a do while loop. In the body of the loop I do a simple Task.Wait on the array generated from the Bag. When it completes it means all the original tasks are done. Then in the while test I do a quick 1 millisecond test of a new array generated from the ConcurrentBag. If no new tasks were added, or any new tasks also completed it will return true, so the not condition exits the loop.
If it returns false (because a new task was added that didn't complete) we go back and do a non-timed Task.Wait. Then rinse and repeat until all new and old tasks are done.
// defined on the class, perhaps they should be properties
CancellationTokenSource Source = new CancellationTokenSource();
CancellationToken Token = Source.Token;
ConcurrentBag<Task> ToDoList = new ConcurrentBag<Task>();
public void RunAndWait() {
// start some tasks add them to the list
for (int i = 0; i < 12; i++)
{
Task task = new Task(() => SillyExample(Token), Token);
ToDoList.Add(task);
task.Start();
}
// now wait for those task, and any other tasks added to ToDoList to complete
try
{
do
{
Task.WaitAll(ToDoList.ToArray(), Token);
} while (! Task.WaitAll(ToDoList.ToArray(), 1, Token));
}
catch (OperationCanceledException e)
{
// any special handling of cancel we might want to do
}
// code that should only run after all tasks complete
}
Is there a more elegant way to do this?
I'd recommend using a ConcurrentQueue and removing items as you wait for them. Due to the first-in-first-out nature of queues, if you get to the point where there's nothing left in the queue, you know that you've waited for all the tasks that have been added up to that point.
ConcurrentQueue<Task> ToDoQueue = new ConcurrentQueue<Task>();
...
while(ToDoQueue.Count > 0 && !Token.IsCancellationRequested)
{
Task task;
if(ToDoQueue.TryDequeue(out task))
{
task.Wait(Token);
}
}
Here's a very cool way using Microsoft's Reactive Framework (NuGet "Rx-Main").
var taskSubject = new Subject<Task>();
var query = taskSubject.Select(t => Observable.FromAsync(() => t)).Merge();
var subscription =
query.Subscribe(
u => { /* Each Task Completed */ },
() => Console.WriteLine("All Tasks Completed."));
Now, to add tasks, just do this:
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
taskSubject.OnNext(Task.Run(() => { }));
And then to signal completion:
taskSubject.OnCompleted();
It is important to note that signalling completion doesn't complete the query immediately, it will wait for all of the tasks to finish too. Signalling completion just says that you will no longer add any new tasks.
Finally, if you want to cancel, then just do this:
subscription.Dispose();
First of all I am totally new to threading in C#. I have created multiple threads as shown below.
if (flag)
{
foreach (string empNo in empList)
{
Thread thrd = new Thread(()=>ComputeSalary(empNo));
threadList.Add(thrd);
thrd.Start();
}
}
Before proceeding further I need check if at least one thread is completed its execution so that I can perform additional operations.
I also tried creating the list of type thread and by added it to list, so that I can check if at least one thread has completed its execution. I tried with thrd.IsAlive but it always gives me current thread status.
Is there any other way to check if atleast on thread has completed its execution?
You can use AutoResetEvent.
var reset = new AutoResetEvent(false); // ComputeSalary should have access to reset
.....
....
if (flag)
{
foreach (string empNo in empList)
{
Thread thrd = new Thread(()=>ComputeSalary(empNo));
threadList.Add(thrd);
thrd.Start();
}
reset.WaitOne();
}
.....
.....
void ComputeSalary(int empNo)
{
.....
reset.set()
}
Other options are callback function, event or a flag/counter(this is not advised).
Here is a solution based on the Task Parallel Library:
// Create a list of tasks for each string in empList
List<Task> empTaskList = empList.Select(emp => Task.Run(() => ComputeSalary(emp)))
.ToList();
// Give me the task that finished first.
var firstFinishedTask = await Task.WhenAny(empTaskList);
A couple of things to note:
In order to use await inside your method, you will have to declare it as async Task or or async Task<T> where T is the desired return type
Task.Run is your equivalent of new Thread().Start(). The difference is Task.Run will use the ThreadPool (unless you explicitly tell it not to), and the Thread class will construct an entirely new thread.
Notice the use of await. This tells the compiler to yield control back to the caller until Task.WhenAny returns the first task that finished.
You should read more about async-await here
I want to chain multiple Tasks, so that when one ends the next one starts. I know I can do this using ContinueWith. But what if I have a large number of tasks, so that:
t1 continues with t2
t2 continues with t3
t3 continues with t4
...
Is there a nice way to do it, other than creating this chain manually using a loop?
Well, assuming you have some sort of enumerable of Action delegates or something you want to do, you can easily use LINQ to do the following:
// Create the base task. Run synchronously.
var task = new Task(() => { });
task.RunSynchronously();
// Chain them all together.
var query =
// For each action
from action in actions
// Assign the task to the continuation and
// return that.
select (task = task.ContinueWith(action));
// Get the last task to wait on.
// Note that this cannot be changed to "Last"
// because the actions enumeration could have no
// elements, meaning that Last would throw.
// That means task can be null, so a check
// would have to be performed on it before
// waiting on it (unless you are assured that
// there are items in the action enumeration).
task = query.LastOrDefault();
The above code is really your loop, just in a fancier form. It does the same thing in that it takes the previous task (after primed with a dummy "noop" Task) and then adds a continuation in the form of ContinueWith (assigning the continuation to the current task in the process for the next iteration of the loop, which is performed when LastOrDefault is called).
You may use static extensions ContinueWhenAll here.
So you can pass multiple tasks.
Update
You can use a chaining extension such as this:
public static class MyTaskExtensions
{
public static Task BuildChain(this Task task,
IEnumerable<Action<Task>> actions)
{
if (!actions.Any())
return task;
else
{
Task continueWith = task.ContinueWith(actions.First());
return continueWith.BuildChain(actions.Skip(1));
}
}
}