I have three methods that I call to do some number crunching that are as follows
results.LeftFront.CalcAi();
results.RightFront.CalcAi();
results.RearSuspension.CalcAi(geom, vehDef.Geometry.LTa.TaStiffness, vehDef.Geometry.RTa.TaStiffness);
Each of the functions is independent of each other and can be computed in parallel with no dead locks.
What is the easiest way to compute these in parallel without the containing method finishing until all three are done?
See the TPL documentation. They list this sample:
Parallel.Invoke(() => DoSomeWork(), () => DoSomeOtherWork());
So in your case this should just work:
Parallel.Invoke(
() => results.LeftFront.CalcAi(),
() => results.RightFront.CalcAi(),
() => results.RearSuspension.CalcAi(geom,
vehDef.Geometry.LTa.TaStiffness,
vehDef.Geometry.RTa.TaStiffness));
EDIT: The call returns after all actions have finished executing. Invoke() is does not guarantee that they will indeed run in parallel, nor does it guarantee the order in which the actions execute.
You can do this with tasks too (nicer if you later need Cancellation or something like results)
var task1 = Task.Factory.StartNew(() => results.LeftFront.CalcAi());
var task2 = Task.Factory.StartNew(() => results.RightFront.CalcAi());
var task3 = Task.Factory.StartNew(() =>results.RearSuspension.CalcAi(geom,
vehDef.Geometry.LTa.TaStiffness,
vehDef.Geometry.RTa.TaStiffness));
Task.WaitAll(task1, task2, task3);
In .NET 4, Microsoft introduced the Task Parallel Library which was designed to handle this kind of problem, see Parallel Programming in the .NET Framework.
To run parallel methods which are independent of each other ThreadPool.QueueUserWorkItem can also be used. Here is the sample method-
public static void ExecuteParallel(params Action[] tasks)
{
// Initialize the reset events to keep track of completed threads
ManualResetEvent[] resetEvents = new ManualResetEvent[tasks.Length];
// Launch each method in it's own thread
for (int i = 0; i < tasks.Length; i++)
{
resetEvents[i] = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem(new WaitCallback((object index) =>
{
int taskIndex = (int)index;
// Execute the method
tasks[taskIndex]();
// Tell the calling thread that we're done
resetEvents[taskIndex].Set();
}), i);
}
// Wait for all threads to execute
WaitHandle.WaitAll(resetEvents);
}
More detail about this function can be found here:
http://newapputil.blogspot.in/2016/03/running-parallel-tasks-using.html
var task1 = SomeLongRunningTask();
var task2 = SomeOtherLongRunningTask();
await Task.WhenAll(task1, task2);
The benefit of this over Task.WaitAll is that this will release the thread and await the completion of the two tasks.
Related
I have 8 methods in an ASP.NET Console App, like Fun1(), Fun2(), Fun3() ... and so on. In my console application I have called all these methods sequentially. But now the requirement is I have do that using parallel programming concepts. I have read about task and threading concepts in Java but completely new to .NET parallel Programming.
Here is the flow of methods I needed in my console app,
As you can see the diagram, Task1 and Task2 should run in parallel, and Task3 will only occur after completion of the previous two.
The functions inside each task, for example Fun3 and Fun4 for the Task1, should run sequentially, the one after the other.
Can anyone please help me out?
One way to solve this is, by using WhenAll.
To take an example, I have created X number of methods with the name FuncX() like this:
async static Task<int> FuncX()
{
await Task.Delay(500);
var result = await Task.FromResult(1);
return result;
}
In this case, we have Func1, Func3, Func4, Func5, and Func6.
So we call methods and pass them to a list of Task.
var task1 = new List<Task<int>>();
task1.Add(Func3());
task1.Add(Func4());
var task2 = new List<Task<int>>();
task2.Add(Func1());
task2.Add(Func5());
task2.Add(Func6());
You have 2 options to get the result:
// option1
var eachFunctionIsDoneWithAwait1 = await Task.WhenAll(task1);
var eachFunctionIsDoneWithAwait2 = await Task.WhenAll(task2);
var sum1 = eachFunctionIsDoneWithAwait1.Sum() + eachFunctionIsDoneWithAwait2.Sum();
Console.WriteLine(sum1);
// option2
var task3 = new List<List<Task<int>>>();
task3.Add(task1);
task3.Add(task2);
var sum2 = 0;
task3.ForEach(async x =>
{
var r = await Task.WhenAll(x);
sum2 += r.Sum();
});
Console.WriteLine(sum2);
This is just example for inspiration, you can change it and do it the way you want.
Here is how you could create the tasks according to the diagram, using the Task.Run method:
Task task1 = Task.Run(() =>
{
Fun3();
Fun4();
});
Task task2 = Task.Run(() =>
{
Fun1();
Fun5();
Fun6();
});
Task task3 = Task.Run(async () =>
{
await Task.WhenAll(task1, task2);
Fun7();
Fun8();
});
The Task.Run invokes the delegate on the ThreadPool, not on a dedicated thread. If you have some reason to create a dedicated thread for each task, you could use the advanced Task.Factory.StartNew method with the TaskCreationOptions.LongRunning argument, as shown here.
It should be noted that the above implementation has not an optimal behavior in case of exceptions. In case the Fun3() fails immediately, the optimal behavior would be to stop the execution of the task2 as soon as possible. Instead this implementation will execute all three functions Fun1, Fun5 and Fun6 before propagating the error. You could fix this minor flaw by creating a CancellationTokenSource and invoking the Token.ThrowIfCancellationRequested after each function, but it's going to be messy.
Another issue is that in case both task1 and task2 fail, only the exception of the task1 is going to be propagated through the task3. Solving this issue is not trivial.
Update: Here is one way to solve the issue of partial exception propagation:
Task task3 = Task.WhenAll(task1, task2).ContinueWith(t =>
{
if (t.IsFaulted)
{
TaskCompletionSource tcs = new();
tcs.SetException(t.Exception.InnerExceptions);
return tcs.Task;
}
if (t.IsCanceled)
{
TaskCompletionSource tcs = new();
tcs.SetCanceled(new TaskCanceledException(t).CancellationToken);
return tcs.Task;
}
Debug.Assert(t.IsCompletedSuccessfully);
Fun7();
Fun8();
return Task.CompletedTask;
}, default, TaskContinuationOptions.DenyChildAttach, TaskScheduler.Default)
.Unwrap();
In case both task1 and task2 fail, the task3 will propagate the exceptions of both tasks.
I have an application that pulls a fair amount of data from different sources. A local database, a networked database, and a web query. Any of these can take a few seconds to complete. So, first I decided to run these in parallel:
Parallel.Invoke(
() => dataX = loadX(),
() => dataY = loadY(),
() => dataZ = loadZ()
);
As expected, all three execute in parallel, but execution on the whole block doesn't come back until the last one is done.
Next, I decided to add a spinner or "busy indicator" to the application. I don't want to block the UI thread or the spinner won't spin. So these need to be ran in async mode. But if I run all three in an async mode, then they in affect happen "synchronously", just not in the same thread as the UI. I still want them to run in parallel.
spinner.IsBusy = true;
Parallel.Invoke(
async () => dataX = await Task.Run(() => { return loadX(); }),
async () => dataY = await Task.Run(() => { return loadY(); }),
async () => dataZ = await Task.Run(() => { return loadZ(); })
);
spinner.isBusy = false;
Now, the Parallel.Invoke does not wait for the methods to finish and the spinner is instantly off. Worse, dataX/Y/Z are null and exceptions occur later.
What's the proper way here? Should I use a BackgroundWorker instead? I was hoping to make use of the .NET 4.5 features.
It sounds like you really want something like:
spinner.IsBusy = true;
try
{
Task t1 = Task.Run(() => dataX = loadX());
Task t2 = Task.Run(() => dataY = loadY());
Task t3 = Task.Run(() => dataZ = loadZ());
await Task.WhenAll(t1, t2, t3);
}
finally
{
spinner.IsBusy = false;
}
That way you're asynchronously waiting for all the tasks to complete (Task.WhenAll returns a task which completes when all the other tasks complete), without blocking the UI thread... whereas Parallel.Invoke (and Parallel.ForEach etc) are blocking calls, and shouldn't be used in the UI thread.
(The reason that Parallel.Invoke wasn't blocking with your async lambdas is that it was just waiting until each Action returned... which was basically when it hit the start of the await. Normally you'd want to assign an async lambda to Func<Task> or similar, in the same way that you don't want to write async void methods usually.)
As you stated in your question, two of your methods query a database (one via sql, the other via azure) and the third triggers a POST request to a web service. All three of those methods are doing I/O bound work.
What happeneds when you invoke Parallel.Invoke is you basically trigger three ThreadPool threads to block and wait for I/O based operations to complete, which is pretty much a waste of resources, and will scale pretty badly if you ever need to.
Instead, you could use async apis which all three of them expose:
SQL Server via Entity Framework 6 or ADO.NET
Azure has async api's
Web request via HttpClient.PostAsync
Lets assume the following methods:
LoadXAsync();
LoadYAsync();
LoadZAsync();
You can call them like this:
spinner.IsBusy = true;
try
{
Task t1 = LoadXAsync();
Task t2 = LoadYAsync();
Task t3 = LoadZAsync();
await Task.WhenAll(t1, t2, t3);
}
finally
{
spinner.IsBusy = false;
}
This will have the same desired outcome. It wont freeze your UI, and it would let you save valuable resources.
What are the differences between using Parallel.ForEach or Task.Run() to start a set of tasks asynchronously?
Version 1:
List<string> strings = new List<string> { "s1", "s2", "s3" };
Parallel.ForEach(strings, s =>
{
DoSomething(s);
});
Version 2:
List<string> strings = new List<string> { "s1", "s2", "s3" };
List<Task> Tasks = new List<Task>();
foreach (var s in strings)
{
Tasks.Add(Task.Run(() => DoSomething(s)));
}
await Task.WhenAll(Tasks);
In this case, the second method will asynchronously wait for the tasks to complete instead of blocking.
However, there is a disadvantage to use Task.Run in a loop- With Parallel.ForEach, there is a Partitioner which gets created to avoid making more tasks than necessary. Task.Run will always make a single task per item (since you're doing this), but the Parallel class batches work so you create fewer tasks than total work items. This can provide significantly better overall performance, especially if the loop body has a small amount of work per item.
If this is the case, you can combine both options by writing:
await Task.Run(() => Parallel.ForEach(strings, s =>
{
DoSomething(s);
}));
Note that this can also be written in this shorter form:
await Task.Run(() => Parallel.ForEach(strings, DoSomething));
The first version will synchronously block the calling thread (and run some of the tasks on it).
If it's a UI thread, this will freeze the UI.
The second version will run the tasks asynchronously in the thread pool and release the calling thread until they're done.
There are also differences in the scheduling algorithms used.
Note that your second example can be shortened to
await Task.WhenAll(strings.Select(s => Task.Run(() => DoSomething(s))));
I have seen Parallel.ForEach used inappropriately, and I figured an example in this question would help.
When you run the code below in a Console app, you will see how the tasks executed in Parallel.ForEach doesn't block the calling thread. This could be okay if you don't care about the result (positive or negative) but if you do need the result, you should make sure to use Task.WhenAll.
using System;
using System.Linq;
using System.Threading.Tasks;
namespace ParrellelEachExample
{
class Program
{
static void Main(string[] args)
{
var indexes = new int[] { 1, 2, 3 };
RunExample((prefix) => Parallel.ForEach(indexes, (i) => DoSomethingAsync(i, prefix)),
"Parallel.Foreach");
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine("*You'll notice the tasks haven't run yet, because the main thread was not blocked*");
Console.WriteLine("Press any key to start the next example...");
Console.ReadKey();
RunExample((prefix) => Task.WhenAll(indexes.Select(i => DoSomethingAsync(i, prefix)).ToArray()).Wait(),
"Task.WhenAll");
Console.WriteLine("All tasks are done. Press any key to close...");
Console.ReadKey();
}
static void RunExample(Action<string> action, string prefix)
{
Console.ForegroundColor = ConsoleColor.White;
Console.WriteLine($"{Environment.NewLine}Starting '{prefix}'...");
action(prefix);
Console.WriteLine($"{Environment.NewLine}Finished '{prefix}'{Environment.NewLine}");
}
static async Task DoSomethingAsync(int i, string prefix)
{
await Task.Delay(i * 1000);
Console.WriteLine($"Finished: {prefix}[{i}]");
}
}
}
Here is the result:
Conclusion:
Using the Parallel.ForEach with a Task will not block the calling thread. If you care about the result, make sure to await the tasks.
I ended up doing this, as it felt easier to read:
List<Task> x = new List<Task>();
foreach(var s in myCollectionOfObject)
{
// Note there is no await here. Just collection the Tasks
x.Add(s.DoSomethingAsync());
}
await Task.WhenAll(x);
I have sample code to compare processing time for Parallel approach and Task approach. The goal of this experiment is understanding of how do they work.
So my questions are:
Why Parallel worked faster then Task?
Do my results mean that I should use Parallel instead of Task?
Where should I use Task and where Parallel?
What benefits of using Task in comparison to Parallel?
Does Task is just a wrap for ThreadPool.QueueUserWorkItem method?
public Task SomeLongOperation()
{
return Task.Delay(3000);
}
static void Main(string[] args)
{
Program p = new Program();
List<Task> tasks = new List<Task>();
tasks.Add(Task.Factory.StartNew(() => p.SomeLongOperation()));
tasks.Add(Task.Factory.StartNew(() => p.SomeLongOperation()));
var arr = tasks.ToArray();
Stopwatch sw = Stopwatch.StartNew();
Task.WaitAll(arr);
Console.WriteLine("Task wait all results: " + sw.Elapsed);
sw.Stop();
sw = Stopwatch.StartNew();
Parallel.Invoke(() => p.SomeLongOperation(), () => p.SomeLongOperation());
Console.WriteLine("Parallel invoke results: " + sw.Elapsed);
sw.Stop();
Console.ReadKey();
}
Here are my processing results:
EDIT:
Changed code to look like this:
Program p = new Program();
Task[] tasks = new Task[2];
Stopwatch sw = Stopwatch.StartNew();
tasks[0] = Task.Factory.StartNew(() => p.SomeLongOperation());
tasks[1] = Task.Factory.StartNew(() => p.SomeLongOperation());
Task.WaitAll(tasks);
Console.WriteLine("Task wait all results: " + sw.Elapsed);
sw.Stop();
sw = Stopwatch.StartNew();
Parallel.Invoke(() => p.SomeLongOperation(), () => p.SomeLongOperation());
Console.WriteLine("Parallel invoke results: " + sw.Elapsed);
sw.Stop();
My new results:
EDIT 2:
When I replaced code with Parallel.Invoke to be first and Task.WaitAll to be second the situation has been changed cardinally. Now Parallel is slower. It makes me think of incorrectness of my estimates. I changed code to look like this:
Program p = new Program();
Task[] tasks = new Task[2];
Stopwatch sw = null;
for (int i = 0; i < 10; i++)
{
sw = Stopwatch.StartNew();
Parallel.Invoke(() => p.SomeLongOperation(), () => p.SomeLongOperation());
string res = sw.Elapsed.ToString();
Console.WriteLine("Parallel invoke results: " + res);
sw.Stop();
}
for (int i = 0; i < 10; i++)
{
sw = Stopwatch.StartNew();
tasks[0] = Task.Factory.StartNew(() => p.SomeLongOperation());
tasks[1] = Task.Factory.StartNew(() => p.SomeLongOperation());
Task.WaitAll(tasks);
string res2 = sw.Elapsed.ToString();
Console.WriteLine("Task wait all results: " + res2);
sw.Stop();
}
And here are my new results:
Now I can suggest that this experiment is much more clear. The results are almost the same. Sometimes Parallel and sometimes Task is faster. Now my questions are:
1. Where should I use Task and where Parallel?
2. What benefits of using Task in comparison to Parallel?
3. Does Task is just a wrap for ThreadPool.QueueUserWorkItem method?
Any helpful info that can clarify those questions are welcome.
EDIT as of this article from MSDN:
Both Parallel and Task are wrappers for ThreadPool. Parallel invoke also awaits until all tasks will be finished.
Related to your questions:
Using Task, Parallel or ThreadPool depends on the granularity of control you need to have on the execution of your parallel tasks. I'm personally got used to Task.Factory.StartNew(), but that's a personal opinion. The same relates to ThreadPool.QueueUserWorkItem()
Additional Information: The first call to Parallel.Invoke() and Task.Factory.StartNew() might be slower due to internal initialization.
If you start nongeneric Tasks(i.e. "void Tasks without a return value") and immediately Wait for them, use Parallel.Invoke instead. Your intent is immediately clear to the reader.
Use Tasks if:
you do not Wait immediately
you need return values
you need to give parameters to the methods called
you require TaskCreationOptions functionality
you require CancellationToken or TaskScheduler functionality and don't want to use ParallelOptions
basically, if you want more options or control
Yes, you can get around some of these, e.g. Parallel.Invoke(() => p.OpWithToken(CancellationToken) but that obfuscates your intent. Parallel.Invoke is for doing a bunch of work using as much CPU power as possible. It gets done, it doesn't deadlock, and you know this in advance.
Your testing is horrid though. The red flag would be that your long action is to wait 3000 milliseconds, yet your tests take less than a tenth of a millisecond.
Task.Factory.StartNew(() => p.SomeLongOperation());
StartNew takes an Action, and executes this in a new main Task. The action () => SomeLongOperation() creates a subtask Task. After this subtask is created (not completed), the call to SomeLongOperation() returns, and the Action is done. So the main Task is already completed after a tenth millisecond, while the two subtasks you have no reference to are still running in the background. The Parallel path also creates two subtasks, which it doesn't track at all, and returns.
The correct way would be tasks[0] = p.SomeLongOperation();, which assigns a running task to the array. Then WaitAll checks for the finishing of this task.
Given the following:
BlockingCollection<MyObject> collection;
public class MyObject
{
public async Task<ReturnObject> DoWork()
{
(...)
return await SomeIOWorkAsync();
}
}
What would be the correct/most performant way to execute all DoWork() tasks asynchronously on all MyObjects in collection concurrently (while capturing the return object), ideally with a sensible thread limit though (I believe the Task Factory/ThreadPool does some management here)?
You can make use of the WhenAll extension method.
var combinedTask = await Task.WhenAll(collection.Select(x => x.DoWork());
It will start all tasks concurrently and waits for all to finish.
ThreadPool manages the number of threads running, but that won't help you much with asynchronous Tasks.
Because of that, you need something else. One way to do this is to utilize ActionBlock from TPL Dataflow:
int limit = …;
IEnumerable<MyObject> collection = …;
var block = new ActionBlock<MyObject>(
o => o.DoWork(),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = limit });
foreach (var obj in collection)
block.Post(o);
block.Complete();
await block.Completion;
What would be the correct/most performant way to execute all DoWork() tasks asynchronously on all MyObjects in collection concurrently (while capturing the return object), ideally with a sensible thread limit
The easiest way to do that is with Task.WhenAll:
ReturnObject[] results = await Task.WhenAll(collection.Select(x => x.DoWork()));
This will invoke DoWork on all MyObjects in the collection and then wait for them all to complete. The thread pool handles all throttling sensibly.
Is there a different way if I want to capture every individual DoWork() return immediately instead of waiting for all items to complete?
Yes, you can use the method described by Jon Skeet and Stephen Toub. I have a similar solution in my AsyncEx library (available via NuGet), which you can use like this:
// "tasks" is of type "Task<ReturnObject>[]"
var tasks = collection.Select(x => x.DoWork()).OrderByCompletion();
foreach (var task in tasks)
{
var result = await task;
...
}
My comment was a bit cryptic, so I though I'd add this answer:
List<Task<ReturnObject>> workTasks =
collection.Select( o => o.DoWork() ).ToList();
List<Task> resultTasks =
workTasks.Select( o => o.ContinueWith( t =>
{
ReturnObject r = t.Result;
// do something with the result
},
// if you want to run this on the UI thread
TaskScheduler.FromCurrentSynchronizationContext()
)
)
.ToList();
await Task.WhenAll( resultTasks );