Handling Expensive process with Task.WhenAll and Task.WaitAll in MVC - c#

I am thinking a way to handle and expensive process which would require me to call and get multiple data from multiple Industrial field. So for each career I will make them into one individual task inside the Generate Report Class. I would like to ask that when using Task.WhenAll is it that task 1,task 2,task 3 will run together without waiting task1 to be completed. But if using Task.WaitAll it will wait for MiningSector to Complete in order to run ChemicalSector. Am I correct since this is what I wanted to achieve.
public static async Task<bool> GenerateReport()
{
static async Task<string> WriteText(string name)
{
var start = DateTime.Now;
Console.WriteLine("Enter {0}, {1}", name, start);
return name;
}
static async Task MiningSector()
{
var task1 = WriteText("task1");
var task2 = WriteText("task2");
var task3 = WriteText("task3");
await Task.WhenAll(task1, task2, task3); //Run All 3 task together
Console.WriteLine("MiningSectorresults: {0}", String.Join(", ", t1.Result, t2.Result, t3.Result));
}
static async Task ChemicalsSector()
{
var task4 = WriteText("task4");
var task5 = WriteText("task5");
var task6 = WriteText("task6");
await Task.WhenAll(task4 , task5 , task6 ); //Run All 3 task together
Console.WriteLine("ChemicalsSectorresults: {0}", String.Join(", ", t1.Result, t2.Result, t3.Result));
}
static void Main(string[] args)
{
Task.WaitAll(MiningSector(), ChemicalsSector()); //Wait when MiningSectoris complete then start ChemicalsSector.
}
return true;
}
The thing I wanted to achieve is, inside MiningSector have 10 or more functions need to be run and ChemicalSector is same as well. I wanted to run all the functions inside the MiningSector at parallel once it's done then the ChemicalSector will start running the function.

But if using Task.WaitAll it will wait for MiningSector to Complete in order to run ChemicalSector. Am I correct since this is what I wanted to achieve.
No. The key thing to understand here is that arguments are evaluated first, and then the method call. This is the same way all arguments and method calls work across the entire C# language.
So this code:
Task.WaitAll(MiningSector(), ChemicalsSector());
has the same semantics as this code:
var miningTask = MiningSector();
var sectorTask = ChemicalsSector();
Task.WaitAll(miningTask, sectorTask);
In other words, both methods are called (and both tasks started) before Task.WaitAll is called.
If you want to do one then the other, then don't call the second method until the first task has completed. This is most easily done using async Main, e.g.:
static async Task Main(string[] args)
{
await MiningSector();
await ChemicalsSector();
}

Related

Task.WhenAll but process results one by one

Let's say I have a list of Tasks, and I want to run them in parallel. But I don't need all of them to finish to continue, I can move on with just one. The following code waits for all the tasks to finish to move on. Is there a way for me to individually continue with the task that has completed while waiting for the other ones to finish?
List<string>[] x = await Task.WhenAll(new Task<List<string>>[] { task1, task2 })
// When task1 finishes, I want to process the result immediately
// instead of waiting on task2.
You're probably looking for Task.WhenAny.
I've used it for setting off a pile of tasks and then processing each of them as they become ready, but I suppose you could also just wait for one to finish and continue without the loop if you don't care about dealing with the rest.
while(tasks.Count() > 0)
{
var task = await Task.WhenAny(tasks);
tasks.Remove(task);
var taskresult = await task;
// process result
}
If you are using C# 8 and .NET Core you can take advantage of IAsyncEnumerable to hide this complexity from the consuming side.
Just like this:
static async Task Main(string[] args)
{
await foreach (var data in GetData())
{
Console.WriteLine(data);
}
Console.ReadLine();
}
static async IAsyncEnumerable<string> GetData()
{
List<Task<string>> tasks = new List<Task<string>> {GetData1(), GetData3(), GetData2()};
while (tasks.Any())
{
var finishedTask = await Task.WhenAny(tasks);
tasks.Remove(finishedTask);
yield return await finishedTask;
}
}
static async Task<string> GetData1()
{
await Task.Delay(5000);
return "Data1";
}
static async Task<string> GetData2()
{
await Task.Delay(3000);
return "Data2";
}
static async Task<string> GetData3()
{
await Task.Delay(2000);
return "Data3";
}
You can use Task.WhenAny instead.
Example "stolen" from Stephen Cleary's Blog:
var client = new HttpClient();
string results = await await Task.WhenAny(
client.GetStringAsync("http://example.com"),
client.GetStringAsync("http://microsoft.com"));
// results contains the HTML for whichever website responded first.
Responding to comment
You absolutely can keep track of the other tasks:
// supposing you have a list of Tasks in `myTasks`:
while( myTasks.Count > 0 )
{
var finishedTask = await Task.WhenAny(myTasks);
myTasks.Remove(finishedTask);
handleFinishedTask(finishedTask); // assuming this is a method that
// does the work on finished tasks.
}
The only thing you'd have to watch out for is :
The returned task will always end in the RanToCompletion state with its Result set to the first task to complete. This is true even if the first task to complete ended in the Canceled or Faulted state.
Remarks in WhenAny Doks(Emphasis by me)
In case you want to process the results in order of the completion of the tasks, there is the OrderByCompletion extension method that does exactly that in Stephen Cleary's Nito.AsyncEx library, with the signature below:
// Creates a new collection of tasks that complete in order.
public static List<Task<T>> OrderByCompletion<T>(this IEnumerable<Task<T>> #this);
Usage example:
Task<string>[] tasks = new[] { task1, task2, task3, task4 };
foreach (var task in tasks.OrderByCompletion())
{
string result = await task;
// Do something with result
}
If you prefer not having external dependencies, the source code is here.
Based on the answer of Peter Csala, here a extension method for IAsyncEnumerable:
public static async IAsyncEnumerable<T> OrderedByCompletion<T>(this IEnumerable<Task<T>> tasks)
{
List<Task<T>> taskList = new List<Task<T>>(tasks);
while (taskList.Count > 0)
{
var finishedTask = await Task.WhenAny(taskList);
taskList.Remove(finishedTask);
yield return await finishedTask;
}
}

Switch new Task(()=>{ }) for Func<Task>

In an answer to one of my other questions, I was told that use of new Task(() => { }) is not something that is a normal use case. I was advised to use Func<Task> instead. I have tried to make that work, but I can't seem to figure it out. (Rather than drag it out in the comments, I am asking a separate question here.)
My specific scenario is that I need the Task to not start right when it is declared and to be able to wait for it later.
Here is a LinqPad example using new Task(() => { }). NOTE: This works perfectly! (Except that it uses new Task.)
static async void Main(string[] args)
{
// Line that I need to swap to a Func<Task> somehow.
// note that this is "cold" not started task
Task startupDone = new Task(() => { });
var runTask = DoStuff(() =>
{
//+++ This is where we want to task to "start"
startupDone.Start();
});
//+++ Here we wait for the task to possibly start and finish. Or timeout.
// Note that this times out at 1000ms even if "blocking = 10000" below.
var didStartup = startupDone.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
await runTask;
Console.Read();
}
public static async Task DoStuff(Action action)
{
// Swap to 1000 to simulate starting up blocking
var blocking = 1; //1000;
await Task.Delay(500 + blocking);
action();
// Do the rest of the stuff...
await Task.Delay(1000);
}
I tried swapping the second line with:
Func<Task> startupDone = new Func<Task>(async () => { });
But then the lines below the comments with +++ in them don't work right.
I swapped the startupDone.Start() with startupDone.Invoke().
But startupDone.Wait needs the task. Which is only returned in the lambda. I am not sure how to get access to the task outside the lambda so I can Wait for it.
How can use a Func<Task> and start it in one part of my code and do a Wait for it in another part of my code? (Like I can with new Task(() => { })).
The code you posted cannot be refactored to make use of a Func<Task> instead of a cold task, because the method that needs to await the task (the Main method) is not the same method that controls the creation/starting of the task (the lambda parameter of the DoStuff method). This could make the use of the Task constructor legitimate in this case, depending on whether the design decision to delegate the starting of the task to a lambda is justified. In this particular example the startupDone is used as a synchronization primitive, to signal that a condition has been met and the program can continue. This could be achieved equally well by using a specialized synchronization primitive, like for example a SemaphoreSlim:
static async Task Main(string[] args)
{
var startupSemaphore = new SemaphoreSlim(0);
Task runTask = RunAsync(startupSemaphore);
bool startupFinished = await startupSemaphore.WaitAsync(1000);
Console.WriteLine(startupFinished ? "Startup Finished" : "Startup Timed Out");
await runTask;
}
public static async Task RunAsync(SemaphoreSlim startupSemaphore)
{
await Task.Delay(500);
startupSemaphore.Release(); // Signal that the startup is done
await Task.Delay(1000);
}
In my opinion using a SemaphoreSlim is more meaningful in this case, and makes the intent of the code clearer. It also allows to await asynchronously the signal with a timeout WaitAsync(Int32), which is not something that you get from a Task out of the box (it is doable though).
Using cold tasks may be tempting in some cases, but when you revisit your code after a month or two you'll find yourself confused, because of how rare and unexpected is to have to deal with tasks that may or may have not been started yet.
I always try my hardest to never have blocking behavior when dealing with anything async or any type that represents potential async behavior such as Task. You can slightly modify your DoStuff to facilitate waiting on your Action.
static async void Main(string[] args)
{
Func<CancellationToken,Task> startupTask = async(token)=>
{
Console.WriteLine("Waiting");
await Task.Delay(3000, token);
Console.WriteLine("Completed");
};
using var source = new CancellationTokenSource(2000);
var runTask = DoStuff(() => startupTask(source.Token), source.Token);
var didStartup = await runTask;
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
Console.Read();
}
public static async Task<bool> DoStuff(Func<Task> action, CancellationToken token)
{
var blocking = 10000;
try
{
await Task.Delay(500 + blocking, token);
await action();
}
catch(TaskCanceledException ex)
{
return false;
}
await Task.Delay(1000);
return true;
}
First, the type of your "do this later" object is going to become Func<Task>. Then, when the task is started (by invoking the function), you get back a Task that represents the operation:
static async void Main(string[] args)
{
Func<Task> startupDoneDelegate = async () => { };
Task startupDoneTask = null;
var runTask = await DoStuff(() =>
{
startupDoneTask = startupDoneDelegate();
});
var didStartup = startupDoneTask.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
}

Performance impact of using async await when its not necessary

Let's assume I have these methods:
public async Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
return await service.GetAsync();
}
and
public Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
return service.GetAsync();
}
Both options compile and work the same way. Is there any best practise as to which option is better of if one is faster then the other?
Or is it just some syntactic sugar?
In the first method compiler will generate "state machine" code around it and execution will be returned to the line return await service.GetAsync(); after task will be completed. Consider example below:
public async Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
// Here execution returns to the caller and returned back only when Task is completed.
Something value = await service.GetAsync();
DoSomething();
return value;
}
The line DoSomething(); will be executed only after service.GetAsync task is completed.
Second approach simply starts execution of service.GetAsync and return correspondent Task to the caller without waiting for completion.
public Task<Something> GetSomethingAsync()
{
var somethingService = new SomethingService();
Task<Something> valueTask = service.GetAsync();
DoSomething();
return valueTask;
}
So in the example above DoSomething() will be executed straight after line Task<Something> valueTask = service.GetAsync(); without waiting for completion of task.
Executing async method on the another thread depend on the method itself.
If method execute IO operation, then another thread will be only waste of the thread, which do nothing, only waiting for response. On my opinion async - await are perfect approach for IO operations.
If method GetAsync contains for example Task.Run then execution goes to the another thread fetched from thread pool.
Below is short example, not a good one, but it show the logic a tried to explain:
static async Task GetAsync()
{
for(int i = 0; i < 10; i++)
{
Console.WriteLine($"Iterate GetAsync: {i}");
await Task.Delay(500);
}
}
static Task GetSomethingAsync() => GetAsync();
static void Main(string[] args)
{
Task gettingSomethingTask = GetSomethingAsync();
Console.WriteLine("GetAsync Task returned");
Console.WriteLine("Start sleeping");
Thread.Sleep(3000);
Console.WriteLine("End sleeping");
Console.WriteLine("Before Task awaiting");
gettingSomethingTask.Wait();
Console.WriteLine("After Task awaited");
Console.ReadLine();
}
And output will be next:
Iterate GetAsync: 0
GetAsync Task returned
Start sleeping
Iterate GetAsync: 1
Iterate GetAsync: 2
Iterate GetAsync: 3
Iterate GetAsync: 4
Iterate GetAsync: 5
End sleeping
Before Task awaiting
Iterate GetAsync: 6
Iterate GetAsync: 7
Iterate GetAsync: 8
Iterate GetAsync: 9
After Task awaited
As you can see executing of GetAsync starts straight after calling it.
If GetSomethingAsync() will be changed to the:
static Task GetSomethingAsync() => new Task(async () => await GetAsync());
Where GetAsync wrapped inside another Task, then GetAsync() will not be executed at all and output will be:
GetAsync Task returned
Start sleeping
End sleeping
Before Task awaiting
After Task awaited
Of course you will need to remove line gettingSomethingTask.Wait();, because then application just wait for task which not even started.

Adding List of tasks without executing

I have a method which returns a task, which I want to call multiple times and wait for any 1 of them to be successful. The issue I am facing is as soon as I add the task to the List, it executes and since I added delay to simulate the work, it just block there.
Is there a way to add the the tasks to the list without really executing it and let whenAny execute the tasks.
The below code is from Linqpad editor.
async void Main()
{
var t = new Test();
List<Task<int>> tasks = new List<Task<int>>();
for( int i =0; i < 5; i++)
{
tasks.Add(t.Getdata());
}
var result = await Task.WhenAny(tasks);
result.Dump();
}
public class Test
{
public Task<int> Getdata()
{
"In Getdata method".Dump();
Task.Delay(90000).Wait();
return Task.FromResult(10);
}
}
Update :: Below one makes it clear, I was under the impression that if GetData makes a call to network it will get blocked during the time it actually completes.
async void Main()
{
OverNetwork t = new OverNetwork();
List<Task<string>> websitesContentTask = new List<Task<string>>();
websitesContentTask.Add(t.GetData("http://www.linqpad.net"));
websitesContentTask.Add(t.GetData("http://csharpindepth.com"));
websitesContentTask.Add(t.GetData("http://www.albahari.com/nutshell/"));
Task<string> completedTask = await Task.WhenAny(websitesContentTask);
string output = await completedTask;
Console.WriteLine(output);
}
public class OverNetwork
{
private HttpClient client = new HttpClient();
public Task<string> GetData(string uri)
{
return client.GetStringAsync(uri);
}
}
since I added delay to simulate the work, it just block there
Actually, your problem is that your code is calling Wait, which blocks the current thread until the delay is completed.
To properly use Task.Delay, you should use await:
public async Task<int> Getdata()
{
"In Getdata method".Dump();
await Task.Delay(90000);
return 10;
}
Is there a way to add the the tasks to the list without really executing it and let whenAny execute the tasks.
No. WhenAny never executes tasks. Ever.
It's possible to build a list of asynchronous delegates, i.e., a List<Func<Task>> and execute them later, but I don't think that's what you're really looking for.
There are multiple tasks in your Getdata method. First does delay, but you are returning finished task which returned 10. Try to change your code like this
return Task.Delay(90000).ContinueWith(t => 10)

How do I convert this to an async task?

Given the following code...
static void DoSomething(int id) {
Thread.Sleep(50);
Console.WriteLine(#"DidSomething({0})", id);
}
I know I can convert this to an async task as follows...
static async Task DoSomethingAsync(int id) {
await Task.Delay(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
}
And that by doing so if I am calling multiple times (Task.WhenAll) everything will be faster and more efficient than perhaps using Parallel.Foreach or even calling from within a loop.
But for a minute, lets pretend that Task.Delay() does not exist and I actually have to use Thread.Sleep(); I know in reality this is not the case, but this is concept code and where the Delay/Sleep is would normally be an IO operation where there is no async option (such as early EF).
I have tried the following...
static async Task DoSomethingAsync2(int id) {
await Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
But, though it runs without error, according to Lucien Wischik this is in fact bad practice as it is merely spinning up threads from the pool to complete each task (it is also slower using the following console application - if you swap between DoSomethingAsync and DoSomethingAsync2 call you can see a significant difference in the time that it takes to complete)...
static void Main(string[] args) {
MainAsync(args).Wait();
}
static async Task MainAsync(String[] args) {
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(DoSomethingAsync2(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
I then tried the following...
static async Task DoSomethingAsync3(int id) {
await new Task(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
Transplanting this in place of the original DoSomethingAsync, the test never completes and nothing is shown on screen!
I have also tried multiple other variations that either do not compile or do not complete!
So, given the constraint that you cannot call any existing asynchronous methods and must complete both the Thread.Sleep and the Console.WriteLine in an asynchronous task, how do you do it in a manner that is as efficient as the original code?
The objective here for those of you who are interested is to give me a better understanding of how to create my own async methods where I am not calling anybody elses. Despite many searches, this seems to be the one area where examples are really lacking - whilst there are many thousands of examples of calling async methods that call other async methods in turn I cannot find any that convert an existing void method to an async task where there is no call to a further async task other than those that use the Task.Run(() => {} ) method.
There are two kinds of tasks: those that execute code (e.g., Task.Run and friends), and those that respond to some external event (e.g., TaskCompletionSource<T> and friends).
What you're looking for is TaskCompletionSource<T>. There are various "shorthand" forms for common situations so you don't always have to use TaskCompletionSource<T> directly. For example, Task.FromResult or TaskFactory.FromAsync. FromAsync is most commonly used if you have an existing *Begin/*End implementation of your I/O; otherwise, you can use TaskCompletionSource<T> directly.
For more information, see the "I/O-bound Tasks" section of Implementing the Task-based Asynchronous Pattern.
The Task constructor is (unfortunately) a holdover from Task-based parallelism, and should not be used in asynchronous code. It can only be used to create a code-based task, not an external event task.
So, given the constraint that you cannot call any existing asynchronous methods and must complete both the Thread.Sleep and the Console.WriteLine in an asynchronous task, how do you do it in a manner that is as efficient as the original code?
I would use a timer of some kind and have it complete a TaskCompletionSource<T> when the timer fires. I'm almost positive that's what the actual Task.Delay implementation does anyway.
So, given the constraint that you cannot call any existing
asynchronous methods and must complete both the Thread.Sleep and the
Console.WriteLine in an asynchronous task, how do you do it in a
manner that is as efficient as the original code?
IMO, this is a very synthetic constraint that you really need to stick with Thread.Sleep. Under this constraint, you still can slightly improve your Thread.Sleep-based code. Instead of this:
static async Task DoSomethingAsync2(int id) {
await Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
You could do this:
static Task DoSomethingAsync2(int id) {
return Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
This way, you'd avoid an overhead of the compiler-generated state machine class. There is a subtle difference between these two code fragments, in how exceptions are propagated.
Anyhow, this is not where the bottleneck of the slowdown is.
(it is also slower using the following console application - if you
swap between DoSomethingAsync and DoSomethingAsync2 call you can see a
significant difference in the time that it takes to complete)
Let's look one more time at your main loop code:
static async Task MainAsync(String[] args) {
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(DoSomethingAsync2(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
Technically, it requests 1000 tasks to be run in parallel, each supposedly to run on its own thread. In an ideal universe, you'd expect to execute Thread.Sleep(50) 1000 times in parallel and complete the whole thing in about 50ms.
However, this request is never satisfied by the TPL's default task scheduler, for a good reason: thread is a precious and expensive resource. Moreover, the actual number of concurrent operations is limited to the number of CPUs/cores. So in reality, with the default size of ThreadPool, I'm getting 21 pool threads (at peak) serving this operation in parallel. That is why DoSomethingAsync2 / Thread.Sleep takes so much longer than DoSomethingAsync / Task.Delay. DoSomethingAsync doesn't block a pool thread, it only requests one upon the completion of the time-out. Thus, more DoSomethingAsync tasks can actually run in parallel, than DoSomethingAsync2 those.
The test (a console app):
// https://stackoverflow.com/q/21800450/1768303
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace Console_21800450
{
public class Program
{
static async Task DoSomethingAsync(int id)
{
await Task.Delay(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync({0})", id);
}
static async Task DoSomethingAsync2(int id)
{
await Task.Run(() =>
{
Thread.Sleep(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync2({0})", id);
});
}
static async Task MainAsync(Func<int, Task> tester)
{
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(tester(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
volatile static int s_maxThreads = 0;
static void UpdateMaxThreads()
{
var threads = Process.GetCurrentProcess().Threads.Count;
// not using locks for simplicity
if (s_maxThreads < threads)
s_maxThreads = threads;
}
static void TestAsync(Func<int, Task> tester)
{
s_maxThreads = 0;
var stopwatch = new Stopwatch();
stopwatch.Start();
MainAsync(tester).Wait();
Console.WriteLine(
"time, ms: " + stopwatch.ElapsedMilliseconds +
", threads at peak: " + s_maxThreads);
}
static void Main()
{
Console.WriteLine("Press enter to test with Task.Delay ...");
Console.ReadLine();
TestAsync(DoSomethingAsync);
Console.ReadLine();
Console.WriteLine("Press enter to test with Thread.Sleep ...");
Console.ReadLine();
TestAsync(DoSomethingAsync2);
Console.ReadLine();
}
}
}
Output:
Press enter to test with Task.Delay ...
...
time, ms: 1077, threads at peak: 13
Press enter to test with Thread.Sleep ...
...
time, ms: 8684, threads at peak: 21
Is it possible to improve the timing figure for the Thread.Sleep-based DoSomethingAsync2? The only way I can think of is to use TaskCreationOptions.LongRunning with Task.Factory.StartNew:
You should think twice before doing this in any real-life application:
static async Task DoSomethingAsync2(int id)
{
await Task.Factory.StartNew(() =>
{
Thread.Sleep(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync2({0})", id);
}, TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
}
// ...
static void Main()
{
Console.WriteLine("Press enter to test with Task.Delay ...");
Console.ReadLine();
TestAsync(DoSomethingAsync);
Console.ReadLine();
Console.WriteLine("Press enter to test with Thread.Sleep ...");
Console.ReadLine();
TestAsync(DoSomethingAsync2);
Console.ReadLine();
}
Output:
Press enter to test with Thread.Sleep ...
...
time, ms: 3600, threads at peak: 163
The timing gets better, but the price for this is high. This code asks the task scheduler to create a new thread for each new task. Do not expect this thread to come from the pool:
Task.Factory.StartNew(() =>
{
Thread.Sleep(1000);
Console.WriteLine("Thread pool: " +
Thread.CurrentThread.IsThreadPoolThread); // false!
}, TaskCreationOptions.LongRunning).Wait();

Categories

Resources