Task.WhenAll finishing before Tasks have completed - c#

My code is continuing to execute before all tasks have been completed.
I've had a look at other people with a similar problem but can't see anything obvious!
static Task MoveAccountAsync(MoverParams moverParams)
{
return Task.Run(() =>
{
Console.WriteLine("Moving {0}", moverParams.Account.Name);
moverParams.Account.Mover.RefreshRoom();
moverParams.Account.Mover.PathfindTo(moverParams.Room);
});
}
static async void MoveAccountsAsync(List<Account> accounts, int room)
{
List<Task> theTasks = new List<Task>();
foreach (Account account in accounts)
{
// Create a new task and add it to the task list
theTasks.Add(MoveAccountAsync(new MoverParams(account, room)));
}
await Task.WhenAll(theTasks);
Console.WriteLine("Finished moving.");
}
Then simply calling it from static main:
MoveAccountsAsync(theAccounts, room);
Help much appreciated!
Cheers,
Dave

async void methods are highly discouraged and often times (e.g. here) sign of an issue.
Because you're not awaiting your method call (and you can't await it because it returns void) caller will not wait for all the work to finish before moving on to the next statement.
Change your method to return Task and await it to fix the problem. If you're calling into MoveAccountsAsync from synchronous context (e.g. Main method) use Wait to wait on the results. But be aware that in certain conditions (e.g. if run as part of ASP.NET application) that might cause deadlocks.

Related

await on task to finish, with operations after the 'await'

I know I may be downvoted but apparently I just don't understand async-await enough and the questions/answer I found, and the articles I found, didn't help me to find an answer for this question:
How can I make "2" to be printed out? Or actually, WHY doesn't 2 gets printed out, both in await t and in t.Wait() ?:
static Task t;
public static async void Main()
{
Console.WriteLine("Hello World");
t = GenerateTask();
await t;
//t.Wait();
Console.WriteLine("Finished");
}
public static Task GenerateTask()
{
var res = new Task(async () =>
{
Console.WriteLine("1");
await Task.Delay(10000);
Console.WriteLine("2");
});
res.Start();
return res;
}
Edit: I'm creating a task and returning it cause in real-life I need to await on this task later on, from a different method.
Edit2: await Task.Delay is just a placeholder for a real-life await on a different function.
Printing '2'
The 2 is actually printed, 10 seconds after 1 is printed. You can observe this if you add Console.ReadLine(); after printing 'Finished'.
The output is
Hello World
1
Finished
2
What is happening?
When you await t (which is res in GenerateTask method) you are awaiting the created Task and not the task that res created.
How to fix (fancy way)
You will need to await both the outer task and inner task. To be able to await the inner task you need to expose it. To expose it you need to change the type of the task from Task to Task<Task> and the return type from Task to Task<Task>.
It could look something like this:
public static async Task Main()
{
Console.WriteLine("Hello World");
var outerTask = GenerateTask();
var innerTask = await outerTask; // what you have
await innerTask; // extra await
Console.WriteLine("Finished");
Console.ReadLine();
}
public static Task<Task> GenerateTask() // returns Task<Task>, not Task
{
var res = new Task<Task>(async () => // creates Task<Task>, not Task
{
Console.WriteLine("1");
await Task.Delay(TimeSpan.FromSeconds(10));
Console.WriteLine("2");
});
res.Start();
return res;
}
The output now is:
Hello World
1
2
Finished
How to fix (easy way)
The outer task is not needed.
public static async Task Main()
{
Console.WriteLine("Hello World");
var t = GenerateTask();
await t;
Console.WriteLine("Finished");
Console.ReadLine();
}
public static async Task GenerateTask()
{
Console.WriteLine("1");
await Task.Delay(TimeSpan.FromSeconds(10));
Console.WriteLine("2");
}
It looks like it's because the constructor to new Task only takes some form of an Action (So the Task never gets returned even though it's async). So essentially what you're doing is an Async void with your delegate. Your await Task.Delay(10000) is returning and the action is considered 'done'.
You can see this if you change the await Task.Delay(10000) to Task.Delay(10000).Wait() and remove the async from the delegate.
On another note though, I've never personally seen or used new Task before. Task.Run() is a much more standard way to do it, and it'll allow for the await to be used. Also means you don't have to call Start() yourself.
Also you might already know this but, in this specific case you don't need a new task at all. You can just do this:
public static async Task GenerateTask()
{
Console.WriteLine("1");
await Task.Delay(10000);
Console.WriteLine("2");
}
Regarding your edits
Replacing your GenerateTask with what I wrote should do what you want. The async/await will turn your method into a Task that has started execution. This is exactly what you are trying to do so I'm not quite sure what you are asking with your edits.
The task returned from GenerateTask can be awaited whenever you want, or not awaited at all. You should almost never need to do new Task(). The only reason I can think is if you wanted to delay execution of the task until later, but there would be better ways around it rather than calling new Task().
If you use the way I showed in your real-life situation, let me know what doesn't work about it and I'll be happy to help.
You should use Task.Run() rather than creating a Task directly:
public static Task GenerateTask()
{
return Task.Run(async () =>
{
Console.WriteLine("1");
await Task.Delay(10000);
Console.WriteLine("2");
});
}
Task.Start() doesn't work because it doesn't understand async delegates, and the returned task just represents the beginning of the task.
Note that you can't fix this by using Task.Factory.StartNew() either, for the same reason.
See Stephen Cleary's blog post on this issue, from which I quote:
[Task.Factory.StartNew()] Does not understand async delegates. This is actually the same as
point 1 in the reasons why you would want to use StartNew. The problem
is that when you pass an async delegate to StartNew, it’s natural to
assume that the returned task represents that delegate. However, since
StartNew does not understand async delegates, what that task actually
represents is just the beginning of that delegate. This is one of the
first pitfalls that coders encounter when using StartNew in async
code.
These comments also apply to the Task constructor, which also doesn't understand async delegates.
However, it's important to note that if you are already awaiting in the code and you don't need to parallelise some compute-bound code, you don't need to create a new task at all - just using the code in your Task.Run() on its own will do.

Nested async methods in a Parallel.ForEach

I have a method that runs multiple async methods within it. I have to iterate over a list of devices, and pass the device to this method. I am noticing that this is taking a long time to complete so I am thinking of using Parallel.ForEach so it can run this process against multiple devices at the same time.
Let's say this is my method.
public async Task ProcessDevice(Device device) {
var dev = await _deviceService.LookupDeviceIndbAsNoTracking(device);
var result = await DoSomething(dev);
await DoSomething2(dev);
}
Then DoSomething2 also calls an async method.
public async Task DoSomething2(Device dev) {
foreach(var obj in dev.Objects) {
await DoSomething3(obj);
}
}
The list of devices continuously gets larger over time, so the more this list grows, the longer it takes the program to finish running ProcessDevice() against each device. I would like to process more than one device at a time. So I have been looking into using Parallel.ForEach.
Parallel.ForEach(devices, async device => {
try {
await ProcessDevice(device);
} catch (Exception ex) {
throw ex;
}
})
It appears that the program is finishing before the device is fully processed. I have also tried creating a list of tasks, and then foreach device, add a new task running ProcessDevice to that list and then awaiting Task.WhenAll(listOfTasks);
var listOfTasks = new List<Task>();
foreach(var device in devices) {
var task = Task.Run(async () => await ProcessDevice(device));
listOfTasks.Add(task);
}
await Task.WhenAll(listOfTasks);
But it appears that the task is marked as completed before ProcessDevice() is actually finished running.
Please excuse my ignorance on this issue as I am new to parallel processing and not sure what is going on. What is happening to cause this behavior and is there any documentation that you could provide that could help me better understand what to do?
You can't mix async with Parallel.ForEach. Since your underlying operation is asynchronous, you'd want to use asynchronous concurrency, not parallelism. Asynchronous concurrency is most easily expressed with WhenAll:
var listOfTasks = devices.Select(ProcessDevice).ToList();
await Task.WhenAll(listOfTasks);
In your last example there's a few problems:
var listOfTasks = new List<Task>();
foreach (var device in devices)
{
await Task.Run(async () => await ProcessDevice(device));
}
await Task.WhenAll(listOfTasks);
Doing await Task.Run(async () => await ProcessDevice(device)); means you are not moving to the next iteration of the foreach loop until the previous one is done. Essentially, you're still doing them one at a time.
Additionally, you aren't adding any tasks to listOfTasks so it remains empty and therefore Task.WhenAll(listOfTasks) completes instantly because there's no tasks to await.
Try this:
var listOfTasks = new List<Task>();
foreach (var device in devices)
{
var task = Task.Run(async () => await ProcessDevice(device))
listOfTasks.Add(task);
}
await Task.WhenAll(listOfTasks);
I can explain the problem with Parallel.ForEach. An important thing to understand is that when the await keyword acts on an incomplete Task, it returns. It will return its own incomplete Task if the method signature allows (if it's not void). Then it is up to the caller to use that Task object to wait for the job to finish.
But the second parameter in Parallel.ForEach is an Action<T>, which is a void method, which means no Task can be returned, which means the caller (Parallel.ForEach in this case) has no way to wait until the job has finished.
So in your case, as soon as it hits await ProcessDevice(device), it returns and nothing waits for it to finish so it starts the next iteration. By the time Parallel.ForEach is finished, all it has done is started all the tasks, but not waited for them.
So don't use Parallel.ForEach with asynchronous code.
Stephen's answer is more appropriate. You can also use WSC's answer, but that can be dangerous with larger lists. Creating hundreds or thousands of new threads all at once will not help your performance.
not very sure it this if what you are asking for, but I can give example of how we start async process
private readonly Func<Worker> _worker;
private void StartWorkers(IEnumerable<Props> props){
Parallel.ForEach(props, timestamp => { _worker.Invoke().Consume(timestamp); });
}
Would recommend reading about Parallel.ForEach as it will do some part for you.

Combining CPU-bound and IO-bound work in async method

There is such application:
static void Main(string[] args)
{
HandleRequests(10).Wait();
HandleRequests(50).Wait();
HandleRequests(100).Wait();
HandleRequests(1000).Wait();
Console.ReadKey();
}
private static async Task IoBoundWork()
{
await Task.Delay(100);
}
private static void CpuBoundWork()
{
Thread.Sleep(100);
}
private static async Task HandleRequest()
{
CpuBoundWork();
await IoBoundWork();
}
private static async Task HandleRequests(int numberOfRequests)
{
var sw = Stopwatch.StartNew();
var tasks = new List<Task>();
for (int i = 0; i < numberOfRequests; i++)
{
tasks.Add(HandleRequest());
}
await Task.WhenAll(tasks.ToArray());
sw.Stop();
Console.WriteLine(sw.Elapsed);
}
Below the output of this app:
From my perspective having CPU-bound and IO-bound parts in one method it is quite regular situation, e.g. parsing/archiving/serialization of some object and saving that to the disk, so it should probably work well. However in the implementation above it works very slow. Could you please help me to understand why?
If we wrap the body of CpuBoundWork() in Task it significantly improve performance:
private static async Task CpuBoundWork()
{
await Task.Run(() => Thread.Sleep(100));
}
private static async Task HandleRequest()
{
await CpuBoundWork();
await IoBoundWork();
}
Why it works so slow without Task.Run? Why we can see performance boost after adding Task.Run? Should we always use such approach in similar methods?
for (int i = 0; i < numberOfRequests; i++)
{
tasks.Add(HandleRequest());
}
The returned task is created at the first await in the HandleRequest(). So you are executing all CPU bound code on one thread: the for loop thread. complete serialization, no parallelism at all.
When you wrap the CPU part in a task you are actually submitting the CPU part as Tasks, so they are executed in parallel.
The way you're doing, this is what happens:
|-----------HandleRequest Timeline-----------|
|CpuBoundWork Timeline| |IoBoundWork Timeline|
Try doing it like this:
private static async Task HandleRequest()
{
await IoBoundWork();
CpuBoundWork();
}
It has the advantage of starting the IO work and while it waits, the CpuBoundWork() can do the processing. You only await at the last moment you need the response.
The timeline would look somewhat like this:
|--HandleRequest Timeline--|
|Io...
|CpuBoundWork Timeline|
...BoundWork Timeline|
On a side note, open extra threads (Task.Run) with caution in an web environment, you already have a thread per request, so multiplying them will have a negative impact on scalability.
You've indicated that your method ought to be asynchronous, by having it return a Task, but you've not actually made it (entirely) asynchronous. You're implementation of the method does a bunch of expensive, long running, work synchronously, and then returns to the caller and does some other work asynchronously.
Your callers of the method, however, assume that it's actually asynchronous (in entirety) and that it doesn't do expensive work synchronously. They assume that they can call the method multiple times, have it return immediately, and then continue on, but since your implementation doesn't return immediately, and instead does a bunch of expensive work before returning, that code doesn't work properly (specifically, it's not able to start the next operation until the previous one returns, so that synchronous work isn't being done in parallel).
Note that your "fix" isn't quite idiomatic. You're using the async over sync anti-pattern. Rather than making CpuBoundWork async and having it return a Task, despite being a CPU bound operation, it should remain as is an HandleRequest should handle indicating that the CPU bound work should be done asynchronously in another thread by calling Task.Run:
private static async Task HandleRequest()
{
await Task.Run(() => CpuBoundWork());
await IoBoundWork();
}

How do I create a Task that uses await inside the body that behaves the same as the synchronous version when Wait is called?

I have some code that creates a task that does some slow work like this:
public static Task wait1()
{
return new Task(() =>
{
Console.WriteLine("Waiting...");
Thread.Sleep(10000);
Console.WriteLine("Done!");
});
}
In the real implementation, the Thread.Sleep will actually be a web service call. I would like to change the body of the method can use await (so it does not consume a thread during the network access/sleep). My first attempt (based on shotgun-debugging the compile errors) was this:
public static Task wait2()
{
return new Task(async () =>
{
Console.WriteLine("Waiting...");
await Task.Delay(10000);
Console.WriteLine("Done!");
});
}
However; this task doesn't seem to behave the same as the first one, because when I call .Wait() on it; it returns immediately.
Below is a full sample (console app) showing the differences (the app will end immediately when the second task starts).
What do I need to do so that I can call Start and Wait on a Task which happens to have code using await inside it? The tasks are queued and executed later by an agent, so it's vital that the task is not auto-started.
class Program
{
static void Main(string[] args)
{
var w1 = wait1();
w1.Start();
w1.Wait(); // This waits 110 seconds
var w2 = wait2();
w2.Start();
w2.Wait(); // This returns immediately
}
public static Task wait1()
{
return new Task(() =>
{
Console.WriteLine("Waiting...");
Thread.Sleep(10000);
Console.WriteLine("Done!");
});
}
public static Task wait2()
{
return new Task(async () =>
{
Console.WriteLine("Waiting...");
await Task.Delay(10000);
Console.WriteLine("Done!");
});
}
}
It seems like this isn't possible! See alexm's answer here:
Tasks returned by async methods are always hot i.e. they are created in Running state.
:-(
I've worked around this by making my agent queue Func<Task>s instead, and the overload that receives a task simply queues () => task. Then; when de-queing a task, I check if it's not running, and if so, start it:
var currentTask = currentTaskFunction();
if (currentTask.Status == TaskStatus.Created)
currentTask.Start();
It seems a little clunky to have to do this (if this simple workaround works; why the original restriction on async methods always being created hot?), but it seems to work for me :-)
You could write this as:
public static async Task Wait2()
{
Console.WriteLine("Waiting...");
await Task.Delay(10000);
Console.WriteLine("Done!");
}
In general, it's rarely a good idea to ever use new Task or new Task<T>. If you must launch a task using the ThreadPool instead of using the async/await language support to compose one, you should use Task.Run to start the task. This will schedule the task to run (which is important, tasks should always be "hot" by conventions).
Note that doing this will make it so you don't have to call Task.Start, as well.
To help you understand this realize that async / await essentially does not create a new thread but rather it schedules that portion of code to be ran at an available point in time.
When you create the new Task(async () => ...) you have a task that run an async method. When that inner async method hits an await the 'new Task' is considered complete because the rest of it has been scheduled. To help you understand better place some code (a lot if wanted) in the 'new Task' before the await command. It will all execute before the application terminates and once await is reached that task will believe it has completed. It then returns and exits the application.
The best way to avoid this is to not place any task or async methods inside of your task.
Remove the async keyword and the await keyword from the method and it will work as expected.
This is the same as creating a callback if you're familiar with that.
void MethodAsync(Action callback)
{
//...some code
callback?.Invoke();
}
//using this looks like this.
MethodAsync(() => { /*code to run when complete */});
//This is the same as
Task MethodAsync()
{
//... some code here
}
//using it
await MethodAsync();
/*code to run when complete */
The thing to understand is that you're creating a new task within a task basically. So the inner 'callback' is being created at the await keyword.
You're code looks like this..
void MethodAsync(Action callback)
{
//some code to run
callback?.Invoke(); // <- this is the await keyword
//more code to run.. which happens after we run whoever is
//waiting on callback
}
There's code missing obviously. If this doesn't make sense please feel free to contact me and I'll assist. async / await (meant to make things simpler) is a beast to wrap your head around at first. Afterward you get it then it'll probably be your favorite thing in c# since linq. :P
Try this:
public async static Task wait2()
{
Console.WriteLine("Waiting...");
await Task.Delay(2000);
Console.WriteLine("Done!");
}
But we aware that the task is already started so you don't have to call start:
var w2 = wait2();
//w2.Start();
w2.Wait();
I think the problem with your wait2 function is that is creating 2 task, the one in new Task(...) and another in Task.Delay(). You are waiting for the first one, but you are not waiting for the inner one.

C# async await Task.delay in Action

I'm having some trouble getting a task to asynchronously delay. I am writing an application that needs to run at a scale of tens/hundreds of thousands of asynchronously executing scripts. I am doing this using C# Actions and sometimes, during the execution of a particular sequence, in order for the script to execute properly, it needs to wait on an external resource to reach an expected state. At first I wrote this using Thread.Sleep() but that turned out to be a torpedo in the applications performance, so I'm looking into async/await for async sleep. But I can't get it to actually wait on the pause! Can someone explain this?
static void Main(string[] args)
{
var sync = new List<Action>();
var async = new List<Action>();
var syncStopWatch = new Stopwatch();
sync.Add(syncStopWatch.Start);
sync.Add(() => Thread.Sleep(1000));
sync.Add(syncStopWatch.Stop);
sync.Add(() => Console.Write("Sync:\t" + syncStopWatch.ElapsedMilliseconds + "\n"));
var asyncStopWatch = new Stopwatch();
sync.Add(asyncStopWatch.Start);
sync.Add(async () => await Task.Delay(1000));
sync.Add(asyncStopWatch.Stop);
sync.Add(() => Console.Write("Async:\t" + asyncStopWatch.ElapsedMilliseconds + "\n"));
foreach (Action a in sync)
{
a.Invoke();
}
foreach (Action a in async)
{
a.Invoke();
}
}
The results of the execution are:
Sync: 999
Async: 2
How do I get it to wait asynchronously?
You're running into a problem with async void. When you pass an async lambda to an Action, the compiler is creating an async void method for you.
As a best practice, you should avoid async void.
One way to do this is to have your list of actions actually be a List<Func<Task>> instead of List<Action>. This allows you to queue async Task methods instead of async void methods.
This means your "execution" code would have to wait for each Task as it completes. Also, your synchronous methods would have to return Task.FromResult(0) or something like that so they match the Func<Task> signature.
If you want a bigger scope solution, I recommend you strongly consider TPL Dataflow instead of creating your own queue.

Categories

Resources