Combining CPU-bound and IO-bound work in async method

Combining CPU-bound and IO-bound work in async method - c#

There is such application:
static void Main(string[] args)
{
HandleRequests(10).Wait();
HandleRequests(50).Wait();
HandleRequests(100).Wait();
HandleRequests(1000).Wait();
Console.ReadKey();
}
private static async Task IoBoundWork()
{
await Task.Delay(100);
}
private static void CpuBoundWork()
{
Thread.Sleep(100);
}
private static async Task HandleRequest()
{
CpuBoundWork();
await IoBoundWork();
}
private static async Task HandleRequests(int numberOfRequests)
{
var sw = Stopwatch.StartNew();
var tasks = new List<Task>();
for (int i = 0; i < numberOfRequests; i++)
{
tasks.Add(HandleRequest());
}
await Task.WhenAll(tasks.ToArray());
sw.Stop();
Console.WriteLine(sw.Elapsed);
}
Below the output of this app:
From my perspective having CPU-bound and IO-bound parts in one method it is quite regular situation, e.g. parsing/archiving/serialization of some object and saving that to the disk, so it should probably work well. However in the implementation above it works very slow. Could you please help me to understand why?
If we wrap the body of CpuBoundWork() in Task it significantly improve performance:
private static async Task CpuBoundWork()
{
await Task.Run(() => Thread.Sleep(100));
}
private static async Task HandleRequest()
{
await CpuBoundWork();
await IoBoundWork();
}
Why it works so slow without Task.Run? Why we can see performance boost after adding Task.Run? Should we always use such approach in similar methods?

for (int i = 0; i < numberOfRequests; i++)
{
tasks.Add(HandleRequest());
}
The returned task is created at the first await in the HandleRequest(). So you are executing all CPU bound code on one thread: the for loop thread. complete serialization, no parallelism at all.
When you wrap the CPU part in a task you are actually submitting the CPU part as Tasks, so they are executed in parallel.

The way you're doing, this is what happens:
|-----------HandleRequest Timeline-----------|
|CpuBoundWork Timeline| |IoBoundWork Timeline|
Try doing it like this:
private static async Task HandleRequest()
{
await IoBoundWork();
CpuBoundWork();
}
It has the advantage of starting the IO work and while it waits, the CpuBoundWork() can do the processing. You only await at the last moment you need the response.
The timeline would look somewhat like this:
|--HandleRequest Timeline--|
|Io...
|CpuBoundWork Timeline|
...BoundWork Timeline|
On a side note, open extra threads (Task.Run) with caution in an web environment, you already have a thread per request, so multiplying them will have a negative impact on scalability.

You've indicated that your method ought to be asynchronous, by having it return a Task, but you've not actually made it (entirely) asynchronous. You're implementation of the method does a bunch of expensive, long running, work synchronously, and then returns to the caller and does some other work asynchronously.
Your callers of the method, however, assume that it's actually asynchronous (in entirety) and that it doesn't do expensive work synchronously. They assume that they can call the method multiple times, have it return immediately, and then continue on, but since your implementation doesn't return immediately, and instead does a bunch of expensive work before returning, that code doesn't work properly (specifically, it's not able to start the next operation until the previous one returns, so that synchronous work isn't being done in parallel).
Note that your "fix" isn't quite idiomatic. You're using the async over sync anti-pattern. Rather than making CpuBoundWork async and having it return a Task, despite being a CPU bound operation, it should remain as is an HandleRequest should handle indicating that the CPU bound work should be done asynchronously in another thread by calling Task.Run:
private static async Task HandleRequest()
{
await Task.Run(() => CpuBoundWork());
await IoBoundWork();
}

Related

Is it necessary to wrap the continuation that will be in non-GUI thread with Task.Run?

This question is for learning purposes. I am not in developing anything for sure.
I have two long running CPU-bound operations (JobA and JobB). Both do not interact with the GUI. Unlike Task.FromResult that completes immediately at the await expression, my Task.Run(()=>JobA()).ConfigureAwait(false) will return control to the caller and causes the continuation to be executed in non-GUI thread (because of ConfigureAwait(false)).
static void JobA()
{
for (int i = 0; i < int.MaxValue; i++) ;
}
static void JobB()
{
for (int i = 0; i < int.MaxValue; i++) ;
}
private static async Task Async()
{
await Task.Run(()=>JobA()).ConfigureAwait(false);
JobB();
//await Task.Run(() => JobB());
}
private async void Button_Click(object sender, RoutedEventArgs e)
{
await Async();
}
Question
In my understanding, wrapping JobB with Task.Run as in the second case below is unnecessary because the continuation is already guaranteed to run in non-GUI thread.
private static async Task Async()
{
await Task.Run(()=>JobA()).ConfigureAwait(false);
JobB();
}
private static async Task Async()
{
await Task.Run(()=>JobA()).ConfigureAwait(false);
await Task.Run(() => JobB());
}
Exception behavior in asynchronous is a bit tricky so I am asking this question because I want to know whether eliding is risky when exception occurs. If there is no such risk, I will delete this question.

my Task.Run(()=>JobA()).ConfigureAwait(false) will return control to the caller and causes the continuation to be executed in non-GUI thread (because of ConfigureAwait(false))
Really? Are you sure?
One interesting aspect of await is that it behaves synchronously if possible. So if the task has already completed by the time await checks it, then await will continue running synchronously. In this scenario, ConfigureAwait has no effect.
Notably, this can happen when you have different computers with different CPU speeds, or memory available, or cache behavior. With a dose of Murphy's Law, you end up with a production issue that you can't reproduce, which is always fun.
So, I never rely on ConfigureAwait(false) to guarantee that any code is running on a thread pool thread. That's what Task.Run is for. For the simple case you posted, you can do one job after another within the Task.Run: await Task.Run(() => { JobA(); JobB(); });

Async method blocking on unawaited task

In my current project, I have a piece of code that, after simplifying it down to where I'm having issues, looks something like this:
private async Task RunAsync(CancellationToken cancel)
{
bool finished = false;
while (!cancel.IsCancellationRequested && !finished)
finished = await FakeTask();
}
private Task<bool> FakeTask()
{
return Task.FromResult(false);
}
If I use this code without awaiting, I end up blocking anyway:
// example 1
var task = RunAsync(cancel); // Code blocks here...
... // Other code that could run while RunAsync is doing its thing, but is forced to wait
await task;
// example 2
var task = RunAsync(cancelSource.Token); // Code blocks here...
cancelSource.Cancel(); // Never called
In the actual project, I'm not actually using FakeTask, and there usually will be some Task.Delay I'm awaiting in there, so the code most of the time doesn't actually block, or only for a limited amount of iterations.
In unit testing, however, I'm using a mock object that does pretty much do what FakeTask does, so when I want to see if RunAsync responds to its CancellationToken getting cancelled the way I expect it to, I'm stuck.
I have found I can fix this issue by adding for example await Task.Delay(1) at the top of RunAsync, to force it to truly run asynchronous, but this feels a bit hacky. Are there better alternatives?

You have an incorrect mental picture of what await does. The meaning of await is:
Check to see if the awaitable object is complete. If it is, fetch its result and continue executing the coroutine.
If it is not complete, sign up the remainder of the current method as the continuation of the awaitable and suspend the coroutine by returning control to the caller. (Note that this makes it a semicoroutine.)
In your program, the "fake" awaitable is always complete, so there is never a suspension of the coroutine.
Are there better alternatives?
If your control flow logic requires you to suspend the coroutine then use Task.Yield.

Task.FromResult actually runs synchronously, as would await Task.Delay(0). If you want to actually simulate asynchronous code, call Task.Yield(). That creates an awaitable task that asynchronously yields back to the current context when awaited.

As #SLaks said, your code will run synchronously. One thing is running async code, and another thing is running parallel code.
If you need to run your code in parallel you can use Task.Run.
class Program
{
static async Task Main(string[] args)
{
var tcs = new CancellationTokenSource();
var task = Task.Run(() => RunAsync("1", tcs.Token));
var task2 = Task.Run(() => RunAsync("2", tcs.Token));
await Task.Delay(1000);
tcs.Cancel();
Console.ReadLine();
}
private static async Task RunAsync(string source, CancellationToken cancel)
{
bool finished = false;
while (!cancel.IsCancellationRequested && !finished)
finished = await FakeTask(source);
}
private static Task<bool> FakeTask(string source)
{
Console.WriteLine(source);
return Task.FromResult(false);
}
}

C#'s async methods execute synchronously up to the point where they have to wait for a result.
In your example there is no such point where the method has to wait for a result, so the loop keeps running forever and thereby blocking the caller.
Inserting an await Task.Yield() to simulate some real async work should help.

Threading foreach that abides by async/await methods inside said foreach

I have been looking up a way to itterate through a foreach with multiple threads, for example, let's say I have a list
public List<MYclass> All()
{
// fill the list
}
Private async Task Main()
{
foreach(All() as whatever)
{
await method();
await method2();
}
}
private async Task method()
{
//do some stuff
// more stuff
await another();
}
private async Task method2()
{
//do some stuff
// more stuff
await another2();
}
private async Task another()
{
//await client to do whatever
}
private async Task another2()
{
//await client to do whatever
}
I am trying to do the following:
List item 1 = thread1
List item 2 = thread2
List item 3 = thread3
...etc depending on how many threads I have
I've been looking around with no hope, I found Parrellel.foreach, but that doesn't wait for awaits, because once it hits the await, it complete the whole action then starts the foreach again, so how could I go about achieving what I want? Any help would be greatly apreciated.

The first thing to say, is that async/await code isn't really multi threaded . I think it sort of creates another thread sometimes, but its mainly meant to alleviate blocking in code.
I tend to think mixing multithreading and async is actually a bit of a pain because exception handling is a bit harder and less predictable. You'd be better off using Parallel foreach, and taking async off the method call definitions. Or just calling .Result inside the parallel foreach to make that code execute synchronously.

"Threads" are too low-level a concept to think about these days - after all, when a method is awaiting, it doesn't even use a thread. Instead, you want to think about "tasks" and "operations".
In this case, what you want is asynchronous concurrency (not parallel/threaded concurrency). First, define the (single) operation that you want done for each item:
private async Task ProcessAsync(MyClass item)
{
await method();
await method2();
}
Next, to start operations for all tasks concurrently, you can use Select:
private async Task Main()
{
var tasks = All().Select(ProcessAsync).ToArray();
}
and then, to (asynchronously) wait for them all to complete, use await Task.WhenAll:
private async Task Main()
{
var tasks = All().Select(ProcessAsync).ToArray();
await Task.WhenAll(tasks);
}

How to async this long running method?

I have this method which I would like to run asynchronously so that I can do other things while it runs. It does not rely on any other Async method (it doesn't call out to another resource, download a file or anything). I would like to avoid using new Task(), Task.Factory.StartTask() and Task.Run(), if possible.
Is it possible to run this method asynchronously, with tidy, readable code and without using Task explicitly?
If not, what is the tidiest way of running the method asynchronously?
Note: Please don't be concerned with the silly logic in the method - I have boiled it down to be deliberately slow but not show my actual code.
public static void main(string[] args)
{
RunMySlowLogic();
}
private void RunMySlowLogic()
{
while (true)
for (int i=0; i<100000000;i++)
if (i == new Random().Next(999))
return true;
}
Currently, I believe that I would need to wrap the method in a lambda or Task and mark it async. Where would the await go?

You're confusing two different things. You can run this in the background, and this method can be asynchronous. These are 2 different things and your method can do either, or both.
If you do something asynchronous in that method, like Task.Delay or some non-blocking I/O then call that method, await the returned task and make the method itself async:
async Task RunMySlowLogicAsync()
{
while (true)
{
// ...
await Task.Delay(1000);
}
}
If you don't have such a thing then your method isn't asynchronous, it's synchronous. You can still run it in the background on a different (ThreadPool) thread while you do other things using Task.Run:
var task = Task.Run(() => RunMySlowLogic());

There are multiple ways of executing code asynchronously in the .NET environment. Have a look at the Asynchronous Programming Patterns MSDN article.
Tasks are to make your job easier. I think the only valid reason to avoid using tasks is when you are targeting an older version of .NET.
So without Tasks, you can start a thread yourself, or use a ThreadPool (Tasks do this internally).
public static void main(string[] args)
{
var are = new AutoResetEvent(false);
ThreadPool.QueueUserWorkItem(RunMySlowLogicWrapped, are);
// Do some other work here
are.WaitOne();
}
// you have to match the signature of WaitCallback delegate, we can use it to communicate cross-thread
private void RunMySlowLogicWrapped(Object state) {
AutoResetEvent are = (AutoResetEvent) state;
RunMySlowLogic();
are.Set();
}
private bool RunMySlowLogic()
{
while (true)
for (int i=0; i<100000000;i++)
if (i == new Random().Next(999))
return true;
}

How do I convert this to an async task?

Given the following code...
static void DoSomething(int id) {
Thread.Sleep(50);
Console.WriteLine(#"DidSomething({0})", id);
}
I know I can convert this to an async task as follows...
static async Task DoSomethingAsync(int id) {
await Task.Delay(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
}
And that by doing so if I am calling multiple times (Task.WhenAll) everything will be faster and more efficient than perhaps using Parallel.Foreach or even calling from within a loop.
But for a minute, lets pretend that Task.Delay() does not exist and I actually have to use Thread.Sleep(); I know in reality this is not the case, but this is concept code and where the Delay/Sleep is would normally be an IO operation where there is no async option (such as early EF).
I have tried the following...
static async Task DoSomethingAsync2(int id) {
await Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
But, though it runs without error, according to Lucien Wischik this is in fact bad practice as it is merely spinning up threads from the pool to complete each task (it is also slower using the following console application - if you swap between DoSomethingAsync and DoSomethingAsync2 call you can see a significant difference in the time that it takes to complete)...
static void Main(string[] args) {
MainAsync(args).Wait();
}
static async Task MainAsync(String[] args) {
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(DoSomethingAsync2(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
I then tried the following...
static async Task DoSomethingAsync3(int id) {
await new Task(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
Transplanting this in place of the original DoSomethingAsync, the test never completes and nothing is shown on screen!
I have also tried multiple other variations that either do not compile or do not complete!
So, given the constraint that you cannot call any existing asynchronous methods and must complete both the Thread.Sleep and the Console.WriteLine in an asynchronous task, how do you do it in a manner that is as efficient as the original code?
The objective here for those of you who are interested is to give me a better understanding of how to create my own async methods where I am not calling anybody elses. Despite many searches, this seems to be the one area where examples are really lacking - whilst there are many thousands of examples of calling async methods that call other async methods in turn I cannot find any that convert an existing void method to an async task where there is no call to a further async task other than those that use the Task.Run(() => {} ) method.

There are two kinds of tasks: those that execute code (e.g., Task.Run and friends), and those that respond to some external event (e.g., TaskCompletionSource<T> and friends).
What you're looking for is TaskCompletionSource<T>. There are various "shorthand" forms for common situations so you don't always have to use TaskCompletionSource<T> directly. For example, Task.FromResult or TaskFactory.FromAsync. FromAsync is most commonly used if you have an existing *Begin/*End implementation of your I/O; otherwise, you can use TaskCompletionSource<T> directly.
For more information, see the "I/O-bound Tasks" section of Implementing the Task-based Asynchronous Pattern.
The Task constructor is (unfortunately) a holdover from Task-based parallelism, and should not be used in asynchronous code. It can only be used to create a code-based task, not an external event task.
So, given the constraint that you cannot call any existing asynchronous methods and must complete both the Thread.Sleep and the Console.WriteLine in an asynchronous task, how do you do it in a manner that is as efficient as the original code?
I would use a timer of some kind and have it complete a TaskCompletionSource<T> when the timer fires. I'm almost positive that's what the actual Task.Delay implementation does anyway.

So, given the constraint that you cannot call any existing
asynchronous methods and must complete both the Thread.Sleep and the
Console.WriteLine in an asynchronous task, how do you do it in a
manner that is as efficient as the original code?
IMO, this is a very synthetic constraint that you really need to stick with Thread.Sleep. Under this constraint, you still can slightly improve your Thread.Sleep-based code. Instead of this:
static async Task DoSomethingAsync2(int id) {
await Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
You could do this:
static Task DoSomethingAsync2(int id) {
return Task.Run(() => {
Thread.Sleep(50);
Console.WriteLine(#"DidSomethingAsync({0})", id);
});
}
This way, you'd avoid an overhead of the compiler-generated state machine class. There is a subtle difference between these two code fragments, in how exceptions are propagated.
Anyhow, this is not where the bottleneck of the slowdown is.
(it is also slower using the following console application - if you
swap between DoSomethingAsync and DoSomethingAsync2 call you can see a
significant difference in the time that it takes to complete)
Let's look one more time at your main loop code:
static async Task MainAsync(String[] args) {
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(DoSomethingAsync2(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
Technically, it requests 1000 tasks to be run in parallel, each supposedly to run on its own thread. In an ideal universe, you'd expect to execute Thread.Sleep(50) 1000 times in parallel and complete the whole thing in about 50ms.
However, this request is never satisfied by the TPL's default task scheduler, for a good reason: thread is a precious and expensive resource. Moreover, the actual number of concurrent operations is limited to the number of CPUs/cores. So in reality, with the default size of ThreadPool, I'm getting 21 pool threads (at peak) serving this operation in parallel. That is why DoSomethingAsync2 / Thread.Sleep takes so much longer than DoSomethingAsync / Task.Delay. DoSomethingAsync doesn't block a pool thread, it only requests one upon the completion of the time-out. Thus, more DoSomethingAsync tasks can actually run in parallel, than DoSomethingAsync2 those.
The test (a console app):
// https://stackoverflow.com/q/21800450/1768303
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace Console_21800450
{
public class Program
{
static async Task DoSomethingAsync(int id)
{
await Task.Delay(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync({0})", id);
}
static async Task DoSomethingAsync2(int id)
{
await Task.Run(() =>
{
Thread.Sleep(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync2({0})", id);
});
}
static async Task MainAsync(Func<int, Task> tester)
{
List<Task> tasks = new List<Task>();
for (int i = 1; i <= 1000; i++)
tasks.Add(tester(i)); // Can replace with any version
await Task.WhenAll(tasks);
}
volatile static int s_maxThreads = 0;
static void UpdateMaxThreads()
{
var threads = Process.GetCurrentProcess().Threads.Count;
// not using locks for simplicity
if (s_maxThreads < threads)
s_maxThreads = threads;
}
static void TestAsync(Func<int, Task> tester)
{
s_maxThreads = 0;
var stopwatch = new Stopwatch();
stopwatch.Start();
MainAsync(tester).Wait();
Console.WriteLine(
"time, ms: " + stopwatch.ElapsedMilliseconds +
", threads at peak: " + s_maxThreads);
}
static void Main()
{
Console.WriteLine("Press enter to test with Task.Delay ...");
Console.ReadLine();
TestAsync(DoSomethingAsync);
Console.ReadLine();
Console.WriteLine("Press enter to test with Thread.Sleep ...");
Console.ReadLine();
TestAsync(DoSomethingAsync2);
Console.ReadLine();
}
}
}
Output:
Press enter to test with Task.Delay ...
...
time, ms: 1077, threads at peak: 13
Press enter to test with Thread.Sleep ...
...
time, ms: 8684, threads at peak: 21
Is it possible to improve the timing figure for the Thread.Sleep-based DoSomethingAsync2? The only way I can think of is to use TaskCreationOptions.LongRunning with Task.Factory.StartNew:
You should think twice before doing this in any real-life application:
static async Task DoSomethingAsync2(int id)
{
await Task.Factory.StartNew(() =>
{
Thread.Sleep(50);
UpdateMaxThreads();
Console.WriteLine(#"DidSomethingAsync2({0})", id);
}, TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
}
// ...
static void Main()
{
Console.WriteLine("Press enter to test with Task.Delay ...");
Console.ReadLine();
TestAsync(DoSomethingAsync);
Console.ReadLine();
Console.WriteLine("Press enter to test with Thread.Sleep ...");
Console.ReadLine();
TestAsync(DoSomethingAsync2);
Console.ReadLine();
}
Output:
Press enter to test with Thread.Sleep ...
...
time, ms: 3600, threads at peak: 163
The timing gets better, but the price for this is high. This code asks the task scheduler to create a new thread for each new task. Do not expect this thread to come from the pool:
Task.Factory.StartNew(() =>
{
Thread.Sleep(1000);
Console.WriteLine("Thread pool: " +
Thread.CurrentThread.IsThreadPoolThread); // false!
}, TaskCreationOptions.LongRunning).Wait();

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Combining CPU-bound and IO-bound work in async method - c#

Related

Is it necessary to wrap the continuation that will be in non-GUI thread with Task.Run?

Async method blocking on unawaited task

Threading foreach that abides by async/await methods inside said foreach

How to async this long running method?

How do I convert this to an async task?

Categories

Resources