Long task after await in an async method - c#

While the following question is generally applicable to all usage of async/await in C#, it refers to Json.NET. The JsonConvert.DeserializeObjectAsync() method has been marked as obsolete by the development team as it would be difficult to maintain and not of much use since most JSON files are small (Refer this).
I have some code following this structure:
public async Task<CarObj> GetCarAsync()
{
string json = await GetJsonStringFromRestEndpoint();
// At this point, we should already be on a separate thread since we have awaited a long running task.
// 1 - Running this relatively long task on this thread should be fine since we're already on a new thread than the caller.
CarObj obj = JsonConvert.DeserializeObject<CarObj>(json);
// 2 - Would this better for some reason?
CarObj obj2 = await Task.Run(() => JsonConvert.DeserializeObject<CarObj>(json));
}
Would option 1 or 2 in the code above be the better solution here?

Arguably this is primarily opinion-based. But…
Assuming the library authors are correct, your first option is better. But not for the reason you think.
When the await GetJsonStringFromRestEndpoint() completes, then assuming the GetCarAsync() method was called from a thread with a synchronization context, the call to DeserializeObject<CarObj>(json); will happen on that same thread.
The reason calling the method synchronously isn't a problem isn't because you're on a different thread (you're not), but rather because as the library authors point out, the input data isn't likely to be large enough for there to be any significant performance problem. You can probably parse the entire JSON data and construct your CarObj value in less time than it takes to queue up the thread pool work item, context-switch to that thread, and then context-switch back.
In other words, don't use worker threads to perform computationally inexpensive work.

// At this point, we should already be on a separate thread since we have awaited a long running task.
No, the caller thread is called back (resumed) instead.
But - if this is not your intended behavior - I'd advice to add
.ConfigureAwait(false);
That would save some synchronization work and afterwards you'll reasonably expect to be in a thread pull thread.

Related

Usage of ConfigureAwait in .NET

I've read about ConfigureAwait in various places (including SO questions), and here are my conclusions:
ConfigureAwait(true): Runs the rest of the code on the same thread the code before the await was run on.
ConfigureAwait(false): Runs the rest of the code on the same thread the awaited code was run on.
If the await is followed by a code that accesses the UI, the task should be appended with .ConfigureAwait(true). Otherwise, an InvalidOperationException will occur due to another thread accessing UI elements.
My questions are:
Are my conclusions correct?
When does ConfigureAwait(false) improves performance, and when it doesn't?
If writing for a GUI application, but the next lines doesn't access the UI elements. Should I use ConfigureAwait(false) or ConfigureAwait(true) ?
To answer your questions more directly:
ConfigureAwait(true): Runs the rest of the code on the same thread the code before the await was run on.
Not necessarily the same thread, but the same synchronization context. The synchronization context can decide how to run the code. In a UI application, it will be the same thread. In ASP.NET, it may not be the same thread, but you will have the HttpContext available, just like you did before.
ConfigureAwait(false): Runs the rest of the code on the same thread the awaited code was run on.
This is not correct. ConfigureAwait(false) tells it that it does not need the context, so the code can be run anywhere. It could be any thread that runs it.
If the await is followed by a code that accesses the UI, the task should be appended with .ConfigureAwait(true). Otherwise, an InvalidOperationException will occur due to another thread accessing UI elements.
It is not correct that it "should be appended with .ConfigureAwait(true)". ConfigureAwait(true) is the default. So if that's what you want, you don't need to specify it.
When does ConfigureAwait(false) improves performance, and when it doesn't?
Returning to the synchronization context might take time, because it may have to wait for something else to finish running. In reality, this rarely happens, or that waiting time is so minuscule that you'd never notice it.
If writing for a GUI application, but the next lines doesn't access the UI elements. Should I use ConfigureAwait(false) or ConfigureAwait(true) ?
You could use ConfigureAwait(false), but I suggest you don't, for a few reasons:
I doubt you would notice any performance improvement.
It can introduce parallelism that you may not expect. If you use ConfigureAwait(false), the continuation can run on any thread, so you could have problems if you're accessing non-thread-safe objects. It is not common to have these problems, but it can happen.
You (or someone else maintaining this code) may add code that interacts with the UI later and exceptions will be thrown. Hopefully the ConfigureAwait(false) is easy to spot (it could be in a different method than where the exception is thrown) and you/they know what it does.
I find it's easier to not use ConfigureAwait(false) at all (except in libraries). In the words of Stephen Toub (a Microsoft employee) in the ConfigureAwait FAQ:
When writing applications, you generally want the default behavior (which is why it is the default behavior).
Edit: I've written an article of my own on this topic: .NET: Don’t use ConfigureAwait(false)
ConfigureAwait(false) may improve performance if there are not many worker threads available and if the thread that it would need to wait for is constantly busy.
ConfigureAwait(false) is recommended everywhere where coming back to same SynchronizationContext (which usualy is linked with thread) is not needed, especially in libraries that awaits something internally: https://medium.com/bynder-tech/c-why-you-should-use-configureawait-false-in-your-library-code-d7837dce3d7f.
ConfigureAwait(true) (which is the default) is needed when you require same context but may also lead to a dead lock in certain situations.
Consider this code:
void Main()
{
// creating a windows form attaches a synchronization context to the current thread
new System.Windows.Forms.Form();
var task = DoSth();
Console.WriteLine(task.Result);
}
async Task<int> DoSth()
{
await Task.Delay(1000);
return 1;
}
in this example because of not awaited task DoSth, the main UI thread is blocked by waiting for task.Result - at the same time DoSth is blocked because it wants to come back to the UI thread after a delay. This will lead to a deadlock and this code will never execute to the end. Adding .ConfigureAwait(false) solves the problem in this case.
Using ConfigureAwait(false) in application code is normally not going to boost your application's performance in any meaningful way, because normally you don't await inside loops in application code. For example lets consider the case that your app has a button, and an async operation is started everytime the user clicks the button, and the async operation includes a single await. By typing the 22 characters .ConfigureAwait(false) after this await you have already lost comparable time of your life, with the time you can hope to save from 10 users that click this button once every minute, 8 hours per day, for 20 years each (~35,000,000 context switchings in total = some seconds of CPU processing time).
And this before taking into account the time you need to think about whether you can safely include this configuration (depending on whether the continuation contains UI-related code), the time you'll need in order to reconfirm you previous assessment every time you have to maintain/modify the code, and the time you'll lose on debugging in case your assessment was wrong.
On the other hand if your Button_Click handler contains code like this:
private async void Button_Click(object sender, EventArgs e)
{
var client = new WebClient();
using var stream = await client.OpenReadTaskAsync("someUrl");
var buffer = new byte[1024];
while ((await stream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
//...
}
}
...then by all means do spend the extra time to ConfigureAwait(false) the ReadAsync task. Also do consider refactoring the code by moving the stream-reading part to a separate asynchronous method, so that you can safely access UI elements anywhere inside the Button_Click handler, without been distracted by technicalities that don't belong to this layer of the app.

Improving performance of Parallel.For in C# with more methods

Recently I've stumbled upon a Parralel.For loop that performs way better than a regular for loop for my purposes.
This is how I use it:
Parallel.For(0, values.Count, i =>Products.Add(GetAllProductByID(values[i])));
It made my application work a lot faster, but still not fast enough. My question to you guys is:
Does Parallel.Foreach performs faster than Parallel.For?
Is there some "hybrid" method with whom I can combine my Parralel.For loop to perform even faster (i.e. use more CPU power)? If yes, how?
Can someone help me out with this?
If you want to play with parallel, I suggest using Parallel Linq (PLinq) instead of Parallel.For / Parallel.ForEach , e.g.
var Products = Enumerable
.Range(0, values.Count)
.AsParallel()
//.WithDegreeOfParallelism(10) // <- if you want, say 10 threads
.Select(i => GetAllProductByID(values[i]))
.ToList(); // <- this is thread safe now
With a help of With methods (e.g. WithDegreeOfParallelism) you can try tuning you implementation.
There are two related concepts: asynchronous programming and multithreading. Basically, to do things "in parallel" or asynchronously, you can either create new threads or work asynchronously on the same thread.
Keep in mind that either way you'll need some mechanism to prevent race conditions. From the Wikipedia article I linked to, a race condition is defined as follows:
A race condition or race hazard is the behavior of an electronic,
software or other system where the output is dependent on the sequence
or timing of other uncontrollable events. It becomes a bug when events
do not happen in the order the programmer intended.
As a few people have mentioned in the comments, you can't rely on the standard List class to be thread-safe - i.e. it might behave in unexpected ways if you're updating it from multiple threads. Microsoft now offers special "built-in" collection classes (in the System.Collections.Concurrent namespace) that'll behave in the expected way if you're updating it asynchronously or from multiple threads.
For well-documented libraries (and Microsoft's generally pretty good about this in their documentation), the documentation will often explicitly state whether the class or method in question is thread-safe. For example, in the documentation for System.Collections.Generic.List, it states the following:
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
In terms of asynchronous programming (vs. multithreading), my standard illustration of this is as follows: suppose you go a restaurant with 10 people. When the waiter comes by, the first person he asks for his order isn't ready; however, the other 9 people are. Thus, the waiter asks the other 9 people for their orders and then comes back to the original guy. (It's definitely not the case that they'll get a second waiter to wait for the original guy to be ready to order and doing so probably wouldn't save much time anyway). That's how async/await typically works (the exception being that some of the Task Parallel library calls, like Thread.Run(...), actually are executing on other threads - in our illustration, bringing in a second waiter - so make sure you check the documentation for which is which).
Basically, which you choose (asynchronously on the same thread or creating new threads) depends on whether you're trying to do something that's I/O-bound (i.e. you're just waiting for an operation to complete or for a result) or CPU-bound.
If your main purpose is to wait for a result from Ebay, it would probably be better to work asynchronously in the same thread as you may not get much of a performance benefit for using multithreading. Think back to our analogy: bringing in a second waiter just to wait for the first guy to be ready to order isn't necessarily any better than just having the waiter to come back to him.
I'm not sitting in front of an IDE so forgive me if this syntax isn't perfect, but here's an approximate idea of what you can do:
public async Task GetResults(int[] productIDsToGet) {
var tasks = new List<Task>();
foreach (int productID in productIDsToGet) {
Task task = GetResultFromEbay(productID);
tasks.Add(task);
}
// Wait for all of the tasks to complete
await Task.WhenAll(tasks);
}
private async Task GetResultFromEbay(int productIdToGet) {
// Get result asynchronously from eBay
}

Using ThreadPool and Task.Wait inside an Async/Await Method

I just encountered this code. I immediately started cringing and talking to myself (not nice things). The thing is I don't really understand why and can't reasonably articulate it. It just looks really bad to me - maybe I'm wrong.
public async Task<IHttpActionResult> ProcessAsync()
{
var userName = Username.LogonName(User.Identity.Name);
var user = await _user.GetUserAsync(userName);
ThreadPool.QueueUserWorkItem((arg) =>
{
Task.Run(() => _billing.ProcessAsync(user)).Wait();
});
return Ok();
}
This code looks to me like it's needlessly creating threads with ThreadPool.QueueUserWorkItem and Task.Run. Plus, it looks like it has the potential to deadlock or create serious resource issues when under heavy load. Am I correct?
The _billing.ProcessAsync() method is awaitable(async), so I would expect that a simple "await" keyword would be the right thing to do and not all this other baggage.
I believe Scott is correct with his guess that ThreadPool.QueueUserWorkItem should have been HostingEnvironment.QueueBackgroundWorkItem. The call to Task.Run and Wait, however, are entirely nonsensical - they're pushing work to the thread pool and blocking a thread pool thread on it, when the code is already on the thread pool.
The _billing.ProcessAsync() method is awaitable(async), so I would expect that a simple "await" keyword would be the right thing to do and not all this other baggage.
I strongly agree.
However, this will change the behavior of the action. It will now wait until Billing.ProcessAsync is completed, whereas before it would return early. Note that returning early on ASP.NET is almost always a mistake - I would say any "billing" processing would be even more certainly a mistake. So, replacing this mess with await will make the app more correct, but it will cause the ProcessAsync action to take longer to return to the client.
It's strange, but depending on what the author is trying to achieve, it seems ok to me to queue a work item in the thread pool from inside an async method.
This is not as starting a thread, it's just queueing an action to be done in a ThreadPool's thread when there is a free one. So the async method (ProcessAsync) can continue and don't need to care about the result.
The weird thing is the code inside the lambda to be enqueued in the ThreadPool. Not only the Task.Run() (which is superflous and just causes unnecessary overhead), but to call an async method without waiting for it to finish is bad inside a method that should be run by the ThreadPool, because it returns the control flow to the caller when awaiting something.
So the ThreadPool eventually thinks this method is finished (and the thread free for the next action in the queue), while actually the method wants to be resumed later.
This may lead to very undefined behaviour. This code may have been working (in certain circumstances), but I would not rely on it and use it as productive code.
(The same goes for calling a not-awaited async method inside Task.Run(), as the Task "thinks" it's finished while the method actually wants to be resumed later).
As solution I'd propose to simply await that async method, too:
await _billing.ProcessAsync(user);
But of course without any knowledge about the context of the code snippet I can't guarantee anything. Note that this would change the behaviour: while until now the code did not wait for _billing.ProcessAsync() to finsih, it would now do. So maybe leaving out await and just fire and forget
_billing.ProcessAsync(user);
maybe good enough, too.

What is the use case for async/await? [duplicate]

This question already has answers here:
Brief explanation of Async/Await in .Net 4.5
(3 answers)
Closed 7 years ago.
C# offers multiple ways to perform asynchronous execution such as threads, futures, and async.
In what cases is async the best choice?
I have read many articles about the how and what of async, but so far I have not seen any article that discusses the why.
Initially I thought async was a built-in mechanism to create a future. Something like
async int foo(){ return ..complex operation..; }
var x = await foo();
do_something_else();
bar(x);
Where call to 'await foo' would return immediately, and the use of 'x' would wait on the the return value of 'foo'. async does not do this. If you want this behavior you can use the futures library: https://msdn.microsoft.com/en-us/library/Ff963556.aspx
The above example would instead be something like
int foo(){ return ..complex operation..; }
var x = Task.Factory.StartNew<int>(() => foo());
do_something_else();
bar(x.Result);
Which isn't as pretty as I would have hoped, but it works nonetheless.
So if you have a problem where you want to have multiple threads operate on the work then use futures or one of the parallel operations, such as Parallel.For.
async/await, then, is probably not meant for the use case of performing work in parallel to increase throughput.
async solves the problem of scaling an application for a large number of asynchronous events, such as I/O, when creating many threads is expensive.
Imagine a web server where requests are processed immediately as they come in. The processing happens on a single thread where every function call is synchronous. To fully process a thread might take a few seconds, which means that an entire thread is consumed until the processing is complete.
A naive approach to server programming is to spawn a new thread for each request. In this way it does not matter how long each thread takes to complete because no thread will block any other. The problem with this approach is that threads are not cheap. The underlying operating system can only create so many threads before running out of memory, or some other kind of resource. A web server that uses 1 thread per request will probably not be able to scale past a few hundred/thousand requests per second. The c10k challenge asks that modern servers be able to scale to 10,000 simultaneous users. http://www.kegel.com/c10k.html
A better approach is to use a thread pool where the number of threads in existence is more or less fixed (or at least, does not expand past some tolerable maximum). In that scenario only a fixed number of threads are available for processing the incoming requests. If there are more requests than there are threads available for processing then some requests must wait. If a thread is processing a request and has to wait on a long running I/O process then effectively the thread is not being utilized to its fullest extent, and the server throughput will be much less than it otherwise could be.
The question is now, how can we have a fixed number of threads but still use them efficiently? One answer is to 'cut up' the program logic so that when a thread would normally wait on an I/O process, instead it will start the I/O process but immediately become free for any other task that wants to execute. The part of the program that was going to execute after the I/O will be stored in a thing that knows how to keep executing later on.
For example, the original synchronous code might look like
void process(){
string name = get_user_name();
string address = look_up_address(name);
string tax_forms = find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
Where look_up_address and find_tax_form have to talk to a database and/or make requests to other websites.
The asynchronous version might look like
void process(){
string name = get_user_name();
invoke_after(() => look_up_address(name), (address) => {
invoke_after(() => find_tax_form(address), (tax_forms) => {
render_tax_form(name, address, tax_forms);
}
}
}
This is continuation passing style, where next thing to do is passed as the second lambda to a function that will not block the current thread when the blocking operation (in the first lambda) is invoked. This works but it quickly becomes very ugly and hard to follow the program logic.
What the programmer has manually done in splitting up their program can be automatically done by async/await. Any time there is a call to an I/O function the program can mark that function call with await to inform the caller of the program that it can continue to do other things instead of just waiting.
async void process(){
string name = get_user_name();
string address = await look_up_address(name);
string tax_forms = await find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
The thread that executes process will break out of the function when it gets to look_up_address and continue to do other work: such as processing other requests. When look_up_address has completed and process is ready to continue, some thread (or the same thread) will pick up where the last thread left off and execute the next line find_tax_forms(address).
Since my current belief of async is about managing threads, I don't believe that async makes a lot of sense for UI programming. Generally UI's will not have that many simultaneous events that need to be processed. The use case for async with UI's is preventing the UI thread from being blocked. Even though async can be used with a UI, I would find it dangerous because ommitting an await on some long running function, due to either an accident or forgetfulness, would cause the UI to block.
async void button_callback(){
await do_something_long();
....
}
This code won't block the UI because it uses an await for the long running function that it invokes. If later on another function call is added
async void button_callback(){
do_another_thing();
await do_something_long();
...
}
Where it wasn't clear to the programmer who added the call to do_another_thing just how long it would take to execute, the UI will now be blocked. It seems safer to just always execute all processing in a background thread.
void button_callback(){
new Thread(){
do_another_thing();
do_something_long();
....
}.start();
}
Now there is no possibility that the UI thread will be blocked, and the chances that too many threads will be created is very small.

async and await without "threads"? Can I customize what happens under-the-hood?

I have a question about how customizable the new async/await keywords and the Task class in C# 4.5 are.
First some background for understanding my problem: I am developing on a framework with the following design:
One thread has a list of "current things to do" (usually around 100 to 200 items) which are stored as an own data structure and hold as a list. It has an Update() function that enumerates the list and look whether some "things" need to execute and does so. Basically its like a big thread sheduler. To simplify things, lets assume the "things to do" are functions that return the boolean true when they are "finished" (and should not be called next Update) and false when the sheduler should call them again next update.
All the "things" must not run concurrently and also must run in this one thread (because of thread static variables)
There are other threads which do other stuff. They are structured in the same way: Big loop that iterates a couple of hundret things to do in a big Update() - function.
Threads can send each other messages, including "remote procedure calls". For these remote calls, the RPC system is returning some kind of future object to the result value. In the other thread, a new "thing to do" is inserted.
A common "thing" to do are just sequences of RPCs chained together. At the moment, the syntax for this "chaining" is very verbose and complicated, since you manually have to check for the completion state of previous RPCs and invoke the next ones etc..
An example:
Future f1, f2;
bool SomeThingToDo() // returns true when "finished"
{
if (f1 == null)
f1 = Remote1.CallF1();
else if (f1.IsComplete && f2 == null)
f2 = Remote2.CallF2();
else if (f2 != null && f2.IsComplete)
return true;
return false;
}
Now this all sound awefull like async and await of C# 5.0 can help me here. I haven't 100% fully understand what it does under the hood (any good references?), but as I get it from some few talks I've watched, it exactly does what I want with this nicely simple code:
async Task SomeThingToDo() // returning task is completed when this is finished.
{
await Remote1.CallF1();
await Remote2.CallF2();
}
But I can't find a way how write my Update() function to make something like this happen. async and await seem to want to use the Task - class which in turn seems to need real threads?
My closest "solution" so far:
The first thread (which is running SomeThingToDo) calls their functions only once and stores the returned task and tests on every Update() whether the task is completed.
Remote1.CallF1 returns a new Task with an empty Action as constructor parameter and remembers the returned task. When F1 is actually finished, it calls RunSynchronously() on the task to mark it as completed.
That seems to me like a pervertion of the task system. And beside, it creates shared memory (the Task's IsComplete boolean) between the two threads which I would like to have replaced with our remote messanging system, if possible.
Finally, it does not solve my problem as it does not work with the await-like SomeThingToDo implementation above. It seems the auto-generated Task objects returned by an async function are completed immediately?
So finally my questions:
Can I hook into async/await to use my own implementations instead of Task<T>?
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Any good reference what exactly happens when I write async and await?
I haven't 100% fully understand what it does under the hood - any good references?
Back when we were designing the feature Mads, Stephen and I wrote some articles at a variety of different levels for MSDN magazine. The links are here:
http://blogs.msdn.com/b/ericlippert/archive/2011/10/03/async-articles.aspx
Start with my article, then Mads's, then Stephen's.
It seems the auto-generated Task objects returned by an async function are completed immediately?
No, they are completed when the code in the method body returns or throws, same as any other code.
Can I hook into async/await to use my own implementations instead of Task<T>?
A method which contains an await must return void, Task or Task<T>. However, the expression that is awaited can return any type so long as you can call GetAwaiter() on it. That need not be a Task.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Absolutely. A Task just represents work that will complete in the future. Though that work is typically done on another thread, there is no requirement.
To answer your questions:
Can I hook into async/await to use my own implementations instead of Task?
Yes. You can await anything. However, I do not recommend this.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
The Task type represents a future. It does not necessarily "run" on a thread; it can represent the completion of a download, or a timer expiring, etc.
Any good reference what exactly happens when I write async and await?
If you mean as far as code transformations go, this blog post has a nice side-by-side. It's not 100% accurate in its details, but it's enough to write a simple custom awaiter.
If you really want to twist async to do your bidding, Jon Skeet's eduasync series is the best resource. However, I seriously do not recommend you do this in production.
You may find my async/await intro helpful as an introduction to the async concepts and recommended ways to use them. The official MSDN documentation is also unusually good.
I did write the AsyncContext and AsyncContextThread classes that may work for your situation; they define a single-threaded context for async/await methods. You can queue work (or send messages) to an AsyncContextThread by using its Factory property.
Can I hook into async/await to use my own implementations instead of Task?
Yes.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Yes.
Any good reference what exactly happens when I write async and await?
Yes.
I would discourage you from asking yes/no questions. You probably don't just want yes/no answers.
async and await seem to want to use the Task - class which in turn seems to need real threads?
Nope, that's not true. A Task represents something that can be completed at some point in the future, possibly with a result. It's sometimes the result of some computation in another thread, but it doesn't need to be. It can be anything that is happening at some point in the future. For example, it could be the result of an IO operation.
Remote1.CallF1 returns a new Task with an empty Action as constructor parameter and remembers the returned task. When F1 is actually finished, it calls RunSynchronously() on the task to mark it as completed.
So what you're missing here is the TaskCompletionSource class. With that missing puzzle piece a lot should fit into place. You can create the TCS object, pass the Task from it's Task property around to...whomever, and then use the SetResult property to signal it's completion. Doing this doesn't result in the creation of any additional threads, or use the thread pool.
Note that if you don't have a result and just want a Task instead of a Task<T> then just use a TaskCompletionSource<bool> or something along those lines and then SetResult(false) or whatever is appropriate. By casting the Task<bool> to a Task you can hide that implementation from the public API.
That should also provide the "How" variations of the first two questions that you asked instead of the "can I" versions you asked. You can use a TaskCompletionSource to generate a task that is completed whenever you say it is, using whatever asynchronous construct you want, which may or may not involve the use of additional threads.

Categories

Resources