C# 5 async/await thread mechanics feel wrong? - c#

Why have the calling thread walk into the async method until the inner 'await'?
Isn't it cleaner to just spawn a thread as soon as an async method is called. That way you know for sure that the async method returns immediately. You don't have to worry about not doing anything expensive at the early stages of the async method.
I tend to like to know whether a method is going to execute code on 'my' thread or not. Whether it's blocking or not. This model seems to open a whole spectrum of in-between possibilities.
The designers are much smarter than I am so I'm sure there is a good reason, I'd just like to get my head around it.

Isn't it cleaner to just spawn a thread as soon as an async method is called.
The whole point of "async" methods is to avoid spawning a new thread.
You are confusing asynchrony with concurrency. Asynchronous methods need not run on another thread to be asynchronous. The point of asynchronous methods is that they allow you to break up work into little pieces that need to run in a particular order, but not necessarily without doing other work on the same thread.
Think of a thread as a worker you can hire. Think of a async method as a to-do list with pauses between the items. If your to-do list says "go to the store, buy milk and eggs, go home, make an omelette", then the benefit of async is that when someone calls your cell phone between the "buy eggs" step and the "go home" step and says "can you stop by the pharmacy on your way home and pick up my prescription?" you can take the call and schedule the work before you make the omelette. With non-async methods, your phone keeps ringing until the omelette is done, and then you take the call. The UI blocks until you're done what you're doing.
Your concept is that in order to keep the UI thread responsive, the moment you get the to-do list you go hire some guy to run to the store for you, so that you're free to take the call about the pharmacy. That is expensive and unnecessary. Everything can stay on the same thread with async because the long-running task has built-in points where the UI gets to interrupt and schedule more work.

I like to think of async..await to be syntactic sugar for continuation-passing style programming.
With that in mind it has nothing to do with threads.

I tend to like to know whether a method is going to execute code on 'my' thread or not.
I think that is a peculiar desire, not really a good argument for/against any feature.
The main point of async/await is that the code for starting an async op and handling the results can be kept into one method.
Without it you are forced to break code that logically belongs together into 2 parts.

Related

Is there a neat way to force a pile of `async` C# code to run single-threadly as though it weren't actually `async`

Suppose (entirely hypothetically ;)) I have a big pile of async code.
10s of classes; 100s of async methods, of which 10s are actually doing async work (e.g. where we WriteToDbAsync(data) or we ReadFileFromInternetAsync(uri), or when WhenAll(parallelTasks).
And I want to do a bunch of diagnostic debugging on it. I want to perf profile it, and step through a bunch of it manually to see what's what.
All my tools are designed around synchronous C# code. They will sort of work with async, but it's definitely much less effective, and debugging is way harder, even when I try to directly manage the threads a bit.
If I'm only interested in a small portion of the code, then it's definitely a LOT easier to temporarily un-async that portion of the code. Read and Write synchronously, and just Task.Wait() on each of my "parallel" Tasks in sequence. But that's not viable for to do if I want to poke around in a large swathe of the code.
Is there anyway to ask C# to run some "async" code like that for me?
i.e. some sort of (() => MyAsyncMethod()).RunAsThoughAsyncDidntExist() which knows that any time it does real async communication with the outside world, it should just spin (within the same thread) until it gets an answer. Any time it's asked to run code in parallel ... don't; just run them in series on its single thread. etc. etc.
I'm NOT talking about just awaiting for the Task to finish, or calling Task.Wait(). Those won't change how that Task executes itself
I strongly assume that this sort of thing doesn't exist, and I just have to live with my tools not being well architected for async code.
But it would be great if someone with some expertise in the area, could confirm that.
EDIT: (Because SO told me to explain why the suggestion isn't an answer)...
Sinatr suggested this: How do I create a custom SynchronizationContext so that all continuations can be processed by my own single-threaded event loop? but (as I understand it) that is going to ensure that each time there's an await command then the code after that await continues on the same thread. But I want the actual contents of the await to be on the same thread.
Keep in mind that asynchronous != parallel.
Parallel means running two or more pieces of code at the same time, which can only be done with multithreading. It's about how code runs.
Asynchronous code frees the current thread to do other things while it is waiting for something else. It is about how code waits.
Asynchronous code with a synchronization context can run on a single thread. It starts running on one thread, then fires off an I/O request (like an HTTP request), and while it waits there is no thread. Then the continuation (because there is a synchronization context) can happen on the same thread depending on what the synchronization context requires, like in a UI application where the continuation happens on the UI thread.
When there is no synchronization context, then the continuation can be run on any ThreadPool thread (but might still happen on the same thread).
So if your goal is to make it initially run and then resume all on the same thread, then the answer you were already referred to is indeed the best way to do it, because it's that synchronization context that decides how the continuation is executed.
However, that won't help you if there are any calls to Task.Run, because the entire purpose of that method is to start a new thread (and give you an asynchronous way to wait for that thread to finish).
It also may not help if the code uses .ConfigureAwait(false) in any of the await calls, since that explicitly means "I don't need to resume on the synchronization context", so it may still run on a ThreadPool thread. I don't know if Stephen's solution does anything for that.
But if you really want it to "RunAsThoughAsyncDidntExist" and lock the current thread while it waits, then that's not possible. Take this code for example:
var myTask = DoSomethingAsync();
DoSomethingElse();
var results = await myTask;
This code starts an I/O request, then does something else while waiting for that request to finish, then finishes waiting and processes the results after. The only way to make that behave synchronously is to refactor it, since synchronous code isn't capable of doing other work while waiting. A decision would have to be made whether to do the I/O request before or after DoSomethingElse().

How to correctly block on async code?

I have tons of code written in following manner:
public string SomeSyncOperation(int someArg)
{
// sync code
SomeAsyncOperation(someArg, someOtherArg).ConfigureAwait(false).GetAwaiter().GetResult()
// sync code
};
Here we have some sync code that have to access to async api, so it blocks until results are ready. We can't method change signature and add async here. So, we are waiting synchronously anyway, so do we need ConfigureAwait(false) here? I'm pretty sure that we don't, but I'm a bit affraid of removing it because it's probably covers some use cases (or why am I seeing it virtually everywhere? It's just a cargo cult?) and removing this call may lead to some unsafe results.
So does it makes sense at all?
How to correctly block on async code?
You do not correctly block on async code. Blocking is wrong. Asking what the right way is to do the wrong thing is a non-starter.
Blocking on async code is wrong because of the following scenario:
I have an object in hand representing an async operation.
The async operation is itself asynchronously waiting on the completion of a second async operation.
The second async operation will be scheduled to this thread when the message loop executes code associated with a message that is at present in this thread's message queue.
And now you can figure out what goes horribly wrong when you attempt to fetch the result synchronously of the first async operation. It blocks until its child async operation is finished, which will never happen, because now we've blocked the thread that is going to service the request in the future!
Your choices are:
Make your entire call stack correctly asynchronous and await the result.
Don't use this API. Write an equivalent synchronous API that you know does not deadlock, from scratch, and call it correctly.
Write an incorrect program which sometimes deadlocks unpredictably.
There are two ways to write a correct program; writing a synchronous wrapper over an asynchronous function is dangerous and wrong.
Now, you might ask, didn't the ConfigureAwait solve the problem by removing the requirement that we resume on the current context? That's not the resumption point that we're worried about. If you're going to rely on ConfigureAwait to save you from deadlock then every asynchronous operation in the stack has to use it, and we don't know if the underlying asynchronous operation that is about to cause the deadlock did that!
If the above is not entirely clear to you, read Stephen's article on why this is a bad practice, and why common workarounds are just dangerous hacks.
https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html
and his updated article giving more hacks and workarounds here:
https://msdn.microsoft.com/en-us/magazine/mt238404.aspx?f=255&MSPPError=-2147217396
But again: the right thing to do is to redesign your program to embrace asynchrony and use await throughout. Don't try to work around it.
becuase this method has stacktrace of ~20 methods, some of them are implementing some interfaces. Changing it to be async require change declarations in ~50 files, and we convert fully sync interfaces to mixed ones.
Get busy then! This sounds pretty easy.

Why are web apps going crazy with await / async nowadays?

I come from a back end / thick client background, so maybe I'm missing something... but I recently looked at the source for an open source JWT token server and the authors went crazy with await / async. Like on every method and every line.
I get what the pattern is for... to run long running tasks in a separate thread. In my thick client days, I would use it if a method might take a few seconds, so as not to block the GUI thread... but definitely not on a method that takes a few ms.
Is this excessive use of await / async something you need for web dev or for something like Angular? This was in a JWT token server, so not even seeing what it has to do with any of those. It's just a REST end point.
How is making every single line async going to improve performace? To me, it'll kill performance from spinning up all those threads, no?
I get what the pattern is for... to run long running tasks in a separate thread.
This is absolutely not what this pattern is for.
Await does not put the operation on a new thread. Make sure that is very clear to you. Await schedules the remaining work as the continuation of the high latency operation.
Await does not make a synchronous operation into an asynchronous concurrent operation. Await enables programmers who are working with a model that is already asynchronous to write their logic to resemble synchronous workflows. Await neither creates nor destroys asynchrony; it manages existing asynchrony.
Spinning up a new thread is like hiring a worker. When you await a task, you are not hiring a worker to do that task. You are asking "is this task already done? If not, call me back when its done so I can keep doing work that depends on that task. In the meanwhile, I'm going to go work on this other thing over here..."
If you're doing your taxes and you find you need a number from your work, and the mail hasn't arrived yet, you don't hire a worker to wait by the mailbox. You make a note of where you were in your taxes, go get other stuff done, and when the mail comes, you pick up where you left off. That's await. It's asynchronously waiting for a result.
Is this excessive use of await / async something you need for web dev or for something like Angular?
It's to manage latency.
How is making every single line async going to improve performance?
In two ways. First, by ensuring that applications remain responsive in a world with high-latency operations. That kind of performance is important to users who don't want their apps to hang. Second, by providing developers with tools for expressing the data dependency relationships in asynchronous workflows. By not blocking on high-latency operations, system resources are freed up to work on unblocked operations.
To me, it'll kill performance from spinning up all those threads, no?
There are no threads. Concurrency is a mechanism for achieving asynchrony; it is not the only one.
Ok, so if I write code like: await someMethod1(); await someMethod2(); await someMethod3(); that is magically going to make the app more responsive?
More responsive compared to what? Compared to calling those methods without awaiting them? No, of course not. Compared to synchronously waiting for the tasks to complete? Absolutely, yes.
That's what I'm not getting I guess. If you awaited on all 3 at the end, then yeah, you're running the 3 methods in parallel.
No no no. Stop thinking about parallelism. There need not be any parallelism.
Think about it this way. You wish to make a fried egg sandwich. You have the following tasks:
Fry an egg
Toast some bread
Assemble a sandwich
Three tasks. The third task depends on the results of the first two, but the first two tasks do not depend on each other. So, here are some workflows:
Put an egg in the pan. While the egg is frying, stare at the egg.
Once the egg is done, put some toast in the toaster. Stare at the toaster.
Once the toast is done, put the egg on the toast.
The problem is that you could be putting the toast in the toaster while the egg is cooking. Alternative workflow:
Put an egg in the pan. Set an alarm that rings when the egg is done.
Put toast in the toaster. Set an alarm that rings when the toast is done.
Check your mail. Do your taxes. Polish the silverware. Whatever it is you need to do.
When both alarms have rung, grab the egg and the toast, put them together, and you have a sandwich.
Do you see why the asynchronous workflow is far more efficient? You get lots of stuff done while you're waiting for the high latency operation to complete. But you did not hire an egg chef and a toast chef. There are no new threads!
The workflow I proposed would be:
eggtask = FryEggAsync();
toasttask = MakeToastAsync();
egg = await eggtask;
toast = await toasttask;
return MakeSandwich(egg, toast);
Now, compare that to:
eggtask = FryEggAsync();
egg = await eggtask;
toasttask = MakeToastAsync();
toast = await toasttask;
return MakeSandwich(egg, toast);
Do you see how that workflow differs? This workflow is:
Put an egg in the pan and set an alarm.
Go do other work until the alarm goes off.
Get the egg out of the pan; put the bread in the toaster. Set an alarm...
Go do other work until the alarm goes off.
When the alarm goes off, assemble the sandwich.
This workflow is less efficient because we have failed to capture the fact that the toast and egg tasks are high latency and independent. But it is surely more efficient use of resources than doing nothing while you're waiting for the egg to cook.
The point of this whole thing is: threads are insanely expensive, so don't spin up new threads. Rather, make more efficient use of the thread you've got by putting it to work while you're doing high latency operations. Await is not about spinning up new threads; it is about getting more work done on one thread in a world with high latency computation.
Maybe that computation is being done on another thread, maybe it's blocked on disk, whatever. Doesn't matter. The point is, await is for managing that asynchrony, not creating it.
I'm having a difficult time understanding how asynchronous programming can be possible without using parallelism somewhere. Like, how do you tell the program to get started on the toast while waiting for the eggs without DoEggs() running concurrently, at least internally?
Go back to the analogy. You are making an egg sandwich, the eggs and toast are cooking, and so you start reading your mail. You get halfway through the mail when the eggs are done, so you put the mail aside and take the egg off the heat. Then you go back to the mail. Then the toast is done and you make the sandwich. Then you finish reading your mail after the sandwich is made. How did you do all that without hiring staff, one person to read the mail, one person to cook the egg, one to make the toast and one to assemble the sandwich? You did it all with a single worker.
How did you do that? By breaking tasks up into small pieces, noting which pieces have to be done in which order, and then cooperatively multitasking the pieces.
Kids today with their big flat virtual memory models and multithreaded processes think that this is how its always been, but my memory stretches back to the days of Windows 3, which had none of that. If you wanted two things to happen "in parallel" that's what you did: split the tasks up into small parts and took turns executing parts. The whole operating system was based on this concept.
Now, you might look at the analogy and say "OK, but some of the work, like actually toasting the toast, is being done by a machine", and that is the source of parallelism. Sure, I didn't have to hire a worker to toast the bread, but I achieved parallelism in hardware. And that is the right way to think of it. Hardware parallelism and thread parallelism are different. When you make an asynchronous request to the network subsystem to go find you a record from a database, there is no thread that is sitting there waiting for the result. The hardware achieves parallelism at a level far, far below that of operating system threads.
If you want a more detailed explanation of how hardware works with the operating system to achieve asynchrony, read "There is no thread" by Stephen Cleary.
So when you see "async" do not think "parallel". Think "high latency operation split up into small pieces" If there are many such operations whose pieces do not depend on each other then you can cooperatively interleave the execution of those pieces on one thread.
As you might imagine, it is very difficult to write control flows where you can abandon what you are doing right now, go do something else, and seamlessly pick up where you left off. That's why we make the compiler do that work! The point of "await" is that it lets you manage those asynchronous workflows by describing them as synchronous workflows. Everywhere that there is a point where you could put this task aside and come back to it later, write "await". The compiler will take care of turning your code into many tiny pieces that can each be scheduled in an asynchronous workflow.
UPDATE:
In your last example, what would be the difference between
eggtask = FryEggAsync();
egg = await eggtask;
toasttask = MakeToastAsync();
toast = await toasttask;
egg = await FryEggAsync();
toast = await MakeToastAsync();?
I assume it calls them synchronously but executes them asynchronously? I have to admit I've never even bothered to await the task separately before.
There is no difference.
When FryEggAsync is called, it is called regardless of whether await appears before it or not. await is an operator. It operates on the thing returned from the call to FryEggAsync. It's just like any other operator.
Let me say this again: await is an operator and its operand is a task. It is a very unusual operator, to be sure, but grammatically it is an operator, and it operates on a value just like any other operator.
Let me say it again: await is not magic dust that you put on a call site and suddenly that call site is remoted to another thread. The call happens when the call happens, the call returns a value, and that value is a reference to an object that is a legal operand to the await operator.
So yes,
var x = Foo();
var y = await x;
and
var y = await Foo();
are the same thing, the same as
var x = Foo();
var y = 1 + x;
and
var y = 1 + Foo();
are the same thing.
So let's go through this one more time, because you seem to believe the myth that await causes asynchrony. It does not.
async Task M() {
var eggtask = FryEggAsync();
Suppose M() is called. FryEggAsync is called. Synchronously. There is no such thing as an asynchronous call; you see a call, control passes to the callee until the callee returns. The callee returns a task which represents an egg to be made available in the future.
How does FryEggAsync do this? I don't know and I don't care. All I know is I call it, and I get an object back that represents a future value. Maybe that value is produced on a different thread. Maybe it is produced on this thread but in the future. Maybe it is produced by special-purpose hardware, like a disk controller or a network card. I don't care. I care that I get back a task.
egg = await eggtask;
Now we take that task and await asks it "are you done?" If the answer is yes, then egg is given the value produced by the task. If the answer is no then M() returns a Task representing "the work of M will be completed in the future". The remainder of M() is signed up as the continuation of eggtask, so when eggtask completes, it will call M() again and pick it up not from the beginning, but from the assignment to egg. M() is a resumable at any point method. The compiler does the necessary magic to make that happen.
So now we've returned. The thread keeps on doing whatever it does. At some point the egg is ready, so the continuation of eggtask is invoked, which causes M() to be called again. It resumes at the point where it left off: assigning the just-produced egg to egg. And now we keep on trucking:
toasttask = MakeToastAsync();
Again, the call returns a task, and we:
toast = await toasttask;
check to see if the task is complete. If yes, we assign toast. If no, then we return from M() again, and the continuation of toasttask is *the remainder of M().
And so on.
Eliminating the task variables does nothing germane. Storage for the values is allocated; it's just not given a name.
ANOTHER UPDATE:
is there a case to be made to call Task-returning methods as early as possible but awaiting them as late as possible?
The example given is something like:
var task = FooAsync();
DoSomethingElse();
var foo = await task;
...
There is some case to be made for that. But let's take a step back here. The purpose of the await operator is to construct an asynchronous workflow using the coding conventions of a synchronous workflow. So the thing to think about is what is that workflow? A workflow imposes an ordering upon a set of related tasks.
The easiest way to see the ordering required in a workflow is to examine the data dependence. You can't make the sandwich before the toast comes out of the toaster, so you're going to have to obtain the toast somewhere. Since await extracts the value from the completed task, there's got to be an await somewhere between the creation of the toaster task and the creation of the sandwich.
You can also represent dependencies on side effects. For example, the user presses the button, so you want to play the siren sound, then wait three seconds, then open the door, then wait three seconds, then close the door:
DisableButton();
PlaySiren();
await Task.Delay(3000);
OpenDoor();
await Task.Delay(3000);
CloseDoor();
EnableButton();
It would make no sense at all to say
DisableButton();
PlaySiren();
var delay1 = Task.Delay(3000);
OpenDoor();
var delay2 = Task.Delay(3000);
CloseDoor();
EnableButton();
await delay1;
await delay2;
Because this is not the desired workflow.
So, the actual answer to your question is: deferring the await until the point where the value is actually needed is a pretty good practice, because it increases the opportunities for work to be scheduled efficiently. But you can go too far; make sure that the workflow that is implemented is the workflow you want.
Generally this is because once asynchronous functions play nicer with other async functions, otherwise you start losing the benefits of asynchronicity. As a result, functions calling async functions end up being async themselves and it spreads throughout the entire application eg. if you made your interactions with a data store async, then things utilising that functionality tend to get made as async as well.
As you convert synchronous code to asynchronous code, you’ll find that it works best if asynchronous code calls and is called by other asynchronous code—all the way down (or “up,” if you prefer). Others have also noticed the spreading behavior of asynchronous programming and have called it “contagious” or compared it to a zombie virus. Whether turtles or zombies, it’s definitely true that asynchronous code tends to drive surrounding code to also be asynchronous. This behavior is inherent in all types of asynchronous programming, not just the new async/await keywords.
Source: Async/Await - Best Practices in Asynchronous Programming
It's an Actor Model World, Really...
My view is that async / await are simply a way of dressing up software systems so as to avoid having to concede that, really, a lot of systems (especially those with a lot of network comms) are better seen as Actor model (or better yet, Communicating Sequential Process) systems.
With both of these the whole point is that you wait for one of several things to become complete-able, take the necessary action when one does, and then return to waiting. Specifically you're waiting for a message to arrive from somewhere else, reading it, and acting on the content. In *nix, the waiting is generally done with a call to epoll() or select().
Using await / async is simply a way of pretending that your system is still kinda synchronous method calls (and therefore familiar), whilst making it difficult to efficiently cope with things not consistently completing in the same order every time.
However, once you get over the idea that you're no longer calling methods but simply passing messages to and fro it all becomes very natural. It's very much a "please do this", "sure, here's the answer" thing, with many such interactions intertwined. Wrapping it up with a big WaitForLotsOfThings() call at the top of a loop is merely an explicit acknowledgement that your program will wait until it has something to do in response to many other programs communicating with it.
How Windows Makes it Hard
Unfortunately, Windows makes it very hard to implement a reactor system ("if you read that message now, you'll get it"). Windows is proactor ("that message you asked me to read? It's now been read."). It's an important distinction.
First, I'll explain reactor and proactor.
A reactor "reacts" to events. For example, if a socket becomes ready to read, only at that point does the "reactor" model program decide what to read and what to do with it.
Whereas a proactor proactively decides what it's going to do if and when the socket becomes ready, and commits to that action.
With a reactor, a message (or indeed a timeout) that means "stop listening to that other actor" is easily dealt with - you simply exclude that other actor from the list you'll listen to next time you wait (the next call to select() or epoll()).
With a proactor, it's a lot harder. How does one honour a "stop listening to that other actor" message when the socket read() has already been started with some sort of async call, which won't complete until something is read? A completed read() is a doubtful outcome given the instruction recently received?
I'm nit-picking to some extent. Reactor is very useful in systems with dynamic connectivity, Actors dropping into the system, dropping out again. Proactor is fine if you have a fixed population of actors with comms links that'll never go away. Nevertheless, given that a proactor system is easily implemented in a reactor platform, but a reactor system cannot easily be implemented on a proactor platform (time won't go backwards), I find Window's approach particularly irritating.
So one way or other, async / await are definitely still in proactor land.
Knock on Impact
This has infected many other libraries.
C++'s Boost asio is also proactor, even on *nix, largely it seems because they wanted to have a Windows implementation.
ZeroMQ, which is an reactor framework, is limited to some extent on Windows being based on a call to select() (which in Windows works on only sockets).
For the cygwin family of POSIX runtimes on Windows, they had to implement select(), epoll(), etc. by having a thread per file descriptor polling (yes, polling!!!!) the underlying socket / serial port / pipe for incoming data in order to recreate POSIX's routines. Yeurk! The comments on the cygwin dev's mailing lists dating back to the time when they were implementing that part make for amusing reading.
Actor Isn't Necessarily Slow
It's worth noting that the phrase "passing messages" doesn't necessarily mean passing copies around - there's plenty of formulations of the Actor Model where you're merely passing ownership of references to messages around (e.g. Dataflow, part of the Task Parallel library in C#). This makes it fast. I've not yet got round to looking at the Dataflow library, but it doesn't really make Windows reactor all of a sudden. It doesn't give you an actor model reactor system working on all sorts of data bearers like sockets, pipes, queues, etc.
Windows 10's Linux Runtime
So having just blasted Windows and it's inferior proactor architecture, one intriguing point is that Windows 10 now runs Linux binaries under WSL1. How, I'd very much like to know, has Microsoft implemented the system call that underlies select(), epoll() in WSL1 given that it has to function on sockets, serial ports, pipes and everything else in the land of POSIX that is a file descriptor, when everything else on Windows can't? I'd give my hind teeth to know the answer to that question.

What is the difference between asynchronous programming and multithreading?

I thought that they were basically the same thing — writing programs that split tasks between processors (on machines that have 2+ processors). Then I'm reading this, which says:
Async methods are intended to be non-blocking operations. An await
expression in an async method doesn’t block the current thread while
the awaited task is running. Instead, the expression signs up the rest
of the method as a continuation and returns control to the caller of
the async method.
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active. You can use Task.Run to move CPU-bound work to a
background thread, but a background thread doesn't help with a process
that's just waiting for results to become available.
and I'm wondering whether someone can translate that to English for me. It seems to draw a distinction between asynchronicity (is that a word?) and threading and imply that you can have a program that has asynchronous tasks but no multithreading.
Now I understand the idea of asynchronous tasks such as the example on pg. 467 of Jon Skeet's C# In Depth, Third Edition
async void DisplayWebsiteLength ( object sender, EventArgs e )
{
label.Text = "Fetching ...";
using ( HttpClient client = new HttpClient() )
{
Task<string> task = client.GetStringAsync("http://csharpindepth.com");
string text = await task;
label.Text = text.Length.ToString();
}
}
The async keyword means "This function, whenever it is called, will not be called in a context in which its completion is required for everything after its call to be called."
In other words, writing it in the middle of some task
int x = 5;
DisplayWebsiteLength();
double y = Math.Pow((double)x,2000.0);
, since DisplayWebsiteLength() has nothing to do with x or y, will cause DisplayWebsiteLength() to be executed "in the background", like
processor 1 | processor 2
-------------------------------------------------------------------
int x = 5; | DisplayWebsiteLength()
double y = Math.Pow((double)x,2000.0); |
Obviously that's a stupid example, but am I correct or am I totally confused or what?
(Also, I'm confused about why sender and e aren't ever used in the body of the above function.)
Your misunderstanding is extremely common. Many people are taught that multithreading and asynchrony are the same thing, but they are not.
An analogy usually helps. You are cooking in a restaurant. An order comes in for eggs and toast.
Synchronous: you cook the eggs, then you cook the toast.
Asynchronous, single threaded: you start the eggs cooking and set a timer. You start the toast cooking, and set a timer. While they are both cooking, you clean the kitchen. When the timers go off you take the eggs off the heat and the toast out of the toaster and serve them.
Asynchronous, multithreaded: you hire two more cooks, one to cook eggs and one to cook toast. Now you have the problem of coordinating the cooks so that they do not conflict with each other in the kitchen when sharing resources. And you have to pay them.
Now does it make sense that multithreading is only one kind of asynchrony? Threading is about workers; asynchrony is about tasks. In multithreaded workflows you assign tasks to workers. In asynchronous single-threaded workflows you have a graph of tasks where some tasks depend on the results of others; as each task completes it invokes the code that schedules the next task that can run, given the results of the just-completed task. But you (hopefully) only need one worker to perform all the tasks, not one worker per task.
It will help to realize that many tasks are not processor-bound. For processor-bound tasks it makes sense to hire as many workers (threads) as there are processors, assign one task to each worker, assign one processor to each worker, and have each processor do the job of nothing else but computing the result as quickly as possible. But for tasks that are not waiting on a processor, you don't need to assign a worker at all. You just wait for the message to arrive that the result is available and do something else while you're waiting. When that message arrives then you can schedule the continuation of the completed task as the next thing on your to-do list to check off.
So let's look at Jon's example in more detail. What happens?
Someone invokes DisplayWebSiteLength. Who? We don't care.
It sets a label, creates a client, and asks the client to fetch something. The client returns an object representing the task of fetching something. That task is in progress.
Is it in progress on another thread? Probably not. Read Stephen's article on why there is no thread.
Now we await the task. What happens? We check to see if the task has completed between the time we created it and we awaited it. If yes, then we fetch the result and keep running. Let's suppose it has not completed. We sign up the remainder of this method as the continuation of that task and return.
Now control has returned to the caller. What does it do? Whatever it wants.
Now suppose the task completes. How did it do that? Maybe it was running on another thread, or maybe the caller that we just returned to allowed it to run to completion on the current thread. Regardless, we now have a completed task.
The completed task asks the correct thread -- again, likely the only thread -- to run the continuation of the task.
Control passes immediately back into the method we just left at the point of the await. Now there is a result available so we can assign text and run the rest of the method.
It's just like in my analogy. Someone asks you for a document. You send away in the mail for the document, and keep on doing other work. When it arrives in the mail you are signalled, and when you feel like it, you do the rest of the workflow -- open the envelope, pay the delivery fees, whatever. You don't need to hire another worker to do all that for you.
In-browser Javascript is a great example of an asynchronous program that has no multithreading.
You don't have to worry about multiple pieces of code touching the same objects at the same time: each function will finish running before any other javascript is allowed to run on the page. (Update: Since this was written, JavaScript has added async functions and generator functions. These functions do not always run to completion before any other javascript is executed: whenever they reach a yield or await keyword, they yield execution to other javascript, and can continue execution later, similar to C#'s async methods.)
However, when doing something like an AJAX request, no code is running at all, so other javascript can respond to things like click events until that request comes back and invokes the callback associated with it. If one of these other event handlers is still running when the AJAX request gets back, its handler won't be called until they're done. There's only one JavaScript "thread" running, even though it's possible for you to effectively pause the thing you were doing until you have the information you need.
In C# applications, the same thing happens any time you're dealing with UI elements--you're only allowed to interact with UI elements when you're on the UI thread. If the user clicked a button, and you wanted to respond by reading a large file from the disk, an inexperienced programmer might make the mistake of reading the file within the click event handler itself, which would cause the application to "freeze" until the file finished loading because it's not allowed to respond to any more clicking, hovering, or any other UI-related events until that thread is freed.
One option programmers might use to avoid this problem is to create a new thread to load the file, and then tell that thread's code that when the file is loaded it needs to run the remaining code on the UI thread again so it can update UI elements based on what it found in the file. Until recently, this approach was very popular because it was what the C# libraries and language made easy, but it's fundamentally more complicated than it has to be.
If you think about what the CPU is doing when it reads a file at the level of the hardware and Operating System, it's basically issuing an instruction to read pieces of data from the disk into memory, and to hit the operating system with an "interrupt" when the read is complete. In other words, reading from disk (or any I/O really) is an inherently asynchronous operation. The concept of a thread waiting for that I/O to complete is an abstraction that the library developers created to make it easier to program against. It's not necessary.
Now, most I/O operations in .NET have a corresponding ...Async() method you can invoke, which returns a Task almost immediately. You can add callbacks to this Task to specify code that you want to have run when the asynchronous operation completes. You can also specify which thread you want that code to run on, and you can provide a token which the asynchronous operation can check from time to time to see if you decided to cancel the asynchronous task, giving it the opportunity to stop its work quickly and gracefully.
Until the async/await keywords were added, C# was much more obvious about how callback code gets invoked, because those callbacks were in the form of delegates that you associated with the task. In order to still give you the benefit of using the ...Async() operation, while avoiding complexity in code, async/await abstracts away the creation of those delegates. But they're still there in the compiled code.
So you can have your UI event handler await an I/O operation, freeing up the UI thread to do other things, and more-or-less automatically returning to the UI thread once you've finished reading the file--without ever having to create a new thread.

What's so special about UI thread?

Let's say I have a method fooCPU that runs synchronously (it doesn't call pure async methods performing I/O, or use other threads to run its code by calling Task.Run or similar ways). That method performs some heavy calculations - it's CPU bound.
Now I call fooCPU in my program without delegating it to be executed by a worker thread. If one line of fooCPU will take long to run, no other lines will be executed until it finishes. So for example, calling it from the UI thread causes the UI thread to freeze (GUI will become unresponsive).
When I stated that async/await is an imitation of mutlithreading. The lines of two different pieces of code are executed in turns, on a single thread. If one of these lines will take long to run, no other lines will be executed until it finishes.,
I've been told that it's true for async used on the UI thread, but it's not true for all other cases (ASP.NET, async on the thread pool, console apps, etc).
Could anyone tell me what this might mean? How is UI thread different from the main thread of a console program?
I think nobody wants anyone here on this forum to continue the discussion of related topics, as they appear in the comments for instance, so it's better to ask a new question.
I recommend you read my async intro post; it explains how the async and await keywords work. Then, if you're interested in writing asynchronous code, continue with my async best practices article.
The relevant parts of the intro post:
The beginning of an async method is executed just like any other method. That is, it runs synchronously until it hits an “await” (or throws an exception).
So this is why the inner method in your console code example (without an await) was running synchronously.
Await examines that awaitable to see if it has already completed; if the awaitable has already completed, then the method just continues running (synchronously, just like a regular method).
So this is why the outer method in your console code example (that was awaiting the inner method which was synchronous) was running synchronously.
Later on, when the awaitable completes, it will execute the remainder of the async method. If you’re awaiting a built-in awaitable (such as a task), then the remainder of the async method will execute on a “context” that was captured before the “await” returned.
This "context" is SynchronizationContext.Current unless it is null, in which case it is TaskScheduler.Current. Or, the simpler version:
What exactly is that “context”?
Simple answer:
If you’re on a UI thread, then it’s a UI context.
If you’re responding to an ASP.NET request, then it’s an ASP.NET request context.
Otherwise, it’s usually a thread pool context.
Putting all of this together, you can visualize async/await as working like this: the method is split into several "chunks", with each await acting as a point where the method is split. The first chunk is always run synchronously, and at each split point it may continue either synchronously or asynchronously. If it continues asynchronously, then it will continue in a captured context (by default). UI threads provide a context that will execute the next chunk on the UI thread.
So, to answer this question, the special thing about UI threads is that they provide a SynchronizationContext that queues work back to that same UI thread.
I think nobody wants anyone here on this forum to continue the discussion of related topics, as they appear in the comments for instance, so it's better to ask a new question.
Well, Stack Overflow is specifically not intended to be a forum; it's a Question & Answer site. So it's not a place to ask for exhaustive tutorials; it's a place to come when you're stuck trying to get code working or if you don't understand something after having researched everything you can about it. This is why the comments on SO are (purposefully) restricted - they have to be short, no nice code formatting, etc. Comments on this site are intended for clarification, not as a discussion or forum thread.
It is pretty simple, a thread can do only one thing at a time. So if you send your UI thread out in the woods doing something non-UI related, say a dbase query, then all UI activity stops. No more screen updates, no response to mouse clicks and key presses. It looks and acts frozen.
You'll probably say, "well, I'll just use another thread to do the UI then". Works in a console mode, kind of. But not in a GUI app, making code thread-safe is difficult and UI is not thread-safe at all because so much code is involved. Not the kind you wrote, the kind you use with a fancy class library wrapper.
The universal solution is to invert that, do the non-UI related stuff on a worker thread and leave the UI thread to only take care of the easy UI stuff. Async/await helps you do that, what's on the right of await runs on a worker. The only way to mess that up, and it is not uncommon, is to ask the UI thread to still do too much work. Like adding a line of text to a textbox once every millisecond. That's just bad UI design, humans don't read that fast.
Given
async void Foo() {
Bar();
await Task.Yield();
Baz();
}
you're right that if Foo() gets called on the UI thread, then Bar() gets called immediately, and Baz() gets called at some later time, but still on the UI thread.
However, this is not a property of the threads themselves.
What's actually going on is that this method gets split up into something similar to
Task Foo() {
Bar();
return Task.Yield().Continue(() => {
Baz();
});
}
This is not actually correct, but the ways in which it's wrong don't matter.
The argument that gets passed to my hypothetical Continue method is code that can be invoked in some way to be determined by the task. The task may decide to execute it immediately, it may decide to execute it at some later point on the same thread, or it may decide to execute it at some later point on a different thread.
Actually, the tasks themselves don't decide, they simply pass on the delegate to a SynchronizationContext. It's this synchronisation context that determines what to do with to-be-executed code.
And that's what's different between the thread types: once you access any WinForms control from a thread, then WinForms installs a synchronisation context for that specific thread, which will schedule the to-be-executed code at some later point on the same thread.
ASP.NET, background threads, it's all different synchronisation contexts, and that's what's causing the changes in how code gets scheduled.

Categories

Resources