Is waiting inside a callback safe in C#? - c#

I have a BeginRead that calls a ReadCallback function upon completion. What I want to do in the callback is wait for a ManualResetEvent on a buffer to tell me if the buffer is empty or not, so I can issue a new BeginRead if I need more data. I have implemented this and it works. But my question is : is it safe to wait inside a callback?
I'm new to C#, if these were regular threads I wouldn't have doubts, but I'm not sure how C# treats callbacks.
Thank you.

APM callbacks are called on the thread-pool in all cases that I can think of.
That reduces your question to "Can I block thread-pool threads?". The answer to that is generally yes but it has downsides.
It is "safe" to do so until you exhaust the pool (then you risk deadlocks and extreme throughput reduction like >1000x).
The other downsides are the usual downsides of blocking threads in general. They cost a lot of memory to keep around. Lots of threads can cause lots of context switches.
Can't you just use await? I imagine your code to look like this:
while (true) {
var readResult = await ReadAsync(...);
await WaitForSomeConditionAsync(); //Instead of ManualResetEvent.
}
No blocking. Very simple code. There is no need to do anything special to issue the next read. It just happens as part of the loop.
My working model is something similar to producer/consumer.
Sounds like a good use for TPL Dataflow. Dataflow automates the forwarding of data, the waiting and throttling. It supports async to the fullest.

Related

How to continue TaskCompletionSource<> in another thread?

I'm using TaskCompletionSource<> quite often. I have a network protocol design, where I receive many streams in one tcp/ip connection. I demultiplex those streams and then inform the corresponding "SubConnections" of new content.
Those "SubConnections" (which are waiting via await) then should continue in a new thread.
Usually I solve such issues by putting the TaskComplectionSource<>.Set call in an anonymous ThreadPool.QueueUserWorkItem method, like this:
ThreadPool.QueueUserWorkItem(delegate { tcs.SetResult(null); });
If I don't do this the corresponding await tcs.Task call will continue in the thread which called tcs.SetResult.
However, I'm aware of that this isn't the right way to do things. It's also possible to self-write a SynchronizationContext (or something) which will instruct the await call to continue in another thread.
My primary question here is: How would I do this in the "best practice" way?
My hope here is also to avoid the ThreadPool overhead, because it's quite high on Linux compared to just blocking a thread and waiting for a ManualResetEvent - even though the SynchronizationContext (or whatever) may also utilize the ThreadPool.
Please refrain from telling me that it's generally a bad idea to multiplex something in one tcp/ip connection or that I should just use System.IO.Pipelines, REST or whatever. This is my scenario. Thank you.
You can create the TaskCompletionSource using TaskCreationOptions.RunContinuationsAsynchronously (in .NET 4.6+):
var tcs = new TaskCompletionSource<Result>(TaskCreationOptions.RunContinuationsAsynchronously);
...
tcs.SetResult(...);
See e.g. this thread for more details.

Why does a blocking thread consume more then async/await?

See this question and answer;
Why use async controllers, when IIS already handles the request concurrency?
Ok, a thread consumes more resources then the async/await construction, but why? What is the core difference? You still need to remember all state etc, don't you?
Why would a thread pool be limited, but can you have tons of more idle async/await constructions?
Is it because async/await knows more about your application?
Well, let's imagine a web-server. Most of his time, all he does is wait. it doesn't really CPU-bound usually, but more of I/O bound. It waits for network I/O, disk I/O etc. After every time he waits, he has something (usually very short to do) and then all he does is waiting again. Now, the interesting part is what happend while he waits. In the most "trivial" case (that of course is absolutely not production), you would create a thread to deal with every socket you have.
Now, each of those threads has it's own cost. Some handles, 1MB of stack space... And of course, not all those threads can run in the same time - so the OS scheduler need to deal with that and choose the right thread to run each time (which means A LOT of context switching). It will work for 1 clients. It'll work for 10 clients. But, let's imagine 10,000 clients at the same time. 10,000 threads means 10GB of memory. That's more than the average web server in the world.
All of these resources, is because you dedicated a thread for a user. BUT, most of this threads does nothing! they just wait for something to happen. and the OS has API for async IO that allows you to just queue an operation that will be done once the IO operation completed, without having dedicated thread waiting for it.
If you use async/await, you can write application that will easily use less threads, and each of the thread will be utilized much more - less "doing nothing" time.
async/await is not the only way of doing that. You could have done this before async/await was introduced. BUT, async/await allows you to write code that's very readable and very easy to write that does that, and look almost as it runs just on a single thread (not a lot of callbacks and delegates moving around like before).
By combining the easy syntax of async/await and some features of the OS like async I/O (by using IO completion port), you can write much more scalable code, without losing readability.
Another famous sample is WPF/WinForms. You have the UI thread, that all he does is to process events, and usually has nothing special to do. But, you can't block it or the GUI will hang and the user won't like it. By using async/await and splitting each "hard" work to short operations, you can achieve responsible UI and readable code. If you have to access the DB to execute a query, you'll start the async operation from the UI thread, and then you'll "await" it until it ends and you have results that you can process in the UI thread (because you need to show them to the user, for example). You could have done it before, but using async/await makes it much more readable.
Hope it helps.
Creating a new thread allocates a separate memory area exclusive for this thread holding its resources, mainly its call stack which in Windows takes up 1MB of memory.
So if you have a 1000 idle threads you are using up at least 1GB of memory doing nothing.
The state for async operations takes memory as well but it's just the actual size needed for that operation and the state machine generated by the compiler and it's kept on the heap.
Moreover, using many threads and blocking them has another cost (which IMO is bigger). When a thread is blocked it is taken out of the CPU and switched with another (i.e. context-switch). That means that your threads aren't using their time-slices optimally when they get blocked. Higher rate of context switching means your machine does more overhead of context-switching and less actual work by the individual threads.
Using async-await appropriately enables using all the given time-slice since the thread, instead of blocking, goes back to the thread pool and takes another task to execute while the asynchronous operation continues concurrently.
So, in conclusion, the resources async await frees up are CPU and memory, which allows your server to handle more requests concurrently with the same amount of resources or the same amount of requests with less resources.
The important thing to realize here is that a blocked thread is not usable to do any other work until it becomes unblocked. A thread that encounters an await is free to return to the threadpool and pick up other work until the value being awaited becomes available.
When you call a synchronous I/O method, the thread executing your code is blocked waiting for the I/O to complete. To handle 1000 concurrent requests, you will need 1000 threads.
When you call an asynchronous I/O method, the thread is not blocked. It initializes the I/O operation and can work on something else. It can be the rest of your method (if you don't await), or it can be some other request if you await the I/O method. The thread pool doesn't need to create new threads for new requests, as all the threads can be used optimally and keep the CPUs busy.
Async I/O operations are actually implemented asynchronously at the OS level.

Asynchronous operation and thread in C#

Asynchronous programming is a technique that calls a long running method in the background so that the UI thread remains responsive. It should be used while calling a web service or database query or any I/O bound operation. when the asynchronous method completes, it returns the result to the main thread. In this way, the program's main thread does not have to wait for the result of an I/O bound operation and continues to execute further without blocking/freezing the UI. This is ok.
As far as I know the asynchronous method executes on a background worker thread. The runtime makes availabe the thread either from the threadpool or it may create a brand new thread for its execution.
But I have read in many posts that an asynchronous operation may execute on a separate thread or without using any thread. Now I am very confused.
1) Could you please help clarifying in what situation an asynchronous operation will not use a thread?
2) What is the role of processor core in asynchronous operation?
3) How it is different from multithreading? I know one thing that multithreading is usefult with compute-bound operation.
Please help.
IO (let's say a database-operation over the network) is a good example for all three:
you basically just register a callback the OS will finally call (maybe on a then newly created thread) when the IO-Operation finished. There is no thread sitting around and waiting - the resurrection will be triggered by hardware-events (or at least by a OS process usually outside user-space)
it might have none (see 1)
in Multithreading you use more than one thread (your background-thread) and there one might idle sit there doing nothing (but using up system-resources) - this is of course different if you have something to compute (so the thread is not idle waiting for external results) - there it makes sense to use a background-worker-thread
Asynchronous operations don't actually imply much of anything about how they are processed, only that they would like the option to get back to you later with your results. By way of example:
They may (as you've mentioned) split off a compute-bound task onto an independent thread, but this is not the only use case.
They may sometimes complete synchronously within the call that launches them, in which case no additional thread is used. This may happen with an I/O request if there is already enough buffer content (input) or free buffer space (output) to service the request.
They may simply drop off a long-running I/O request to the system; in this case the callback is likely to occur on a background thread after receiving notification from an I/O completion port.
On completion, a callback may be delivered later on the same thread; this is especially common with events within a UI framework, such as navigation in a WebBrowser.
Asynchronity doesn't say anything about thread. Its about having some kind of callbacks which will be handled inside a "statemachine" (not really correct but you can think of it like events ). Asynchronity does not raise threads nor significantly allocate system ressources. You can run as many asynchronous methods as you want to.
Threads do have a real imply on your system and you have a hughe but limited number you can have at once.
Io operations are mostly related to others controllers (HDD, NIC,...) What now happens if you create a thread is that a thread of your application which has nothing to do waits for the controllers to finish. In async as Carsten and Jeffrey already mentioned you just get some kind of callback mechanism so your thread continues to do other work, methods and so on.
Also keep in mind that each thread costs ressources (RAM, Performance,handles Garbage Collection got worse,...) and may even and up in exceptions (OutOfMemoryException...)
So when to use Threads? Absolutly only if you really need it. If there is a async api use it unless you have really important reasons to not use it.
In past days the async api was really painfull, thats why many people used threads when ever they need just asynchronity.
For example node.js refuses the use of mulptile thread at all!
This is specially important if you handle multiple requests for example in services / websites where there is always work to do. There is also a this short webcast with Jeffrey Richter about this which helped me to understand
Also have a look at this MSDN article
PS: As a side effect sourcecode with async and await tend to be more readable

How to achieve "true" asynchrony

In his answer to this question, Stephen Cleary refers to "fake" asynchrony and "true" asynchrony.
there's a much easier way to schedule work to the thread pool: Task.Run.
True asynchrony isn't possible, because you have a blocking method
that you must use. So, all you can do is a workaround - fake
asynchrony, a.k.a. blocking a thread pool thread.
How then is it possible to achieve true asynchrony, like the various methods in System.Threading.Tasks.Task? Aren't all "truly asynchronous" methods just blocking operations on some other thread if you dig deep enough?
Aren't all "truly asynchronous" methods just blocking operations on some other thread if you dig deep enough?
No. Truly asynchronous operations don't need a thread throughout the entire operation and using one limits scalability and hurts performance.
While most truly asynchronous operations are I/O ones, that can get too overly complicated to understand. (For a deep dive read There Is No Thread by Stephen Cleary).
Let's say for example that you want to await a user's button click. Since there's a Button.Click event we can utilize TaskCompletionSource to asynchronously wait for the event to be raised:
var tcs = new TaskCompletionSource<bool>();
_button.Click += (sender, EventArgs e) => tcs.SetResult(false);
await tcs.Task;
There's no non-generic TaskCompletionSource so I use a bool one with a dummy value. This creates a Task which isn't connected to a thread, it's just a synchronization construct and will only complete when the user clicks that button (through SetResult). You can await that Task for ages without blocking any threads whatsoever.
Task.Delay for that matter is implemented very similarly with a System.Threading.Timer that completes the awaited task in its callback.
Aren't all "truly asynchronous" methods just blocking operations on some other thread if you dig deep enough?
On the contrary. Truely asynchronous methods are async all the way down to the OS level. These types of methods by default dont block even at the device driver level, using IRP (IO request packet) and DPC.
See How does running several tasks asynchronously on UI thread using async/await work? where i went into detail as of how overlapped IO works all the way down.
How then is it possible to achieve true asynchrony, like the various methods in System.Threading.Tasks.Task?
A Task represents a unit of work which will complete in the future. This has nothing to do with async IO. The reason you might assume that is because async-await works together nicely with awaitables. Most of the BCL libraries expose async IO operations via the TaskAwaiter, and that's why you assume it is the one achieving the asynchrony.

await Console.ReadLine()

I am currently building an asynchronous console application in which I have created classes to handle separate areas of the application.
I have created an InputHandler class which I envisioned would await Console.ReadLine() input. However, you cannot await such a function (since it is not async), my current solution is to simply:
private async Task<string> GetInputAsync() {
return Task.Run(() => Console.ReadLine())
}
which runs perfectly fine. However, my (limited) understanding is that calling Task.Run will fire off a new (parallel?) thread. This defeats the purpose of async methods since that new thread is now being blocked until Readline() returns right?
I know that threads are an expensive resource so I feel really wasteful and hacky doing this. I also tried Console.In.ReadLineAsync() but it is apparently buggy? (It seems to hang).
I know that threads are an expensive resource so I feel really wasteful and hacky doing this. I also tried Console.In.ReadLineAsync() but it is apparently buggy? (It seems to hang).
The console streams unfortunately do have surprising behavior. The underlying reason is that they block to ensure thread safety for the console streams. Personally I think that blocking in an asynchronous method is a poor design choice, but Microsoft decided to do this (only for console streams) and have stuck by their decision.
So, this API design forces you to use background threads (e.g., Task.Run) if you do want to read truly asynchronously. This is not a pattern you should normally use, but in this case (console streams) it is an acceptable hack to work around their API.
However, my (limited) understanding is that calling Task.Run will fire off a new (parallel?) thread.
Not quite. Task.Run will queue some work to the thread pool, which will have one of its threads execute the code. The thread pool manages the creation of threads as necessary, and you usually don't have to worry about it. So, Task.Run is not as wasteful as actually creating a new thread every time.

Categories

Resources