Multiplexing C# 5.0's async over a thread pool -- thread safe?

Multiplexing C# 5.0's async over a thread pool -- thread safe? - c#

This may seem a little crazy, but it's an approach I'm considering as part of a larger library, if I can be reasonably certain that it's not going to cause weird behavior.
The approach:
Run async user code with a SynchronizationContext that dispatches to a thread pool. The user code would look something like:
async void DoSomething()
{
int someState = 2;
await DoSomethingAsync();
someState = 4;
await DoSomethingElseAsync();
// someState guaranteed to be 4?
}
I'm not certain whether access to someState would be threadsafe. While the code would run in one "thread" such that the operations are, in fact, totally ordered, it could still be split across multiple threads beneath the hood. If my understanding is correct, ordering ought to be safe on x86, and since the variable isn't shared I won't need to worry about compiler optimizations and so on.
More importantly though, I'm concerned as to whether this will be guaranteed thread-safe under the ECMA or CLR memory models.
I'm fairly certain I'll need to insert a memory barrier before executing a queued piece of work, but I'm not totally confident in my reasoning here (or that this approach might be unworkable for entirely separate reasons).

This is answered in the comments section of the async / await FAQ:
TPL includes the appropriate barriers when tasks are queued and at the beginning/end of task execution so that values are appropriately made visible.
So no explicit barriers are necessary.

Related

Can you use ConfigureAwait(false) without being thread-safe?

I see people all over the place recommend using ConfigureAwait(false) where you can, and it is a must for library authors, and so on.
But since the continuation of ConfigureAwait(false) can run on any thread from thread pool, then how can you safely protect against multiple threads accessing the same state in your library?
Say you have the following API for your library:
async Task FooAsync()
{
// Do something
//barAsync and saveToFileAsync are private methods.
await barAsync().ConfigureAwait(false);
// counter is a private field
counter++;
await saveToFileAsync().ConfigureAwait(false);
// Do other things
}
If a UI thread keeps calling this FooAsync (e.g. because of user pressing button), wouldn't this code corrupt the value of counter and the file saved? Since multiple threads might be executing?
I find it hard to fanthom using ConfigureAwait(false) without being thread-safe, except for the simplist cases that do not modify state.
Update
I might not have been clear, but in our team, we decided we are going single-threaded. And so, from the answers below, it seems we can't use ConfigureAwait(false) then, since it introduces the possibility of parallelism, which needs to be controlled using locks and so on.

But since the continuation of ConfigureAwait(false) can run on any thread from thread pool, then how can you safely protect against multiple threads accessing the same state in your library?
await does introduce the possibility of reentrancy, but having it actually cause a problem is rare. Asynchronous code by its nature encourages a more functional kind of structure (inputs to a method are its parameters, and outputs are its return values). It's possible to have asynchronous methods have side effects and depend on state, but it's not terribly common.
Note that it is the await that causes accidental reentrancy. ConfigureAwait(false) resumes on the thread pool, but that doesn't cause the issue here.
If a UI thread keeps calling this FooAsync (e.g. because of user pressing button), wouldn't this code corrupt the value of counter and the file saved? Since multiple threads might be executing?
Yes and sort of. Yes, the counter may get an unexpected value, but it's not necessarily because of multiple threads. Consider the same code without ConfigureAwait(false): you still have multiple invocations of that function running, just on a single thread. They're still fighting over the counter and any other shared state. In that case, because of the single thread, counter++ is atomic, but because it's shared, a single invocation of that function may see the value unexpectedly change when resuming from an await.
With ConfigureAwait(false), you do have the additional concern of accidental parallelism (with await you have accidental reentrancy), so if you have non-threadsafe shared state, things can get worse. Reentrancy can cause unexpected states, but parallelism can cause invalid states.

ConfigureAwait is not about thread-safety. It's about avoiding capturing the context.
If you want your code to be thread-safe, then you should implement it to be. This usually involves using some kind of synchronization construct(s), such as for example a lock.
As already pointed out, your FooAsync() is not thread-safe even if you remove the calls to ConfigureAwait(false). Two or more threads can still call it simultaneously, even in a UI application where there is a SynchronizationContext available.
how can you safely protect against multiple threads accessing the same state in your library?
By synchronizing the access to any shared resource. Assuming counter is the only critical section in your code, you could make the method thread-safe using the Interlocked.Increment API:
async Task FooAsync()
{
...
Interlocked.Increment(ref counter);
...
}
This will increment counter and store the new result as an atomic operation.
There are a bunch of other synchronization constructs as well. Which one to use depends on what you are doing basically. Avoid calling ConfigureAwait(false) is not a way to make code thread-safe though.

Is there a neat way to force a pile of `async` C# code to run single-threadly as though it weren't actually `async`

Suppose (entirely hypothetically ;)) I have a big pile of async code.
10s of classes; 100s of async methods, of which 10s are actually doing async work (e.g. where we WriteToDbAsync(data) or we ReadFileFromInternetAsync(uri), or when WhenAll(parallelTasks).
And I want to do a bunch of diagnostic debugging on it. I want to perf profile it, and step through a bunch of it manually to see what's what.
All my tools are designed around synchronous C# code. They will sort of work with async, but it's definitely much less effective, and debugging is way harder, even when I try to directly manage the threads a bit.
If I'm only interested in a small portion of the code, then it's definitely a LOT easier to temporarily un-async that portion of the code. Read and Write synchronously, and just Task.Wait() on each of my "parallel" Tasks in sequence. But that's not viable for to do if I want to poke around in a large swathe of the code.
Is there anyway to ask C# to run some "async" code like that for me?
i.e. some sort of (() => MyAsyncMethod()).RunAsThoughAsyncDidntExist() which knows that any time it does real async communication with the outside world, it should just spin (within the same thread) until it gets an answer. Any time it's asked to run code in parallel ... don't; just run them in series on its single thread. etc. etc.
I'm NOT talking about just awaiting for the Task to finish, or calling Task.Wait(). Those won't change how that Task executes itself
I strongly assume that this sort of thing doesn't exist, and I just have to live with my tools not being well architected for async code.
But it would be great if someone with some expertise in the area, could confirm that.
EDIT: (Because SO told me to explain why the suggestion isn't an answer)...
Sinatr suggested this: How do I create a custom SynchronizationContext so that all continuations can be processed by my own single-threaded event loop? but (as I understand it) that is going to ensure that each time there's an await command then the code after that await continues on the same thread. But I want the actual contents of the await to be on the same thread.

Keep in mind that asynchronous != parallel.
Parallel means running two or more pieces of code at the same time, which can only be done with multithreading. It's about how code runs.
Asynchronous code frees the current thread to do other things while it is waiting for something else. It is about how code waits.
Asynchronous code with a synchronization context can run on a single thread. It starts running on one thread, then fires off an I/O request (like an HTTP request), and while it waits there is no thread. Then the continuation (because there is a synchronization context) can happen on the same thread depending on what the synchronization context requires, like in a UI application where the continuation happens on the UI thread.
When there is no synchronization context, then the continuation can be run on any ThreadPool thread (but might still happen on the same thread).
So if your goal is to make it initially run and then resume all on the same thread, then the answer you were already referred to is indeed the best way to do it, because it's that synchronization context that decides how the continuation is executed.
However, that won't help you if there are any calls to Task.Run, because the entire purpose of that method is to start a new thread (and give you an asynchronous way to wait for that thread to finish).
It also may not help if the code uses .ConfigureAwait(false) in any of the await calls, since that explicitly means "I don't need to resume on the synchronization context", so it may still run on a ThreadPool thread. I don't know if Stephen's solution does anything for that.
But if you really want it to "RunAsThoughAsyncDidntExist" and lock the current thread while it waits, then that's not possible. Take this code for example:
var myTask = DoSomethingAsync();
DoSomethingElse();
var results = await myTask;
This code starts an I/O request, then does something else while waiting for that request to finish, then finishes waiting and processes the results after. The only way to make that behave synchronously is to refactor it, since synchronous code isn't capable of doing other work while waiting. A decision would have to be made whether to do the I/O request before or after DoSomethingElse().

ReaderWriterLockSlim questions

In his answer here, https://stackoverflow.com/a/19664437/4919475
Stephen Cleary mentioned
ReaderWriterLockSlim is a thread-affine lock type, so it usually
cannot be used with async and await.
What did he mean by "usually"? When can ReaderWriterLockSlim be used?
Also, I've read here http://joeduffyblog.com/2007/02/07/introducing-the-new-readerwriterlockslim-in-orcas/ that ReaderWriterLockSlim has different quirks, but this article is from 2007. Did it change since then?

I guess you've posted a question that only Cleary can answer, because you want to know what he means.
In the meantime, the obvious inference from his statement is that you can get away with using ReaderWriterLockSlim with async/await in any situation where you are able to guarantee the same thread that acquired the lock will also be able to release it.
For example, you could imagine code like this:
private readonly ReaderWriterLockSlim _rwls = new ReaderWriterLockSlim();
async void button1_Click(object sender, EventArgs e)
{
_rwls.EnterWriteLock();
await ...;
_rwls.ExitWriteLock();
}
In the above, because the Click event will be raised in a thread where await will return to, you can acquire the lock, execute the await, and still get away with releasing the lock in the continuation, because you know it'll be the same thread.
In many other uses of async/await, the continuation is not guaranteed to be in the thread in which the method yielded, and so it wouldn't be allowed to release the lock having acquired it previous to the await. In some cases, this is explicitly intentional (i.e. ConfigureAwait(false)), in other cases it's just a natural outcome of the context of the await. Either way, those scenarios aren't compatible with ReaderWriterLockSlim the way the Click example would be.
(I am intentionally ignoring the larger question of whether it's a good idea to acquire a lock and then hold it for the duration of a potentially long-running asynchronous operation. That is, as they say, "a whole 'nother ball o' wax" .)
Addendum:
A "short" comment, which is too long to be an actual comment, regarding the "larger question" I am ignoring…
The "larger question" is fairly broad and highly context-dependent. It's why I didn't address it. The short version is in two parts:
In general, locks should be held for brief periods of time, but in general asynchronous operations are known to be potentially long in duration, so the two are mutually disagreeable. Locks are a necessary evil when doing concurrent operations, but they will always to some extent negate the benefit of doing things concurrently, because they have the effect of serializing otherwise-concurrent operations.The longer you hold a lock, the greater the likelihood of one or more threads getting blocked waiting for something, serializing whatever work they have. They are all waiting on the same lock, so even once the long-running lock is released, they still will all have to work in order, not concurrently. It's a bit like a traffic jam where a long queue of cars is waiting for a construction truck to finish blocking the road…even once the truck is out of the way, it will take some significant time to clear the jam.I would not say is inherently bad to hold a lock during an asynchronous operation — that is, I can imagine carefully thought-out scenarios where it would be okay — but it very often will undermine other implementation goals, and can in some cases completely undo a design meant to be heavily concurrent, especially when done without great care.
Semantically it's easy to make a mistake, i.e. with await you know the lock remains for the duration, but "fire-and-forget" is not uncommon, and would lead to the code appearing to lock while an asynchronous operation is occurring, but in reality it not (see the Stack Overflow question What happens to a lock during an Invoke/BeginInvoke? (event dispatching) for an example of someone who did exactly this, and didn't even realize it). One methodology for avoiding buggy code is to simply avoid patterns of coding known to potentially lead to bugs.Again, if one is careful enough, one can avoid the bug. But it is generally better to simply change the implementation to use a less tricky approach, and to be in the habit of doing so.

I noticed over on this question that you had asked:
Can you explain what you mean by "arbitrary code"?
I believe this note highlights an important aspect to "the larger question" which I will try--briefly, as I am also pressed for time--to address here. One of the main concerns here is that an await statement cannot guarantee the Task it awaits will run within the same context (particularly, in the case of thread-affine locks, on the same thread) as the calling code; this, in fact, would defeat much of the purpose of the Task promise.
Let's say the Task you await, somewhere down the line, awaits a Task created using Task.Run, is otherwise on another thread, or has yielded the current thread to await some background resource (like disk or network I/O). Under these conditions there are at least two unexpected behaviors which would be easy to accidentally come across:
If the code executing in the other thread attempts to obtain the same lock as the calling code that is awaiting it; the calling thread owns the lock and since the sub-task is executing on a different thread it cannot obtain the lock until the calling thread releases it, which it will not do because it is awaiting the sub-task that has not completed. If the second attempt to lock was on the same thread as the first, the lock would recognize that this thread has already acquired the lock and would allow the second lock attempt to proceed. Since they are not on the same thread this becomes a self-dependent deadlock and will either halt both the calling thread and the sub-task or will timeout, depending on the locking methods used. Most other deadlocks require using 2 or more locks in differing order across multiple code paths where each path holds a lock the other is waiting on.
If the calling thread is the UI thread (or some other context with a message pump which can continue processing requests while a previous request is awaiting asynchronous behavior), assuming it awaits a Task executing in another thread which takes long enough to process that the message pump begins processing another message (like another click to the same button, or any other "arbitrary code" which might want the same lock), that new message is executing on the same thread which owns the lock and is therefore allowed to proceed even though the previous Task has not completed, thus allowing arbitrary access to resources that are supposed to be synchronized.
While the former could cause your application or some component of it to lock up, the latter of these issues can yield very unexpected results and be especially tricky to troubleshoot. Similar conditions exist for all thread-affine locking mechanisms (like Monitor which is the underlying implementation of the lock keyword). Hope that helps.
If you're interested in more about parallelism patterns in C#, I might recommend the free Threading in C# e-book (which is actually an excerpt from the otherwise excellent book "C# in a Nutshell")

ConfigureAwait(false) on Top Level Requests

I'm trying to figure out if ConfigureAwait(false) should be used on top level requests. Reading this post from a somewhat authority of the subject:
http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html
...he recommends something like this:
public async Task<JsonResult> MyControllerAction(...)
{
try
{
var report = await _adapter.GetReportAsync();
return Json(report, JsonRequestBehavior.AllowGet);
}
catch (Exception ex)
{
return Json("myerror", JsonRequestBehavior.AllowGet); // really slow without configure await
}
}
public async Task<TodaysActivityRawSummary> GetReportAsync()
{
var data = await GetData().ConfigureAwait(false);
return data
}
...it says to using ConfigureAwait(false) on every await except the top level call. However when doing this my exception takes several seconds to return to the caller vs. using it and it and having it come back right away.
What is the best practice for MVC controller actions that call async methods? Should I use ConfigureAwait in the controller itself or just in the service calls that use awaits to request data, etc.? If I don't use it on the top level call, waiting several seconds for the exception seems problematic. I don't need the HttpContext and I've seen other posts that said always use ConfigureAwait(false) if you don't need the context.
Update:
I was missing ConfigureAwait(false) somewhere in my chain of calls which was causing the exception to not be returned right away. However the question still remains as posted as to whether or not ConfigureAwait(false) should be used at the top level.

Is it a high traffic website? One possible explanation might be that you're experiencing ThreadPoolstarvation when you are not using ConfigureAwait(false). Without ConfigureAwait(false), the await continuation is queued via AspNetSynchronizationContext.Post, which implementation boils down to this:
Task newTask = _lastScheduledTask.ContinueWith(_ => SafeWrapCallback(action));
_lastScheduledTask = newTask; // the newly-created task is now the last one
Here, ContinueWith is used without TaskContinuationOptions.ExecuteSynchronously (I'd speculate, to make continuations truly asynchronous and reduce a chance for low stack conditions). Thus, it acquires a vacant thread from ThreadPool to execute the continuation on. In theory, it might happen to be the same thread where the antecedent task for await has finished, but most likely it'd be a different thread.
At this point, if ASP.NET thread pool is starving (or has to grow to accommodate a new thread request), you might be experiencing a delay. It's worth mentioned that the thread pool consists of two sub-pools: IOCP threads and worker threads (check this and this for some extra details). Your GetReportAsync operations is likely to complete on an IOCP thread sub-pool, which doesn't seem to be starving. OTOH, the ContinueWith continuation runs on a worker thread sub-pool, which appears to be starving in your case.
This is not going to happen in case ConfigureAwait(false) is used all the way through. In that case, all await continuations will run synchronously on the same threads the corresponding antecedent tasks have ended, be it either IOCP or worker threads.
You can compare the thread usage for both scenarios, with and without ConfigureAwait(false). I'd expect this number to be larger when ConfigureAwait(false) isn't used:
catch (Exception ex)
{
Log("Total number of threads in use={0}",
Process.GetCurrentProcess().Threads.Count);
return Json("myerror", JsonRequestBehavior.AllowGet); // really slow without configure await
}
You can also try increasing the size of the ASP.NET thread pool (for diagnostics purpose, rather than an ultimate solution), to see if the described scenario is indeed the case here:
<configuration>
<system.web>
<applicationPool
maxConcurrentRequestsPerCPU="6000"
maxConcurrentThreadsPerCPU="0"
requestQueueLimit="6000" />
</system.web>
</configuration>
Updated to address the comments:
I realized I was missing a ContinueAwait somewhere in my chain. Now it
works fine when throwing an exception even when the top level doesn't
use ConfigureAwait(false).
This suggests that your code or a 3rd party library in use might be using blocking constructs (Task.Result, Task.Wait, WaitHandle.WaitOne, perhaps with some added timeout logic). Have you looked for those? Try the Task.Run suggestion from the bottom of this update. Besides, I'd still do the thread count diagnostics to rule out thread pool starvation/stuttering.
So are you saying that if I DO use ContinueAwait even at the top level
I lose the whole benefit of the async?
No, I'm not saying that. The whole point of async is to avoid blocking threads while waiting for something, and that goal is achieved regardless of the added value of ContinueAwait(false).
What I'm saying is that not using ConfigureAwait(false) might introduce redundant context switching (what usually means thread switching), which might be a problem in ASP.NET if thread pool is working at its capacity. Nevertheless, a redundant thread switch is still better than a blocked thread, in terms of the server scalability.
In all fairness, using ContinueAwait(false) might also cause redundant context switching, especially if it's used inconsistently across the chain of calls.
That said, ContinueAwait(false) is also often misused as a remedy against deadlocks caused by blocking on asynchronous code. That's why I suggested above to look for those blocking construct across all code base.
However the question still remains as posted as to whether or not
ConfigureAwait(false) should be used at the top level.
I hope Stephen Cleary could elaborate better on this, by here's my thoughts.
There's always some "super-top level" code that invokes your top-level code. E.g., in case of a UI app, it's the framework code which invokes an async void event handler. In case of ASP.NET, it's the asynchronous controller's BeginExecute. It is the responsibility of that super-top level code to make sure that, once your async task has completed, the continuations (if any) run on the correct synchronization context. It is not the responsibility of the code of your task. E.g., there might be no continuations at all, like with a fire-and-forget async void event handler; why would you care to restore the context inside such handler?
Thus, inside your top-level methods, if you don't care about the context for await continuations, do use ConfigureAwait(false) as soon as you can.
Moreover, if you're using a 3rd party library which is known to be context agnostic but still might be using ConfigureAwait(false) inconsistently, you may want to wrap the call with Task.Run or something like WithNoContext. You'd do that to get the chain of the async calls off the context, in advance:
var report = await Task.Run(() =>
_adapter.GetReportAsync()).ConfigureAwait(false);
return Json(report, JsonRequestBehavior.AllowGet);
This would introduce one extra thread switch, but might save you a lot more of those if ConfigureAwait(false) is used inconsistently inside GetReportAsync or any of its child calls. It'd also serve as a workaround for potential deadlocks caused by those blocking constructs inside the call chain (if any).
Note however, in ASP.NET HttpContext.Current is not the only static property which is flowed with AspNetSynchronizationContext. E.g., there's also Thread.CurrentThread.CurrentCulture. Make sure you really don't care about loosing the context.
Updated to address the comment:
For brownie points, maybe you can explain the effects of
ConfigureAwait(false)... What context isn't preserved.. Is it just the
HttpContext or the local variables of the class object, etc.?
All local variables of an async method are preserved across await, as well as the implicit this reference - by design. They actually gets captured into a compiler-generated async state machine structure, so technically they don't reside on the current thread's stack. In a way, it's similar to how a C# delegate captures local variables. In fact, an await continuation callback is itself a delegate passed to ICriticalNotifyCompletion.UnsafeOnCompleted (implemented by the object being awaited; for Task, it's TaskAwaiter; with ConfigureAwait, it's ConfiguredTaskAwaitable).
OTOH, most of the global state (static/TLS variables, static class properties) is not automatically flowed across awaits. What does get flowed depends on a particular synchronization context. In the absence of one (or when ConfigureAwait(false) is used), the only global state preserved with is what gets flowed by ExecutionContext. Microsoft's Stephen Toub has a great post on that: "ExecutionContext vs SynchronizationContext". He mentions SecurityContext and Thread.CurrentPrincipal, which is crucial for security. Other than that, I'm not aware of any officially documented and complete list of global state properties flowed by ExecutionContext.
You could peek into ExecutionContext.Capture source to learn more about what exactly gets flowed, but you shouldn't depend on this specific implementation. Instead, you can always create your own global state flow logic, using something like Stephen Cleary's AsyncLocal (or .NET 4.6 AsyncLocal<T>).
Or, to take it to the extreme, you could also ditch ContinueAwait altogether and create a custom awaiter, e.g. like this ContinueOnScope. That would allow to have precise control over what thread/context to continue on and what state to flow.

However the question still remains as posted as to whether or not ConfigureAwait(false) should be used at the top level.
The rule of thumb for ConfigureAwait(false) is to use it whenever the rest of your method doesn't need the context.
In ASP.NET, the "context" is not actually well-defined anywhere. It does include things like HttpContext.Current, user principal, and user culture.
So, the question really comes down to: "Does Controller.Json require the ASP.NET context?" It's certainly possible that Json doesn't care about the context (since it can write the current response from its own controller members), but OTOH it does do "formatting", which may require the user culture to be resumed.
I don't know whether Json requires the context, but it's not documented one way or the other, and in general I assume that any calls into ASP.NET code may depend on the context. So I would not use ConfigureAwait(false) at the top-level in my controller code, just to be on the safe side.

Clarification on tasks in .net

I'm trying to understand tasks in .net from what I understand is that they are better than threads because they represent work that needs to get done and when there is a idle thread it just gets picked up and worked on allowing the full cpu to be utilized.
I see the Task<ActionResult> all over a new mvc 5 project and I would like to know why this is happening?
Does it make sense to always do this, or just when there can be blocking work in the function?
I'm guessing since this does act like a thread there is still sync objects that may be needed is this correct?

MVC 5 uses Task<ActionResult> to allow it to be fully asynchronous. By using Task<T>, the methods can be implemented using the new async and await language features, which allows you to compose asynchronous IO functions with MVC in a simple manner.
When working with MVC, in general, the Task<T> will hopefully not be using threads - they'll be composing asynchronous operations (typically IO bound work). Using threads on a server, in general, will reduce your overall scalability.

A Task does not represent a thread, even logically. It's not just an alternate implementation of threads. It's a higher level concept. A Task is the representation of an asynchronous operation that will complete at some point (usually in the future).
That task could represent code being run on another thread, it could represent some asynchronous IO operation that relies on OS interrupts to (indirectly, through a few other layers of indirection) cause the task to be marked completed), it could be the result of two other tasks being completed, or the continuation of some other task being completed, it could be an indication of when an event next fires, or some custom TaskCompletionSource that has who knows what as its implementation.
But you don't need to worry about all of those options. That's the point. In other models you need to treat all of those different types of asynchronous operations differently, complicating your asynchronous programs. The use of Task allows you to write code that can easily be composed with any and every type of asynchronous operation.
I'm guessing since this does act like a thread there is still sync objects that may be needed is this correct?
Technically, yes. There are times where you may need to use these, but largely, no. Ideally, if you're using idiomatic practices, you can avoid this, at least in most cases. Generally when one task depends on code running in other tasks it should be the continuation of that task, and information is assessed between tasks through the tasks' Result property. The use of Result doesn't require any synchronization mechanisms, so usually you can avoid them entirely.
I see the Task all over a new mvc 5 project and I would like to know why this is happening?
When you're going to make something asynchronous it generally makes sense to make everything asynchronous (or nothing). Mixing and matching just...doesn't work. Asynchronous code relies on having every method take very little time to execute so that the message pump can get back to processing its queue of pending tasks/continuations. Mixing asynchronous code and synchronous code makes it very likely to deadlock your application, and also defeats most of the purposes of using asynchrony to begin with (which is to avoid blocking threads).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.