How does HttpContext.Current work in a multi-threaded environment? - c#

So I'm left wondering how exactly asp.net is able to scope a static property, when (to my knowledge) asp.net is multi-threaded.
One theory goes that the ASP.NET guys maintain a different appdomain for every request ... but that doesn't seem feasible.
Another theory goes that the .Current method looks at the current Thread, and then uses that to look up the http context in some hashtable (or other static storage mechanism).
Either way, it's a technique that seems really useful ... I'd like to utilize it, but definitely don't want to be debugging shared state bugs :-/

It isn't an AppDomain per-request. If you want to use thread-specific state, try:
[ThreadStatic]
private static int foo;
public static int Foo {get {return foo;} set {foo = value;}}
Each thread now gets its own value of Foo (or rather: 'foo').
This is not to be used lightly - it does have costs, but is a valid way of sharing state on a per-thread basis. I've used this once, maybe twice - and I've written a lot of C#. Don't over-use it...
In particular, watch out for initialization issues (i.e. forgetting to do it), and remember to clean up after yourself etc. And be very careful if you use any async code, as any callbacks/workers/etc will have different state.

What Marc says is the easiest most likely for what you are after, however ASP.NET is actually somewhat more complicated than what say ThreadStatic does, because single requests actually can be processed by multiple threads.. what I believe happens with ASP.NET is that the executing thread explicitely is told to switch context, of course the hosting environment is scheduling the threads and it has context of which httpcontext needs executing, so it finds a thread, tells the thread which context it should run in.. then sends it off on its way.
So the solution really isn't all that pretty sadly, where as threadstatic is much simpler and probably suits needs 95% of the time.

Related

Safety of AsyncLocal in ASP.NET Core

For .NET Core, AsyncLocal is the replacement for CallContext. However, it is unclear how "safe" it is to use in ASP.NET Core.
In ASP.NET 4 (MVC 5) and earlier, the thread-agility model of ASP.NET made CallContext unstable. Thus in ASP.NET the only safe way to achieve the behavior of a per-request logical context, was to use HttpContext.Current.Items. Under the covers, HttpContext.Current.Items is implemented with CallContext, but it is done in a way that is safe for ASP.NET.
In contrast, in the context of OWIN/Katana Web API, the thread-agility model was not an issue. I was able to use CallContext safely, after careful considerations of how correctly to dispose it.
But now I'm dealing with ASP.NET Core. I would like to use the following middleware:
public class MultiTenancyMiddleware
{
private readonly RequestDelegate next;
static int random;
private static AsyncLocal<string> tenant = new AsyncLocal<string>();
//This is the new form of "CallContext".
public static AsyncLocal<string> Tenant
{
get { return tenant; }
private set { tenant = value; }
}
//This is the new verion of [ThreadStatic].
public static ThreadLocal<string> LocalTenant;
public MultiTenancyMiddleware(RequestDelegate next)
{
this.next = next;
}
public async Task Invoke(HttpContext context)
{
//Just some garbage test value...
Tenant.Value = context.Request.Path + random++;
await next.Invoke(context);
//using (LocalTenant = new AsyncLocal<string>()) {
// Tenant.Value = context.Request.Path + random++;
// await next.Invoke(context);
//}
}
}
So far, the above code seems to be working just fine. But there is at least one red flag. In the past, it was critical to ensure that CallContext was treated like a resource that must be freed after each invocation.
Now I see there is no self-evident way to "clean up" AsyncLocal.
I included code, commented out, showing how ThreadLocal<T> works. It is IDisposable, and so it has an obvious clean-up mechanism. In contrast, the AsyncLocal is not IDisposable. This is unnerving.
Is this because AsyncLocal is not yet in release-candidate condition? Or is this because it is truly no longer necessary to perform cleanup?
And even if AsyncLocal is being used properly in my above example, are there any kinds of old-school "thread agility" issues in ASP.NET Core that are going to make this middleware unworkable?
Special Note
For those unfamiliar with the issues CallContext has within ASP.NET apps, in this SO post, Jon Skeet references an in-depth discussion about the problem (which in turn references commentary from Scott Hanselman). This "problem" is not a bug - it is just a circumstance that must be carefully accounted for.
Furthermore, I can personally attest to this unfortunate behavior. When I build ASP.NET applications, I normally include load-tests as part of my automation test infrastructure. It is during load tests that I can witness CallContext become unstable (where perhaps 2% to 4% of requests show CallContext being corrupted. I have also seen cases where a Web API GET has stable CallContext behavior, but the POST operations are all unstable. The only way to achieve total stability is to rely on HttpContext.Current.Items.
However, in the case of ASP.NET Core, I cannot rely on HttpContext.Items...there is no such static access point. I'm also not yet able to create load tests for the .NET Core apps I'm tinkering with, which is partly why I've not answered this question for myself. :)
Again: Please understand that the "instability" and "problem" I'm discussing is not a bug at all. CallContext is not somehow flawed. The issue is simply a consequence of the thread dispatch model employed by ASP.NET. The solution is simply to know the issue exists, and to code accordingly (e.g. use HttpContext.Current.Items instead of CallContext, when inside an ASP.NET app).
My goal with this question is to understand how this dynamic applies (or does not) in ASP.NET Core, so that I don't accidentally build unstable code when using the new AsyncLocal construct.
I'm just looking into the source code of the ExecutionContext class for CoreClr:
https://github.com/dotnet/coreclr/blob/775003a4c72f0acc37eab84628fcef541533ba4e/src/mscorlib/src/System/Threading/ExecutionContext.cs
Base on my understanding of the code, the async local values are fields/variables of each ExecutionContext instance. They are not based on ThreadLocal or any thread specific persisted data store.
To verify this, in my testing with thread pool threads, an instance left in async local value is not accessible when the same thread pool thread is reused, and the "left" instance's destructor for cleaning up itself got called on next GC cycle, meaning the instance is GCed as expected.
Adding my two cents if someone lands on this page (like I did) after googling if AsyncLocal is "safe" in ASP.NET classic (non Core) application (some commenters have been asking this, and also I see a deleted answer asking about the same).
I wrote a small test that simulates asp.net's ThreadPool behavior
AsyncLocal is always cleared between requests even if thread pool re-uses an existing thread. So it is "safe" in that regard, no data will be leaked to another thread.
However, AsyncLocal can be cleared even within the same context (for example between code that runs in global.asax and the code that runs in controller). Because MVC-methods sometimes runs on a separate thread from non-MVC code, see this question for example: asp.net mvc 4, thread changed by model binding?
Using ThreadLocal is not safe b/c it preserves the value after the thread from Thread Pool is re-used. Never use ThreadLocal in web-applications. I know the question is not about ThreadLocal I'm just adding this warning to whoever considering using it, sorry.
Tested under ASP.NET MVC 5 .NET 4.7.2.
Overall, AsyncLocal seems like a perfect alternative to short-time caching stuff in HttpContext.Current in cases where you can't access the latter directly. You might end up re-calculating the cached value a bit more often though, but that's not a big problem.

Threadpool management of shared variables in .NET

Let's say I have a timer (e.g. a System.Timers.Timer), and we know each elasped event will get put into the threadpool. If events come rapidly enough, how does the threadpool manage access to shared variables (e.g. a global int counter). Does the manager use semaphores/locks under the hood?
Or does it not do anything, and just simply make a copy of shared variables at the start of the threadpool, and the last thread to finish will set the correct variable value?
Unfortunately I can't really test this because the order of events firing are not guaranteed (e.g. using a counter variable is not reliable) between each elapsed event, as they may be fired out of order.
Thanks
You have to manage multi-threaded access to shared variables yourself.
There are many answers on StackOverflow and Google explaining how to do this, search for "thread safety C#".
I've worked on huge projects with many potential threading issues, and the code I write just works. I'm damn good at writing thread safe code these days, as I've already made all of the possible mistakes.
If you are just learning to write thread safe code, then its easy to get overwhelmed by the huge amount of information out there. You might find some pages that cover the 8 different types of synchronization primitives. You will find huge discussions on the topic, and only half of them will be helpful.
If you are following the learning curve for the first time, I would recommend that you ignore said noise for now, and instead focus on mastering these two rules first:
Rule 1
If any two threads write to some shared primitive (like a long or a Dictionary or a List), put a lock around the access to this shared primitive. Aim for a situation so that when the lock is finished, the data structure is completely updated. This is the heart of writing thread safe code: all other rules for threading can be derived from this one.
Example:
// This _lock should be initialized once on program startup, and should be global.
static readonly object _dictLock = new object();
// This data structure can be accessed by multiple threads.
public static Dictionary<string, int> dict = new Dictionary<string, int>();
lock (_dictLock)
{
if (dict.ContainsKey("Hello") == false)
{
dict.Add("Hello", 42);
}
} // Lock exits: data structure is now completely 100% updated. Google "atomic access C#".
Rule 2
Try not to have locks within locks. This can create deadlocks if the locks are entered in the wrong order. If you only lock around the primitives (e.g. dictionary, long, string, etc), then this shouldn't be an issue.
Guideline 1
If you are just learning, use nothing but lock, see how to use lock. Its difficult to go wrong if you just this, as the lock is automatically released when the function exits. You can graduate to other types of locks, like reader-write locks, later on. Don't bother with ConcurrentDictionary or Interlocked.Increment yet - focus on getting the basics correct.
Guideline 2
Try to spend as little time in locks as possible. Don't put a lock around a huge block of code, put locks around the smallest possible portions in the code, usually a dictionary or a long. A lock is blindingly fast unless its contested, so this technique seems to work well to create thread safe code that is fast.
Cause of 95% of meaningful threading issues?
In my experience, the single biggest cause of thread-unsafe code is Dictionary. Even ConcurrentDictionary is not immune to this - it needs manual locking to be correct if the access is spread over multiple lines. If you get this right, you will eliminate 95% of meaningful threading issues in your code.
The thread pool can't magically make your shared mutable variables thread-safe. It has no control over them and it does not even know they exist.
Be aware of the fact that timer ticks can happen concurrently (even at low frequencies) and after the timer has been disposed. You need to perform any synchronization necessary.
The thread pool itself is thread-safe in the sense that I can successfully process concurrent work items (which is kind of the point).

Thread safety of HttpContext

After a fair bit of Googling I have not found any authoritative, conclusive information regarding the thread safety of HttpContext.
I'm looking at a scenario such as:
public class AsyncHandler : IAsyncHttpHandler
{
void BeginProcessRequest(...)
{
// Spawn some tasks in parallel, provide them with current HttpContext as argument.
}
void EndProcessRequest(...) {...}
}
My (io bound) parallel tasks would want to access HttpContext, potentially at the same time.
Looking around at various posts it seems that this is safe, however, I'd like some actual evidence. Of course MSDN gives the usual "statics are threadsafe etc.", but this doesn't help other than I'd have to assume it's not threadsafe.
I've seen various posts on StackOverflow (such as here, or here, or here) but no real answer to this question.
With all the async work in .NET 4.5 it would seem a little strange if HttpContext is not threadsafe, however, if indeed it isn't so, are there any ways to make it so? I could think of:
Clone it (but this isn't so easy, although it doesn't seem impossible at first sight).
Wrap HttpContextBase and make this thread safe (hey, I'd call it HttpContextWrapperWrapper).
but this all feels a bit crappy and too much work.
EDIT: After looking into this in more detail, and some reflectoring later, I believe it does in fact not matter that HttpContext is not thread safe. I detailed this in this blog post. The gist is that using the correct SynchronizationContext in ASP.NET ensures that no more than one thread at a time will access the context.
The HttpContext class is not thread-safe.
For example, the HttpContext.Items property is just a reference to an unsynchronized Hashtable - so this is clearly not thread-safe.
It's not clear from your question exactly what you want to share between the parallel tasks, but I would suggest you use an instance of your own thread-safe class to share state between the tasks rather than attempting to wrap an existing class.

Lock before reading a global string?

I have class than spins off a backgroundworker to do some processor intensive stuff. The background worker reads a few strings that are declared globally for the whole class... do I need to lock around those strings? The backgroundworker never write the strings, they simply represent some directory locations that are set in the constructor of the class and are hardly ever written to by the class after the constructor (and never written to by the backgroundworker). So it's possible the background worker could read the string as it is also being written to by the main class object, though pretty unlikely. But wouldn't both those operations (the read by the background worker and the write by the main class) be atomic for a string literal anyway?
Thanks,
-Robert
Edit: I don't care about the string being out of date or anything (that wouldn't be a big problem in my app), I'm more worried about getting the "object is in use elsewhere" exception.
Strings in .NET are immutable; they can't change. What happens is that the reference will point to a totally different string but the strings themselves won't be changed.
So if you don't particularly mind that the background workers might not all use the same string if you change it, then you should be fine. Example: Worker A reads string, something else changes it, Worker B reads new string—maybe this doesn't cause problems, maybe it does. But accessing the strings itself is definitely safe.
To quote from the documentation:
This type is thread safe.
ETA: A very good point by Martinho Fernandes in the comments below: Thread-safe objects to not automagically mean that everything you do with them is thread-safe as well. He even wrote a blog post on that which spares me the work of saying again everything he did :-)
If you don't use a lock, the worst case would be that one of your background workers reads an outdated copy of the string from the perspective of your main class and thread. You will never (under any circumstance) encounter a "object in use elsewhere" exception when working with strings (from question).
As stated correctly in another answer, strings are immutable and cannot be changed after created. Any changes to an existing string will transparently result in a new string being created in memory on the heap, without any impact to the previous string object.
Using a lock (at the cost of a possibly measurable performance impact), will ensure that your background workers read the latest copy of the string.
Yes, reads and writes will be atomic to string variables. That is because only the variable reference is ever changed. Strings are immutable so any operation that modifies the contents of the string will also create a new instance of the string. It is the reference to that new instance that is swapped out via the variable. But, that is not the main issue.
The main issue has to do with the staleness of the string variable itself. Without the appropriate synchronization mechanisms the writes in one thread may not be seen in another thread.
Bottom line...if there is even a remote chance that another thread will modify the string variable then you will need to synchronize access to it from the worker thread and most likely your main thread as well.
Edit: Since staleness is of no concern to you then you will probably be okay without using locks. However, the assumption is that you have initialized the string variable to something before the worker thread starts.
If you are not doing any writing or modifying any shared variables then you don't need to use lock.
There are several strings A B & C? Does it matter if the background worker is working on (say) v3 of A and B and v2 of C? If so then you need a lock around the update of the whole set.
A second subtle problem is that C# might choose cache values in registers or otehrwise reorder your code and so the threads don't see the same view of "reality". See this discussion and answers to this SO question.
My recommendation, write the conspicuously correct code using synchronisation. In this scenario the performance impact is surely trivial. That way the code maintainer doesn't even need to worry. If benchmarking reveals this to be a performance problem then be very, very careful as you study writing thread-safe clever code.
Yes, i would recomend locking, if it is such a small task as only reading, it should be fairly trivial. Just be aware of deadlock where a thread has locked a string and is waiting for another thread that has some other lock held.
You don't need the lock since such a race would only occur if two threads attempt to assign a value concurrently. Since .NET strings are immutable, the result you get is never corrupt - in the worst case, it's outdated.

only one of multiple threads to execute a particular code path

I have multiple threads starting at the roughly the same time --- all executing the same code path. Each thread needs to write records to a table in a database. If the table doesn't exist it should be created. Obviously two or more threads could see the table as missing, and try to create it.
What is the preferred approach to ensure that this particular block of code is executed only once by only one thread.
While I'm writing in C# on .NET 2.0, I assume that the approach would be framework/language neutral.
Something like this should work...
private object lockObject = new object();
private void CreateTableIfNotPresent()
{
lock(lockObject)
{
// check for table presence and create it if necessary,
// all inside this block
}
}
Have your threads call call the CreateTableIfNotPresent function. The lock block will ensure that no thread will be able to execute the code inside of the block concurrently, so no threads will be able to view the table as not present while another is creating it.
This is a classical application for either a Mutex or a Semaphore
A mutex ensures that a specific piece of code (or several pieces of code) can only be run by a single thread at a time. You could be clever and use a different mutex for each table, or simply constrain the whole initialisation block to one thread at a time.
A semaphore (or set of semaphores) could perform exactly the same function.
Most lock implementations will use a mutex internally, so look at what lock code is already available in the language or libraries you are using.
#ebpower has it right that in certain applications, you would actually be more efficient to catch an exception caused by an attempt to create the same table multiple times, though this may not be the case in your example.
However there are many other ways of proceeding. For example, you could use a single-threaded ExecutorService (sorry, I could only find a Java reference) that has responsibility for creating any tables that your worker threads discover are missing. If it gets two requests for the same table, it simply ignores the later ones.
A variant on a Memoizer (remembering table references, creating them first if necessary) would also work under the circumstances. The book Java Concurrency In Practice walks through the implementation of a nice Memoizer class, but this would be pretty simple to port to any other language with effective concurrency building blocks.
This is what Semaphores are for.
You may not even need to bother with locks since your database shouldn't let you create multiple tables with the same name. Why not just catch the appropriate exceptions and if two threads try to create the same table, one wins and continues on, while the other recovers and continues on.
I'd use a thread sync object such as ManualResetEvent though it sounds to me like you're willing a race condition which may mean you have a design problem
Some posts have suggested Mutexes - this is an overkill unless your threads are running on different processes.
Others have suggested using locks - this is fine but locking can lead to over-pessimistic locks on data which can negate the benefit of using threads in the first place.
A more fundamental question is why are you doing it this way at all? What benefit does threading bring to the problem domain? Does concurrency solve your problem?
You may want to try static constructors to get a reference of the table.
According to the MSDN (.net 2.0), A static constructor is used to initialize any static data, or to perform a particular action that needs performed once only.
Also, CLR automatically guarantees that a static constructor executes only once per AppDomain and is thread-safe.
For more info, check Chapter 8 of CLR via C# by Jeffrey Richter.

Categories

Resources