Logging in Multithreading Async Code

Logging in Multithreading Async Code - c#

I have a multithreading application with async processing (which by itself uses many threads). Without async, it would be easy to log and then trace the execution flow as one simply puts current thread identifier to the log and thus you can see which log line was executed by which thread.
How to achive a similar thing in async environment? Often when await is called, the following code is assigned to another thread (and I am OK with that, I trust the thread pool manager to do these assignments for me effectively). The problem is that suddenly I do not have this fixed thread ID for the execution flow and it is hard to put the two parts together.
Is there any identifier of the task that would be kept among the whole code? I mean let's say there are 5x await calls in a method. With thread ID, I can see in the logs up to 6 different IDs. I would want one thing and I would prefer if it is there already (I know I can create an object and pass it to my log function, over and over again, but if there was something already, it would be better).
Is Task.Id or Task.CurrentId suitable for this purpose, or is it something else?

What you're referring to is a "correlation id", or what log4net calls a "nested diagnostic context" (NDC). Whatever logging framework you have should have one already, most likely one that already works with async.
If you need to build your own, I'd recommend putting an id (or an immutable stack of ids) into an AsyncLocal<T> (which is essentially LogicalSetData but with an easier-to-use and more portable API). Note that the async contextual data should be immutable. See my blog for more info.

I think the answer to "is [there] a ready to go identifier that can be used" is probably no, but you can use CallContext, and specifically LogicalSetData, not just SetData(), and for instance add a GUID that will flow across all async-await forks.

Related

Get ALL stacktraces in async/await application

I want to get info about all call stacks (or get all stacktraces) in my asynchronous C# application. I know, how to get stacktraces of all existing threads.
But how to get info about all call stacks released by await, which do not have a running thread on it?
CONTEXT EXAMPLE
Suppose the following code:
private static async Task Main()
{
async Task DeadlockMethod(SemaphoreSlim lock1, SemaphoreSlim lock2)
{
await lock1.WaitAsync();
await Task.Delay(500);
await lock2.WaitAsync(); // this line causes the deadlock
}
SemaphoreSlim lockA = new SemaphoreSlim(1);
SemaphoreSlim lockB = new SemaphoreSlim(1);
Task call1 = Task.Run(() => DeadlockMethod(lockA, lockB));
Task call2 = Task.Run(() => DeadlockMethod(lockB, lockA));
Task waitTask = Task.Delay(1000);
await Task.WhenAny(call1, call2, waitTask);
if (!call1.IsCompleted
&& !call2.IsCompleted)
{
// DUMP STACKTRACES to find the deadlock
}
}
I would like to dump all stacktraces, even those not having its thread currently, so that I can find the deadlock.
If line await lock2.WaitAsync(); is changed to lock2.Wait();, then it would be possible by already mentioned get stacktraces of all threads. But how to list all stacktraces without a running thread?
PREVENTION OF MISUNDERSTANDING:
The example is very simplified, it just ilustrates one of potential complications. The original problem is a complex multithreaded application, which runs on a server and many hard-to-investigate parallel-related issues may happen.
We would use the list of async/await stacktraces not only to find deadlocks, but also for other purposes. Therefore please do not advice me how to avoid deadlocks or how to write a multithreaded application - that is not the point of the question.
You can answer this generally, but also solution working at least on .Net Core 3.1 is enough.

I know, how to get stacktraces of all existing threads.
Just gonna give a bit of background here.
In Windows, threads are an OS concept. They're the unit of scheduling. So there's a definite list of threads somewhere, since that's what the OS scheduler uses.
Furthermore, each thread has a call stack. This dates back to the early days of computer programming. However, the purpose of the call stack is often misunderstood. The call stack is used as a sequence of return locations. When a method returns, it pops its call stack arguments off the stack and also the return location, and then jumps to the return location.
This is important to remember because the call stack does not represent how code got into a situation; it represents where code is going it returns from the current method. The call stack is where the code is going to, not where it came from. That is the reason the call stack exists: to direct the future code, not to assist diagnostics. Now, it does turn out that the call stack does have useful information on it for diagnostics since it gives an indication of where the code came from as well as where it's going, so that's why call stacks are on exceptions and are commonly used for diagnostics. But that's not the actual reason why the call stack exists; it's just a happy circumstance.
Now, enter asynchronous code.
In asynchronous code, the call stack still represents where the code is returning to (just like all call stacks). But in asynchronous code, the call stack no longer represents where the code came from. In the synchronous world, these two things were the same, and the call stack (which is necessary) can also be used to answer the question of "how did this code get here?". In the asynchronous world, the call stack is still necessary but only answers the question "where is this code going?" and cannot answer the question "how did this code get here?". To answer the "how did this code get here?" question you need a causality chain.
Furthermore, call stacks are necessary for correct operation (in both the synchronous and asynchronous worlds), and so the compiler/runtime ensures they exist. Causality chains are not necessary, and they are not provided out of the box. In the synchronous world, the call stack just happens to be a causality chain, which is nice, but that happy circumstance doesn't carry over to the asynchronous world.
When a thread is released by await, the stacktrace and all objects along the call stack are stored somewhere.
No; this is not the case. This would be true if async used fibers, but it doesn't. There is no call stack saved anywhere.
Because otherwise the continuation thread would lose context.
When an await resumes, it only needs sufficient context to continue executing its own method, and potentially completing the method. So, there is an async state machine structure that is boxed and placed on the heap; this structure contains references to local variables (including this and method arguments). But that is all that is necessary for program correctness; a call stack is not necessary and so it is not stored.
You can easily see this yourself by setting a breakpoint after an await and observing the call stack. You'll see that the call stack is gone after the first await yields. Or - more properly - the call stack represents the code that is continuing the async method, not the code that originally started the async method.
At the implementation level, async/await is more like callbacks than anything else. When a method hits an await, it sticks its state machine structure on the heap (if it hasn't already) and wires up a callback. That callback is triggered (invoked directly) when the task completes, and that continues executing the async method. When that async method completes, it completes its tasks, and anything awaiting those tasks are then invoked to continue executing. So, if a whole sequence of tasks complete, you actually end up with a call stack that is an inversion of the causality stack.
I would like to dump all stacktraces, even those not having its thread currently, so that I can find the deadlock.
So, there's a couple of problems here. First, there is no global list of all Task objects (or more generally, tasklike objects). And that would be a difficult thing to get.
Second, for each asynchronous method/task, there's no causality chain anyway. The compiler doesn't generate one because it's not necessary for correct operation.
That's not to say either of these problems are insurmountable - just difficult. I've done some work on the causality chain problem with my AsyncDiagnostics library. It's rather old at this point but should upgrade pretty easily to .NET Core. It uses PostSharp to modify the compiler-generated code for each method and manually track causality chains.
However, the goal of AsyncDiagnotics is to get causality chains onto exceptions. Getting a list of all tasklikes and associating causality chains with each one is another problem, likely requiring the use of an attached profiler. I'm aware of other companies who have wanted this solution, but none of them have dedicated the time necessary to create one; all of them have found it more efficient to implement code reviews, auditing, and developer training.

I marked Stephen Cleary's answer as the correct answer. He gave hints and explained deeply why it is so difficult.
I posted this alternative answer to explain, how we finally solved it and what we decided to do.
WORKAROUND SOLVING THE PROBLEM
Assumption: stacktraces including own code are enough.
Based on the assumption we can to this:
Encapsulate all called external Async methods (track their enter and leave)
Implement style check, which will warn about using any Async method out of your project namespaces
Ad 1.: Encapsulation
Suppose an external method Task ExternalObject.ExternalAsync(). We will create encapsulating extension method:
public static async Task MyExternalAsync(this ExternalObject obj)
{
using var disposable = AsyncStacktraces.MethodStarted();
await obj.ExternalAsync();
}
During the AsyncStacktraces.MethodStarted(); static call the current stacktrace will be recorded from Environment.StackTrace property into some static dictionary together with the disposable object. There will be no performance issues, since the async method itself is most probably much more expensive than stacktrace retrieval.
The disposable object will implement IDisposable interface. The .Dispose() method will remove the current stacktrace from the static dictionary at the end of the MyExternalAsync() method.
Usually only few tens of external Async methods are actually called in the solution, therefore the effort is quite low.
Ad 2.: Style check
Custom style check extension will warn when anybody uses external Async method directly. CI can be set-up so that it will not pass when this warning exists. On few places, where we will need a direct external Async method, we will use #pragma warning disable.

Is there any drawback of using ThreadPool.QueueUserWorkItem in web app.

I did some research on this topic, but I am unable to find the expected answer for this. In my application I have used
ThreadPool.QueueUserWorkItem
Like following way.
ThreadPool.QueueUserWorkItem(o => CaseBll.SendEmailNotificationForCaseUpdate(currentCase, caseUpdate));
My app in asp.net mvc & I have handled all background task which is not required to execute on user operation for faster execution & quick user response.
Now I wants to know that, Is there any bad side of using ThreadPool.QueueUserWorkItem When we have larger audience for application.

No, you should never use it. I can write a lot of reasons why but instead you should read this article from Scott Hansleman who is a genius IMHO
http://www.hanselman.com/blog/ChecklistWhatNOTToDoInASPNET.aspx
Under "Reliability and Performance":
Fire-and-Forget Work - Avoid using ThreadPool.QueueUserWorkItem as your app pool could disappear at any time. Move this work outside or use WebBackgrounder if you must.
So, as recommended, don't use ThreadPool.QueueUserWorkItem. An excellent alternative for this is at:
https://www.asp.net/aspnet/overview/web-development-best-practices/what-not-to-do-in-aspnet-and-what-to-do-instead#fire
Edit: As mentioned by #Scott-Chamberlain, here's a better link:
http://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx

It really depends on what you are going to be doing but generally speaking, your likely concerns will be:
Persistence. Threads in the managed pool are background threads. They will die when the application recycles which is generally undesirable. In your case you want to send e-mails. Imagine if your process dies for some reason before the thread executes. Your e-mail will never be sent.
Local storage is shared, which means you need to make sure there are no leftovers from the last thread if using it. Applies to fields marked with ThreadStaticAttribute as well.
I would instead recommend that you implement a job scheme where you schedule the job somewhere and have some other component actually read from this list (e.g. database) and perform the job, then mark it as complete. That way it persists across application unload, there is no memory reuse and you can throttle the performance. You could implement the processing component inside your application or even as a Windows service if you prefer.

ConfigureAwait(false) on Top Level Requests

I'm trying to figure out if ConfigureAwait(false) should be used on top level requests. Reading this post from a somewhat authority of the subject:
http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html
...he recommends something like this:
public async Task<JsonResult> MyControllerAction(...)
{
try
{
var report = await _adapter.GetReportAsync();
return Json(report, JsonRequestBehavior.AllowGet);
}
catch (Exception ex)
{
return Json("myerror", JsonRequestBehavior.AllowGet); // really slow without configure await
}
}
public async Task<TodaysActivityRawSummary> GetReportAsync()
{
var data = await GetData().ConfigureAwait(false);
return data
}
...it says to using ConfigureAwait(false) on every await except the top level call. However when doing this my exception takes several seconds to return to the caller vs. using it and it and having it come back right away.
What is the best practice for MVC controller actions that call async methods? Should I use ConfigureAwait in the controller itself or just in the service calls that use awaits to request data, etc.? If I don't use it on the top level call, waiting several seconds for the exception seems problematic. I don't need the HttpContext and I've seen other posts that said always use ConfigureAwait(false) if you don't need the context.
Update:
I was missing ConfigureAwait(false) somewhere in my chain of calls which was causing the exception to not be returned right away. However the question still remains as posted as to whether or not ConfigureAwait(false) should be used at the top level.

Is it a high traffic website? One possible explanation might be that you're experiencing ThreadPoolstarvation when you are not using ConfigureAwait(false). Without ConfigureAwait(false), the await continuation is queued via AspNetSynchronizationContext.Post, which implementation boils down to this:
Task newTask = _lastScheduledTask.ContinueWith(_ => SafeWrapCallback(action));
_lastScheduledTask = newTask; // the newly-created task is now the last one
Here, ContinueWith is used without TaskContinuationOptions.ExecuteSynchronously (I'd speculate, to make continuations truly asynchronous and reduce a chance for low stack conditions). Thus, it acquires a vacant thread from ThreadPool to execute the continuation on. In theory, it might happen to be the same thread where the antecedent task for await has finished, but most likely it'd be a different thread.
At this point, if ASP.NET thread pool is starving (or has to grow to accommodate a new thread request), you might be experiencing a delay. It's worth mentioned that the thread pool consists of two sub-pools: IOCP threads and worker threads (check this and this for some extra details). Your GetReportAsync operations is likely to complete on an IOCP thread sub-pool, which doesn't seem to be starving. OTOH, the ContinueWith continuation runs on a worker thread sub-pool, which appears to be starving in your case.
This is not going to happen in case ConfigureAwait(false) is used all the way through. In that case, all await continuations will run synchronously on the same threads the corresponding antecedent tasks have ended, be it either IOCP or worker threads.
You can compare the thread usage for both scenarios, with and without ConfigureAwait(false). I'd expect this number to be larger when ConfigureAwait(false) isn't used:
catch (Exception ex)
{
Log("Total number of threads in use={0}",
Process.GetCurrentProcess().Threads.Count);
return Json("myerror", JsonRequestBehavior.AllowGet); // really slow without configure await
}
You can also try increasing the size of the ASP.NET thread pool (for diagnostics purpose, rather than an ultimate solution), to see if the described scenario is indeed the case here:
<configuration>
<system.web>
<applicationPool
maxConcurrentRequestsPerCPU="6000"
maxConcurrentThreadsPerCPU="0"
requestQueueLimit="6000" />
</system.web>
</configuration>
Updated to address the comments:
I realized I was missing a ContinueAwait somewhere in my chain. Now it
works fine when throwing an exception even when the top level doesn't
use ConfigureAwait(false).
This suggests that your code or a 3rd party library in use might be using blocking constructs (Task.Result, Task.Wait, WaitHandle.WaitOne, perhaps with some added timeout logic). Have you looked for those? Try the Task.Run suggestion from the bottom of this update. Besides, I'd still do the thread count diagnostics to rule out thread pool starvation/stuttering.
So are you saying that if I DO use ContinueAwait even at the top level
I lose the whole benefit of the async?
No, I'm not saying that. The whole point of async is to avoid blocking threads while waiting for something, and that goal is achieved regardless of the added value of ContinueAwait(false).
What I'm saying is that not using ConfigureAwait(false) might introduce redundant context switching (what usually means thread switching), which might be a problem in ASP.NET if thread pool is working at its capacity. Nevertheless, a redundant thread switch is still better than a blocked thread, in terms of the server scalability.
In all fairness, using ContinueAwait(false) might also cause redundant context switching, especially if it's used inconsistently across the chain of calls.
That said, ContinueAwait(false) is also often misused as a remedy against deadlocks caused by blocking on asynchronous code. That's why I suggested above to look for those blocking construct across all code base.
However the question still remains as posted as to whether or not
ConfigureAwait(false) should be used at the top level.
I hope Stephen Cleary could elaborate better on this, by here's my thoughts.
There's always some "super-top level" code that invokes your top-level code. E.g., in case of a UI app, it's the framework code which invokes an async void event handler. In case of ASP.NET, it's the asynchronous controller's BeginExecute. It is the responsibility of that super-top level code to make sure that, once your async task has completed, the continuations (if any) run on the correct synchronization context. It is not the responsibility of the code of your task. E.g., there might be no continuations at all, like with a fire-and-forget async void event handler; why would you care to restore the context inside such handler?
Thus, inside your top-level methods, if you don't care about the context for await continuations, do use ConfigureAwait(false) as soon as you can.
Moreover, if you're using a 3rd party library which is known to be context agnostic but still might be using ConfigureAwait(false) inconsistently, you may want to wrap the call with Task.Run or something like WithNoContext. You'd do that to get the chain of the async calls off the context, in advance:
var report = await Task.Run(() =>
_adapter.GetReportAsync()).ConfigureAwait(false);
return Json(report, JsonRequestBehavior.AllowGet);
This would introduce one extra thread switch, but might save you a lot more of those if ConfigureAwait(false) is used inconsistently inside GetReportAsync or any of its child calls. It'd also serve as a workaround for potential deadlocks caused by those blocking constructs inside the call chain (if any).
Note however, in ASP.NET HttpContext.Current is not the only static property which is flowed with AspNetSynchronizationContext. E.g., there's also Thread.CurrentThread.CurrentCulture. Make sure you really don't care about loosing the context.
Updated to address the comment:
For brownie points, maybe you can explain the effects of
ConfigureAwait(false)... What context isn't preserved.. Is it just the
HttpContext or the local variables of the class object, etc.?
All local variables of an async method are preserved across await, as well as the implicit this reference - by design. They actually gets captured into a compiler-generated async state machine structure, so technically they don't reside on the current thread's stack. In a way, it's similar to how a C# delegate captures local variables. In fact, an await continuation callback is itself a delegate passed to ICriticalNotifyCompletion.UnsafeOnCompleted (implemented by the object being awaited; for Task, it's TaskAwaiter; with ConfigureAwait, it's ConfiguredTaskAwaitable).
OTOH, most of the global state (static/TLS variables, static class properties) is not automatically flowed across awaits. What does get flowed depends on a particular synchronization context. In the absence of one (or when ConfigureAwait(false) is used), the only global state preserved with is what gets flowed by ExecutionContext. Microsoft's Stephen Toub has a great post on that: "ExecutionContext vs SynchronizationContext". He mentions SecurityContext and Thread.CurrentPrincipal, which is crucial for security. Other than that, I'm not aware of any officially documented and complete list of global state properties flowed by ExecutionContext.
You could peek into ExecutionContext.Capture source to learn more about what exactly gets flowed, but you shouldn't depend on this specific implementation. Instead, you can always create your own global state flow logic, using something like Stephen Cleary's AsyncLocal (or .NET 4.6 AsyncLocal<T>).
Or, to take it to the extreme, you could also ditch ContinueAwait altogether and create a custom awaiter, e.g. like this ContinueOnScope. That would allow to have precise control over what thread/context to continue on and what state to flow.

However the question still remains as posted as to whether or not ConfigureAwait(false) should be used at the top level.
The rule of thumb for ConfigureAwait(false) is to use it whenever the rest of your method doesn't need the context.
In ASP.NET, the "context" is not actually well-defined anywhere. It does include things like HttpContext.Current, user principal, and user culture.
So, the question really comes down to: "Does Controller.Json require the ASP.NET context?" It's certainly possible that Json doesn't care about the context (since it can write the current response from its own controller members), but OTOH it does do "formatting", which may require the user culture to be resumed.
I don't know whether Json requires the context, but it's not documented one way or the other, and in general I assume that any calls into ASP.NET code may depend on the context. So I would not use ConfigureAwait(false) at the top-level in my controller code, just to be on the safe side.

Using the StateMachineCompiler(SMC) in own code

Hello i want to use the State Machine Compiler (SMC) with C#
http://smc.sourceforge.net/
i have created the sm-File to describe the state machine and generated c# code from it.
Then i created my own class MyClass,add the generated class which was generated with smc and implement the methods.
My Problem is how can i run this statemachine? With a While-loop, a async call or the Task Library ? What is an elegant way?
The Statemachine is a behaivior for sending data throught the serialport. So that the user can call MyClass.Send(Data) and the StateMachine should work behind the curtains.
Can someone give me an example how to use the statemachine in own code?
Regards
rubiktubik

I've used SMC in many application and I was very satisfied with it. I hit the same problem as you. SMC generates code for C# which is synchronous and linear. This means if you issue a transaction by calling fsm.YourTransaction() and by chance somewhere in the middle of that transaction you issue another transaction, it would be directly called. It is very dangerous, because it breaks the base principle of a state machine - that transactions are atomic and the system is guaranteed to be in single state, or single transition all the time.
When I realized this hidden problem I implemented an asynchronous layer above the state machine code generated by SMC. Unfortunately I cannot provide you with the code, because is licensed, but I can describe the principle here.
I replaced direct method calls with asynchronous event processing: there is a queue awaiting transactions. Transactions are represented by strings, which must be the same as transaction methods in fsm. They are removed from the queue one by one in an independent thread. Transaction string is transformed to a fsm method call.
This concept proved to work very well in many critical applications. It is rather simple to implement and easy to extend with new features.
Final form of this asynchronous layer had these additional features:
Perfect logging: all transactions and their arguments, time of arrival, time of processing ...etc.
Possibility to replace the independent thread with an external thread - sometimes it is better to process things in thread provided from outside (Windows GUI is very sensitive to external thread calls...)
Preprocessing of transactions - sometimes the system was expected to change state only if a sequence of transaction occured. It is rather clumsy to achieve it directly with SMC. I implemented so called "transaction transformations" - it was possible to declare how a set of transactions is transformed into a single transaction.

Use of lock keyword and the new async functionality of C# 5.0

Is it still necessary to use the lock keyword on resources like SQL Compact database in methods called with async (AsyncCtpLibrary.dll)? As i understand from the talk given by Anders, the async processing all happens within the same thread, so they shouldn't be a need for it, or am I wrong? I cannot find any info on this anywhere on the internet at the moment.
Thanks

AFAIK async is based on the TPL and Tasks - and so no they won't be running on the same thread every time (or continue on the same thread) and yes you have to design with concurency in mind still. Async only helps you to put the pieces together in a much nicer way.
To be clear: everything inside your methods (if started only once) will run in a thread at a time but if you share resources you will have to think on locking or other synchronization methods just as you used to do all the time.
If you can go for immutable data - this way you can strip all this to a mere minimum, but you allway have to remember that your processes will run on many threads (dispatch for UI comes to mind).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.