How to capture information on why my application is "hung" - c#

I have a WPF monitoring application that uses a separate (internally developed) C# test
infrastructure to execute tests and monitor and log the results. I also uses a commercial package (InGear) to communicate to a PLC. As a result, the application has LOTS of threads (most of which are created by the tools I am using).
Because of the nature of the environment, it would be very difficult to use a debugger in the target environment; so, we are both using log4net to log diagnostics.
I use try/catch blocks in around my external calls and also have setup a unhandled exception handlers both at the WPF and AppDomain levels.
During our first long run the application appears to have become non-responsive and I got the standard "not responding" dialog. Looking at the log it seems like everything just stopped. Ex: I can see from the log that a DispatcherTimer was set to respond on the main thread in 1 sec; but, never did.
So.... My questions are:
How can I detect the hang or is hook into Window's detection that I am hung? Note that I am assuming that it could be a higher priority thread that is blocking my UI tread; so, I probably can't respond to a Windows Message.
Once I do tap in, how do I find out what thread is the culprit. Being able the log its call stack would be a big plus.

Maybe simplistic, but what about attaching the debugger to the process, doing a 'Break All' and then inspect the stack trace of the various threads?

Where I was unable to determine a way to detect the 'hang' before Windows does, I was able to catch the Windows timeout exception and ultimately traced the problem to unmanaged code in the Oracle .NET component.

Related

In c#/Xamarin/Android How do I get the stack trace for all managed threads?

I'm trying to diagnose why my app is freezing up and android is displaying a message that app is not responding. For an unknown reason, messages that should show in logcat output do not when this happens.
I'm looking at using https://github.com/nwestfall/Xamarin.ANRWatchDog to find out what is going on in my app if it becomes non-responsive. The problem is, I don't see anything in call stacks from a c#/managed thread perspective.
All information I've been able to find doesn't work with c#/Xamarin/Android and/or is target the development for desktop/server development. Attempting to get a list of threads gives me a list, but all entries in the list are null. Even if I can a list of the actual threads, how do I get the call stack for each thread?
var threads = Process.GetCurrentProcess().Threads;
Is there a way to get the current stack trace for all managed threads in a Xamarin/Android app?
System.Diagnostics.StackTrace() for the current thread, however you probably don't care about that.
Here's a way to get other threads and get their stack traces:
How to get non-current thread's stacktrace?
If that doesn't work, you can certainly use a method I've used successfully in winforms, a long time ago. Here's the strategy:
ensure all your worker threads have a top-level try/catch with logging you can access
Abort the worker threads and look for the abort exceptions in the logs!
Sounds simple, but takes you straight to a crash. Still you have to get the threads to abort them, see the threadsampler.Start method first few lines in the S/O link above.
If your UI is responsive you can hide this function in an easter-egg activated devtools menu or have a debug menu activated in your configs (you'll want than when your UI apps get non trivial anyway!)
Unfortunately since it's locking up, you'll just have to start a background thread to run this functionality, you could have it ping the UI every second or 20, and abort threads if it locks. Strategy would be to invoke a 'dummy function' on the UI thread:
https://learn.microsoft.com/en-us/dotnet/maui/platform-integration/appmodel/main-thread
and wait for it to come back, e.g., InvokeOnMainThreadAsync, it should do that in 5sec or less otherwise you take action. Apply a timeout waiting for it:
Asynchronously wait for Task<T> to complete with timeout.
Make sure to turn this off in prod!!! Or optionally, activate via switch you can direct users to turn on.
This is the pattern we used in the field for our app, years and years ago.
Full saga:
How do I get the GUI thread of winform?
A newer version of our app does all the work on a background thread and the user gets a chance to cancel if it takes more than 10 sec. However that requires the use of an abstracted UI and the team created a variation on MVVM. Useful as it allows us to deploy a thick client and web app with all the same code - including UI code. Probably not worth the effort for any other apps, and if I had to do that over I'd use javascript for a web app and deploy it to desktop in electron and mobile via cordova, react etc. Oh well.
You're probably beyond this (apologies in advance) but there is the Debug\Windows\Threads tab and if I hang a named thread deliberately in my Xamarin Android app:
new Thread(hangMe) { Name = "HangMe" }.Start();
void hangMe()
{
while(true)
{
Thread.Sleep(500);
}
}
If execution is paused by going to the VS main menu Debug\Break All then there may be some useful call stack info there in the Threads tab.
I understand that things are rarely this simple...

Possible to run a function before "rudely" interrupting a program? [duplicate]

We have a .NET console app that has many foreground threads.
If we kill the process using Task Manager or issuing killjob, kill from the command line in windows, is there a way by which we can gracefully shut down the application (adding manged code within the .net console app), something like having a function being called say TodoBeforeShutdown() that disposes objects, closes any open connections, etc.
P.S. - I read the other threads and they all suggested different ways to kill the process, rather than my specific question, what is the best way we can handle a terminate process, within the .NET managed code.
Thanks in advance.
Unfortunately, there is no event raised that you can handle whenever a process is killed.You can think of killing a process like cutting off the power to the computer—no matter what code you have designed to run on system shutdown, if the computer doesn't shut down gracefully or properly, that code is not going to run.
When you kill a process using Task Manager, it calls the Win32 TerminateProcess function, which unconditionally forces the process (including all of its owned threads) to exit. The execution of all threads/processes is halted, and all pending I/O requests are canceled. Your program is effectively dead. The TerminateProcess function does not invoke the shutdown sequence provided by the CLR, so your managed app would not even have any idea that is was being shut down.
You suggest that you're concerned about disposing objects whenever your application's process is terminated, but there are a couple of things worth pointing out here:
Always strive to minimize the amount of damage that could be done. Dispose of your objects as early as possible, whenever you are finished with them. Don't wait until later. At any given time, when your program's process is terminated, you should only be keeping the bare minimum number of objects around, which will leave fewer possibilities for leaks.
The operating system will generally clean up and free most of these resources (i.e., handles, etc.) upon termination.
Finally, it should go without saying that process termination in this way is truly an exceptional condition—even if some resources leak, that's to be expected. You're not supposed to shut an app down this way any more than you're supposed to kill necessary Windows system processes (even though you can when running as an Administrator).
If this is your regular plan to shut down your console application, you need to find another plan.
In short: You can't!Killing a process is exactly the opposite of a gracefull exit.If you are running Foreground Threads, sending a wm_exit won't shut down your app. Since you have a console app, you could simply redirect the console input to send an "exit" to your process.Further I think you could change the app to service (instead of a console application), this would offer you exactly what you are looking for -> interface for gracefull exit based on windows build-in tools/commands.

Handling executing processes

A couple of days ago I began to get an error with a c# winform application I've been creating stating that
The CLR has been unable to transition from COM context 0x278f58 to COM context 0x2790c8 for 60 seconds. The thread that owns the destination context/apartment
is most likely either doing a non pumping wait or processing a very long running operation without pumping Windows messages.
This is occuring when I am using a separate thread to run exe processes to avoid freezing up the ui. In a release version, this program runs fine and as expected but pretty much makes it impossible to consistently debug my program (sometimes works fine, others not so fine)..
I've tried implementing this process by forcing a BackgroundWorker to be synchronous using multiple googled answers which solves the issue of this error but makes my program work in unexpected ways (textboxes populated before exe finished resulting in erroneous data).
I have read that this error will only occur in production and not in a release.. so my question is should I just try to live with this annoyance or is their a non backgroundworker solution? If any code example is needed I can provide but I don't believe it is necessary
The Managed Debugging Assistant (MDA) is telling you that a single-threaded apartment (STA) COM thread hasn't responded to a message in 60 seconds. STA COM is done through message passing. This exception occurs if MDA is switched on, which it is by default when running under a debugger. The MDA works to detect deadlock with a pre-defined timeout, and it's only effective when you're running the program under the VS debugger.
Since many COM components are STA and the main thread in Windows Forms is also STA, this is a warning that you’re blocking. This is probably occurring because you are stalling the message loop by spending time stepping through code.
To switch this off for a single project, add the following content to your application configuration file:
<mdaConfig>
<assistants>
<contextSwitchDeadlock enable="false" />
</assistants>
</mdaConfig>
To switch this off globally:
Click on the Debug menu in Visual Studio.
Choose the Exceptions option (Debug -> Exceptions).
The Exceptions window will open.
Expand the "Managed Debugging Assistants" node.
Uncheck the ContextSwitchDeadlock option under the Thrown column.
Click OK and close the Exceptions window.
The implication of disabling this MDA is that you lose a useful tool for discovering bugs before you release the application. Of course if you see this deadlock when not running under the debugger then you need to do a normal deadlock analysis.

AppDomains vs. a robust server

after doing some research it seems that AppDomains are not really a tool for building a hosting server. From my understanding, the hosting server will still crash if there is an unhandled exception in a created AppDomain (if the exception is thrown from a thread in the created AppDomain). So in that case if the hosting server hosts a service which leaks exceptions this will bring down the default AppDomain as well.
So I guess from a server architecture point-of-view there is nothing better than creating child processes and monitoring them.
Is that correct or am I missing something with AppDomains?
thanks,
Christoph
If you can control the threads created in the other AppDomain, you can also handle exceptions by using catch-all blocks in the thread main method.
Other than that, as long as you use the default host, I believe that your assumption is correct. However, if you host the runtime yourself, you can also handle unhandled exceptions.
From a forum post on the topic:
Well, it is possible. You'd have to
create your own CLR host. That starts
with ICorBindToRuntimeEx(). You get
to have full control of AppDomains
that throw exceptions. And it's being
used by MSFT software like ASP.NET and
SQL Server 2005. When you write a
service, you are working with the
default CLR host implementation and it
terminates the process when any
unhandled exception is raised,
regardless of what AppDomain caused
the exception.
Problem is, hosts like ASP.NET and SQL
server have a very well defined code
execution path. In a web server,
managed code runs because of a page
request. In a dbase server, it runs
because of a query. When something
bad happens, they have the luxury of
simply aborting everything that the
request started (killing the
AppDomain) and returning a "sorry,
couldn't do it" status back to the
client. You might have seen it,
crashing the forums server on the old
web site was pretty trivial but didn't
stop it from serving other requests.
Not actually 100% sure about that.
Your service implementation is
probably not nearly as clean. I can't
tell, you didn't say anything about
it. It general, there's a problem
with aborting a thread. You always
have to abort a thread when there's an
unhandled exception. A service
typically has one thread, started by
the OnStart() method. Aborting it
kills the server until somebody stops
and starts it again.
You can definitely make it more
resilient than that, you could start a
"master" thread that launches child
threads in response to external events
that makes your service do its job.
Having a child thread terminated
because of an unhandled exception is
something you could possibly recover
from. But then, if you make that next
step, why not have the child thread
catch an exception and pass it back to
the master thread so it can make an
intelligent decision about what to do
next.
The cold hard fact of the default CLR
host is: if you are not willing to
deal with failure, it is not going to
do the job for you. And it shouldn't,
the .NET 1.x behavior to threads that
died with exceptions was a major
mistake that got corrected in .NET
2.0.
You know what to do: handle failure.
Or write you own host. Or accept that
things could be beyond your control
and log a good error message so you
can tell your customer what to do.
I'd strongly recommend the latter.

how to track if a given process throws exception, using windows service in C#

My process sometimes throws exception like dllnotfound after start. i have a monitor service responsible for maintaining the consistent state of the process.
how can i keep track of state of my process using windows service.
is there an open source implementation of windows service that maintains/track the state of process in windows.
That's not possible, exceptions are local to a thread first, local to a process secondary if it is unhandled. An unhandled exception will terminate the process. The only shrapnel you could pick up from such a dead process is the process exit code. Which should be set to 0xe0434f4e, the exception code for an unmanaged exception. No other relevant info is available, unless there's an unhandled exception handler in the process that logs state. That state is very unreliable, the process suffered a major heart attack.
Keeping multiple processes in synch and running properly when they may die from exceptions is extraordinarily difficult. Only death can be detect reliably, avoid doing more.
Edit: So the actual problem wasn't that the process was dying, but that the process was stuck in an exception handler dialog waiting for the user to hit debug or cancel. The solution to the problem was to disable the .net JIT debug dialog, instructions here
http://weblogs.asp.net/fmarguerie/archive/2004/08/27/how-to-turn-off-disable-the-net-jit-debugging-dialog.aspx
My original proposed solution is below
Not a window service, but this is a pretty easy .NET program to write.
use System.Diagnostics.Process to get a Process object for the process you want to check. You can use GetProcessByName if you want to open an existing process. If you create the process from C#, then you will already have the process object.
Then you just can WaitForExit either with or without a timeout on the Process object. or test the HasExited property, or register an Exited callback. Once the process has exited, you can check the ExitCode property to find out whether the process returned an error value.
Have your process write events and exceptions to the system's application log and have your monitor check for entries periodically to find events relating to your process, and you can check the system events for service start and stop events.
If the process itself is a windows service, you can check its status using the `System.ServiceProcess.ServiceController'.
this worked for now
http://weblogs.asp.net/fmarguerie/archive/2004/08/27/how-to-turn-off-disable-the-net-jit-debugging-dialog.aspx
In the case of DllNotFoundException and other things that happen at startup, you can have the application indicate when it's finished starting up. Have it write a timestamp to a file, for instance. Your monitor can compare the time the application started with the time in the file.
One thing you could do is to monitor the CPU usage of the process. I am assuming your process goes away when the exception is thrown. Therefore, the CPU usage of the process should be 0 since it is no longer available. Therefore, if the CPU usage stays at zero for a certain period of time, you can safely assume that the process has raised the exception. This method is not fool proof since you are basing your decision on CPU usage and a legitimate process may have a zero CPU usage for a given period of time. You can incorporate this check inside your monitoring service or you could write a simple VB script to check process CPU usage externally.

Categories

Resources