KiUserExceptionDispatch calls into handler in unloaded NativeAot dll - c#

I have a program that loads a NativeAot compiled dll into a process, I was able to unload the module with some hacky approach. However, I recently discovered a problem, that is, the Window exception dispatcher calls into the handler in the NativeAot module even after it was unloaded, no matter where the exception is thrown, causing access violation.
Pseudo code:
HMODULE module = LoadLibraryA("Aot.dll");
// Code that terminates .NET runtime thread and unload dll
.......
// Throw and catch an exception
try {
throw exception("argh"); // Access violation executing location 0x00007FF97C8E69B0.
}
catch(exception ex){
cout << "Caught" << endl; // Handler never called
}
Stack trace from visual sutdio:
00007ff97a3a69b0() -> This is a function in the unmapped module
ntdll.dll!RtlpCallVectoredHandlers()
ntdll.dll!RtlDispatchException()
ntdll.dll!KiUserExceptionDispatch()
main()
Any idea on why this is happening and what solution/hack I can use will be appreciated!

As Raymond Chen pointed out in the comments (much appreciated), .NET registeres a vectored exception handler with AddVectoredExceptionHandler.
I hooked this api and check the module of the handler with GetModuleHandleEx during unload, then unregisters it. Now the problem is gone.
There's a link to a demo in the answer of the question linked above.

Related

Catching fatal exceptions thrown from unmanaged code

Currently, there is no way (at least I did not find a way) to catch fatal exceptions (such as Stack Overflow, Segfault, ..) with try-catch block.
I already started issue at .net core repository so for more details you can read there (https://github.com/dotnet/core/issues/4228)
What I'm trying to do is to make the application not crash when there is any segfault/stack overflow/any fatal exception in loaded unmanaged code. what happens now is that .NET CLR kills my application if any fatal error occurs.
Example:
In c# managed code loaded external c++ dll via kernel LoadLibrary function.
Assume the dll is intentionally created for robustness testing therefore when a specific function is called it triggers segfault (e.g. trying to get data from outside of array bounds).
When this error happens this gets caught by .net CLR and immediately kills the calling managed c# code(application).
What I would like is just report that this happens instead of dying silently.
I did some research and found out there is the reasoning behind that which is described in the issue above.

Isolate exceptions thrown in an AppDomain to not Crash the Application

TL;DR: How do you isolate add-in exceptions from killing the main process?
I want to have a very stable .Net application that runs less stable code in an AppDomain. This would appear to be one of the prime purposes of the AppDomain in the first place (well, that and security sandboxing) but it doesn't appear to work.
For instance in AddIn.exe:
public static class Program
{
public static void Main(string[] args)
{
throw new Exception("test")
}
}
Called in my 'stable' code with:
var domain = AppDomain.CreateDomain("sandbox");
domain.UnhandledException += (sender, e) => {
Console.WriteLine("\r\n ## Unhandled: " + ((Exception) e.ExceptionObject).Message);
};
domain.ExecuteAssemblyByName("AddIn.exe", "arg A", "arg B")
The exception thrown in the AppDomain gets passed straight to the application that created the domain. I can log these with domain.UnhandledException and catch them in the wrapper application.
However, there are more problematic exceptions thrown, for instance:
public static class Program
{
public static void Main(string[] args)
{
Stackoverflow(1);
}
static int Stackoverflow(int x)
{
return Stackoverflow(++x);
}
}
This will throw a stackoverflow exception that kills the entire application every time. It doesn't even fire domain.UnhandledException - it just goes straight to killing the entire application.
In addition calling things like Environment.Exit() from inside the AppDomain also kill the parent application, do not pass GO, do not collect £200 and don't run any ~Finialiser or Dispose().
It seems from this that AppDomain fundamentally doesn't do what it claims (or at lease what it appears to claim) to do, as it just passes all exceptions straight to the parent domain, making it useless for isolation and pretty weak for any kind of security (if I can take out the parent process I can probably compromise the machine). That would be a pretty fundamental failure in .Net, so I must be missing something in my code.
Am I missing something? Is there some way to make AppDomain actually isolate the code that it's running and unload when something bad happens? Am I using the wrong thing and is there some other .Net feature that does provide exception isolation?
I'll throw on some random thoughts, but what #Will has said is correct regarding permissions, CAS, security transparency, and sandboxing. AppDomains are not quite superman. Regarding exceptions though, an AppDomain is capable of handling most unhandled exceptions. The category of exceptions that they are not is called an asynchronous exception. Finding documentation on such exceptions is a little more difficult now that we have async/await, but it exists, and they come in three common forms:
StackOverflowException
OutOfMemoryException
ThreadAbortException
These exceptions are said to be asynchronous because they can be thrown anywhere, even between CIL opcodes. The first two are about the whole environment dying. The CLR lacks the powers of a Phoenix, it cannot handle these exceptions because the means of doing so are already dead. Note that these rules only exist when the CLR throws them. If you just new-up and instance and throw it yourself, they behave like normal exceptions.
Sidenote: If you ever peek at a memory dump of a process that is hosting the CLR, you will see there are always OutOfMemoryException, ThreadAbortException, and StackOverflowException on the heap, but they have no roots you can see, and they never get GCed. What gives? The reason they are there is because the CLR preallocates them - it wouldn't be able to allocate them at the time they are needed. It wouldn't be able to allocate an OutOfMemoryException when we're out of memory.
There is a piece of software that is able to handle all of these exceptions. Starting in 2005, SQL has had the ability to run .NET assemblies with a feature called SQLCLR. SQL server is a rather important process, and having a .NET assembly throw an OutOfMemoryException and it bringing down the entire SQL process seemed tremendously undesirable, so the SQL team doesn't let that happen.
They do this using a .NET 2.0 feature called constrained execution and critical regions. This is where things like ExecuteCodeWithGuaranteedCleanup come into play. If you are able to host the CLR yourself, start with native code and spin up the CLR yourself, you are then able to change the escalation policy: from native code you are able to handle those managed exceptions. This is how SQL CLR handles those situations.
You can't do anything about Environment.Exit(), just like you can't prevent a user from killing your process in Task Manager. Static analysis for this could be circumvented, as well. I wouldn't worry too much about that. There are things you can do, and things you really can't.
The AppDomain does do what it claims to do. However, what it actually claims to do and what you believe it claims to do are two different things.
Unhandled exceptions anywhere will take down your application. AppDomains don't protect against these. But you can prevent unhandled exceptions from crossing AppDomain boundaries by the following (sorry, no code)
Create your AppDomain
Load and unwrap your plugin controller in this AppDomain
Control plugins through this controller, which
Isolates calls to 3rd party plugins by wrapping them in try/catch blocks.
Really, the only thing an AppDomain gives you is the ability to load, isolate and unload assemblies that you do not fully trust during runtime. You cannot do this within the executing AppDomain. All loaded assemblies stay until execution halts, and they enjoy the same permission set as all other code in the AppDomain.
To be a touch clearer, here's some pseudocode that looks like c# that prevents 3rd-party code from throwing exceptions across the AppDomain boundary.
public class PluginHost : IPluginHost, IPlugin
{
private IPlugin _wrapped;
void IPluginHost.Load(string filename, string typename)
{
// load the assembly (filename) into the AppDomain.
// Activator.CreateInstance the typename to create 3rd party plugin
// _wrapped = the plugin instance
}
void IPlugin.DoWork()
{
try
{
_wrapped.DoWork();
}catch(Exception ex)
// log
// unload plugin whatevs
}
}
This type would be created in your Plugin AppDomain, and its proxy unwrapped in the application AppDomain. You use it to puppet the plugin within the Plugin AppDomain. It prevents exceptions from crossing AppDomain boundaries, performs loading tasks, etc etc. Pulling a proxy of the plugin type into the application AppDomain is very risky, as any object types that are NOT MarshalByRefObject that the proxy can somehow get into your hands (e.g., Throw new MyCustomException()) will result in the plugin assembly being loaded in the application AppDomain, thus rendering your isolation efforts null and void.
(this is a bit oversimplified)

Catching native code exceptions in .NET

I'm currently working with ffmpeg with a proprietary wrapper and I need to catch native code exceptions that sometimes occur during transcoding procedures etc.
I already read those questions/answers:
https://stackoverflow.com/questions/10517199/cant-catch-native-exception-in-managed-code
https://stackoverflow.com/questions/150544/can-you-catch-a-native-exception-in-c-sharp-code
However, they were not really helpful as the exceptions occur not during a function I call, but in a completely different thread that runs side-by-side in the ffmpeg library and were thrown by additional components such as DirectX. This is somehow a real issue as the exceptions tear down my whole application!
Any help is really appreciated.
You can catch exceptions thrown from other threads using these 2 events:
System.Windows.Forms.Application.ThreadException += Application_ThreadException;
AppDomain.CurrentDomain.UnhandledException += CurrentDomain_UnhandledException;
// Set the unhandled exception mode to force all Windows Forms errors to go through our handler.
Application.SetUnhandledExceptionMode(UnhandledExceptionMode.CatchException);
You might also need to set the following in your App.config file:
<configuration xmlns="http://schemas.microsoft.com/.NetConfiguration/v2.0">
<runtime>
<legacyUnhandledExceptionPolicy enabled="1" />
</runtime>
...
If you can't get it to work with managed code, the code below will catch everything in native (c++) code (not sure if you can PInvoke it):
//Sets the handler for a pure virtual function call.
_set_purecall_handler( &MyPureVirtualCallHandler );
// Sets the handler function to be called when invalid parameters are passed to C runtime functions
_set_invalid_parameter_handler( &MyInvalidParameterHandler );
//Register an unhandled exception filter
SetUnhandledExceptionFilter(&UnhandExceptionFilter);

application level global exception handler didn't get hit

My .net application has a global exception handler by subscribing to AppDomain.Current.Domain UnhandledException event. On a few occassions i have seen that my application crashes but this global exception handler never gets hit. Not sure if its help but application is doing some COM interop.
My understanding is that as long as I don't have any local catch blocks swallowing the exception, this global exception handler should always be hit. Any ideas on what I might be missing causing this handler never been invoked?
Is this the cause of your problem?
AppDomain.CurrentDomain.UnhandledException not firing without debugging
The CLR is not all-powerful to catch every exception that unmanaged code can cause. Typically an AccessViolationException btw. It can only catch them when the unmanaged code is called from managed code. The scenario that's not supported is the unmanaged code spinning up its own thread and this thread causing a crash. Not terribly unlikely when you work with a COM component.
Since .NET 4.0, a Fatal Execution Engine exception no longer causes the UnhandledException event to fire. The exception was deemed too nasty to allow any more managed code to run. It is. And traditionally, a StackOverflowException causes an immediate abort.
You can diagnose this somewhat from the ExitCode of the process. It contains the exception code of the exception that terminated the process. 0x8013yyyy is an exception caused by managed code. 0xc0000005 is an access violation. Etcetera. You can use adplus, available from the Debugging Tools For Windows download to capture a minidump of the process. Since this is likely to be caused by the COM component, working with the vendor is likely to be important to get this resolved.
Since you are doing COM interop I do strongly suspect that some unmanaged code was running in another thread which did cause an unhandled exception. This will lead to application exit without a call to your unhandled exception handler.
Besides this with .NET 4.0 the policy did get stronger when the application is shut down without further notice.
Under the following conditions your application is shut down without further notice (Environmnt.FailFast).
Pre .NET 4:
StackOverFlowException
.NET 4:
StackoverFlowException
AccessViolationException
You can override the behaviour in .NET 4 by decorating a method with the HandleProcessCorruptedStateExceptionsAttribute or you can add the legacyCorruptedStateExceptionsPolicy tag to your App.config.
If your problem is an uncatched exception in unmanaged code you can either run your application under a debugger or you let it crash und collect a memory dump for post mortem debugging. Debugging crash dumps is usualy done with WindDbg.
After you have downloaded Windbg you have adplus (a vbs script located under Programm Files\Debugging Tools for Windows) which you can attach to your running process to trigger a crash dump when the process terminates due to an exception.
adplus -crash -p yourprocessid
Then you have a much better chance to find out what was going on when your process did terminate. Windows can also be configured to take a crash dump for you via DrWatson on older Windows Versions (Windows Error Reporting)
Crash Dump Generation
Hard core programmers will insist to create their own dump generation tool which basically uses the AEDebug registry key. When this key has a value which points to an existing executable it will be called when an application crashes which can e.g. show the Visual Studio Debugger Chooser Dialog or it can trigger the dump generation for your process.
Suspend Threads
An often overlooked thing is when you create a crash dump with an external tool (it is better to rely on external tools since you do not know how bad your process is corrupted and if it is out memory you are already in a bad situation) that you should suspend all threads from the crashed process before you take the dump.
When you take a big full memory dump it can take several minutes depending on the allocated memory of the faulted process. During this time the application threads can continue to wreak havoc on your application state leaving you with a dump which contains an inconsistent process state which did change during dump generation.
This would happen if your handler throws an exception.
It would also happen if you call Environment.FailFast or if you Abort the UI thread.

Catching a StackOverflowException

How do I catch a StackOverflowException?
I have a program that allows the user to write scripts, and when running arbitrary user-code I may get a StackOverflowException. The piece running user code is obviously surrounded with a try-catch, but stack overflows are uncatchable under normal circumstances.
I've looked around and this is the most informative answer I could find, but still led me to a dead end; from an article in the BCL team's blog I found that I should use RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup to call the code and the delegate that would get called even after a stack overflow, but when trying, the process gets terminated with the stack overflow message without the delegate ever getting called. I've tried adding PrePrepareMethodAttribute on the handler method but that didn't change anything.
I've also tried using an AppDomain and handling both the UnhandledException and the DomainUnload event - but the entire process gets killed on stack overflows. The same happens even if I throw new StackOverflowException(); manually and not get an actual stack overflow.
To handle an exception that is not handled by your code, you can subscribe to the AppDomains UnhandledException -- which is what the operating system handles when it displays the dialog that says the program exited unexpectedly.
In the Main method of your program use
var currentDomain = AppDomain.CurrentDomain;
and then add a handler to the event
currentDomain.UnhandledException += handler;
In the handler you can do anything you want, such as log, display an error, or even reinitializing the program if desired.
Program your script engine to trace the level of recursion in the script. If the recursion goes above some arbitrarily large number then kill the script before it kills your program. Alternatively you could program the script engine to operate in a stackless manner and store all of the script's stack data in a System.Collections.Generic.Stack<T>. Even if you do use a separate stack you will still want to limit the level of recursion that a script can have, but stack collection will give you a few hundred times more stack space.
You need to run the code in a separate process.
You must load the user script, or any external 3rd party plugin, in a different app domain, so that you can safely unload the domain should an unrecoverable error occurs.
You must create a different AppDomain since you cannot unload an assembly from a loaded domain, and you don't want to shutdown your main application domain.
You create a new application domain like this:
var scriptDomain = AppDomain.CreateDomain("User Scripts");
You can then load any type from an assembly that you need to create. You have to be sure that the object that you will load inherits from MarshalByRefObject.
I assume that your user script is wrapped inside an object defined like this:
public abstract UserScriptBase : MarshaByRefObject
{
public abstract void Execute();
}
You can therefore load any user script like this:
object script = domain.CreateInstanceFromAndUnwrap(type.Location, type.FullName);
After all that, you can subscribe to the scriptDomain.UnhandledException and monitor any unrecoverable error.
Using a different application domain is not easy and you will most likely encounter some loading/unloading problem (DLL is referenced by both domain).
I recommend that you fellow some tutorial that you could find online.

Categories

Resources