I have a native (unmanaged) .dll written in C++ that is to be called from a managed process (a C# program). When debugging the dll the problem I have has shown to be that when I create an object in the dll with the new keyword I get a System Access Violation Exception. This only shows up when calling the dll from a managed process, not when I am calling it from another native program.
The code is something similar to this:
// Native.dll file
MyClass myInstance; // global variable (and does need to be so)
__declspec(dllexport) uint8_t _stdcall NativeFunction(){
myInstance = new MyClass(); // <-- this causes Access Violation Exception
}
and the C# code:
using System.Runtime.Interopservices;
// Loading the dll
[DllImport("Native.dll",CallingConvention = CallingConvention.StdCall)]
private extern static byte NativeFunction();
class TestClass{
byte returnVal = NativeFunction(); //<-- exception in managed context
}
I know this has something to do with the native process trying to allocate memory outside allowed memory-space. It only happens when memory is allocated with new (at least in this project) which I unfortunately do need to use. My question is: Does anyone know why this is causing the exception and how to avoid it?
new MyClass is most likely going to call ::operator new, the global operator, unless you have provided MyClass::operator new. And if you haven't provided ::operator new yourself, you should be getting the ::operator new from your compiler (likely Visual Studio).
This ::operator new implementation will probably forward to HeapAlloc. And guess what? That's the same Win32 function which .Net will call as well. There's not much magic involved here; that's how Windows assigns pages of memory to your virtual address space. And when you use those pages, Windows will assign RAM.
Now the thing here is that you don't need to do anything special for this. In fact, doing anything special is how you would break operator new. And since you broke it, you''re going to have to figure that out. There is not much magic code going on here. Use a debug build, so you will have a clear stack dump (no inlining). Can you backtrack to HeapAlloc?
Check also the content of the Access Violation Exception. The error code will be C0000005. But what type of exception is it? Read or write? On what type of address? Code or data?
Related
I tried to find out how pinned pointers defined with fixed keyword work. My idea was that internally GCHandle.Alloc(object, GCHandleType.Pinned) was used for that. But when I looked into the IL generated for the following C# code:
unsafe static void f1()
{
var arr = new MyObject[10];
fixed(MyObject * aptr = &arr[0])
{
Console.WriteLine(*aptr);
}
}
I couldn't find any traces of GCHandle.
The only hint I saw that the pinned pointer was used in the method was the following IL declaration:
.locals init ([0] valuetype TestPointerPinning.MyObject[] arr,
[1] valuetype TestPointerPinning.MyObject& pinned aptr)
So the pointer was declared as pinned, and that did not require any additional methods calls, to pin it.
My questions are
Is there any difference between using pinned pointers in the declaration and pinning the pointer by using GCHandle class?
Is there any way to declare a pinned pointer in C# without using fixed keyword? I need this to pin a bunch of pointers within a loop and there's no way I can do this using fixed keyword.
Well, sure there's a difference, you saw it. The CLR supports more than one way to pin an object. Only the GCHandleType.Pinned method is directly exposed to user code. But there are others, like "async pinned handles", a feature that keeps I/O buffers pinned while a driver performs an overlapped I/O operation. And the one that the fixed keyword uses, it doesn't use an explicit handle or method call at all. These extra ways were added to make unpinning the objects again as quick and reliable as possible, very important to GC health.
Fixed buffer pins are implemented by the jitter. Which performs two important jobs when it translates MSIL to machine code, the highly visible one is the machine code itself, you can easily see it with the debugger. But it also generates a data structure used by the garbage collector, completely invisible in the debugger. Required by the GC to reliably find object references back that are stored in the stack frame or a CPU register. More about that data structure in this answer.
The jitter uses the [pinned] attribute on the variable declaration in the metadata to set a bit in that data structure, indicating that the object that's referenced by the variable is temporarily pinned. The GC sees this and knows to not move the object. Very efficient because it doesn't require an explicit method call to allocate the handle and doesn't require any storage.
But no, these tricks are not available otherwise to C# code, you really do need to use the fixed keyword in your code. Or GCHandle.Alloc(). If you are finding yourself getting lost in the pins then high odds that you ought to be considering pinvoke or C++/CLI so you can easily call native code. The temporary pins that the pinvoke marshaller uses to keep objects stable while the native code is running are another example of automatic pinning that doesn't require explicit code.
I live in the belief that is not possible to produce / generate an Access Violation Exception in "pure" managed code in .Net. If one looks at .Net as flawless and does not use any external libraries (that is not managed) through for example interop.
Am I living in a fantasy?
throw new AccessViolationException();
This is pure managed code and it produces AccessViolationException :P
You can also use the following code (it only throws AccessViolationException because of malformed input though):
IntPtr ptr = new IntPtr(123);
Marshal.StructureToPtr(123, ptr, true);
You can e.g. use WPF which does call into your graphics card driver. You can easily get AcessViolationExceptions pre .NET 4.5 those with a buggy graphics card drivers which are not at all uncommon.
In a strange sense you are right. With .NET 4.5 and above you will never get AccessViolationExceptions in managed code anymore because the .NET runtime does not convert an AccessViolation coming from unmanaged code to an AccessViolationException anymore but it does terminate your process immediately. I guess MS support was tired to search for .NET Framework bugs only to find that it was a buggy graphics card driver.
You almost never see the CPU actually throw one asynchronously (in the middle of something) because the .NET just-in-time compiler usually provokes an exception if 'this' is null in a method call. It puts cmp [rcx],rcx at the call site to provoke an exception before it potentially uses 0 as the address. It is possible to have large enough field offsets to read readable memory with a null pointer, so this guards against that.
See http://blogs.msdn.com/b/oldnewthing/archive/2007/08/16/4407029.aspx
There is no magic, C# becomes instructions just like any other compiled language. There is no reason to feel all cozy about how AV's will never happen.
I have a Windows Forms application running under .NET 4.0. This application imports a DLL which is available for:
32 bit
64 bit
Here is my code snippet:
[DllImport("my64Bit.dll"), EntryPoint="GetLastErrorText"]
private static extern string GetLastErrorText();
// Do some stuff...
string message = GetLastErrorText();
When calling this function (compiled for x64) the application just crashes. I can't even see any debug message in Visual Studio 2012. The identical code with the 32-bit-DLL (compiled for x86) works fine. The prototype is:
LPCSTR APIENTRY GetLastErrorText()
Unfortunately I don't have any further information about the DLL as it is a third-party product.
The function signature is quite troublesome. Whether your code will crash depends on what operating system you run. Nothing happens on XP, an AccessViolation exception is thrown on Vista and later.
At issue is that C functions returning strings need to typically do so by returning a pointer to a buffer that stores a string. That buffer needs to be allocated from the heap and the caller needs to release that buffer after using the string. The pinvoke marshaller implements that contract, it calls CoTaskMemFree() on the returned string pointer after converting it to a System.String.
That invariably turns out poorly, a C function almost never uses CoTaskMemAlloc() to allocate the buffer. The XP heap manager is very forgiving, it simply ignores bad pointers. Not the later Windows versions, they intentionally generate an exception. A strong enabler for the "Vista sucks" label btw, it took a while for programmers to get their pointer bugs fixed. If you have unmanaged debugging enabled then you'll get a diagnostic from the heap manager which warns that the pointer is invalid. Very nice feature but unmanaged debugging is invariably disabled when you debug managed code.
You can stop the pinvoke marshaller from trying to release the string by declaring the return value as IntPtr. You then have to marshal the string yourself with Marshal.PtrToStringAnsi() or one of its friends.
You still have the problem of having to release the string buffer. There is no way to do this reliably, you cannot call the proper deallocator. The only hope you have is that the C function actually returns a pointer to a string literal, one that's stored in the data segment and should not be released. That might work for a function that returns an error string, provided it doesn't implement anything fancy like localization. The const char* return type is encouraging.
You will need to test this to make sure there is no memory leak from not releasing the string buffer. Easy to do, call this function a billion times in a loop. If you don't get IntPtr.Zero as a return value and the program doesn't otherwise fall over with a out-of-memory exception then you're good. For 64-bit pinvoke you'll need to keep an eye on the test program's memory consumption.
Found it. The native function returns LPCSTR, i.e. the C# function cannot return a string. Instead an IntPtr must be returned like this:
[DllImport("my64Bit.dll"), EntryPoint="GetLastErrorText"]
private static extern IntPtr GetLastErrorText();
// Do some stuff...
IntPtr ptr = GetLastErrorText();
string s = Marshal.PtrToStringAnsi(ptr);
Assume someClass is a class defined in C# with some method int doSomething(void), and for simplicity, providing a constructor taking no arguments. Then, in C#, instances have to be created on the gc heap:
someClass c; // legit, but only a null pointer in C#
// c->doSomething() // would not even compile.
c = new someClass(); // now it points to an instance of someclass.
int i = c->doSomething();
Now, if someClass is compiled into some .Net library, you can also use it in C++/CLI:
someClass^ cpp_gcpointer = gcnew someClass();
int i = cpp_gcpointer->doSomething();
That easy! Nifty! This is of course assuming a reference to the .Net library has been added to the project and a corresponding using declaration has been made.
It is my understanding that this is the precise C++/CLI equivalent of the previous C# example (condensed to a single line, this is not the point I'm interested in). Correct? (Sorry, I'm new to the topic)
In C++, however, also
someClass cpp_cauto; // in C++ declaration implies instantiation
int i = cpp_cauto.doSomething();
is valid syntax. Out of curiosity, I tried this today. A colleague, looking over my shoulder, was willing to bet it would not even compile. He would have lost the bet. (This is still the class from the C# assembly). Actually it produces also the same result i as the code from the previous examples.
Nifty, too, but -- uhmm -- what exactly is it, what is created here? My first wild guess was that behind my back, .Net dynamically creates an instance on the gc heap and cpp_auto is some kind of wrapper for this object, behaving syntactily like an instance of class someClass. But then I found this page
http://msdn.microsoft.com/en-us/library/ms379617%28v=vs.80%29.aspx#vs05cplus_topic2
This page seems to tell me, that (at least, if someClass were a C++ class) cpp_auto is actually created on the stack, which, to my knowledge, would be the same behaviour you get in classical C++. And something you cannot do in C# (you can't, can you?). What I'd like to know: is the instance from the C# assembly also created on the stack? Can you produce .Net binaries in C++ with class instances on the stack which you cannot create in C#? And does this possibly may even give you a perfomance gain :-) ?
Kind regards,
Thomas
The link you referenced explains this in detail:
C++/CLI allows you to employ stack semantics with reference types. What this means is that you can introduce a reference type using the syntax reserved for allocating objects on the stack. The compiler will take care of providing you the semantics that you would expect from C++, and under the covers meet the requirements of the CLR by actually allocating the object on the managed heap.
Basically, it's still making a handle to the reference type on the managed heap, but automatically calls Dispose() on IDisposable implementations when it goes out of scope for you.
The object instance, however, is still effectively allocated via gcnew (placed on the managed heap) and collected by the garbage collector. This, too, is explained in detail:
When d goes out of scope, its Dispose method will be called to allow its resources to be released. Again, since the object is actually allocated from the managed heap, the garbage collector will take care of freeing it in its own time.
Basically, this is all handled by the compiler to make the code look and work like standard C++ stack allocated classes, but its really just a compiler trick. The resulting IL code is still doing managed heap allocations.
Will using GetComInterfaceForObject and passing the returned IntPtr to unmanaged code keep the managed object from being moved in memory? Or does the clr somehow maintain that ptr? Note that the unmanaged code will use this for the lifetime of the program, and I need to make sure the managed object is not being moved by the GC.(At least I think that's right?)
EDIT - Alright I found some info and I am thinking that this may be the answer. It deals with delegates, but I would have to believe calling GetComInterfaceForObject does something along the same lines.
Source of the Following text
"Managed Delegates can be marshaled to unmanaged code,
where they are exposed as unmanaged function pointers. Calls on those
pointers will perform an unmanaged to managed transition; a change in
calling convention; entry into the correct AppDomain; and any necessary
argument marshaling. Clearly the unmanaged function pointer must refer to a
fixed address. It would be a disaster if the GC were relocating that! This
leads many applications to create a pinning handle for the delegate. This
is completely unnecessary. The unmanaged function pointer actually refers
to a native code stub that we dynamically generate to perform the transition
& marshaling. This stub exists in fixed memory outside of the GC heap.
However, the application is responsible for somehow extending the lifetime
of the delegate until no more calls will occur from unmanaged code. The
lifetime of the native code stub is directly related to the lifetime of the
delegate. Once the delegate is collected, subsequent calls via the
unmanaged function pointer will crash or otherwise corrupt the process. In
our recent release, we added a Customer Debug Probe which allows you to
cleanly detect this all too common bug in your code. If you havent
started using Customer Debug Probes during development, please take a look!"
As your edit states (about delegates), your managed object doesn't need to be pinned, since GetComInterfaceForObject returns a "pinned" pointer that calls through to the correct managed object. However, you will need to make sure that the managed object lives for as long as the COM clients are using the unmanaged pointer to it.