AccessViolationException in P/Invoke call - c#

I'm writing a small zlib wrapper via P/Invoke calls. It runs perfectly on a 64-bit target (64-bit C# build, 64-bit DLL), but throws an AccessViolationException on a 32-bit target (32-bit C# build, 32-bit DLL).
Here's the C# signature and code which throws the exception:
[DllImport(Program.UnmanagedDll, CallingConvention = CallingConvention.Cdecl)]
private static extern ZLibResult ZLibDecompress(byte[] inStream, uint inLength, byte[] outStream, ref uint outLength);
internal enum ZLibResult : byte {
Success = 0,
Failure = 1,
InvalidLevel = 2,
InputTooShort = 3
}
internal static ZLibResult Decompress(byte[] compressed, out byte[] data, uint dataLength) {
var len = (uint) compressed.Length;
fixed (byte* c = compressed) {
var buffer = new byte[dataLength];
ZLibResult result;
fixed (byte* b = buffer) {
result = ZLibDecompress(c, len, b, &dataLength);
}
if(result == ZLibResult.Success) {
data = buffer;
return result;
}
data = null;
return result;
}
}
And here's the C code (compiled with MinGW-w64):
#include <stdint.h>
#include "zlib.h"
#define ZLibCompressSuccess 0
#define ZLibCompressFailure 1
__cdecl __declspec(dllexport) uint8_t ZLibDecompress(uint8_t* inStream, uint32_t inLength,
uint8_t* outStream, uint32_t* outLength)
{
uLongf oL = (uLongf)*outLength;
int result = uncompress(outStream, &oL, inStream, inLength);
*outLength = (uint32_t)oL;
if(result == Z_OK)
return ZLibCompressSuccess;
return ZLibCompressFailure;
}
I've looked over everything and can't figure out why an access violation would be happening on a 32-bit build and not on a 64-bit build. ZLibDecompress works fine decompressing the same stream when called from a C app, but throws an access violation when called from my C# app.
Does anyone know why this could be happening?
EDIT:
Updated my code, still getting an access violation on 32-bit builds, but not 64-bit.
C# Code:
[DllImport(Program.UnmanagedDll, CallingConvention = CallingConvention.Cdecl)]
private static extern ZLibResult ZLibDecompress(
[MarshalAs(UnmanagedType.LPArray)]byte[] inStream, uint inLength,
[MarshalAs(UnmanagedType.LPArray)]byte[] outStream, ref uint outLength);
internal static ZLibResult Decompress(byte[] compressed, out byte[] data, uint dataLength) {
var buffer = new byte[dataLength];
var result = ZLibDecompress(compressed, (uint)compressed.Length, buffer, ref dataLength);
if(result == ZLibResult.Success) {
data = buffer;
return result;
}
data = null;
return result;
}
C Code:
__declspec(dllexport) uint8_t __cdecl ZLibDecompress(uint8_t* inStream, uint32_t inLength,
uint8_t* outStream, uint32_t* outLength) {
uLongf oL = (uLongf)*outLength;
int result = uncompress(outStream, &oL, inStream, inLength);
*outLength = (uint32_t)oL;
if(result == Z_OK)
return ZLibCompressSuccess;
return ZLibCompressFailure;
}

fixed (byte* b = buffer) {
result = ZLibDecompress(c, len, b, &dataLength);
}
No, that can't work. The fixed keyword provides a highly optimized way to ensure that the garbage collector moving objects doesn't cause trouble. It doesn't do it by pinning the object (like the documentation says), it does it by exposing the b variable to the garbage collector. Which then sees it referencing the buffer and updates the value of b when it moves buffer.
That however can't work in this case, a copy of the b value was passed to ZlibDecompress(). The garbage collector cannot update that copy. The outcome will be poor when a GC occurs while ZLibDecompress() is running, the native code will destroy the integrity of the garbage collected heap and that will eventually cause an AV.
You cannot use fixed, you must use GCHandle.Alloc() to pin the buffer.
But don't do that either, you are helping too much. The pinvoke marshaller is already very good at pinning objects when necessary. Declare the instream and outstream arguments as byte[] instead of byte*. And pass the arrays directly without doing anything special. Also, the outlength argument should be declared ref int.

In 64-bit there's only one ABI for Windows (no cdecl/stdcall), so the problem for 32-bit seems to be with the calling conventions. Your parameter pointers are going into wrong registers and the native function accesses the wrong memory region.
To resolve the issue:
Try commenting out the lines in the native function (see if it crashes - it yes, it's not the calling convention)
Try playing with the calling conventions "cdecl/stdcall"
To check everything, try dumping the pointer values and see if they coincide in native/managed functions.
EDIT:
Then it is a problem with pointers. You are allocating the arrays in C# (thus they reside in a managed heap). You have to marshal them using the "[MarshalAs(UnmanagedType.LPArray)]" attribute.
[DllImport(Program.UnmanagedDll, CallingConvention = CallingConvention.Cdecl)]
private static extern ZLibResult ZLibDecompress(
[MarshalAs(UnmanagedType.LPArray)] byte[] inStream,
uint inLength,
[MarshalAs(UnmanagedType.LPArray)] byte[] outStream,
ref UInt32 outLength);
The [In,Out] modifier might be of help also.
And yes, as Hans says, pin the pointers and do not allow them to be garbage-collected.
byte[] theStream = new byte[whateveyouneed];
// Pin down the byte array
GCHandle handle = GCHandle.Alloc(theStream, GCHandleType.Pinned);
IntPtr address = handle.AddrOfPinnedObject();
and then pass it as IntPtr.

The actual issue was caused by MinGW-w64 generating a buggy DLL. I had been passing -ftree-vectorize to gcc when building zlib, which was generating code that the 32-bit CLR didn't like. The code ran fine after using less aggressive optimization options.

Related

Marshaling struct with IntPtr to buffer from C# to C dll

I've been banging my head all day and hope someone can help. I need to marshal a managed data structure to an unmanaged C dll. When I look at all the memory it appears that what I'm doing is working, but the C dll (a black box to me) is returning an error indicating the data is corrupt. Can anyone point out my error?
C declarations
typedef struct _TAG_Data
{
void *data; // Binary data
uint32_t size; // Data size bytes
} Data;
// Parse binary data, extract int value
ParseData(Data ∗ result, uint64_t parameter, int32_t ∗ value)
Managed object:
public class MyData
{
public byte[] data;
public UInt32 size;
}
Packed equivalent for moving to unmanaged memory:
[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct MyData_Packed
{
public IntPtr data;
public UInt32 size;
}
At this point I have a managed "MyData" struct called MyResult with valid data that needs to go into the dll. Here's what I'm doing:
[DllImport("Some.dll", EntryPoint = "ParseData", SetLastError = true, CharSet = CharSet.Ansi)]
private static extern Int32 ParseData_Native(IntPtr result, UInt64 parameter, ref Int32 value);
IntPtr MyResultPackedPtr = new IntPtr();
MyData_Packed MyResultPacked = new MyData_Packed();
// Copy "MyResult" into un-managed memory so it can be passed to the C library.
// Allocate un-managed memory for the data buffer
MyResultPacked.data = Marshal.AllocHGlobal((int)MyResult.size);
// Copy data from managed "MyResult.data" into the unmanaged "MyResultPacked.data"
Marshal.Copy(MyResult.data, 0, MyResultPacked.data, (int)MyResult.size);
MyResultPacked.size = MyResult.size;
// Allocate unmanaged memory for the structure itself
MyResultPackedPtr = Marshal.AllocHGlobal(Marshal.SizeOf(typeof(MyData_Packed)));
// Copy the packed struct into the unmanaged memory and get our pointer
Marshal.StructureToPtr(MyResultPacked, MyResultPackedPtr, false);
// Pass our pointer to the unmanaged struct, which points to the unmanaged data buffer, to the C dll
Int32 tmp = 0;
ErrorCode = ParseData_Native(MyResultPackedPtr, parameter, ref tmp);
When I look at the unmanaged memory pointed to by MyResultPacked.data, the data is correct so the copy was good. And when I look at MyResultPackedPtr in memory, the first 8 bytes are the address of the same unmanaged memory pointed to by MyResultPacked.data (64-bit machine), and the next 4 bytes are the proper size of the data. So it appears that MyResultPackedPtr points to a valid copy of MyResultPacked. But the return value from ParseData() indicates my data must be corrupt, so I must be doing something wrong.
To take it a step further, I wrote the same code 100% in C and it works. And the data in the binary buffer in C matched the data in the binary buffer in C#, going by the memory watch feature in Visual Studio, so it appears my data handling is correct. Which makes me think something is wrong with the way I'm passing MyResultPackedPtr to the dll. Unfortunately I don't have the source for the dll and cannot step into it. Can anyone offer a suggestion on what to try next?
I don't see why you need any of this custom marshalling code in the first place. You should be able to pass the struct with the byte[] array directly, and the marshaller will sort out the copying.
You also need to set the calling convention correctly.
[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct MyData_Packed
{
public byte[] data;
public UInt32 size;
}
[DllImport("Some.dll", EntryPoint = "ParseData", SetLastError = true, CallingConvention = CallingConvention.CDecl)]
private static extern int ParseData_Native(ref MyData_Packed result, ulong parameter, out int value);
var MyResultPacked = new MyData_Packed
{
data = MyResult,
size = MyResult.size,
};
ErrorCode = ParseData_Native(ref MyResultPacked, parameter, out var tmp);

Pinvoke cdecl convention with char**

In summary:
I`m trying to use a C++ dll with cdecl calling convention all ran fine unless i get to this method signature:
int SaveToBuffer( char **buf, int *buf_size );
from what i have read i should use it like this:
[DllImport("entry.dll",
CallingConvention = CallingConvention.Cdecl,
EntryPoint = "SaveToBuffer")]
private static int SaveToBuffer( ref sbyte[] buf, ref int buf_size );
This does not work if this function is called from C# program crashes.
I suppose this is related to Cdecl calling model and should use Marshal.AllocHGlobal(value),
I can`t imagine how should it be done correct.
I also tryed this:
[DllImport("entry.dll",
CallingConvention = CallingConvention.Cdecl,
EntryPoint = "SaveToBuffer")]
private static int SaveToBuffer( IntPtr buf, ref int buf_size );
And then alocate enough memory
IntPtr data=Marshal.AllocHGlobal(128000);
int bufSize=128000;
var sCode=SaveToBuffer(data,bufSize ); /* value of scode idicate succses*/
Calling this way i get return value from SaveToBuffer indicating function succseeded but: bufSize returns to 0 and how should i read my data from IntPtr.
I`m completly stuck on this.
This is not an issue with the calling convention. The problem is in the buffer handling.
There's really only one sensible way to interpret the C++ argument types and the apparent intent to return an array of bytes. That is that the buffer is allocated and populated by the callee, and its address returned in buf. The buffer length is returned in buf_size.
With these semantics the function arguments cannot be marshalled automatically and you'll have to do it manually:
[DllImport("entry.dll", CallingConvention = CallingConvention.Cdecl)]
private static int SaveToBuffer(out IntPtr buf, out int buf_size);
Call like this
IntPtr buf;
int buf_size;
int retval SaveToBuffer(out buf, out buf_size);
// check retval
Then copy to byte array like this:
byte[] buffer = new byte[buf_size];
Marshal.Copy(buf, buffer, 0, buf_size);
The DLL will also need to export a function to deallocate the unmanaged buffer.

How do I call this c function in c# (unmarshalling return struct)?

I want to use c# interop to call a function from a dll written in c. I have the header files.
Take a look at this:
enum CTMBeginTransactionError {
CTM_BEGIN_TRX_SUCCESS = 0,
CTM_BEGIN_TRX_ERROR_ALREADY_IN_PROGRESS,
CTM_BEGIN_TRX_ERROR_NOT_CONNECTED
};
#pragma pack(push)
#pragma pack(1)
struct CTMBeginTransactionResult {
char * szTransactionID;
enum CTMBeginTransactionError error;
};
struct CTMBeginTransactionResult ctm_begin_customer_transaction(const char * szTransactionID);
How do I call ctm_begin_customer_transaction from c#. The const char * mapps well to string, but despite various attempts (looking at stackoverflow and other sites), I fail to marshal the return structure. If I define the function to return IntPtr it works ok...
Edit
I changed the return type to IntPtr and use:
CTMBeginTransactionResult structure = (CTMBeginTransactionResult)Marshal.PtrToStructure(ptr, typeof(CTMBeginTransactionResult));
but it throws AccessViolationException
I also tried:
IntPtr ptr = Transactions.ctm_begin_customer_transaction("");
int size = 50;
byte[] byteArray = new byte[size];
Marshal.Copy(ptr, byteArray, 0, size);
string stringData = Encoding.ASCII.GetString(byteArray);
stringData == "70e3589b-2de0-4d1e-978d-55e22225be95\0\"\0\0\a\0\0\b\b?" at this point. The "70e3589b-2de0-4d1e-978d-55e22225be95" is the szTransactionID from the struct. Where is the Enum? Is it the next byte?
There's a memory management problem hidden in this struct. Who owns the C string pointer? The pinvoke marshaller will always assume that the caller owns it so it will try to release the string. And passes the pointer to CoTaskMemFree(), same function as the one called by Marshal.FreeCoTaskMem(). These functions use the COM memory allocator, the universal interop memory manager in Windows.
This rarely comes to a good end, C code does not typically use that allocator unless the programmer designed his code with interop in mind. In which case he'd never have used a struct as a return value, interop always works much less trouble-free when the caller supplies buffers.
So you cannot afford to let the marshaller do its normal duty. You must declare the return value type as IntPtr so it doesn't try to release the string. And you must marshal it yourself with Marshal.PtrToStructure().
That however still leaves the question unanswered, who owns the string? There is nothing you can do to release the string buffer, you don't have access to the allocator used in the C code. The only hope you have is that the string wasn't actually allocated on the heap. That's possible, the C program might be using string literals. You need to verify that guess. Call the function a billion times in a test program. If that doesn't explode the program then you're good. If not then only C++/CLI can solve your problem. Given the nature of the string, a "transaction ID" ought to change a lot, I'd say you do have a problem.
I hate to answer my own question, but I found the solution to marshal the resulting struct. The struct is 8 bytes long (4 bytes for the char * and 4 bytes for enum). Marshalling the string does not work automatically, but the following works:
// Native (unmanaged)
public enum CTMBeginTransactionError
{
CTM_BEGIN_TRX_SUCCESS = 0,
CTM_BEGIN_TRX_ERROR_ALREADY_IN_PROGRESS,
CTM_BEGIN_TRX_ERROR_NOT_CONNECTED
};
// Native (unmanaged)
[StructLayout(LayoutKind.Sequential, Pack = 1, CharSet = CharSet.Ansi)]
internal struct CTMBeginTransactionResult
{
public IntPtr szTransactionID;
public CTMBeginTransactionError error;
};
// Managed wrapper around native struct
public class BeginTransactionResult
{
public string TransactionID;
public CTMBeginTransactionError Error;
internal BeginTransactionResult(CTMBeginTransactionResult nativeStruct)
{
// Manually marshal the string
if (nativeStruct.szTransactionID == IntPtr.Zero) this.TransactionID = "";
else this.TransactionID = Marshal.PtrToStringAnsi(nativeStruct.szTransactionID);
this.Error = nativeStruct.error;
}
}
[DllImport("libctmclient-0.dll")]
internal static extern CTMBeginTransactionResult ctm_begin_customer_transaction(string ptr);
public static BeginTransactionResult BeginCustomerTransaction(string transactionId)
{
CTMBeginTransactionResult nativeResult = Transactions.ctm_begin_customer_transaction(transactionId);
return new BeginTransactionResult(nativeResult);
}
The code works, but I still need to investigate, if calling the unmanaged code results in memory leaks.

Correct way to marshall uchar[] from native dll to byte[] in c#

I'm trying to marshall some data that my native dll allocated via CoTaskMemAlloc into my c# application and wondering if the way I'm doing it is just plain wrong or I'm missing some sublte decorating of the method c# side.
Currently I have c++ side.
extern "C" __declspec(dllexport) bool __stdcall CompressData( unsigned char* pInputData, unsigned int inSize, unsigned char*& pOutputBuffer, unsigned int& uOutputSize)
{ ...
pOutputBuffer = static_cast<unsigned char*>(CoTaskMemAlloc(60000));
uOutputSize = 60000;
And on the C# side.
private const string dllName = "TestDll.dll";
[System.Security.SuppressUnmanagedCodeSecurity]
[DllImport(dllName)]
public static extern bool CompressData(byte[] inputData, uint inputSize, out byte[] outputData, out uint outputSize );
...
byte[] outputData;
uint outputSize;
bool ret = CompressData(packEntry.uncompressedData, (uint)packEntry.uncompressedData.Length, out outputData, out outputSize);
here outputSize is 60000 as expected, but outputData has a size of 1, and when I memset the buffer c++ side, it seems to only copy across 1 byte, so is this just wrong and I need to marshall the data outside the call using an IntPtr + outputSize, or is there something sublte I'm missing to get working what I have already?
Thanks.
There are two things.
First, the P/Invoke layer does not handle reference parameters in C++, it can only work with pointers. The last two parameters (pOutputBuffer and uOutputSize) in particular are not guaranteed to marshal correctly.
I suggest you change your C++ method declaration to (or create a wrapper of the form):
extern "C" __declspec(dllexport) bool __stdcall CompressData(
unsigned char* pInputData, unsigned int inSize,
unsigned char** pOutputBuffer, unsigned int* uOutputSize)
That said, the second problem comes from the fact that the P/Invoke layer also doesn't know how to marshal back "raw" arrays (as opposed to say, a SAFEARRAY in COM that knows about it's size) that are allocated in unmanaged code.
This means that on the .NET side, you have to marshal the pointer that is created back, and then marshal the elements in the array manually (as well as dispose of it, if that's your responsibility, which it looks like it is).
Your .NET declaration would look like this:
[System.Security.SuppressUnmanagedCodeSecurity]
[DllImport(dllName)]
public static extern bool CompressData(byte[] inputData, uint inputSize,
ref IntPtr outputData, ref uint outputSize);
Once you have the outputData as an IntPtr (this will point to the unmanaged memory), you can convert into a byte array by calling the Copy method on the Marshal class like so:
var bytes = new byte[(int) outputSize];
// Copy.
Marshal.Copy(outputData, bytes, 0, (int) outputSize);
Note that if the responsibility is yours to free the memory, you can call the FreeCoTaskMem method, like so:
Marshal.FreeCoTaskMem(outputData);
Of course, you can wrap this up into something nicer, like so:
static byte[] CompressData(byte[] input, int size)
{
// The output buffer.
IntPtr output = IntPtr.Zero;
// Wrap in a try/finally, to make sure unmanaged array
// is cleaned up.
try
{
// Length.
uint length = 0;
// Make the call.
CompressData(input, size, ref output, ref length);
// Allocate the bytes.
var bytes = new byte[(int) length)];
// Copy.
Marshal.Copy(output, bytes, 0, bytes.Length);
// Return the byte array.
return bytes;
}
finally
{
// If the pointer is not zero, free.
if (output != IntPtr.Zero) Marshal.FreeCoTaskMem(output);
}
}
The pinvoke marshaller cannot guess how large the returned byte[] might be. Raw pointers to memory in C++ do not have a discoverable size of the pointed-to memory block. Which is why you added the uOutputSize argument. Good for the client program but not quite good enough for the pinvoke marshaller. You have to help and apply the [MarshalAs] attribute to pOutputBuffer, specifying the SizeParamIndex property.
Do note that the array is getting copied by the marshaller. That's not so desirable, you can avoid it by allowing the client code to pass an array. The marshaller will pin it and pass the pointer to the managed array. The only trouble is that the client code will have no decent way to guess how large to make the array. The typical solution is to allow the client to call it twice, first with uOutputSize = 0, the function returns the required array size. Which would make the C++ function look like this:
extern "C" __declspec(dllexport)
int __stdcall CompressData(
const unsigned char* pInputData, unsigned int inSize,
[Out]unsigned char* pOutputBuffer, unsigned int uOutputSize)

pinvoke: How to free a malloc'd string?

In a C dll, I have a function like this:
char* GetSomeText(char* szInputText)
{
char* ptrReturnValue = (char*) malloc(strlen(szInputText) * 1000); // Actually done after parsemarkup with the proper length
init_parser(); // Allocates an internal processing buffer for ParseMarkup result, which I need to copy
sprintf(ptrReturnValue, "%s", ParseMarkup(szInputText) );
terminate_parser(); // Frees the internal processing buffer
return ptrReturnValue;
}
I would like to call it from C# using P/invoke.
[DllImport("MyDll.dll")]
private static extern string GetSomeText(string strInput);
How do I properly release the allocated memory?
I am writing cross-platform code targeting both Windows and Linux.
Edit:
Like this
[DllImport("MyDll.dll")]
private static extern System.IntPtr GetSomeText(string strInput);
[DllImport("MyDll.dll")]
private static extern void FreePointer(System.IntPtr ptrInput);
IntPtr ptr = GetSomeText("SomeText");
string result = Marshal.PtrToStringAuto(ptr);
FreePointer(ptr);
You should marshal returned strings as IntPtr otherwise the CLR may free the memory using the wrong allocator, potentially causing heap corruption and all sorts of problems.
See this almost (but not quite) duplicate question PInvoke for C function that returns char *.
Ideally your C dll should also expose a FreeText function for you to use when you wish to free the string. This ensures that the string is deallocated in the correct way (even if the C dll changes).
Add another function ReturnSomeText that calls free or whatever is needed to release the memory again.
If you return to .net memory allocated with your native malloc, then you also have to export the deallocator. I don't regard that to be a desirable action and instead prefer to export the text as a BSTR. This can be freed by the C# runtime because it knows that the BSTR was allocated by the COM allocator. The C# coding becomes a lot simpler.
The only wrinkle is that a BSTR uses Unicode characters and your C++ code uses ANSI. I would work around that like so:
C++
#include <comutil.h>
BSTR ANSItoBSTR(const char* input)
{
BSTR result = NULL;
int lenA = lstrlenA(input);
int lenW = ::MultiByteToWideChar(CP_ACP, 0, input, lenA, NULL, 0);
if (lenW > 0)
{
result = ::SysAllocStringLen(0, lenW);
::MultiByteToWideChar(CP_ACP, 0, input, lenA, result, lenW);
}
return result;
}
BSTR GetSomeText(char* szInputText)
{
return ANSItoBSTR(szInputText);
}
C#
[DllImport("MyDll.dll", CallingConvention=CallingConvention.Cdecl)]
[return: MarshalAs(UnmanagedType.BStr)]
private static extern string GetSomeText(string strInput);

Categories

Resources