I am currently writing a thin C# binding for OpenGL. I've just recently implemented the OpenGL GenVertexArrays function, which has the following signature:
OpenGL Documentation on glGenVertexArrays.
Essentially, you pass it an array in which to store generated object values for the vertex arrays created by OpenGL.
In order to create the binding, I use delegates as glGenVertexArrays is an OpenGL extension function, so I have to load it dynamically using wglGetProcAddress. The delegate signature I have defined in C# looks like this:
[SuppressUnmanagedCodeSecurity]
[UnmanagedFunctionPointer(CallingConvention.StdCall)]
private delegate void glGenVertexArrays(uint amount, uint[] array);
The function pointer is retrieved and converted to this delegate using Marshal.GetDelegateForFunctionPointer, like this:
IntPtr proc = wglGetProcAddress(name);
del = Marshal.GetDelegateForFunctionPointer(proc, delegateType);
Anyways, here's what bothers me:
In any official documentation I can find on default marshalling behaviour for reference types (which includes arrays), is this:
By default, reference types (classes, arrays, strings, and interfaces)
passed by value are marshaled as In parameters for performance
reasons. You do not see changes to these types unless you apply
InAttribute and OutAttribute (or just OutAttribute) to the method
parameter.
This is taken from this MSDN page: MSDN page on directional attributes
However, as can be seen from my delegate signatures, the [In] and [Out] directional attributes have not been used on the array of unsigned integers, meaning when I call this function I should actually not be able to see the generated object values which OpenGL should have stored in them. Except, I am. Using this signature, I can the following result when running the debugger:
As can be seen, the call absolutely did affect the array, even though I did not explicitly use the [Out] attribute. This is not, from what I understand, a result I should expect.
Does anyone know the reason behind this? I know it might seem as a minor deal, but I am very curious to know why this seems to break the default marshalling behaviour described by Microsoft. Is there some behind-the-scenes stuff going on when invoking delegates compared to pure platform invoke prototypes? Or am I misinterpreting the documentation?
[EDIT]
For anyone curious, the public method that invokes the delegate is defined on a static "GL" class, and is as followed:
public static void GenVertexArrays(uint amount, uint[] array)
{
InvokeExtensionFunction<glGenVertexArrays>()(amount, array);
}
It is not mentioned on the documentation page you linked, but there is another topic dedicated to the marshaling of arrays, where it says:
With pinning optimization, a blittable array can appear to operate as an In/Out parameter when interacting with objects in the same apartment.
Both conditions are met in your case: array of uint is blittable, and there is no machine-to-machine marshaling. It is still a good idea to declare it [Out], so your intention is documented within the code.
The documentation is correct in the general case. But uint is a bit special, it is a blittable type. An expensive word that means that the pinvoke marshaller does not have to do anything special to convert the array element values. An uint in C# is exactly the same type as an unsigned int in C. Not a coincidence at all, it is the kind of type that a processor can handle natively.
So the marshaller can simply pin the array and pass a pointer to the first array element as the second argument. Very fast, always what you want. And the function scribbles directly into the managed array, so copying the values back is not necessary. A bit dangerous too, you never ever want to lie about the amount argument, GC heap corruption is an excessively ugly bug to diagnose.
Most simple value types and structs of simple values types are blittable. bool is a notable exception. You'll otherwise never have to be sorry for using [Out] even if it is not necessary. The marshaller simply ignores it here.
Related
Is there a difference between using [In, Out] and just using ref when passing parameters from C# to C++?
I've found a couple different SO posts, and some stuff from MSDN as well that comes close to my question but doesn't quite answer it. My guess is that I can safely use ref just like I would use [In, Out], and that the marshaller won't act any differently. My concern is that it will act differently, and that C++ won't be happy with my C# struct being passed. I've seen both things done in the code base I'm working in...
Here are the posts I've found and have been reading through:
Are P/Invoke [In, Out] attributes optional for marshaling arrays?
Makes me think I should use [In, Out].
MSDN: InAttribute
MSDN: OutAttribute
MSDN: Directional Attributes
These three posts make me think that I should use [In, Out], but that I can use ref instead and it will have the same machine code. That makes me think I'm wrong -- hence asking here.
The usage of ref or out is not arbitrary. If the native code requires pass-by-reference (a pointer) then you must use those keywords if the parameter type is a value type. So that the jitter knows to generate a pointer to the value. And you must omit them if the parameter type is a reference type (class), objects are already pointers under the hood.
The [In] and [Out] attributes are then necessary to resolve the ambiguity about pointers, they don't specify the data flow. [In] is always implied by the pinvoke marshaller so doesn't have to be stated explicitly. But you must use [Out] if you expect to see any changes made by the native code to a struct or class member back in your code. The pinvoke marshaller avoids copying back automatically to avoid the expense.
A further quirk then is that [Out] is not often necessary. Happens when the value is blittable, an expensive word that means that the managed value or object layout is identical to the native layout. The pinvoke marshaller can then take a shortcut, pinning the object and passing a pointer to managed object storage. You'll inevitably see changes then since the native code is directly modifying the managed object.
Something you in general strongly want to pursue, it is very efficient. You help by giving the type the [StructLayout(LayoutKind.Sequential)] attribute, it suppresses an optimization that the CLR uses to rearrange the fields to get the smallest object. And by using only fields of simple value types or fixed size buffers, albeit that you don't often have that choice. Never use a bool, use byte instead. There is no easy way to find out if a type is blittable, other than it not working correctly or by using the debugger and compare pointer values.
Just be explicit and always use [Out] when you need it. It doesn't cost anything if it turned out not to be necessary. And it is self-documenting. And you can feel good that it still will work if the architecture of the native code changes.
I've been going round and round in circles on Google on this, and I can find all kinds of discussion, lots of suggestions, but nothing seems to work. I have an ActiveX component which takes an image as a byte array. When I do a TLB import, it comes in with this signature:
int HandleImage([MarshalAs(UnmanagedType.Struct)] ref object Bitmap);
How do I pass a byte[] to that?
There's another function which can return the data with a similar signature, and it works because I can pass "null" in. The type that comes back is a byte[1..size] (non-zero bounded byte[]). But even if I try to pass in what came back, it still gets a type mismatch exception.
More details:
I've been editing the method in the IDispatch interface signature (using ILSpy to extract the interface from the auto-generated interop assembly). I've tried just about every combination of the following, it always gets Type mismatch exception:
Adding and removing the "ref"
Changing the parameter datatype to "byte[]" or "Array"
Marshalling as [MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_UI1)]. After playing around with MarshalAs quite a bit, I'm becoming convinced that IDispatch does not use those attributes.
Also tried using the "ref object" interface as is, and passing it different types: byte[], Array.CreateInstance(typeof(byte) (which I think are both identical, but I found someone suggesting it, so it couldn't hurt to try).
Here's an example of Delphi code that creates a proper array to pass in:
var
image: OLEVariant;
buf: Pointer;
image := VarArrayCreate([0, Stream.Size], VarByte);
Buf := VarArrayLock(image);
Stream.ReadBuffer(Buf^, Stream.Size);
VarArrayUnlock(image);
Here's the C code to do the same thing. I guess if I can't get it to work from C#, I can invoke it through managed C++, although I'd rather have everything in one project:
long HandleImage(unsigned char* Bitmap, int Length)
{
VARIANT vBitmap;
VariantInit (&vBitmap);
VariantClear(&vBitmap);
SAFEARRAYBOUND bounds[1];
bounds[0].cElements = Length;
bounds[0].lLbound = 1;
SAFEARRAY* arr = SafeArrayCreate(VT_UI1, 1, bounds);
SafeArrayLock(arr);
memcpy(arr->pvData, Bitmap, Length);
SafeArrayUnlock(arr);
vBitmap.parray = arr;
vBitmap.vt = VT_ARRAY | VT_UI1;
long result;
static BYTE parms[] = VTS_PVARIANT;
InvokeHelper(0x5e, DISPATCH_METHOD, VT_I4, (void*)&result, parms,
&vBitmap);
SafeArrayDestroy(arr);
VariantClear(&vBitmap);
return result;
}
I finally figured out how to do it in 100% C# code. Apparently Microsoft never considered the idea that someone might use a method with this signature to pass data in, since it marshals correctly going the other direction (it properly comes back as a byte[]).
Also, ICustomMarshaler doesn't get called on IDispatch calls, it never hit the breakpoints in the custom marshaler (except the static method to get an instance of it).
The answer by Hans Passant in this question got me on the right track: Calling a member of IDispatch COM interface from C#
The copy of IDispatch there doesn't contain the "Invoke" method on IUnknown, but it can be added to the interface, using types in System.Runtime.InteropServices.ComTypes as appropriate: http://msdn.microsoft.com/en-us/library/windows/desktop/ms221479%28v=vs.85%29.aspx
That means you get 100% control over marshaling arguments. Microsoft doesn't expose an implementation of the VARIANT structure, so you have to define your own: http://limbioliong.wordpress.com/2011/09/19/defining-a-variant-structure-in-managed-code-part-2/
The input parameters of Invoke are a variant array, so you have to marshal those to an unmanaged array, and there's a variant output parameter.
So now that we have a variant, what should it contain? This is where automatic marshaling falls down. Instead of directly embedding a pointer to the SAFEARRAY, it needs a pointer to another variant, and that variant should point to the SAFEARRAY.
You can build SAFEARRAYs via P/Invoking these methods: http://msdn.microsoft.com/en-us/library/windows/desktop/ms221145%28v=vs.85%29.aspx
So main variant should be VT_VARIANT | VT_BYREF, it should point to another variant VT_UI8 | VT_ARRAY, and that should point to a SAFEARRAY generated via SafeArrayCreate(). That outermost variant should then be copied into a block of memory, and its IntPtr sent to the Invoke method.
I use a dll compiled from C++ code (LPSolve, see http://lpsolve.sourceforge.net/5.5/), from my C# code. I use the API to build a linear programming model, and subsequently solve it. I call functions such as:
[DllImport("lpsolve55.dll", SetLastError = true)]
public static extern bool add_columnex
(int lp, int count, double[] column, int[] rowno);
I am wondering what happens, memorywise, when I call such a function and the ints and arrays that I created in managed code leave scope (in the c# code). Will they be eligible for garbage collection? What does this mean for the c++ code? Or are the ineligible, and in that case: why?
Because the function prototype is using plain old datatypes and arrays, the memory for these values is pinned in place and then the native code acts directly on the data. Then when the function returns, the memory is unpinned and can be garbage collected.
In other words, they never leave the scope.
In terms of the C++ code, if it needs to store any of the data then it will need to take a copy of the data passed into it and then manage that memory itself.
I think Nick has covered the basic part, this is just to add more information. Array of int/double are considered as blittable types (types that have same layout in managed/unmanaged worlds) - these are typically get pinned when Marshalled. So you don't have to about GC. Also, what you have done indicates passing array by value - in such case, marshaller treats this as In parameter - in case, your unmanaged dll is going to update values in array then I would suggest you to mark it as In/Out parameter (e.g. [In, Out]double[] column). For more info:
Blittable and Non-Blittable Types:
http://msdn.microsoft.com/en-us/library/75dwhxf7.aspx
Copying and Pinning: http://msdn.microsoft.com/en-us/library/23acw07k.aspx
Marshalling Arrays: http://learning.infocollections.com/ebook%202/Computer/Programming/General/Programming.With.Microsoft.Dot.NET/LiB0948.htm
If your app doesn't crash with an AccessViolationException after using it for a while (past a garbage collection) then it is pretty safe to assume that the unmanaged code made a copy of the array elements you passed it. This is the normal thing to do, the library would otherwise be very hard to use from native code as well. There also ought to be an API function that lets you clear or re-initialize the model, that one should be releasing the memory.
The C or C++ function has a prototype like this:
bool add_columnex(int lp, int count, double[] column, int[] rowno);
The parameters lp and count are passed by value. The parameters column and rowno are also passed by value but the actual data is passed by reference, the function add_columnex will have to dereference column and rowno. This dereferentiation is only allowed during the function call. When the function returns, these parameters are out of scope. The kind of derenferentation must be in the contract of the interface.
When the function returns all arguments get out of scope and there is no mean that the function could do anything after call. If the function stores copies of the arguments, i.e. the address of the double or rowno array, this must be allowed by the contract. In that case you would get in trouble. A better contract would be that the function must copy the dereferenced data.
In another question I asked, a comment arose indicating that the .NET framework's Array.Copy method uses unmanaged code. I went digging with Reflector and found the signature one of the Array.Copy method overloads is defined as so:
[MethodImpl(MethodImplOptions.InternalCall), ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);
After looking at this, I'm slightly confused. The source of my confusion is the extern modifier which means (MSDN link):
The extern modifier is used to declare
a method that is implemented
externally.
However, the method declaration is also decorated with a MethodImplOptions.InternalCall attribute, which indicates (MSDN link):
Specifies an internal call. An
internal call is a call to a method
that is implemented within the common
language runtime itself.
Can anyone explain this seemingly apparent contradiction?
I would have just commented on leppie's post, but it was getting a bit long.
I'm currently working on an experimental CLI implementation. There are many cases where a publicly exposed method (or property) can't be implemented without knowledge of how the virtual machine is implemented internally. One example is OffsetToStringData, which requires knowledge of how the memory manager allocates strings.
For cases like this, where there is no C# code to express the method, you can treat each call to the method in a special way internal to the JIT process. As an example here, replacing the call byte code with a ldc.i4 (load constant integer) before passing it to the native code generator. The InternalCall flag means "The body of this method is treated in a special way by the runtime itself." There may or may not be an actual implementation - in several cases in my code the call is treated as an intrinsic by the JIT.
There are other cases where the JIT may have special information available that allows heavy optimization of a method. One example is the Math methods, where even though these can be implemented in C#, specifying InternalCall to make them effectively intrinsics has significant performance benefits.
In C#, a method has to have a body unless it is abstract or extern. The extern means a general "You can call this method from C# code, but the body of it is actually defined elsewhere.". When the JIT reaches a call to an extern method, it looks up where to find the body and behaves in different ways per the result.
The DllImport attribute instructs the JIT to make a P/Invoke stub to call a native code implementation.
The InternalCall flag instructs the JIT to treat the call in a self-defined way.
(There are some others, but I don't have examples off the top of my head for their use.)
InternalCall means provided by the framework.
extern says you are not providing code.
extern can be used in 2 general situations, like above, or with p/invoke.
With p/invoke, you simply tell the method where to get the implementation.
In my C# application, I have a large struct (176 bytes) that is passed potentially a hundred thousand times per second to a function. This function then simply takes a pointer to the struct and passes the pointer to unmanaged code. Neither the function nor the unmanaged code will make any modifications to the struct.
My question is, should I pass the struct to the function by value or by reference? In this particular case, my guess is that passing by reference would be much faster than pushing 176 bytes onto the call stack, unless the JIT happens to recognize that the struct is never modified (my guess is it can't recognize this since the struct's address is passed to unmanaged code) and optimizes the code.
Since we're at it, let's also answer the more general case where the function does not pass the struct's pointer to unmanaged code, but instead performs some read-only operation on the contents of the struct. Would it be faster to pass the struct by reference? Would in this case the JIT recognize that the struct is never modified and thus optimize? Presumably it is not more efficient to pass a 1-byte struct by reference, but at what struct size does it become better to pass a struct by reference, if ever?
Thanks.
EDIT:
As pointed out below, it's also possible to create an "equivalent" class for regular use, and then use a struct when passing to unmanaged code. I see two options here:
1) Create a "wrapper" class that simply contains the struct, and then pin and pass a pointer to the struct to the unmanaged code when necessary. A potential issue I see is that pinning has its own performance consequences.
2) Create an equivalent class whose fields are copied to the struct when the struct is needed. But copying would take a lot of time and seems to me to defeat the point of passing by reference in the first place.
EDIT:
As mentioned a couple times below, I could certainly just measure the performance of each of these methods. I will do this and post the results. However, I am still interested in seeing people's answers and reasonings from an intellectual perspective.
I did some very informal profiling, and the results indicate that, for my particular application, there is a modest performance gain for passing by reference. For by-value I got about 10,050,000 calls per second, whereas for by-reference I got about 11,200,000 calls per second.
Your mileage may vary.
Before you ask whether or not you should pass the struct by reference, you should ask yourself why you've got such an enormous struct in the first place. Does it really need to be a struct? If you need to use a struct at some point for P/Invoke, would it be worth having a struct just for that, and then the equivalent class elsewhere?
A struct that big is very, very unusual...
See the Design Guidelines for Developing Class Libraries section on Choosing Between Classes and Structures for more guidance on this.
The only way you can get an answer to this question is to code up both and measure the performance.
You mention unmanaged/managed interop. My experience is that it takes a surprisingly long time to to the interop. You could try changing your code from:
void ManagedMethod(MyStruct[] items) {
foreach (var item in items) {
unmanagedHandle.ProcessOne(item);
}
}
To:
void ManagedMethod(MyStruct[] items) {
unmanagedHandle.ProcessMany(items, items.Count);
}
This technique helped me in a similar case, but only measurements will tell you if it works for your case.
Why not just use a class instead, and pass your class to your P/Invoke function?
Using class will pass nicely around in your managed code, and will work the same as passing a struct by reference to a P/Invoke function.
e.g.
// What you have
public struct X
{
public int data;
}
[DllImport("mylib.dll")]
static extern void Foo( ref X arg);
// What you could do
[StructLayout(LayoutKind.Sequential)]
public class Y
{
public int data;
}
[DllImport("mylib.dll")]
static extern void Bar( Y arg );