In C, there is a function that accepts a pointer to a function to perform comparison:
[DllImport("mylibrary.dll", CallingConvention = CallingConvention.Cdecl)]
private static extern int set_compare(IntPtr id, MarshalAs(UnmanagedType.FunctionPtr)]CompareFunction cmp);
In C#, a delegate is passed to the C function:
[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
delegate int CompareFunction(ref IntPtr left, ref IntPtr right);
Currently, I accept Func<T,T,int> comparer in a constructor of a generic class and convert it to the delegate. "mylibrary.dll" owns data, managed C# library knows how to convert pointers to T and then compare Ts.
//.in ctor
CompareFunction cmpFunc = (ref IntPtr left, ref IntPtr right) => {
var l = GenericFromPointer<T>(left);
var r = GenericFromPointer<T>(right);
return comparer(l, r);
};
I also have an option to write a CompareFunction in C for most important data types that are used in 90%+ cases, but I hope to avoid modifications to the native library.
The question is, when setting the compare function with P/Invoke, does every subsequent call to that function from C code incurs marshaling overheads, or the delegate is called from C as if it was initially written in C?
I imagine that, when compiled, the delegate is a sequence of machine instructions in memory, but do not understand if/why C code would need to ask .NET to make the actual comparison, instead of just executing these instructions in place?
I am mostly interested in better understanding how interop works. However, this delegate is used for binary search on big data sets, and if every subsequent call has some overheads as a single P/Invoke, rewriting comparers in native C could be a good option.
I imagine that, when compiled, the delegate is a sequence of machine instructions in memory, but do not understand if/why C code would need to ask .NET to make the actual comparison, instead of just executing these instructions in place?
I guess you're a bit confused about how .NET works. C doesn't ask .NET to execute code.
First, your lambda is turned into a compiler-generated class instance (because you're closing over the comparer variable), and then a delegate to a method of this class is used. And it's an instance method since your lambda is a closure.
A delegate is similar to a function pointer. So, like you say, it points to executable code. Whether this code is generated from a C source or a .NET source is irrelevant at this point.
It's in the interop case when this starts to matter. P/Invoke won't pass your delegate as-is as a function pointer to C code. It will pass a function pointer to a thunk which calls the delegate. Visual Studio will display this as a [Native to Managed Transition] stack frame. This is needed for different reasons such as marshaling or passing additional parameters (like the instance of the class backing your lambda for instance).
As to the performance considerations of this, here's what MSDN says, quite obviously:
Thunking. Regardless of the interoperability technique used, special transition sequences, which are known as thunks, are required each time a managed function calls an native function, and vice-versa. Because thunking contributes to the overall time that it takes to interoperate between managed code and native code, the accumulation of these transitions can negatively affect performance.
So, if your code requires a lot of transitions between managed and native code, you should get better performance by doing your comparisons on the C side if possible, so that you avoid the transitions.
Related
To my surprise I have discovered a powerful feature today. Since it looks too good to be true I want to make sure that it is not just working due to some weird coincidence.
I have always thought that when my p/invoke (to c/c++ library) call expects a (callback) function pointer, I would have to pass a delegate on a static c# function. For example in the following I would always reference a delegate of KINSysFn to a static function of that signature.
[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
public delegate int KINSysFn(IntPtr uu, IntPtr fval, IntPtr user_data );
and call my P/Invoke with this delegate argument:
[DllImport("some.dll", EntryPoint = "KINInit", ExactSpelling = true, CallingConvention = CallingConvention.Cdecl)]
public static extern int KINInit(IntPtr kinmem, KINSysFn func, IntPtr tmpl);
But now I just tried and passed a delegate on an instance method and it worked also! For example:
public class MySystemFunctor
{
double c = 3.0;
public int SystemFunction(IntPtr u, IntPtr v, IntPtr userData) {}
}
// ...
var myFunctor = new MySystemFunctor();
KINInit(kinmem, myFunctor.SystemFunction, IntPtr.Zero);
Of course, I understand that inside managed code there is no technical problem at all with packaging a "this" object together with an instance method to form the respective delegate.
But what surprises me about it, is the fact that the "this" object of MySystemFunctor.SystemFunction finds its way also to the native dll, which only accepts a static function and does not incorporate any facility for a "this" object or packaging it together with the function.
Does this mean that any such delegate is translated (marshalled?) individually to a static function where the reference to the respective "this" object is somehow hard coded inside the function definition? How else could be distinguished between the different delegate instances, for example if I have
var myFunctor01 = new MySystemFunctor();
// ...
var myFunctor99 = new MySystemFunctor();
KINInit(kinmem, myFunctor01.SystemFunction, IntPtr.Zero);
// ...
KINInit(kinmem, myFunctor99.SystemFunction, IntPtr.Zero);
These can't all point to the same function. And what if I create an indefinite number of MySystemFunctor objects dynamically? Is every such delegate "unrolled"/compiled to its own static function definition at runtime?
Does this mean that any such delegate is translated (marshalled?) individually to a static function...
Yes, you guessed at this correctly. Not exactly a "static function", there is a mountain of code inside the CLR that performs this magic. It auto-generates machine code for a thunk that adapts the call from native code to managed code. The native code gets a function pointer to that thunk. The argument values may have to be converted, a standard pinvoke marshaller duty. And always shuffled around to match the call to the managed method. Digging up the stored delegate's Target property to provide this is part of that. And it jiggers the stack frame, tying a link to the previous managed frame, so the GC can see that it again needs to look for object roots.
There is however one nasty little detail that gets just about everybody in trouble. These thunks are automatically cleaned-up again when the callback is not necessary anymore. The CLR gets no help from the native code to determine this, it happens when the delegate object gets garbage-collected. Maybe you smell the rat, what determines in your program when that happens?
var myFunctor = new MySystemFunctor();
That is a local variable of a method. It is not going to survive for very long, the next collection will destroy it. Bad news if the native code keeps making callbacks through the thunk, it won't be around anymore and that's a hard crash. Not so easy to see when you are experimenting with code since it takes a while.
You have to ensure this can't happen. Storing the delegate objects in your class might work, but then you have to make sure your class object survives long enough. Whatever it takes, no guess from the snippet. It tends to solve itself when you also ensure that you unregister these callbacks again since that requires storing the object reference for use later. You can also store them in a static variable or use GCHandle.Alloc(), but that of course loses the benefit of having an instance callback in a hurry. Feel good about having this done correctly by testing it, call GC.Collect() in the caller.
Worth noting is that you did it right by new-ing the delegate explicitly. C# syntax sugar does not require that, makes it harder to get this right. If the callbacks only occur while you make the pinvoke call into the native code, not uncommon (like EnumWindows), then you don't have to worry about it since the pinvoke marshaller ensures the delegate object stays referenced.
For the records: I have walked right into the trap, Hans Passant has mentioned. Forced garbage collection has led to a null reference exception because the delegate was transient:
KINInit(kinmem, myFunctor.SystemFunction, IntPtr.Zero);
// BTW: same with:
// KINInit(kinmem, new KINSysFn(myFunctor.SystemFunction), IntPtr.Zero);
GC.Collect();
GC.WaitForPendingFinalizers();
KINSol(/*...*); // BAAM! NullReferenceException
Luckily I had already wrapped the critical two P/Invokes, KINInit (which sets the callback delegate) and KINSolve (which actually uses the callback) into a dedicated managed class. The solution was, as already discussed, to keep the delegate referenced by a class member:
// ksf is a class member of delegate type KINSysFn that keeps ref to delegate instance
ksf = new KINSysFn(myFunctor.SystemFunction);
KINInit(kinmem, ksf, IntPtr.Zero);
GC.Collect();
GC.WaitForPendingFinalizers();
KINSol(/*...*);
Thanks again, Hans, I'd never have noticed this flaw because it works as long as no GC happens!
I am currently writing a thin C# binding for OpenGL. I've just recently implemented the OpenGL GenVertexArrays function, which has the following signature:
OpenGL Documentation on glGenVertexArrays.
Essentially, you pass it an array in which to store generated object values for the vertex arrays created by OpenGL.
In order to create the binding, I use delegates as glGenVertexArrays is an OpenGL extension function, so I have to load it dynamically using wglGetProcAddress. The delegate signature I have defined in C# looks like this:
[SuppressUnmanagedCodeSecurity]
[UnmanagedFunctionPointer(CallingConvention.StdCall)]
private delegate void glGenVertexArrays(uint amount, uint[] array);
The function pointer is retrieved and converted to this delegate using Marshal.GetDelegateForFunctionPointer, like this:
IntPtr proc = wglGetProcAddress(name);
del = Marshal.GetDelegateForFunctionPointer(proc, delegateType);
Anyways, here's what bothers me:
In any official documentation I can find on default marshalling behaviour for reference types (which includes arrays), is this:
By default, reference types (classes, arrays, strings, and interfaces)
passed by value are marshaled as In parameters for performance
reasons. You do not see changes to these types unless you apply
InAttribute and OutAttribute (or just OutAttribute) to the method
parameter.
This is taken from this MSDN page: MSDN page on directional attributes
However, as can be seen from my delegate signatures, the [In] and [Out] directional attributes have not been used on the array of unsigned integers, meaning when I call this function I should actually not be able to see the generated object values which OpenGL should have stored in them. Except, I am. Using this signature, I can the following result when running the debugger:
As can be seen, the call absolutely did affect the array, even though I did not explicitly use the [Out] attribute. This is not, from what I understand, a result I should expect.
Does anyone know the reason behind this? I know it might seem as a minor deal, but I am very curious to know why this seems to break the default marshalling behaviour described by Microsoft. Is there some behind-the-scenes stuff going on when invoking delegates compared to pure platform invoke prototypes? Or am I misinterpreting the documentation?
[EDIT]
For anyone curious, the public method that invokes the delegate is defined on a static "GL" class, and is as followed:
public static void GenVertexArrays(uint amount, uint[] array)
{
InvokeExtensionFunction<glGenVertexArrays>()(amount, array);
}
It is not mentioned on the documentation page you linked, but there is another topic dedicated to the marshaling of arrays, where it says:
With pinning optimization, a blittable array can appear to operate as an In/Out parameter when interacting with objects in the same apartment.
Both conditions are met in your case: array of uint is blittable, and there is no machine-to-machine marshaling. It is still a good idea to declare it [Out], so your intention is documented within the code.
The documentation is correct in the general case. But uint is a bit special, it is a blittable type. An expensive word that means that the pinvoke marshaller does not have to do anything special to convert the array element values. An uint in C# is exactly the same type as an unsigned int in C. Not a coincidence at all, it is the kind of type that a processor can handle natively.
So the marshaller can simply pin the array and pass a pointer to the first array element as the second argument. Very fast, always what you want. And the function scribbles directly into the managed array, so copying the values back is not necessary. A bit dangerous too, you never ever want to lie about the amount argument, GC heap corruption is an excessively ugly bug to diagnose.
Most simple value types and structs of simple values types are blittable. bool is a notable exception. You'll otherwise never have to be sorry for using [Out] even if it is not necessary. The marshaller simply ignores it here.
I use a dll compiled from C++ code (LPSolve, see http://lpsolve.sourceforge.net/5.5/), from my C# code. I use the API to build a linear programming model, and subsequently solve it. I call functions such as:
[DllImport("lpsolve55.dll", SetLastError = true)]
public static extern bool add_columnex
(int lp, int count, double[] column, int[] rowno);
I am wondering what happens, memorywise, when I call such a function and the ints and arrays that I created in managed code leave scope (in the c# code). Will they be eligible for garbage collection? What does this mean for the c++ code? Or are the ineligible, and in that case: why?
Because the function prototype is using plain old datatypes and arrays, the memory for these values is pinned in place and then the native code acts directly on the data. Then when the function returns, the memory is unpinned and can be garbage collected.
In other words, they never leave the scope.
In terms of the C++ code, if it needs to store any of the data then it will need to take a copy of the data passed into it and then manage that memory itself.
I think Nick has covered the basic part, this is just to add more information. Array of int/double are considered as blittable types (types that have same layout in managed/unmanaged worlds) - these are typically get pinned when Marshalled. So you don't have to about GC. Also, what you have done indicates passing array by value - in such case, marshaller treats this as In parameter - in case, your unmanaged dll is going to update values in array then I would suggest you to mark it as In/Out parameter (e.g. [In, Out]double[] column). For more info:
Blittable and Non-Blittable Types:
http://msdn.microsoft.com/en-us/library/75dwhxf7.aspx
Copying and Pinning: http://msdn.microsoft.com/en-us/library/23acw07k.aspx
Marshalling Arrays: http://learning.infocollections.com/ebook%202/Computer/Programming/General/Programming.With.Microsoft.Dot.NET/LiB0948.htm
If your app doesn't crash with an AccessViolationException after using it for a while (past a garbage collection) then it is pretty safe to assume that the unmanaged code made a copy of the array elements you passed it. This is the normal thing to do, the library would otherwise be very hard to use from native code as well. There also ought to be an API function that lets you clear or re-initialize the model, that one should be releasing the memory.
The C or C++ function has a prototype like this:
bool add_columnex(int lp, int count, double[] column, int[] rowno);
The parameters lp and count are passed by value. The parameters column and rowno are also passed by value but the actual data is passed by reference, the function add_columnex will have to dereference column and rowno. This dereferentiation is only allowed during the function call. When the function returns, these parameters are out of scope. The kind of derenferentation must be in the contract of the interface.
When the function returns all arguments get out of scope and there is no mean that the function could do anything after call. If the function stores copies of the arguments, i.e. the address of the double or rowno array, this must be allowed by the contract. In that case you would get in trouble. A better contract would be that the function must copy the dereferenced data.
I have been programming for a few years now and have used function pointers in certain cases. What I would like to know is when is it appropriate or not to use them for performance reasons and I mean in the context of games, not business software.
Function pointers are fast, John Carmack used them to the extent of abuse in the Quake and Doom source code and because he is a genius :)
I would like to use function pointers more but I want to use them where they are most appropriate.
These days what are the best and most practical uses of function pointers in modern c-style languages such as C, C++, C# and Java, etc?
There is nothing especially "fast" about function pointers. They allow you to call a function which is specified at runtime. But you have exactly the same overhead as you'd get from any other function call (plus the additional pointer indirection). Further, since the function to call is determined at runtime, the compiler can typically not inline the function call as it could anywhere else. As such, function pointers may in some cases add up to be significantly slower than a regular function call.
Function pointers have nothing to do with performance, and should never be used to gain performance.
Instead, they are a very slight nod to the functional programming paradigm, in that they allow you to pass a function around as parameter or return value in another function.
A simple example is a generic sorting function. It has to have some way to compare two elements in order to determine how they should be sorted. This could be a function pointer passed to the sort function, and in fact c++'s std::sort() can be used exactly like that. If you ask it to sort sequences of a type that does not define the less than operator, you have to pass in a function pointer it can call to perform the comparison.
And this leads us nicely to a superior alternative. In C++, you're not limited to function pointers. You often use functors instead - that is, classes that overload the operator (), so that they can be "called" as if they were functions. Functors have a couple of big advantages over function pointers:
They offer more flexibility: they're full-fledged classes, with constructor, destructor and member variables. They can maintain state, and they may expose other member functions that the surrounding code can call.
They are faster: unlike function pointers, whose type only encode the signature of the function (a variable of type void (*)(int) may be any function which takes an int and returns void. We can't know which one), a functor's type encodes the precise function that should be called (Since a functor is a class, call it C, we know that the function to call is, and will always be, C::operator()). And this means the compiler can inline the function call. That's the magic that makes the generic std::sort just as fast as your hand-coded sorting function designed specifically for your datatype. The compiler can eliminate all the overhead of calling a user-defined function.
They are safer: There's very little type safety in a function pointer. You have no guarantee that it points to a valid function. It could be NULL. And most of the problems with pointers apply to function pointers as well. They're dangerous and error-prone.
Function pointers (in C) or functors (in C++) or delegates (in C#) all solve the same problem, with different levels of elegance and flexibility: They allow you to treat functions as first-class values, passing them around as you would any other variable. You can pass a function to another function, and it will call your function at specified times (when a timer expires, when the window needs redrawing, or when it needs to compare two elements in your array)
As far as I know (and I could be wrong, because I haven't worked with Java for ages), Java doesn't have a direct equivalent. Instead, you have to create a class, which implements an interface, and defines a function (call it Execute(), for example). And then instead of calling the user-supplied function (in the shape of a function pointer, functor or delegate), you call foo.Execute(). Similar to the C++ implementation in principle, but without the generality of C++ templates, and without the function syntax that allows you to treat function pointers and functors the same way.
So that is where you use function pointers: When more sophisticated alternatives are not available (i.e. you are stuck in C), and you need to pass one function to another. The most common scenario is a callback. You define a function F that you want the system to call when X happens. So you create a function pointer pointing to F, and pass that to the system in question.
So really, forget about John Carmack and don't assume that anything you see in his code will magically make your code better if you copy it. He used function pointers because the games you mention were written in C, where superior alternatives are not available, and not because they are some magical ingredient whose mere existence makes code run faster.
They can be useful if you do not know the functionality supported by your target platform until run-time (e.g. CPU functionality, available memory). The obvious solution is to write functions like this:
int MyFunc()
{
if(SomeFunctionalityCheck())
{
...
}
else
{
...
}
}
If this function is called deep inside of important loops then its probably better to use a function pointer for MyFunc:
int (*MyFunc)() = MyFunc_Default;
int MyFunc_SomeFunctionality()
{
// if(SomeFunctionalityCheck())
..
}
int MyFunc_Default()
{
// else
...
}
int MyFuncInit()
{
if(SomeFunctionalityCheck()) MyFunc = MyFunc_SomeFunctionality;
}
There are other uses of course, like callback functions, executing byte code from memory or for creating an interpreted language.
To execute Intel compatible byte code on Windows, which might be useful for an interpreter. For example, here is an stdcall function returning 42 (0x2A) stored in an array which can be executed:
code = static_cast<unsigned char*>(VirtualAlloc(0, 6, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE));
// mov eax, 42
code[0] = 0x8b;
code[1] = 0x2a;
code[2] = 0x00;
code[3] = 0x00;
code[4] = 0x00;
// ret
code[5] = 0xc3;
// this line executes the code in the byte array
reinterpret_cast<unsigned int (_stdcall *)()>(code)();
...
VirtualFree(code, 6, MEM_RELEASE);
);
As per my personal experience they can can help you save significant lines of code.
Consider the condition:
switch(sample_var)
{
case 0:
func1(<parameters>);
break;
case 1:
func2(<parameters>);
break;
up to case n:
funcn(<parameters>);
break;
}
where func1() ... funcn() are functions with same prototype.
What we could do is:
Declare an array of function pointers arrFuncPoint containing the addresses of functions
func1() to funcn()
Then the whole switch case would be replaced by
*arrFuncPoint[sample_var];
Any time you use a event handler or delegate in C#, you are effectively using a function pointer.
And no, they are not about speed. Function pointers are about convenience.
Jonathan
Function pointers are used as callbacks in many cases. One use is as a comparison function in sorting algorithms. So if you are trying to compare customized objects, you can provide a function pointer to the comparison function that knows how to handle that data.
That said, I'll provide a quote I got from a former professor of mine:
Treat a new C++ feature like you would treat a loaded automatic weapon in a crowded room: never use it just because it looks nifty. Wait until you understand the consequences, don't get cute, write what you know, and know what you write.
These days what are the best and most practical uses of integers in modern c-style languages?
In the dim, dark ages before C++, there was a common pattern I used in my code which was to define a struct with a set of function pointers that (typically) operated on that struct in some way and provided particular behaviors for it. In C++ terms, I was just building a vtable. The difference was that I could side-effect the struct at runtime to change behaviors of individual objects on the fly as needed. This offers a much richer model of inheritance at the cost of stability and ease of debugging. The greatest cost, however, was that there was exactly one person who could write this code effectively: me.
I used this heavily in a UI framework that let me change the way objects got painted, who was the target of commands, and so on, on the fly - something that very few UIs offered.
Having this process formalized in OO languages is better in every meaningful way.
Just speaking of C#, but function pointers are used all over C#. Delegates and Events (and Lambdas, etc) are all function pointers under the hood, so nearly any C# project is going to be riddled with function pointers. Basically every event handler, near every LINQ query, etc - will be using function pointers.
There are occasions when using function pointers can speed up processing. Simple dispatch tables can be used instead of long switch statements or if-then-else sequences.
Function pointers are a poor man's attempt to be functional. You could even make an argument that having function pointers makes a language functional, since you can write higher order functions with them.
Without closures and easy syntax, they're sorta gross. So you tend to use them far less than desireable. Mainly for "callback" functions.
Sometimes, OO design works around using functions by instead creating a whole interface type to pass in the function needed.
C# has closures, so function pointers (which actually store an object so it's not just a raw function, but typed state too) are vastly more usable there.
Edit
One of the comments said there should be a demonstration of higher order functions with function pointers. Any function taking a callback function is a higher order function. Like, say, EnumWindows:
BOOL EnumWindows(
WNDENUMPROC lpEnumFunc,
LPARAM lParam
);
First parameter is the function to pass in, easy enough. But since there are no closures in C, we get this lovely second parameter: "Specifies an application-defined value to be passed to the callback function." That app-defined value allows you to manually pass around untyped state to compensate for lack of closures.
The .NET framework is also filled with similar designs. For instance, IAsyncResult.AsyncState: "Gets a user-defined object that qualifies or contains information about an asynchronous operation." Since the IAR is all you get on your callback, without closures, you need a way to shove some data into the async op so you can cast it out later.
Function pointers are fast
In what context? Compared to?
It sounds like you just want to use function pointers for the sake of using them. That would be bad.
A pointer to a function is normally used as a callback or event handler.
In another question I asked, a comment arose indicating that the .NET framework's Array.Copy method uses unmanaged code. I went digging with Reflector and found the signature one of the Array.Copy method overloads is defined as so:
[MethodImpl(MethodImplOptions.InternalCall), ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);
After looking at this, I'm slightly confused. The source of my confusion is the extern modifier which means (MSDN link):
The extern modifier is used to declare
a method that is implemented
externally.
However, the method declaration is also decorated with a MethodImplOptions.InternalCall attribute, which indicates (MSDN link):
Specifies an internal call. An
internal call is a call to a method
that is implemented within the common
language runtime itself.
Can anyone explain this seemingly apparent contradiction?
I would have just commented on leppie's post, but it was getting a bit long.
I'm currently working on an experimental CLI implementation. There are many cases where a publicly exposed method (or property) can't be implemented without knowledge of how the virtual machine is implemented internally. One example is OffsetToStringData, which requires knowledge of how the memory manager allocates strings.
For cases like this, where there is no C# code to express the method, you can treat each call to the method in a special way internal to the JIT process. As an example here, replacing the call byte code with a ldc.i4 (load constant integer) before passing it to the native code generator. The InternalCall flag means "The body of this method is treated in a special way by the runtime itself." There may or may not be an actual implementation - in several cases in my code the call is treated as an intrinsic by the JIT.
There are other cases where the JIT may have special information available that allows heavy optimization of a method. One example is the Math methods, where even though these can be implemented in C#, specifying InternalCall to make them effectively intrinsics has significant performance benefits.
In C#, a method has to have a body unless it is abstract or extern. The extern means a general "You can call this method from C# code, but the body of it is actually defined elsewhere.". When the JIT reaches a call to an extern method, it looks up where to find the body and behaves in different ways per the result.
The DllImport attribute instructs the JIT to make a P/Invoke stub to call a native code implementation.
The InternalCall flag instructs the JIT to treat the call in a self-defined way.
(There are some others, but I don't have examples off the top of my head for their use.)
InternalCall means provided by the framework.
extern says you are not providing code.
extern can be used in 2 general situations, like above, or with p/invoke.
With p/invoke, you simply tell the method where to get the implementation.