When we import legacy dlls in C# we use something like the following notation:
[DllImport("user32.dll")] // Why am I enclosed in "["s
static extern int MessageBoxA(int hWnd, string strMsg, string strCaption, int iType);
OR also:
[MarshalAs(UnmanagedType.LPStr)] // <-- What in the world is it?
string arg1,
as mentioned here
However, this notation is not exclusively used for interop Services only like here, like:
[Conditional("DOT")] // <--- this guy right here!
static void MethodB()
{
Console.WriteLine(false);
}
but it is not listed as a preprocessor directive at msdn
What is this notation called? Where can I find literature or documentation for it?
These are attributes. They're not "preprocessor" parts of the language - unlike things like #if, #pragma (which are still not really handles by a preprocessor, but are meant to be thought of that way).
Basically, attributes allow you to express compile-time constant metadata about types, fields, methods, parameters, and return values. That metadata can then be retrieved at execution time via reflection.
One important thing to know in terms of finding documentation: the C# compiler will attempt to resolve an attribute like this:
[Foo]
as both Foo and FooAttribute. So your [MarshalAs] example actually refers to MarshalAsAttribute. Conventionally, all attributes end with an Attribute suffix.
Related
I wanted to answer this question and thought I would look inside of the Array source code to see how it is implemented. Therefore, I looked in .NET source code for the CreateInstance method and found that it calls an external method whose body is a semi-colon and implemented elsewhere. Here is what it looks like:
private unsafe static extern Array
InternalCreate(void* elementType,int rank,int *pLengths,int *pLowerBounds);
Question:
How do I find where the implementation for the above external method is?
To find the source code for any extern methods, do the following:
Find the name of the extern method. In my case it is InternalCreate.
Go here and find the mapping of the method to the external method. In my case I needed to find InternalCreate and here is what the mapping looks like. The name of the class is ArrayNative and the method is CreateInstance:
FCFuncElement("InternalCreate", ArrayNative::CreateInstance)
Find the mapped class here. In my case I needed arraynative and I needed the method CreateInstance. The implementation is right there and I am copying it here but removing the body for brevity:
FCIMPL4(Object*, ArrayNative::CreateInstance,
void* elementTypeHandle, INT32 rank, INT32* pLengths, INT32* pLowerBounds)
{
//...
}
There you will find the implementation and study the code.
I am currently writing a thin C# binding for OpenGL. I've just recently implemented the OpenGL GenVertexArrays function, which has the following signature:
OpenGL Documentation on glGenVertexArrays.
Essentially, you pass it an array in which to store generated object values for the vertex arrays created by OpenGL.
In order to create the binding, I use delegates as glGenVertexArrays is an OpenGL extension function, so I have to load it dynamically using wglGetProcAddress. The delegate signature I have defined in C# looks like this:
[SuppressUnmanagedCodeSecurity]
[UnmanagedFunctionPointer(CallingConvention.StdCall)]
private delegate void glGenVertexArrays(uint amount, uint[] array);
The function pointer is retrieved and converted to this delegate using Marshal.GetDelegateForFunctionPointer, like this:
IntPtr proc = wglGetProcAddress(name);
del = Marshal.GetDelegateForFunctionPointer(proc, delegateType);
Anyways, here's what bothers me:
In any official documentation I can find on default marshalling behaviour for reference types (which includes arrays), is this:
By default, reference types (classes, arrays, strings, and interfaces)
passed by value are marshaled as In parameters for performance
reasons. You do not see changes to these types unless you apply
InAttribute and OutAttribute (or just OutAttribute) to the method
parameter.
This is taken from this MSDN page: MSDN page on directional attributes
However, as can be seen from my delegate signatures, the [In] and [Out] directional attributes have not been used on the array of unsigned integers, meaning when I call this function I should actually not be able to see the generated object values which OpenGL should have stored in them. Except, I am. Using this signature, I can the following result when running the debugger:
As can be seen, the call absolutely did affect the array, even though I did not explicitly use the [Out] attribute. This is not, from what I understand, a result I should expect.
Does anyone know the reason behind this? I know it might seem as a minor deal, but I am very curious to know why this seems to break the default marshalling behaviour described by Microsoft. Is there some behind-the-scenes stuff going on when invoking delegates compared to pure platform invoke prototypes? Or am I misinterpreting the documentation?
[EDIT]
For anyone curious, the public method that invokes the delegate is defined on a static "GL" class, and is as followed:
public static void GenVertexArrays(uint amount, uint[] array)
{
InvokeExtensionFunction<glGenVertexArrays>()(amount, array);
}
It is not mentioned on the documentation page you linked, but there is another topic dedicated to the marshaling of arrays, where it says:
With pinning optimization, a blittable array can appear to operate as an In/Out parameter when interacting with objects in the same apartment.
Both conditions are met in your case: array of uint is blittable, and there is no machine-to-machine marshaling. It is still a good idea to declare it [Out], so your intention is documented within the code.
The documentation is correct in the general case. But uint is a bit special, it is a blittable type. An expensive word that means that the pinvoke marshaller does not have to do anything special to convert the array element values. An uint in C# is exactly the same type as an unsigned int in C. Not a coincidence at all, it is the kind of type that a processor can handle natively.
So the marshaller can simply pin the array and pass a pointer to the first array element as the second argument. Very fast, always what you want. And the function scribbles directly into the managed array, so copying the values back is not necessary. A bit dangerous too, you never ever want to lie about the amount argument, GC heap corruption is an excessively ugly bug to diagnose.
Most simple value types and structs of simple values types are blittable. bool is a notable exception. You'll otherwise never have to be sorry for using [Out] even if it is not necessary. The marshaller simply ignores it here.
I have arrived at a point in my self-taught studies where I am not fully grasping what a delegate in C# is useful for. Additionally, on a whim, I decided to take a look at an intro to C++ site and I came across function templates.
Maybe I'm comparing apples and oranges here, but I understood a delegate to be a sort of template for a function that would be defined at run-time. Is this true? If so, how does that differ from a function template in C++?
Is it possible to see (realistic) examples of each in use?
A delegate is a way of taking a member function of some object, and creating a...thing that can be called independently.
In other words, if you have some object A, with some member function F, that you'd normally call as something like: A.F(1);, a delegate is a single entity that you can (for example) pass as a parameter, that acts as a proxy for that object/member function, so when the delegate is invoked, it's equivalent to invoking that member function of that object.
It's a way of taking existing code, and...packaging it to make it easier to use in a fairly specific way (though I feel obliged to add, that 'way' is quite versatile so delegates can be extremely useful).
A C++ function template is a way of generating functions. It specifies some set of actions to take, but does not specify the specific type of object on which those actions will happen. The specification is at a syntactic level, so (for example) I can specify adding two things together to get a third item that's their sum. If I apply that to numbers, it sums like you'd expect. If I do the same with strings, it'll typically concatenate the strings. This is because (syntactically) the template just specifies something like a+b, but + is defined to do addition of numbers, and concatenation of strings.
Looked at slightly differently, a function template just specifies the skeleton for some code. The rest of that code's body is "filled in" based on the type, when you instantiate the template over some specific type.
In C++ terms a C# delegate combines an object pointer and a member function pointer into one callable entity, which calls the member function on the pointed to object.
You can do about the same with std::bind and std::function.
Before C++11 there was a short flurry of articles on how to do very efficient delegates in C++. std::function is a very reasonable compromise and may even attain those levels of efficiency.
Example:
#include <iostream>
#include <functional>
using namespace std;
// Here `function<void()>` serves as a "delegate" type.
void callback_on( function<void()> const f )
{
f();
}
struct S
{
int x;
void foo() const { cout << x << endl; }
};
int main()
{
S o = {42};
callback_on( bind( &S::foo, &o ) );
}
Given the following c++ class in foo.dll
class a{
private:
int _answer;
public:
a(int answer) { _answer = answer; }
__declspec(dllexport) int GetAnswer() { return _answer; }
}
I would like the pInvoke GetAnswer from C#. To do that, I use the following method:
[DllImport("foo.dll", CallingConvention = CallingConvention.ThisCall, EntryPoint= "something")]
public static extern int GetAnswer(IntPtr thisA);
And I pass in an IntPtr that points to an a (that I got from somewhere else, it's not important). CallingConvention = CallingConvention.ThisCall makes sure it's handled correctly
What's cool about this question is that I know I'm right so far because it's already working great! Using Depends.exe, I can see that "GetAnswer" is exported as ?GetAnswer#a##UAEHXZ (Or something close - the point being that it's been name mangled). When I plug the mangled name into the "something" for the EntryPoint everything works great! It took me about a day before it dawned on me to use Depends.exe, so I'm going to leave this here as a help to anybody who has a similar issue.
My REAL Question is: Is there any way to disable C++ name mangling on GetAnswer so that I don't need to put the mangled name in as my entry point. Having the mangled name in there seems like it could break, because my understanding of name mangling is that it can change if the compiler changes. Also it's a pain in the butt to use Depends.exe for every instance method that I want to pInvoke.
Edit: Forgot to add what I've tried:
I don't seem to be able to put extern "C" on the function declaration, although I can stick it on the definition. This doesn't seem to help though (which is obvious when you think about it)
The only other solution I can think of is a c-style function that wraps the instance method and takes an instance of an a as a parameter. Then, disable name mangling on that wrapper and pInvoke that. I'd rather stick with the solution that I already have, though. I already told my co-workers that pInvoke is great. I'm going to look like an idiot if I have to put special functions in our c++ library just to make pInvoke work.
You cannot disable mangling for a C++ class method, but you may well be able to export the function under a name of your choice using /EXPORT or a .def file.
However, your entire approach is brittle because you rely on an implementation detail, namely that this is passed as an implicit parameter. And what's more, exporting individual methods of a class is a recipe for pain.
The most sensible strategies for exposing a C++ class to .net languages are:
Create flat C wrapper functions and p/invoke those.
Create a C++/CLI mixed mode layer that publishes a managed class that wraps the native class.
Option 2 is preferable in my opinion.
You may be able to use the comment/linker #pragma to pass the /EXPORT switch to the linker which should allow you to rename the exported symbol:
#pragma comment(linker, "/EXPORT:GetAnswer=?GetAnswer#a##UAEHXZ")
Unfortunately, this does not resolve your need to look up the mangled name using depends or some other tool.
You do not have to disable the mangled name which actually contains lots of information of how the function itself is declared, it basically represents the whole signature of the function after the function name gets de-mangled. I understand you already found a word-around and the other answer has been marked as a correct answer. What I am writing below is how we can make it work as you desired.
[DllImport("foo.dll", CallingConvention = CallingConvention.ThisCall, EntryPoint = "#OrdinalNumber")]
public static extern int GetAnswer(IntPtr thisA);
If you replace "#OrdinalNumber" with the real ordinal number of GetAnsweer, such as "#1", it will work as you desired.
You may just consider the EntryPoint property is the same as the function name we pass to GetProcAddress where you can either pass the function name or the ordinal number of the function.
Your approach to calling non-static function members of a C++ class is indeed correct and thiscall is used correctly and that is exactly thiscall calling convention comes in play in C# P/Invoke. The issue with this approach is that you will have to look into the DLL's PE information, Export Function Information and find out the ordinal number for each function you would like to call, if you have a big number of C++ functions to call, you may want to automate such a process.
From the Question Author: The solution I actually went with
I ended up going with a c-style function that wraps the instance method and takes an instance of an a as a parameter. That way, if the class ever does get inherited from, the right virtual method will get called.
I deliberately chose not to go with C++/CLI because it's just one more project to manage. If I needed to use all of the methods on a class, I would consider it, but I really only need this one method that serializes the class data.
In another question I asked, a comment arose indicating that the .NET framework's Array.Copy method uses unmanaged code. I went digging with Reflector and found the signature one of the Array.Copy method overloads is defined as so:
[MethodImpl(MethodImplOptions.InternalCall), ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);
After looking at this, I'm slightly confused. The source of my confusion is the extern modifier which means (MSDN link):
The extern modifier is used to declare
a method that is implemented
externally.
However, the method declaration is also decorated with a MethodImplOptions.InternalCall attribute, which indicates (MSDN link):
Specifies an internal call. An
internal call is a call to a method
that is implemented within the common
language runtime itself.
Can anyone explain this seemingly apparent contradiction?
I would have just commented on leppie's post, but it was getting a bit long.
I'm currently working on an experimental CLI implementation. There are many cases where a publicly exposed method (or property) can't be implemented without knowledge of how the virtual machine is implemented internally. One example is OffsetToStringData, which requires knowledge of how the memory manager allocates strings.
For cases like this, where there is no C# code to express the method, you can treat each call to the method in a special way internal to the JIT process. As an example here, replacing the call byte code with a ldc.i4 (load constant integer) before passing it to the native code generator. The InternalCall flag means "The body of this method is treated in a special way by the runtime itself." There may or may not be an actual implementation - in several cases in my code the call is treated as an intrinsic by the JIT.
There are other cases where the JIT may have special information available that allows heavy optimization of a method. One example is the Math methods, where even though these can be implemented in C#, specifying InternalCall to make them effectively intrinsics has significant performance benefits.
In C#, a method has to have a body unless it is abstract or extern. The extern means a general "You can call this method from C# code, but the body of it is actually defined elsewhere.". When the JIT reaches a call to an extern method, it looks up where to find the body and behaves in different ways per the result.
The DllImport attribute instructs the JIT to make a P/Invoke stub to call a native code implementation.
The InternalCall flag instructs the JIT to treat the call in a self-defined way.
(There are some others, but I don't have examples off the top of my head for their use.)
InternalCall means provided by the framework.
extern says you are not providing code.
extern can be used in 2 general situations, like above, or with p/invoke.
With p/invoke, you simply tell the method where to get the implementation.