I am currently browsing decompiled C# IL (with ILSpy) to get an impression on how some of the methods from System.Runtime.InteropServices are (could be) implemented.
When I wanted to check how Marshal.Copy() is implemented, I found out that it only calls CopyToNative(), which is defined as follows:
// System.Runtime.InteropServices.Marshal
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern void CopyToNative(object source, int startIndex, IntPtr destination, int length);
Where is it implemented?
Is there any chance to look at its (decompiled) source code?
If not, does anyone have a clue on how it could be implemented?
The MethodImplAttribute and extern keyword indicate that this function is internal to the .NET runtime itself. This function is almost certainly implemented in C or C++ in the runtime's source code.
Without access to the runtime's source code your only real option is to disassemble this particular function and examine the assembly. (Please note that doing this may violate the EULA of your runtime.)
You might consider looking at Mono's source code to get a look at one possible implementation.
Marshal.Copy() implementations
copy_to_unmanaged() runtime-internal implementation
Related
In NetCore, MS added some new functions to copy memory, I want to know which is the best in performance.
Note: There is a similar answer for .NET Framework but NetCore has some drastical changes!
Cannot see any source code
Unsafe.CopyBlock - memcpy ?
Unsafe.CopyBlockUnaligned - will omit compiler check for aligned memory ?
These looks to be using same methods internally.
Array.Copy - user Buffer class.
Buffer.BlockCopy - .NET Framework used memcpy, using memmove in Core.
Buffer.MemoryCopy - docs suggest its using unsafe pointers to copy 64bit word per op, internally memmove.
I have used https://source.dot.net/ to inspect NetCore source code.
Unsafe.CopyBlock - memcpy ?
Unsafe.CopyBlockUnaligned - will omit compiler check for aligned memory ?
This is probably no longer useful to OP, but I thought I'd point out that it is the opposite.
CopyBlock - omits the alignment check (meaning it will throw exception if CPU does not support unaligned pointers).
CopyBlockUnaligned - includes the safety check (it adds additional code to properly handle unaligned pointers). It is not a compiler check since pointers can't be validated at compile time, but an additional IL instruction.
I have found out that following methods are heavily optimized based on the data length you are trying to copy.
If source and destination memory overlaps they will use memmove (aligning and moving memory), otherwise bulkcopy (copying data in chunks)
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern void __BulkMoveWithWriteBarrier(ref byte destination, ref byte source, nuint byteCount);
[DllImport(RuntimeHelpers.QCall, CharSet = CharSet.Unicode)]
private static extern unsafe void __Memmove(byte* dest, byte* src, nuint len);
Array.Copy, Buffer.BlockCopy, Buffer.MemoryCopy, Span.CopyTo, Span.TryCopyTo are calling a generic Buffer.Memmove that will decide which one of the two to use.
System.Buffer.MemoryCopy seems to use a custom implementation
instead of calling the native memcpy function or using the cpblk
instruction (and it seems to dynamically switch between a (custom)
memcpy and a QCall memmove).
Some answers that helped:
How is Array.Copy implemented in C#?
Where is the implementation of "_Memmove" in the core library souce code
I wanted to answer this question and thought I would look inside of the Array source code to see how it is implemented. Therefore, I looked in .NET source code for the CreateInstance method and found that it calls an external method whose body is a semi-colon and implemented elsewhere. Here is what it looks like:
private unsafe static extern Array
InternalCreate(void* elementType,int rank,int *pLengths,int *pLowerBounds);
Question:
How do I find where the implementation for the above external method is?
To find the source code for any extern methods, do the following:
Find the name of the extern method. In my case it is InternalCreate.
Go here and find the mapping of the method to the external method. In my case I needed to find InternalCreate and here is what the mapping looks like. The name of the class is ArrayNative and the method is CreateInstance:
FCFuncElement("InternalCreate", ArrayNative::CreateInstance)
Find the mapped class here. In my case I needed arraynative and I needed the method CreateInstance. The implementation is right there and I am copying it here but removing the body for brevity:
FCIMPL4(Object*, ArrayNative::CreateInstance,
void* elementTypeHandle, INT32 rank, INT32* pLengths, INT32* pLowerBounds)
{
//...
}
There you will find the implementation and study the code.
Given the following c++ class in foo.dll
class a{
private:
int _answer;
public:
a(int answer) { _answer = answer; }
__declspec(dllexport) int GetAnswer() { return _answer; }
}
I would like the pInvoke GetAnswer from C#. To do that, I use the following method:
[DllImport("foo.dll", CallingConvention = CallingConvention.ThisCall, EntryPoint= "something")]
public static extern int GetAnswer(IntPtr thisA);
And I pass in an IntPtr that points to an a (that I got from somewhere else, it's not important). CallingConvention = CallingConvention.ThisCall makes sure it's handled correctly
What's cool about this question is that I know I'm right so far because it's already working great! Using Depends.exe, I can see that "GetAnswer" is exported as ?GetAnswer#a##UAEHXZ (Or something close - the point being that it's been name mangled). When I plug the mangled name into the "something" for the EntryPoint everything works great! It took me about a day before it dawned on me to use Depends.exe, so I'm going to leave this here as a help to anybody who has a similar issue.
My REAL Question is: Is there any way to disable C++ name mangling on GetAnswer so that I don't need to put the mangled name in as my entry point. Having the mangled name in there seems like it could break, because my understanding of name mangling is that it can change if the compiler changes. Also it's a pain in the butt to use Depends.exe for every instance method that I want to pInvoke.
Edit: Forgot to add what I've tried:
I don't seem to be able to put extern "C" on the function declaration, although I can stick it on the definition. This doesn't seem to help though (which is obvious when you think about it)
The only other solution I can think of is a c-style function that wraps the instance method and takes an instance of an a as a parameter. Then, disable name mangling on that wrapper and pInvoke that. I'd rather stick with the solution that I already have, though. I already told my co-workers that pInvoke is great. I'm going to look like an idiot if I have to put special functions in our c++ library just to make pInvoke work.
You cannot disable mangling for a C++ class method, but you may well be able to export the function under a name of your choice using /EXPORT or a .def file.
However, your entire approach is brittle because you rely on an implementation detail, namely that this is passed as an implicit parameter. And what's more, exporting individual methods of a class is a recipe for pain.
The most sensible strategies for exposing a C++ class to .net languages are:
Create flat C wrapper functions and p/invoke those.
Create a C++/CLI mixed mode layer that publishes a managed class that wraps the native class.
Option 2 is preferable in my opinion.
You may be able to use the comment/linker #pragma to pass the /EXPORT switch to the linker which should allow you to rename the exported symbol:
#pragma comment(linker, "/EXPORT:GetAnswer=?GetAnswer#a##UAEHXZ")
Unfortunately, this does not resolve your need to look up the mangled name using depends or some other tool.
You do not have to disable the mangled name which actually contains lots of information of how the function itself is declared, it basically represents the whole signature of the function after the function name gets de-mangled. I understand you already found a word-around and the other answer has been marked as a correct answer. What I am writing below is how we can make it work as you desired.
[DllImport("foo.dll", CallingConvention = CallingConvention.ThisCall, EntryPoint = "#OrdinalNumber")]
public static extern int GetAnswer(IntPtr thisA);
If you replace "#OrdinalNumber" with the real ordinal number of GetAnsweer, such as "#1", it will work as you desired.
You may just consider the EntryPoint property is the same as the function name we pass to GetProcAddress where you can either pass the function name or the ordinal number of the function.
Your approach to calling non-static function members of a C++ class is indeed correct and thiscall is used correctly and that is exactly thiscall calling convention comes in play in C# P/Invoke. The issue with this approach is that you will have to look into the DLL's PE information, Export Function Information and find out the ordinal number for each function you would like to call, if you have a big number of C++ functions to call, you may want to automate such a process.
From the Question Author: The solution I actually went with
I ended up going with a c-style function that wraps the instance method and takes an instance of an a as a parameter. That way, if the class ever does get inherited from, the right virtual method will get called.
I deliberately chose not to go with C++/CLI because it's just one more project to manage. If I needed to use all of the methods on a class, I would consider it, but I really only need this one method that serializes the class data.
I am developing a C# application. Since I have some algorithms for least-squares fit in C/C++ that would be too cumbersome too translate, I have made the C++ code into a dll and then created a wrapper in C#.
In the C# code, I have defined a struct that is passed to the unmanaged C++ code as a pointer. The struct contains the initial guesstimates for the fitting functions, and it is also used to return the results of the fit.
It appears to me that you must define the struct in both the managed and the unmanaged code. However, someone using my source code in in the future might decide to change the fields of the struct in the C# application, without understanding that they also have to change the struct in the native code. This will result in a run-time error at best (and produce erroneous results at worse), but there will be no error message telling the developer / end user what is wrong.
From my understanding, it impossible to create a test in the unmanaged C++ DLL that checks if the struct contains the correct fields, but is it posible to have the DLL returning a struct of the correct format to the C# wrapper?
Otherwise, what ways are there to reduce the risk that some careless programmer in the future causes run-time errors that are hard to detect?
Sample code:
//C++
struct InputOutputStruct {
double a,b,c;
}
extern "C" __declspec(dllexport) void DoSomethingToStruct(InputOutputStruct* s)
{
// ... algorithm
}
//C#
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Sequential)]
public struct InputOutputStruct {
public double a,b,c;
}
[DllImport("CpluplusDll.dll")]
public static unsafe extern bool DoSomethingToStruct(InputOutputStruct* s);
class CSharpWrapper {
static void Main(string[] args)
{
InputOutputStruct s = new InputOutputStruct();
unsafe {
InputOutpustruct* sPtr = &s;
DoSomethingToStruct(sPtr);
s = *sPtr;
}
}
}
It appears to me that you must define the struct in both the managed
and the unmanaged code.
Not true. This is what C++/CLI was invented for- facilitate much easier interoperation with C++ and .NET.
but is it posible to have the DLL returning a struct of the correct format to the C# wrapper?
I don't think so, because you always need to define the structs on the C# side.
Here are a solution which may work ( never tested ):
Give each struct which is shared a unique identifier on both sides ( GUID, Macros )
Create a reflection for C++ which contains informations about the the types which are used on C# and C++ side. This could be done by using macros.
Compare the C++ structs and the C# structs on startup by using the GUIDs, reflection and macros. You can also use sizeof to compare sizes first.
That could be a lot of work
When working on C++ side you still can make a lot of things wrong when you not know about the macros
In another question I asked, a comment arose indicating that the .NET framework's Array.Copy method uses unmanaged code. I went digging with Reflector and found the signature one of the Array.Copy method overloads is defined as so:
[MethodImpl(MethodImplOptions.InternalCall), ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);
After looking at this, I'm slightly confused. The source of my confusion is the extern modifier which means (MSDN link):
The extern modifier is used to declare
a method that is implemented
externally.
However, the method declaration is also decorated with a MethodImplOptions.InternalCall attribute, which indicates (MSDN link):
Specifies an internal call. An
internal call is a call to a method
that is implemented within the common
language runtime itself.
Can anyone explain this seemingly apparent contradiction?
I would have just commented on leppie's post, but it was getting a bit long.
I'm currently working on an experimental CLI implementation. There are many cases where a publicly exposed method (or property) can't be implemented without knowledge of how the virtual machine is implemented internally. One example is OffsetToStringData, which requires knowledge of how the memory manager allocates strings.
For cases like this, where there is no C# code to express the method, you can treat each call to the method in a special way internal to the JIT process. As an example here, replacing the call byte code with a ldc.i4 (load constant integer) before passing it to the native code generator. The InternalCall flag means "The body of this method is treated in a special way by the runtime itself." There may or may not be an actual implementation - in several cases in my code the call is treated as an intrinsic by the JIT.
There are other cases where the JIT may have special information available that allows heavy optimization of a method. One example is the Math methods, where even though these can be implemented in C#, specifying InternalCall to make them effectively intrinsics has significant performance benefits.
In C#, a method has to have a body unless it is abstract or extern. The extern means a general "You can call this method from C# code, but the body of it is actually defined elsewhere.". When the JIT reaches a call to an extern method, it looks up where to find the body and behaves in different ways per the result.
The DllImport attribute instructs the JIT to make a P/Invoke stub to call a native code implementation.
The InternalCall flag instructs the JIT to treat the call in a self-defined way.
(There are some others, but I don't have examples off the top of my head for their use.)
InternalCall means provided by the framework.
extern says you are not providing code.
extern can be used in 2 general situations, like above, or with p/invoke.
With p/invoke, you simply tell the method where to get the implementation.