Performance of Calling Unmanaged .dll from C# - c#

How long is the typical overhead added by calling a .dll written in C++ from a C# application using the following syntax?
[DllImport("abc.dll", EntryPoint = "xcFoo", CallingConvention = CallingConvention.Cdecl)]
public extern static Result Foo(out IntPtr session,
[MarshalAs(UnmanagedType.FunctionPtr)]ObjectCallback callback,
UInt64 turnKey,
string serverAddress,
string userId,
string password);
Is there a more efficient way to do it?

Check out this article on how to improve interop performance. What to do and what best to avoid.
http://msdn.microsoft.com/en-us/library/ms998551.aspx

Are you talking about the overhead of invoking the native method? If so, I dont think it is significant at all, as there are a lot of such calls in the .NET framework class libraries.
Now, whether the overhead is significant for your scenario can only be answered by doing performance measurements, and comparing them against what you expect.

The marshalling into the native method will cost three memory allocations from the NT heap, which is not so bad. It's the delegate back that gets worrysome.

A good method to check this sort of thing is throw in a break point where you make calls. Don't know when the library is loaded, so maybe only check the break point on the second call (unless loading cost is your main concern). Then open up the disassembly window in visual studio and see how many lines there are before your dll function is being invoked.

i know this question is old but ive managed to call native functions blazingly fast with not only using calli CIL instruction but also with special trick, but ofc you need to handle pinnig and/or marshalling arguments yourself if you deal with complex types including strings..

Related

Differences in methods to call unmanaged C++ dll from C#

I am calling some unmanaged C functions (in an external dll) from C#. I have 2 different methods to do so, and I am not sure of the differences between the 2 (other than the amount of code)
Method #1
[DllImport("PComm32.dll",CallingConvention=CallingConvention.StdCall, EntryPoint ="PmacSelect")]
public static extern int PmacSelect(IntPtr intPtr);
int device = PmacSelect(IntPrt.Zero);
Method #2
[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate int PmacSelect(IntPrt intptr);
[DllImport("kernel32.dll")]
public static extern IntPtr LoadLibrary(string dllToLoad);
[DllImport("kernel32.dll")]
public static extern IntPtr GetProcAddress(IntPtr hModule, string procedureName);
[DllImport("kernel32.dll")]
public static extern bool FreeLibrary(IntPtr hModule);
public PmacSelect PmacSelectFunction;
private IntPtr pDll = LoadLibrary("PComm32");
IntPtr pAddressOfFunctionToCall = GetProcAddress(pDll, "PmacSelect"); //find the function in the loaded pcomm32 dll
PmacSelectFunction = (PmacSelect)Marshal.GetDelegateForFunctionPointer(pAddressOfFunctionToCall,type(PmacSelect));
int device = PmacSelectFunction(IntPrt.Zero);
Both methods work, and call the PmacSelect Function located in the PComm32.dll file.
My question is what are the functional differences between the 2 methods?
Method #1 must rely on windows managing the DLL in the background for me as needed? Could windows load and unload the dll without my knowledge? I dont really care if it does, as long as it automatically loads is when I call a function in the dll.
Method #2 The DLL is loaded explicitly when I call LoadLibrary. Does the library remain in memory until I free it?
I'll give you answers, but you seem to already understand what is going on.
My question is what are the functional differences between the 2 methods?
There is no functional difference between the two methods. In the first method (which you should use if at all possible), the DotNet framework is handling everything for you. Under the hood, it is doing exactly what you're doing manually: Calling LoadLibrary, GetProcAddress, and at some point, FreeLibrary. Those are the steps to calling a function in a DLL.
Method #1 must rely on windows managing the DLL in the background for me as needed? Could windows load and unload the dll without my knowledge?
Yes, that's exactly right, although I wouldn't say it is without your knowledge. You are telling it to do that when you write [DllImport("PComm32.dll"...)].
Method #2 The DLL is loaded explicitly when I call LoadLibrary. Does the library remain in memory until I free it?
Again, yes, you understand what is happening.
Since you seem to answered your own questions and I've merely confirmed your answers, let you give you the reasons why you should (almost) always use #1:
It works just as well
It is easier to use and easier to read / maintain
There is no value in doing it the hard way (#2)
I can only think of one reason that you would ever want to bother with the second method: If you needed to be able to replace the DLL, on the fly, with a newer version or something, without quitting your application, then you would want the fine grain control over when the DLL is unloaded (so that the file can be replaced).
That is unlikely to be a requirement except in very special cases.
Bottom Line: If you're using C#, you accept the fact that you are giving control to the framework in exchange for being able to focus on the work, and not worry about things like DLL or memory management. The DotNet framework is your friend, let it do the heavy lifting for you, and focus your efforts on the rest of the code.
When you pinvoke with DllImport, LoadLibrary ends up getting called for you, and the exported function is (hopefully) found in the EAT. Basically in your second example you're doing the same thing as a typedef + GetModuleHandle & GetProcAddress for c/c++.
The only reason I could think to use your second method, would be if the DllMain of your unmanaged module executes code upon attachment to a process, which you might depending on the scenario want to have specific timing control over when that module gets loaded into your process.

C# pinvoke free native c++ memory

If i have a c++ function which returns a char* like this:
char *cctalk_app_event_name(DG_CCTALK_APP_EVT_CODE);
And the corresponding C# signature:
[System.Runtime.InteropServices.DllImportAttribute("cctalk.dll", EntryPoint = "cctalk_app_event_name", CallingConvention = CallingConvention.Cdecl)]
public static extern System.IntPtr cctalk_app_event_name(DG_CCTALK_APP_EVT_CODE param0);
If the native code returns a char* allocated with the new keyword, i am for sure gonna have a memleak each time i call this function C#? Is there a way i can free that memory?
My hunch from looking at the function name is that this may well be returning a pointer to a string constant within the DLL, in which case you don't need to worry about freeing the pointer anyway.
If the manual for the SDK (link here) wasn't any use, then I'd disassemble the DLL and look at what that function did, but don't worry about how to delete it before you've established that you need to.
If you don't have any documentation or source then you cannot know whether or not the memory was allocated with new. If it was allocated with new then it needs to be deallocated with delete. That can only be done from the native code. If that is needed then the DLL will need to export a deallocator for you.
Other possibilities include allocation from a shared heap, e.g. the COM heap. Not very likely. Or perhaps the string is statically allocated and does not need deallocation. This final option is usually the case when a functions returns a C string as a return value. If you had to guess, that's the percentage option. In any case, if there's no way for you to deallocate the string, what else can you do?
The only way you can be sure is to have documentation, or source code, or support from the author. I appreciate that you want to know the solution, but your only hope are the options listed in the first sentence of this paragraph.
I find it hard to believe that a library this complex has no documentation. How did you come by this library? Are you really sure there are no docs?

What exactly happens during a "managed-to-native transition"?

I understand that the CLR needs to do marshaling in some cases, but let's say I have:
using System.Runtime.InteropServices;
using System.Security;
[SuppressUnmanagedCodeSecurity]
static class Program
{
[DllImport("kernel32.dll", SetLastError = false)]
static extern int GetVersion();
static void Main()
{
for (; ; )
GetVersion();
}
}
When I break into this program with a debugger, I always see:
Given that there is no marshaling that needs to be done (right?), could someone please explain what's actually happening in this "managed-to-native transition", and why it is necessary?
First the call stack needs to be set up so that a STDCALL can happen. This is the calling convention for Win32.
Next the runtime will push a so called execution frame. There are many different types of frames: security asserts, GC protected regions, native code calls, ...
The runtime uses such a frame to track that currently native code is running. This has implications for a potentially concurrent garbage collection and probably other stuff. It also helps the debugger.
So not a lot is happening here actually. It is a pretty slim code path.
Besides the marshaling layer, which is responsible for converting parameters for you and figuring out calling conventions, the runtime needs to do a few other things to keep internal state consistent.
The security context needs to be checked, to make sure the calling code is allowed to access native methods. The current managed stack frame needs to be saved, so that the runtime can do a stack walk back for things like debugging and exception handling (not to mention native code that calls into a managed callback). Internal bits of state need to be set to indicate that we're currently running native code.
Additionally, registers may need to be saved, depending on what needs to be tracked and which are guaranteed to be restored by the calling convention. GC roots that are in registers (locals) might need to be marked in some way so that they don't get garbage collected during the native method.
So mainly it's stack handling and type marshaling, with some security stuff thrown in. Though it's not a huge amount of stuff, it will represent a significant barrier against calling smaller native methods. For example, trying to P/Invoke into an optimized math library rarely results in a performance win, since the overhead is enough to negate any of the potential benefits. Some performance profiling results are discussed here.
I realise that this has been answered, but I'm surprised that no one has suggested that you show the external code in the debug window. If you right click on the [Native to Managed Transition] line and tick the Show External Code option, you will see exactly which .NET methods are being called in the transition. This may give you a better idea. Here is an example:
I can't really see much that'd be necessary to do. I suspect that it is mainly informative, to indicate to you that part of your call stack shows native functions, and also to indicate that the IDE and debugger may behave differently across that transition (since managed code is handled very differently in the debugger, and some features you expect may not work)
But I guess you should be able to find out simply by inspecting the disassembly around the transition. See if it does anything unusual.
Since you are calling a dll. it needs to go out of the managed environment. It is going into windows core. You are breaking the .net barrier and going into windows code that doesn't run the same as .NET.

Transfer of control and data between C and C#

C# main program needs to call a C program GA.c This C code executes many functions and one function initialize() calls objective() function. But this objective function needs to be written in C#.This call is in a loop in the C code and the C code needs to continue execution after the return from objective() until its main is over and return control
to C# main program.
C# main()
{
//code
call to GA in C;
//remaining code;
}
GA in C:
Ga Main()
{
//code
call to initialize function();
//remaining code
}
initialize function() in GA
{
for(some condition)
{
//code
call to objective(parameter) function in C#;
//code
}
}
How do we do this?
Your unmanaged C code needs to be in a library, not an executable. When a program "calls another program", that means it executes another executable, and any communication between the two processes is either in the form of command-line arguments to the callee coupled with an integer return value to the caller, or via some sort of IPC*. Neither of which allows the passing of a callback function (although equivalent functionality can be built with IPC, it's a lot of trouble).
From this C library, you'll need to export the function(s) you wish to be entry points from the C# code. You can then call this/these exported function(s) with platform invoke in C#.
C library (example for MSVC):
#include <windows.h>
BOOL APIENTRY DllMain(HMODULE hModule, DWORD ul_reason_for_call, LPVOID lpReserved){
switch(ul_reason_for_call){
case DLL_PROCESS_ATTACH:
case DLL_THREAD_ATTACH:
case DLL_THREAD_DETACH:
case DLL_PROCESS_DETACH:
break;
}
return TRUE;
}
#ifdef __cplusplus
extern "C"
#endif
__declspec(dllexport)
void WINAPI Foo(int start, int end, void (CALLBACK *callback)(int i)){
for(int i = start; i <= end; i++)
callback(i);
}
C# program:
using System;
using System.Runtime.InteropServices;
static class Program{
delegate void FooCallback(int i);
[DllImport(#"C:\Path\To\Unmanaged\C.dll")]
static extern void Foo(int start, int end, FooCallback callback);
static void Main(){
FooCallback callback = i=>Console.WriteLine(i);
Foo(0, 10, callback);
GC.KeepAlive(callback); // to keep the GC from collecting the delegate
}
}
This is working example code. Expand it to your needs.
A note about P/Invoke
Not that you asked, but there are two typical cases where platform invoke is used:
To leverage "legacy" code. A couple of good uses here:
To make use of existing code from your own code base. For instance, your company might want a brand new GUI for their accounting software, but choose to P/Invoke to the old business layer so as to avoid the time and expense of rewriting and testing a new implementation.
To interface with third-party C code. For instance, a lot of .NET applications use P/Invoke to access native Windows API functionality not exposed through the BCL.
To optimize performance-critical sections of code. Finding a bottleneck in a certain routine, a developer might decide to drop down to native code for this routine in an attempt to get more speed.
It is in this second case that there is usually a misjudgment. A number of considerations usually prove this to be a bad idea:
There is rarely a significant speed benefit to be obtained by using unmanaged code. This is a hard one for a lot of developers to swallow, but well-written managed code usually (though not always) performs nearly as fast as well-written unmanaged code. In a few cases, it can perform faster. There are some good discussions on this topic here on SO and elsewhere on the Net, if you're interested in searching for them.
Some of the techniques that can make unmanaged code more performant can also be done in C#. Primarily, I'm referring here to unsafe code blocks in C#, which allow one to use pointers, bypassing array boundary checking. In addition, straight C code is usually written in a procedural fashion, eliminating the slight overhead that comes from object-oriented code. C# can also be written procedurally, using static methods and static fields. While unsafe code and gratuitous use of static members are generally best avoided, I'd say that they are preferable to mixing managed and unmanaged code.
Managed code is garbaged-collected, while unmanaged code is usually not. While this is mostly a speed benefit while coding, it is sometimes a speed benefit at runtime, too. When one has to manage one's own memory, there is often a bit of overhead involved, such as passing an additional parameter to functions denoting the size of a block of memory. There is also eager destruction and deallocation, a necessity in most unmanaged code, whereas managed code can offload these tasks to the lazy collector, where they can be performed later, perhaps when the CPU isn't so busy doing real work. From what I've read, garbage collection also means that allocations can be faster than in unmanaged code. Lastly, some amount of manual memory management is possible in C#, using Managed.AllocHGlobal and unsafe pointers, and this might allow one to make fewer larger allocations instead of many smaller ones. Another technique is to convert types used in large arrays to value types instead of reference types, so that the memory for the entire array is allocated in one block.
Often overlooked is the cost within the platform invoke layer. This can outweigh small native code performance gains, especially when many transitions from managed to unmanaged (or vice versa, such as with your callback function) must occur. And this cost can increase exponentially when marshaling must take place.
There's a maintenance hassle when splitting your code between managed and unmanaged components. It means maintaining logic in two different projects, possibly using two different development environments, and possibly even requiring two different developers with different skill sets. The typical C# developer is not a good C developer, and vice versa. At minimum, having the code split this way will be a mental stumbling block for any new maintainers of the project.
Oftentimes, a better performance gain can be had by just rethinking the existing implementation, and rewriting it with a new approach. In fact, I'd say that most real performance gains that are achieved when bottleneck code is rewritten for a "faster" platform are probably directly due to the developer being forced to rethink the problem.
Sometimes, the code that is chosen to be dropped out into unmanaged is not the real bottleneck. People too often make assumptions about what is slowing them down without doing actual profiling to verify. Profiling can often reveal inefficiencies that can be correctly fairly easily without dropping down to a lower-level platform.
If you find yourself faced with a temptation to mix platforms to increase performance, keep these pitfalls in mind.
* There is one more thing, sort of. The parent process can redirect the stdin and stdout streams of the child process, implementing character-based message passing via stdio. This is really just an IPC mechanism, it's just one that's been around longer than the term "IPC" (AFAIK).
This is known as a callback. When you create an instance of GA, pass it your c# objective() method as a delegate (a delegate is reference to a class method). Look for the MSDN help topic on delegates in C#.
I don't know the proper syntax for the C side of this. And there are sure to be some special considerations for calling out to unmanaged code. Someone else is bound to provide the whole answer. :)

Will there be any performance issue using C++ inside C#?

I have a C++ program which does text processing on 40k records. We developed this program in C++ because we thought it would be faster. Then I used/executed this C++ part inside my C# program using the process-execute but the problem is we feel like we lost control of the execution flow: not able to debug the C++ part. I want to integrate the C++ much more in my C# program. I googled and found that I have to generate a DLL for my C++ and then i can use it inside my C# program.
Question:
Will this slow down the execution of the C++ part?
Is there any other better alternative to integrate the C++ part inside my c# program?
You have a few options here:
Write the processing in .NET and measure the performance. If it is unacceptable try to optimize it. If it is still too slow you revert to unmanaged code. But thinking that unmanaged code will be faster and for this reason writing unmanaged code without measuring IMHO is wrong approach.
As you already wrote unmanaged code you can expose it as a dynamic link library by exporting a function that will do the processing:
extern "C" __declspec(dllexport) int DoProcessing(int);
Next you import the function in managed code:
class Program
{
[DllImport("mylibrary.dll")]
static extern int DoProcessing(int input);
static void Main()
{
int result = DoProcessing(123);
}
}
This works if the input and output of your processing is not very complex and can be easily marshaled. It will have very little overhead.
Compile the unmanaged code using C++ CLI as managed assembly and reference it directly.
Wrapping C++ code inside DLL will not slow it down in any way.
Yes there is a (slight) performance penalty for calling functions in DLL as opposed in the executable - for instance the compiler cannot inline calls. But this often is completely negligible overhead (3-5 CPU instructions)
This is probably the simplest way.
You can't tell if this will be fast enough to meet your goals without measuring. Do it the simplest way possible (wrap the existing C++ code inside a DLL) and see if it meets your performance goals. I'm guessing it probably will.
Calling native code from managed does have some overhead per each method call - if your program is heavily compute bound and will be calling the native methods many times per record, you may see a slow-down due to the interop. If your code calls the native code once to process all 40k records in bulk, the cost of doing interop will be greatly dwarfed by the actual time spent processing the records. If the records are coming from a slower storage media, such as over the network, your processing time will probably be negligible compared to the I/O time.
Try to implement it in C#.
40k records seems like a VERY low number. It may be (depending on how much processing you need to do on each record) that processing the 40k records in C# is actually faster than even spawning the process like you currently do.
Other that that compile your C app to a dll and load that in-process. That will still have some overhead, but it will be WAY smaller than spawning an additional process
I agree with AdamRalph - I do not think you gained anything but integration pains by writing this code in CPP.
BTW is CPP code managed? if it is why do not you just link it into your C# code and avoid all interop overhead

Categories

Resources