My application is written in native C++. I want to use some C# code that deals with DB stuff.
I've been reading into the /clr flag and have some questions related to this.
I am aware that the compiler will add a thunk for each call to a
managed function, however, I only need a single call. To minimize
the effect of the /clr flag, I only want to compile a single cpp
file with this flag. Would this affect the rest of the application?
What would be the effect of compiling my entire application (which is fully unmanaged) with the /clr flag? As long as the code is fully unmanaged (and calls no managed code), I would assume the compiler to hardly add any code, is that true?
I guess I don't need /clr:pure, as I don't want to switch to the C++/CLI framework, I only want access to some C# code in a very isolated part of the code, does that make sense?
Regards,
Ben
Related
I have read this and this and was wondering if I use in C# functions from unmanaged C++ library via C# wrapper of this library, is there going to be any difference in performance compared with the same program, but written fully with unmanaged C++ and the C++ library? I am asking about crucial performance difference bigger then 1.5 times. Notice I am asking about the function performance of the C++ library only(in the two ways - with and without the use of C# wrapper), isolating the other code!
After edit:
I was just wondering if I want to use C++ dynamic unmanaged library(.dll) in C# and I am using wrapper - which is going to be compiled to intermediate CIL code and which is not. I guess only the wrapper is being compiled to CIL, and when C# want to use C++ function from the library it is just parsing and passing the arguments to C++ function with the use of the wrapper, so there will be maybe some delay, but not like if I write the whole library via C#. Correct me if I am mistaking please.
Of course, there is overhead involved in switching from managed to unmanaged code execution. It is very modest, takes about 12 cpu cycles. All that needs to be done is write a "cookie" on the stack so that the garbage collector can recognize that subsequent stack frames belong to unmanaged code and therefore should not be inspected for valid object references.
These cookies are strung together like a linked-list, supporting the scenario where C# code calls native code which in turn calls back into managed code. Traversed by the GC when it collects. Not as uncommon as it sounds, it happens in any GUI app for example. The Click event is a good example, triggered when the UI thread pinvokes GetMessage().
Not the only thing that needs to happen however, in any practical scenario you also pass arguments to the native function. They can require a lot more work to get marshaled into a format that the native code can understand. Arrays in particular, they'll need to get pinned if the array elements are blittable, that's still pretty cheap. Gets expensive when the entire array needs to be converted because the element is not blittable. Not always easy to recognize, a profiler is forever the proper tool to detect inefficient code.
I just found out I could unload DLL which is implicit linking with the function FreeLibrary() in C#. I remember I couldn't do this in C++, but it works well in my simple test project. I wonder if this would be okay in my real projects too. Is it safe to use this method?
Fairly vague, I'll have to assume you talk about DLLs that got loaded through pinvoke. Yes, there is no protection against calling FreeLibrary() twice. Works in C++ as well btw for a DLL that's loaded explicitly. Not for implicitly loaded DLLs, they get a reference count of "infinity".
The pinvoke marshaller uses LoadLibrary() under the hood, happens when the very first [DllImport] function gets executed. The OS loader simply keeps a reference count, every LoadLibrary() call increments it and FreeLibrary() decrements. When it reaches 0 then it gets unloaded. So if you pinvoke LoadLibrary() yourself and call FreeLibrary() twice then the DLL does get unloaded. The virtual address space formerly used by the memory-mapped file that maps the code in the DLL is released and can be used again by subsequent allocations.
Safe, no, that's not a word that jumps to mind. When you accidentally call an entrypoint in the DLL then your program is going to behave very poorly. The pinvoke marshaller cannot do anything about it, the stub for the native method was already generated. Odds for an AccessViolationException are decent but not guaranteed. Arbitrary code execution is technically possible.
The only truly safe way to do this is to ensure that the AppDomain that contains the pinvoke code is unloaded. You get no help with this, just a rule you have to implement yourself.
Okay, I messed something up. I've written in C++ a DLL which I call from the managed code (C# .NET). The library works like diamonds and is blazingly fast.
My DLL uses its internal state i.e. allocates heaps of memory and uses myriad of variables which are not cleared off between the calls from .NET. Instead they stay there and C# code is aware of that (there is preprocessing and building data structures), actually this is required for performance.
So what is the problem?
I want to add multi-threading, effectively by allowing each .NET thread access his own DLL. Without storing any data between the calls it would be easy achievable with just one DLL.
But in my case, do I have to copy the *.DLL the number of times equal to the number of threads and write pInvoke wrapper for each file separately?? :O I mean [DllImport(...)] for each out of like 40 functions?
No way, there must be something more clever. Help?
Simply put you need to stop sharing variables between threads.
Your global variables are the problem. Instead you need each different thread to have its own copy of the state that persists between calls. Typically you would put this state into a structure of some sort, perhaps a struct. Then an initial call to the DLL would return a new instance of this structure. You then pass that structure back into the DLL every time you call a function that requires access to the persistent state. When you are done, call back into the DLL to deallocate the structure. You don't need to declare the structure in the managed code. You can just treat it as an opaque pointer. Use IntPtr.
Of course, perhaps you'd just be better off with a C++/CLI assembly.
I am using CodeDom to dynamically compile an assembly in memory
(using CompilerParameters.GenerateInMemory=True)
and would like to know if there is any way, (using additional VB.NET code in my assembly) to prevent someone from being able to save a copy of the assembly to their desktop while the assembly is still running in memory?
Or is this even possible for the assembly to detect when someone is using some hacker type program to save a copy of my assembly, while its running in memory?
Experts let me know if it is possible and how to accomplish this?
A short answer would be "no". The critical problem with any security-through-obscurity measure is that at some point, the code has to run. In the case of a managed library, this applies to metadata as well (unless you write your own IL to native compiler), because it has to be compiled by the JIT compiler. You can't really stop the "hacker types", because even at the lowest point, they can analyze the native code, and observe the memory directly. True, there's more high-level hackers (and script-kiddies) than low-level hackers, but the point stands.
In case of dynamic assemblies, they're definitely in the memory as well, as you've pointed out yourself. In fact, I believe they have a distinct virtual memory space, so it isn't even that hard to find them in the memory :)
Are you trying to implement some copy protection scheme or something? That is pretty much impossible even with native code, managed code only makes it that much easier to remove the protection :)
In the midst of asking about manually managing CLR memory, I realized I know very little.
I'm aware that the CLR will place a 'cookie' on the stack when you exit a managed context, so that the Garbage Collector won't trample your memory space; however, in everything I've read the assumption is that you are calling some library written in C.
I want to an entire write layer of my application in C#, outside of the managed context, to manage data at a low level. Then, I want to access this layer from a managed layer.
In this case, will my Unmanaged C# code compile to IL and be run on the CLR? How does this work?
I assume this is related to the same C# database project you mentioned in the question.
It is technically possible to implement an entire write layers in C/C++ or any other language. And it is technically possible to have everything else in C#. I am currently working on an application that uses unmanaged code for some high-performance low level stuff and C# for business logic and upper level management.
However, the complexity of the task shall not be underestimated. The typical way to do this, is to design a contract that both parties can understand. The contract will be exposed to the managed language and managed language will trigger calls to the native application. If you have ever tried calling a C++ method from C# you will get the idea... Plus every call to unmanaged code has quite significant performance overhead, which may kill the whole idea of low level performance.
If you really interested in high-performance relational databases, then use single low level language.
If you want to have a naive, but fully working implementation of a database, just use C#. Don't mix these two languages unless you fully understand the complexity. See Raven DB - a document based NoSQL databases that is fully built in C# only.
Will my Unmanaged C# code compile to IL and be run on the CLR?
No, there is no such thing as unmanaged C#. C# code will be always compiled into the IL code and executed by CLR. It is the case of managed code calling unmanaged code. Unmanaged code can be implemented in several languages C/C++/Assembly etc, but CLR will have no idea of what is happening in that code.
Update from a comment. There is a tool (ngen.exe) that can compile C# directly into native architecture specific code. This tool is designed to improve performance of the managed application by removing JIT-compilation stage and putting native code directly into the executable image or library. This code, however, is still "managed" by the CLR host - memory allocation and collection, managed threading, application domains, exception handling, security and all other aspects are still controlled by the CLR. So even though C# can technically be compiled into native code, this code is not running as a standalone native image.
How does this work?
Managed code interoperate with unmanaged code. There are couple of ways to do this:
Through the code via .Net Interop. This is relatively fast but looks a bit ugly in code (plus it is hard to maintain/test) (good article with C#/C/Assembly samples)
A much much slower approach, but more open to other languages: web wervices (SOAP, WS, REST and company), queueing (such as MSMQ, NServiceBus and others), also (possibly) interprocess communication. So unmanaged process sits on one end and a managed application sits on the other one.
I know this is a C# question, but if you are comfortable with C++, C++/CLI might be an option worth considering.
It allows you to selectively compile portions of your C++ code to either a managed or an unmanaged context - However be aware that code that interacts with CLR types MUST run in a managed context.
I'm not aware of the runtime cost of transitioning from managed-context to unmanaged-context and viceversa from within the C++ code, but I assume it must be similar to the cost of calling a native method via .net Interop from C#, which as #oleksii already pointed out, is expensive. In my experience this has really paid off if you need to interact frequently with native C or C++ libraries - IMHO it is much easier to call them from within a C++/CLI project rather than writing the required .net Interop interfaces in C#.
See this question for a litte bit of information on how it is done.