Using C# Pointers - c#

How does c# makes use of pointers? If C# is a managed language and the garbage collector does a good job at preventing memory leaks and freeing up memory properly, then what is the effect of using pointers in c# and how "unsafe" are they?

To use pointers you have to allow unsafe code, and mark the methods using pointers as unsafe. You then have to fix any pointers in memory to make sure the garbage collector doesn't move them:
byte[] buffer = new byte[256];
// fixed ensures the buffer won't be moved and so make your pointers invalid
fixed (byte* ptrBuf = buffer) {
// ...
}
It is unsafe because, theoretically, you could take a pointer, walk the entire address space, and corrupt or change the internal CLR data structures to, say, change a method implementation. You can't do that in managed code.

When using pointers in C# (inside unsafe code blocks), the memory is not managed by the Framework. You are responsible for managing your own memory and cleaning up after yourself.
...therefore, I would consider if fairly "unsafe".

C# supports pointers in a limited way. In C# pointer can only be declared to hold the memory address of value types and arrays. Unlike reference types, pointer types are not tracked by the default garbage collection mechanism. Pointers are also not allowed to point to a reference type or even to a structure type which contains a reference type. So, in pure C#, they have rather limited uses. If used in 'unsafe' code, they would be considered pretty unsafe (of course!).

Related

Memory allocation/deallocation when working with C# and C++ unmanaged

I am working with some C# and C++ unmanaged code and there are two things I don't understand when dealing with memory. If someone can help me understand:
If a variable is dynamically allocated under C# (using new) and then is passed to the C++ unmanaged code. Does that variable memory needs to be deallocated manually under the C++ unmanaged code by the user ?
If a variable is dynamically allocated under C++ unmanaged (using new) and then passed to C#, is it safe to say the Garbage Collector will deallocate that memory ?
No, since the object is allocated on managed heap GC will handle deallocation as usual. The problem is you must tell him not to deallocate or change address of the object while it is used from unmanaged code because GC can't know how long you are going to use the object from the unmanaged code. This can be done by PINNING the object.
See answer to this question.
No, since the object is allocated on C++ unmanaged heap GC won't touch it. You have to deallocate it yourself using delete.
Edit:
If you need to allocate an object in managed code and deallocate in unmanaged code or vice versa, It's good to know there is OS heap for this purpose that you can use via Marshal.AllocHGlobal and Marshal.FreeHGlobal calls from C#, there will be similar calls in C++.
It's really simple!
Depends
Depends
Eh, Sorry about that.
Under typical conditions, C# will keep track of the memory and get rid of it any time after it's no longer used on the C# side. It has no way of tracking references on the C++ side, so one common mistake in interop is that the memory is deallocated before the unmanaged side is done with it (resulting in loads of FUN). This only applies for cases where the memory is directly referenced, not when its copied (the typical case being a byte[] that's pinned for the duration of the unmanaged call). Don't use automatic marshalling when the life-time of the object/pointer being passed to unmanaged code is supposed to be longer than the run of the invoked method.
Under typical conditions, C# has no way of tracking memory allocations in the C++ code, so you can't rely on automatic memory management. There are exceptions (e.g. some COM scenarios), but you'll almost always need to manage the memory manually. This usually means sending the pointer back to the C++ code to do the deallocation, unless it used a global allocator of some kind (e.g. CoMemoryInitialize). Remember that in the unmanaged world, there is no one memory manager that you can safely invoke to dispose of memory; you don't really have the necessary information anyway.
This only applies to pointers, of course. Passing integers is perfectly fine, and using automatic marshalling usually means the marshaller takes care of most of the subtleties (though still only in the simplest case, so be careful). Unmanaged code is unmanaged - you need to understand perfectly how the memory is allocated, and how, when and who is responsible for cleaning up the memory.
As a rule of thumb, whichever component/object allocates memory should deallocate memory. For every new a delete by the one which did new.
That is the ideal. If not followed for reasons such as you C++ program may terminate and not exists when allocated memory's lifecycle comes to an end, your C# should clean up and visa versa.

`fixed` vs GCHandle.Alloc(obj, GCHandleType.Pinned)

I tried to find out how pinned pointers defined with fixed keyword work. My idea was that internally GCHandle.Alloc(object, GCHandleType.Pinned) was used for that. But when I looked into the IL generated for the following C# code:
unsafe static void f1()
{
var arr = new MyObject[10];
fixed(MyObject * aptr = &arr[0])
{
Console.WriteLine(*aptr);
}
}
I couldn't find any traces of GCHandle.
The only hint I saw that the pinned pointer was used in the method was the following IL declaration:
.locals init ([0] valuetype TestPointerPinning.MyObject[] arr,
[1] valuetype TestPointerPinning.MyObject& pinned aptr)
So the pointer was declared as pinned, and that did not require any additional methods calls, to pin it.
My questions are
Is there any difference between using pinned pointers in the declaration and pinning the pointer by using GCHandle class?
Is there any way to declare a pinned pointer in C# without using fixed keyword? I need this to pin a bunch of pointers within a loop and there's no way I can do this using fixed keyword.
Well, sure there's a difference, you saw it. The CLR supports more than one way to pin an object. Only the GCHandleType.Pinned method is directly exposed to user code. But there are others, like "async pinned handles", a feature that keeps I/O buffers pinned while a driver performs an overlapped I/O operation. And the one that the fixed keyword uses, it doesn't use an explicit handle or method call at all. These extra ways were added to make unpinning the objects again as quick and reliable as possible, very important to GC health.
Fixed buffer pins are implemented by the jitter. Which performs two important jobs when it translates MSIL to machine code, the highly visible one is the machine code itself, you can easily see it with the debugger. But it also generates a data structure used by the garbage collector, completely invisible in the debugger. Required by the GC to reliably find object references back that are stored in the stack frame or a CPU register. More about that data structure in this answer.
The jitter uses the [pinned] attribute on the variable declaration in the metadata to set a bit in that data structure, indicating that the object that's referenced by the variable is temporarily pinned. The GC sees this and knows to not move the object. Very efficient because it doesn't require an explicit method call to allocate the handle and doesn't require any storage.
But no, these tricks are not available otherwise to C# code, you really do need to use the fixed keyword in your code. Or GCHandle.Alloc(). If you are finding yourself getting lost in the pins then high odds that you ought to be considering pinvoke or C++/CLI so you can easily call native code. The temporary pins that the pinvoke marshaller uses to keep objects stable while the native code is running are another example of automatic pinning that doesn't require explicit code.

Marshalling between C# and C++, and the Juggling of Responsibilities

what if I had a native C++ function in which, depending on the result of the function, the responsibility of deleting a certain pointer (delete[]) differs between the caller and the function. I would of course check for the return value and act accordingly in C++.
Question is, what if the function was marshalled between C++ and C#, will setting the pointer to null in C# be enough?
No. C# can't do what delete[] in C++ does. You'd have to use a shared memory allocation API, or write a C++ wrapper that handles the cleanup.
No, simply setting a pointer allocated in native code to null will not free the memory. The CLR can only garbage collect memory that it knows about (aka managed memory). It has no idea about native memory and hence can't collect it. Any native memory which has ownership in a managed type must be explicitly freed.
The most common way this is done is via the Alloc and Free functions on the Marshal class
http://msdn.microsoft.com/en-us/library/atxe881w.aspx

How does memory management in Java and C# differ?

I was reading through 2010 CWE/SANS Top 25 Most Dangerous Programming Errors and one of the entries is for Buffer Copy without Checking Size of Input. It suggests using a language with features to prevent or mitigate this problem, and says:
For example, many languages that
perform their own memory management,
such as Java and Perl, are not subject
to buffer overflows. Other languages,
such as Ada and C#, typically provide
overflow protection, but the
protection can be disabled by the
programmer.
I was not aware that Java and C# differed in any meaningful way with regard to memory management. How is it that Java is not subject to buffer overflows, while C# only protects against overflows? And how is it possible to disable this protection in C#?
java does not support raw pointers (strictly speaking it does not support pointer arithmetic).
In C#, you can use unsafe code and pointers, and unmanaged memory, which makes buffer overruns possible. See unsafe keyword.
To maintain type safety and security,
C# does not support pointer
arithmetic, by default. However, by
using the unsafe keyword, you can
define an unsafe context in which
pointers can be used. For more
information about pointers, see the
topic Pointer types.
Good Answers. I would add that Java depends on usage of stack or heap memory locations. C# does as well. The idea of using raw pointers is an addition to C# that comes from it's C code background. Although C# and C / C++ are not the same code language, they do share some commonalities semantics. The idea of using "unsafe" code allows you to avoid keeping large objects on the heap where memory is limited to around 2GB per runtime instance (for C# per CLR, for Java per JVM instance) without incurring dramatic performance degradation due to garbage collection. In some cases you can use C#'s ability to leverage unsafe or manually managed memory pointers to get around the fact there are not near as many third party tools for problems like caching outside of the heap.
I would caution that if you do use unsafe code be sure to get familiar with "Disposable Types" and "Finalizers". This can be a rather advanced practice and the ramifications of not disposing of your objects properly is the same as with C code ... the dreaded MEMORY LEAK. Repercussions are you run out of memory for your app and it falls over (not good). That is why C# does not allow it by default and that you need to override any usage of manually controlled pointers with the "unsafe" keyword. This ensures that any manually handled memory is intentional. Put on your C-code hat when dealing with the "unsafe" keyword.
A great reference to this was in the chapter "Understanding Object Lifetime" in "Pro C# 2010 and the .Net Platform" by Andrew Troelsen. If you prefer online references see the MSDN Website Implementing Finalize and Dispose to Clean Up Unmanaged Resources
One final note - Unmanaged memory is released in the finalizer portion of your object (~ObjectName(){...}). These patterns do add overhead to performance so if you are dealing with lower latency scenarios you may be best served by keeping objects light. If you are dealing with human response then you should be fine to consider this where absolutely necessary.

When would I need to use the stackalloc keyword in C#?

What functionality does the stackalloc keyword provide? When and Why would I want to use it?
From MSDN:
Used in an unsafe code context to allocate a block of memory on the
stack.
One of the main features of C# is that you do not normally need to access memory directly, as you would do in C/C++ using malloc or new. However, if you really want to explicitly allocate some memory you can, but C# considers this "unsafe", so you can only do it if you compile with the unsafe setting. stackalloc allows you to allocate such memory.
You almost certainly don't need to use it for writing managed code. It is feasible that in some cases you could write faster code if you access memory directly - it basically allows you to use pointer manipulation which suits some problems. Unless you have a specific problem and unsafe code is the only solution then you will probably never need this.
Stackalloc will allocate data on the stack, which can be used to avoid the garbage that would be generated by repeatedly creating and destroying arrays of value types within a method.
public unsafe void DoSomeStuff()
{
byte* unmanaged = stackalloc byte[100];
byte[] managed = new byte[100];
//Do stuff with the arrays
//When this method exits, the unmanaged array gets immediately destroyed.
//The managed array no longer has any handles to it, so it will get
//cleaned up the next time the garbage collector runs.
//In the mean-time, it is still consuming memory and adding to the list of crap
//the garbage collector needs to keep track of. If you're doing XNA dev on the
//Xbox 360, this can be especially bad.
}
Paul,
As everyone here has said, that keyword directs the runtime to allocate on the stack rather than the heap. If you're interested in exactly what this means, check out this article.
http://msdn.microsoft.com/en-us/library/cx9s2sy4.aspx
this keyword is used to work with unsafe memory manipulation. By using it, you have ability to use pointer (a powerful and painful feature in C/C++)
stackalloc directs the .net runtime to allocate memory on the stack.
Most other answers are focused on the "what functionality" part of OP's question.
I believe this will answers the when and why:
When do you need this?
For the best worst-case performance with cache locality of multiple small arrays.
Now in an average app you won't need this, but for realtime sensitive scenarios it gives more deterministic performance: No GC is involved and you are all but guaranteed a cache hit.
(Because worst-case performance is more important than average performance.)
Keep in mind that the default stack size in .net is small though!
(I think it's 1MB for normal apps and 256kb for ASP.net?)
Practical use could for example include realtime sound processing.
It is like Steve pointed out, only used in unsafe code context (e.g, when you want to use pointers).
If you don't use unsafe code in your C# application, then you will never need this.

Categories

Resources