Assume someClass is a class defined in C# with some method int doSomething(void), and for simplicity, providing a constructor taking no arguments. Then, in C#, instances have to be created on the gc heap:
someClass c; // legit, but only a null pointer in C#
// c->doSomething() // would not even compile.
c = new someClass(); // now it points to an instance of someclass.
int i = c->doSomething();
Now, if someClass is compiled into some .Net library, you can also use it in C++/CLI:
someClass^ cpp_gcpointer = gcnew someClass();
int i = cpp_gcpointer->doSomething();
That easy! Nifty! This is of course assuming a reference to the .Net library has been added to the project and a corresponding using declaration has been made.
It is my understanding that this is the precise C++/CLI equivalent of the previous C# example (condensed to a single line, this is not the point I'm interested in). Correct? (Sorry, I'm new to the topic)
In C++, however, also
someClass cpp_cauto; // in C++ declaration implies instantiation
int i = cpp_cauto.doSomething();
is valid syntax. Out of curiosity, I tried this today. A colleague, looking over my shoulder, was willing to bet it would not even compile. He would have lost the bet. (This is still the class from the C# assembly). Actually it produces also the same result i as the code from the previous examples.
Nifty, too, but -- uhmm -- what exactly is it, what is created here? My first wild guess was that behind my back, .Net dynamically creates an instance on the gc heap and cpp_auto is some kind of wrapper for this object, behaving syntactily like an instance of class someClass. But then I found this page
http://msdn.microsoft.com/en-us/library/ms379617%28v=vs.80%29.aspx#vs05cplus_topic2
This page seems to tell me, that (at least, if someClass were a C++ class) cpp_auto is actually created on the stack, which, to my knowledge, would be the same behaviour you get in classical C++. And something you cannot do in C# (you can't, can you?). What I'd like to know: is the instance from the C# assembly also created on the stack? Can you produce .Net binaries in C++ with class instances on the stack which you cannot create in C#? And does this possibly may even give you a perfomance gain :-) ?
Kind regards,
Thomas
The link you referenced explains this in detail:
C++/CLI allows you to employ stack semantics with reference types. What this means is that you can introduce a reference type using the syntax reserved for allocating objects on the stack. The compiler will take care of providing you the semantics that you would expect from C++, and under the covers meet the requirements of the CLR by actually allocating the object on the managed heap.
Basically, it's still making a handle to the reference type on the managed heap, but automatically calls Dispose() on IDisposable implementations when it goes out of scope for you.
The object instance, however, is still effectively allocated via gcnew (placed on the managed heap) and collected by the garbage collector. This, too, is explained in detail:
When d goes out of scope, its Dispose method will be called to allow its resources to be released. Again, since the object is actually allocated from the managed heap, the garbage collector will take care of freeing it in its own time.
Basically, this is all handled by the compiler to make the code look and work like standard C++ stack allocated classes, but its really just a compiler trick. The resulting IL code is still doing managed heap allocations.
Related
I tried to find out how pinned pointers defined with fixed keyword work. My idea was that internally GCHandle.Alloc(object, GCHandleType.Pinned) was used for that. But when I looked into the IL generated for the following C# code:
unsafe static void f1()
{
var arr = new MyObject[10];
fixed(MyObject * aptr = &arr[0])
{
Console.WriteLine(*aptr);
}
}
I couldn't find any traces of GCHandle.
The only hint I saw that the pinned pointer was used in the method was the following IL declaration:
.locals init ([0] valuetype TestPointerPinning.MyObject[] arr,
[1] valuetype TestPointerPinning.MyObject& pinned aptr)
So the pointer was declared as pinned, and that did not require any additional methods calls, to pin it.
My questions are
Is there any difference between using pinned pointers in the declaration and pinning the pointer by using GCHandle class?
Is there any way to declare a pinned pointer in C# without using fixed keyword? I need this to pin a bunch of pointers within a loop and there's no way I can do this using fixed keyword.
Well, sure there's a difference, you saw it. The CLR supports more than one way to pin an object. Only the GCHandleType.Pinned method is directly exposed to user code. But there are others, like "async pinned handles", a feature that keeps I/O buffers pinned while a driver performs an overlapped I/O operation. And the one that the fixed keyword uses, it doesn't use an explicit handle or method call at all. These extra ways were added to make unpinning the objects again as quick and reliable as possible, very important to GC health.
Fixed buffer pins are implemented by the jitter. Which performs two important jobs when it translates MSIL to machine code, the highly visible one is the machine code itself, you can easily see it with the debugger. But it also generates a data structure used by the garbage collector, completely invisible in the debugger. Required by the GC to reliably find object references back that are stored in the stack frame or a CPU register. More about that data structure in this answer.
The jitter uses the [pinned] attribute on the variable declaration in the metadata to set a bit in that data structure, indicating that the object that's referenced by the variable is temporarily pinned. The GC sees this and knows to not move the object. Very efficient because it doesn't require an explicit method call to allocate the handle and doesn't require any storage.
But no, these tricks are not available otherwise to C# code, you really do need to use the fixed keyword in your code. Or GCHandle.Alloc(). If you are finding yourself getting lost in the pins then high odds that you ought to be considering pinvoke or C++/CLI so you can easily call native code. The temporary pins that the pinvoke marshaller uses to keep objects stable while the native code is running are another example of automatic pinning that doesn't require explicit code.
I'm basically a C++ guy trying to venture into C#. From the basic tutorial of C#, I happen to find that all objects are created and stored dynamically (also true for Java) and are accessed by references and hence there's no need for copy constructors. There is also no need of bitwise copy when passing objects to a function or returning objects from a function. This makes C# much simpler than C++.
However, I read somewhere that operating on objects exclusively through references imposes limitations on the type of operations that one can perform thus restricting the programmer of complete control. One limitation is that the programmer cannot precisely specify when an object can be destroyed.
Can someone please elaborate on other limitations? (with a sample code if required)
Most of the "limitations" are by design rather than considered a deficiency (you may not agree of course)
You cannot determine/you don't have to worry about
when an object is destroyed
where the object is in memory
how big it is (unless you are tuning the application)
using pointer arithmetic
accessing out side an object
accessing an object with the wrong type
sharing objects between threads is simpler
whether the object is on the stack or the heap. (The stack is being used more and more in Java)
fragmentation of memory (This is not true of all collectors)
Because of Garbage collection done in java we cannot predict when the object will get destroyed but it performs the work of destructor.
If you want to free up some resources then you can use finally block.
try {
} finally{
// dispose resources.
}
Having made a similar transition, the more you look into it, the more you do have to think about C#'s GC behaviour in all but the most straighforward cases. This is especially true when trying to handle unmanaged resources from managed code.
This article highlights a lot of the issues you may be interested in.
Personally I miss a reference counted alternative to IDisposable (more like shared_ptr), but that's probably a hangover from a C++ background.
The more I have to write my own plumbing to support C++ like programming, the more likely it is there is another C# mechanism I've overlooked, or I end up getting frustrated with C#. For example, swap and move are not common idioms in C# as far as I've seen and I miss them: other programmers with a C# background may well disagree about how useful those idioms are.
Sorry for being confused, at C++ I know to return local variable's reference or pointers can cause bad_reference exception. I am not sure how it is in C# ?
e.g
List<StringBuilder> logs = new List<StringBuilder>();
void function(string log)
{
StringBuilder sb = new StringBuilder();
logs.Add(sb);
}
at this function a local object is created and stored in a list, is that bad or must be done in another way. I am really sorry for asking this, but I am confused after coding C++ for 2 months.
Thanks.
Your C# code doesn't return an object reference so it doesn't match your concern. It is however a problem that doesn't exist in C#. The CLR doesn't let you create objects on the stack, only the heap. And the garbage collector makes sure that object references stay valid.
In C#, the garbage collector manages all the (managed) objects you create. It will not delete one unless there are no longer any references to it.
So that code is perfectly valid. logs keeps a reference to the StringBuilder. The garbage collector knows this, so it will not clean it up even after the context in which it was originally created goes out of scope.
In C# object lifecycle is managed for you by the CLR; compared to C++ where you have to match each new with a delete.
However in C# you can't do
void fun()
{
SomeObject sb(10);
logs.Add(sb);
}
i.e. allocating on the stack you have to use new - so in this respect both C# and C++ work similarly - except when it comes to releasing / freeing the object reference.
It is still possible to leak memory in C# - but it's harder than in C++.
There's nothing wrong with the code you've written. This is mostly because C#, like any .NET language, is a "managed" language that does a lot of memory management for you. To get the same effect in C++ you would need to explicitly use some third-party library.
To clear up some of the basics for you:
In C#, you rarely deal with "pointers" or "references" directly. You can deal with pointers, if you need to, but that is "unsafe" code and you really avoid that kind of thing unless you know what you're doing. In the few cases where you do deal with references (e.g. ref or out parameters) the language hides all the details from you and lets you treat them as normal variables.
Instead, objects in C# are defined as instances of reference types; whenever you use an instance of a reference type, it is similar to using a pointer except that you don't have to worry about the details. You create new instances of references types in C# in the same way that you create new instances of objects in C++, using the new operator, which allocates memory, runs constructors, etc. In your code sample, both StringBuilder and List<StringBuilder> are reference types.
The key aspect of managed languages that is important here is the automatic garbage collection. At runtime, the .NET Framework "knows" which objects you have created, because you're always creating them from it's own internally-managed heap (no direct malloc or anything like that in C#). It also "knows" when an object has gone completely out of scope -- when there are no more references to it anywhere in your program. Once that happens, the runtime is able to free the memory whenever it wants to, typically when it starts to run low on free memory, and you never have to do it. In fact, there is no way in C# to explicitly destroy a managed object (though you do have to clean up unmanaged resources if you use them).
In your example, the runtime knows that you've created a StringBuilder and put it into a List<>; it will keep track of that object, and as long as it's in the List<> it will stick around. Once you either remove it from the List<>, or the List<> itself goes away, the runtime will automatically clean up the StringBuilder for you.
Could someone explain to a C++ programmer most important differences between Java (and C# as well) references and shared_ptr (from Boost or from C++0x).
I more or less aware how shared_ptr is implemented. I am curious about differences in the following ares:
1) Performance.
2) Cycling. shared_ptr can be cycled (A and B hold pointers to each other). Is cycling possible in Java?
3) Anything else?
Thank you.
Performance: shared_ptr performs pretty well, but in my experience is slightly less efficient than explicit memory management, mostly because it is reference counted and the reference count has to allocated as well. How well it performs depends on a lot of factors and how well it compares to Java/C# garbage collectors can only be determined on a per use case basis (depends on language implementation among other factors).
Cycling is only possible with weak_ptr, not with two shared_ptrs. Java allows cycling without further ado; its garbage collector will break the cycles. My guess is that C# does the same.
Anything else: the object pointed to by a shared_ptr is destroyed as soon as the last reference to it goes out of scope. The destructor is called immediately. In Java, the finalizer may not be called immediately. I don't know how C# behaves on this point.
The key difference is that when the shared pointer's use count goes to zero, the object it points to is destroyed (destructor is called and object is deallocated), immediately. In Java and C# the deallocation of the object is postponed until the Garbage Collector chooses to deallocate the object (i.e., it is non-deterministic).
With regard to cycles, I am not sure I understand what you mean. It is quite common in Java and C# to have two objects that contain member fields that refer to each other, thus creating a cycle. For example a car and an engine - the car refers to the engine via an engine field and the engine can refer to its car via a car field.
Nobody pointed the possibility of moving the object by the memory manager in managed memory. So in C# there are no simple references/pointers, they work like IDs describing object which is returned by the manager.
In C++ you can't achieve this with shared_ptr, because the object stays in the same location after it has been created.
First of all, Java/C# have only pointers, not references, though they call them that way. Reference is a unique C++ feature. Garbage collection in Java/C# basically means infinite life-time. shared_ptr on the other hand provides sharing and deterministic destruction, when the count goes to zero. Therefore, shared_ptr can be used to automatically manage any resources, not just memory allocation. In a sense (just like any RAII design) it turns pointer semantics into more powerful value semantics.
Cyclical references with C++ reference-counted pointers will not be disposed. You can use weak pointers to work around this. Cyclical references in Java or C# may be disposed, when the garbage collector feels like it.
When the count in a C++ reference-counted pointer drops to zero, the destructor is called. When a Java object is no longer reachable, its finalizer may not be called promptly or ever. Therefore, for objects which require explicit disposal of external resources, some form of explicit call is required.
Will using GetComInterfaceForObject and passing the returned IntPtr to unmanaged code keep the managed object from being moved in memory? Or does the clr somehow maintain that ptr? Note that the unmanaged code will use this for the lifetime of the program, and I need to make sure the managed object is not being moved by the GC.(At least I think that's right?)
EDIT - Alright I found some info and I am thinking that this may be the answer. It deals with delegates, but I would have to believe calling GetComInterfaceForObject does something along the same lines.
Source of the Following text
"Managed Delegates can be marshaled to unmanaged code,
where they are exposed as unmanaged function pointers. Calls on those
pointers will perform an unmanaged to managed transition; a change in
calling convention; entry into the correct AppDomain; and any necessary
argument marshaling. Clearly the unmanaged function pointer must refer to a
fixed address. It would be a disaster if the GC were relocating that! This
leads many applications to create a pinning handle for the delegate. This
is completely unnecessary. The unmanaged function pointer actually refers
to a native code stub that we dynamically generate to perform the transition
& marshaling. This stub exists in fixed memory outside of the GC heap.
However, the application is responsible for somehow extending the lifetime
of the delegate until no more calls will occur from unmanaged code. The
lifetime of the native code stub is directly related to the lifetime of the
delegate. Once the delegate is collected, subsequent calls via the
unmanaged function pointer will crash or otherwise corrupt the process. In
our recent release, we added a Customer Debug Probe which allows you to
cleanly detect this all too common bug in your code. If you havent
started using Customer Debug Probes during development, please take a look!"
As your edit states (about delegates), your managed object doesn't need to be pinned, since GetComInterfaceForObject returns a "pinned" pointer that calls through to the correct managed object. However, you will need to make sure that the managed object lives for as long as the COM clients are using the unmanaged pointer to it.