Objects memory re-location? - c#

Is there any way to tell .net runtime , not to re-locate object in memory ?
IMHO - Object can be re-locate by GC when :
Moving from one generation to another
Being moved from finilization-queue to the f-reachable queue.
else ( maybe optimization mechanism ?).
Also,I thought immutable (strings)are automatically recreated each time , so they must be created in a new location.
(just a theoratical question )

As an implementation detail, the .Net framework can move an object in the memory in the final stage of garbage collection. But this doesn't necessarily mean moving between generations: when performing generation 2 GC, objects in gen 2 will be moved, even though they don't change generation (because there is nowhere to go beyond gen 2).
The finalization queue and the f-reachable queue have nothing to do with this, they contain only references to objects, not the objects themselves.
I have no idea what does this have to do with immutable objects. The runtime doesn't give any special treatment to them (except for strings).
Telling the runtime not to relocate an object (also known as “pinning” the object) is an unusual requirement and should have a really good reason, because it can negatively affect the performance of the GC. To temporarily pin an object in unsafe code, you can use the fixed statement. To do it permanently or from safe code, you can use GCHandle.Alloc(), specifying GCHandleType.Pinned.

Pinned Objects tells the gc to not to move it to create large chuck of free space. They are created using Fixed keyword.
Useful scenario
lets think of a scenario where we have an int of array needed to be passed to some unmanaged function and unmanaged function reads the value of array and does some changes. If the array is not pinned, changed values would not be able to be written back as pointer to array had been moved by GC.

Not sure if this is useful in questions context, but in managed scenario you could use Marshall class to allocate memory, move a structure to allocated memory and get a pointer back. This structure wouldn't be moved around by gc. Later on you could retrieve the structure from allocated memory using the pointer from before.

Related

Manual GC Gen2 data allocation

I'm prototyping some managed directx game engine before moving to c++ syntax horror.
So let's say I've got some data (f.e. an array or a hashset of references) that I'm sure it'll stay alive throughout whole application's life. Since performance is crucial here and I'm trying to avoid any lag spikes on generation promotion, I'd like to ask if there's any way to initialize an object (allocate its memory) straight ahead in GC's generation 2? I couldn't find an answer for that, but I'm pretty sure I've seen someone doing that before.
Alternatively since there would be no real need to "manage" that piece of memory, would it be possible to allocate it with unmanaged code, but to expose it to the rest of the code as a .NET type?
You can't allocate directly in Gen 2. All allocations happen in either Gen 0 or on the large object heap (if they are 85000 bytes or larger). However, pushing something to Gen 2 is easy: Just allocate everything you want to go to Gen 2 and force GCs at that point. You can call GC.GetGeneration to inspect the generation of a given object.
Another thing to do is keep a pool of objects. I.e. instead of releasing objects and thus making them eligible for GC, you return them to a pool. This reduces allocations and thus the number of GCs as well.

How does the GC update references after compaction occurs

The .NET Garbage Collector collects objects (reclaims their memory) and also performs memory compaction (to keep memory fragmentation to minimum).
I am wondering, since an application may have many references to objects, how does the GC (or the CLR) manage these references to objects, when the object's address changes due to compaction being made by the GC.
The concept is simple enough, the garbage collector simply updates any object references and re-points them to the moved object.
Implementation is a bit trickier, there is no real difference between native and managed code, they are both machine code. And there's nothing special about an object reference, it is just a pointer at runtime. What's needed is a reliable way for the collector to find these pointers back and recognize them as the kind that reference a managed object. Not just to update them when the pointed-to object gets moved while compacting, also to recognize live references that ensure that an object does not get collected too soon.
That's simple for any object references that are stored in class objects that are stored on the GC heap, the CLR knows the layout of the object and which fields store a pointer. It is not so simple for object references stored on the stack or in a cpu register. Like local variables and method arguments.
The key property of executing managed code which makes it distinct from native code is that the CLR can reliably iterate the stack frames owned by managed code. Done by restricting the kind of code used to setup a stack frame. This is not typically possible in native code, the "frame pointer omission" optimization option is particularly nasty.
Stack frame walking first of all lets it finds object references stored on the stack. And lets it know that the thread is currently executing managed code so that the cpu registers should be checked for references as well. A transition from managed code to native code involves writing a special "cookie" on the stack that the collector recognizes. So it knows that any subsequent stack frames should not be checked because they'll contain random pointer values that don't ever reference a managed object.
You can see this back in the debugger when you enable unmanaged code debugging. Look at the Call Stack window and note the [Native to Managed Transition] and [Managed to Native Transition] annotations. That's the debugger recognizing those cookies. Important for it as well since it needs to know whether or not the Locals window can display anything meaningful. The stack walk is also exposed in the framework, note the StackTrace and StackFrame classes. And it is very important for sandboxing, Code Access Security (CAS) performs stack walks.
For simplicity, I'll assume a stop-the-world GC in which no objects are pinned, every object gets scanned and relocated on every GC cycle, and none of the destinations overlap any of the sources. In actuality, the .NET GC is a bit more complicated, but this should give a good feel for how things work.
Each time a reference is examined, there are three possibilities:
It's null. In that case, no action is required.
It identifies an object whose header says it's something other than a relocation marker (a special kind of object described below). In that case, move the object to a new location and replace the original object with a three-word relocation marker containing the new location, the old location of the object which contains the just-observed reference to the present object, and the offset within that object. Then start scanning the new object (the system can forget about the object that was being scanned for the moment, since it just recorded its address).
It identifies an object whose header says it's a relocation marker. In that case, update the reference being scanned to reflect the new address.
Once the system finishes scanning the present object, it can look at its old location to find out what it was doing before it started scanning the present object.
Once an object has been relocated, the former contents of its first three words will be available at its new location and will no longer be needed at the old one. Because the offset into an object will always be a multiple of four, and individual objects are limited to 2GB each, only a fraction of all possible 32-bit values would be needed to hold all possible offsets. Provided that at least one word in an object's header has at least 2^29 values it can never hold for anything other than an object-relocation marker, and provided every object is allocated at least twelve bytes, it's possible for object scanning to handle any depth of tree without requiring any depth-dependent storage outside the space occupied by old copies of objects whose content is no longer needed.
Garbage collection
Every application has a set of roots. Roots identify storage locations, which refer to objects on the managed heap or to objects that are set to null. For example, all the global and static object pointers in an application are considered part of the application's roots. In addition, any local variable/parameter object pointers on a thread's stack are considered part of the application's roots. Finally, any CPU registers containing pointers to objects in the managed heap are also considered part of the application's roots. The list of active roots is maintained by the just-in-time (JIT) compiler and common language runtime, and is made accessible to the garbage collector's algorithm.
When the garbage collector starts running, it makes the assumption that all objects in the heap are garbage. In other words, it assumes that none of the application's roots refer to any objects in the heap. Now, the garbage collector starts walking the roots and building a graph of all objects reachable from the roots. For example, the garbage collector may locate a global variable that points to an object in the heap.
Once this part of the graph is complete, the garbage collector checks the next root and walks the objects again. As the garbage collector walks from object to object, if it attempts to add an object to the graph that it previously added, then the garbage collector can stop walking down that path. This serves two purposes. First, it helps performance significantly since it doesn't walk through a set of objects more than once. Second, it prevents infinite loops should you have any circular linked lists of objects.
Once all the roots have been checked, the garbage collector's graph contains the set of all objects that are somehow reachable from the application's roots; any objects that are not in the graph are not accessible by the application, and are therefore considered garbage. The garbage collector now walks through the heap linearly, looking for contiguous blocks of garbage objects (now considered free space). The garbage collector then shifts the non-garbage objects down in memory (using the standard memcpy function that you've known for years), removing all of the gaps in the heap. Of course, moving the objects in memory invalidates all pointers to the objects. So the garbage collector must modify the application's roots so that the pointers point to the objects' new locations. In addition, if any object contains a pointer to another object, the garbage collector is responsible for correcting these pointers as well.
C# fixed statement
The fixed statement sets a pointer to a managed variable and "pins" that variable during the execution of statement. Without fixed, pointers to movable managed variables would be of little use since garbage collection could relocate the variables unpredictably. The C# compiler only lets you assign a pointer to a managed variable in a fixed statement.
Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework
fixed Statement (C# Reference)

setnewhandler in C#

for creating my own memory management in C# I need to have a possibility to intercept the new command before it returns a null or fires an exception. When using the new command I want to call the original handler first. If this handler fails to return a block of memory, I want to inform all my mappable objects to be written to disk and to free memory.
In C++ there has been a possibility to intercept the new command by assigned a different new handler. In C# I couldn't find anything which shows the same behaviour.
Has anyone seen a possibility to do this.
Thanks
Martin
You can't do what you're after in C#, or in any managed language. Nor should you try. The .NET runtime manages allocations and garbage collection. It's impossible for you to instruct your objects to free memory, as you have no guarantee when (or, technically, even if) a particular object will be collected once it's no longer rooted. Even eliminating all references and manually calling GC.Invoke() is not an absolute guarantee. If you're looking for granular memory management, you need to be using a lower-level environment.
As an important point, it is not possible for the new operator to return a null reference. It can only return either a reference to the specified type or throw an exception.
If you want to do your own management of how and when objects are allocated, you'll have to use something along the lines of a factory pattern.
I think you're approaching this from the wrong angle; the whole point of using a runtime with managed memory is so that you don't have to worry about memory. The tradeoff is that you can't do this type of low-level trickery.
As an aside, you can 'override new' for a limited class of objects (those descending from ContextBoundObject) by creating a custom ProxyAttribute, though this likely does not address what you're intending.
I believe that you are not understanding the side-effects of what you're asking for. Even in C++, you can't really do what you think you can do. The reason is simple, if you have run out of memory, you can't even make your objects serialize to disk because you have no memory to accomplish that. By the time memory is exhausted, the only real thing you can do is either discard memory (without saving or doing anything else first) or abend the program.
Now, what you're talking about will still work 95% of the time because your memory allocation will likely be sufficiently large that when it fails, you have a little room to play with, but you can't guarantee that this will be the case.
Example: If you have only 2MB of memory left, and you try to allocate 10MB, then it will fail, and you still have 2MB to play with to try and free up some memory, which will allow you to allocate small chunks of memory needed to serialize objects to disk.
But, if you only have 10 bytes of memory left, then you don't even have enough memory to create a new exception object (unless it comes from a reserved pool). So, in essence, you're creating a very poor situation that will likely crash at some point.
Even in C++ low memory conditions are almost impossible to get right, and it's almost impossible to recover from every case unless you have very carefully planned, and pre-allocated memory for your recovery routines.
Now, when you're talking about a garbage collected OS, you have no control over how memory is allocated or freed. At best, all you can do is give hints. There is very little you can reliably do here by the nature of garbage collection. It's non-deterministic.

Comparing shared_ptr to managed language references

Could someone explain to a C++ programmer most important differences between Java (and C# as well) references and shared_ptr (from Boost or from C++0x).
I more or less aware how shared_ptr is implemented. I am curious about differences in the following ares:
1) Performance.
2) Cycling. shared_ptr can be cycled (A and B hold pointers to each other). Is cycling possible in Java?
3) Anything else?
Thank you.
Performance: shared_ptr performs pretty well, but in my experience is slightly less efficient than explicit memory management, mostly because it is reference counted and the reference count has to allocated as well. How well it performs depends on a lot of factors and how well it compares to Java/C# garbage collectors can only be determined on a per use case basis (depends on language implementation among other factors).
Cycling is only possible with weak_ptr, not with two shared_ptrs. Java allows cycling without further ado; its garbage collector will break the cycles. My guess is that C# does the same.
Anything else: the object pointed to by a shared_ptr is destroyed as soon as the last reference to it goes out of scope. The destructor is called immediately. In Java, the finalizer may not be called immediately. I don't know how C# behaves on this point.
The key difference is that when the shared pointer's use count goes to zero, the object it points to is destroyed (destructor is called and object is deallocated), immediately. In Java and C# the deallocation of the object is postponed until the Garbage Collector chooses to deallocate the object (i.e., it is non-deterministic).
With regard to cycles, I am not sure I understand what you mean. It is quite common in Java and C# to have two objects that contain member fields that refer to each other, thus creating a cycle. For example a car and an engine - the car refers to the engine via an engine field and the engine can refer to its car via a car field.
Nobody pointed the possibility of moving the object by the memory manager in managed memory. So in C# there are no simple references/pointers, they work like IDs describing object which is returned by the manager.
In C++ you can't achieve this with shared_ptr, because the object stays in the same location after it has been created.
First of all, Java/C# have only pointers, not references, though they call them that way. Reference is a unique C++ feature. Garbage collection in Java/C# basically means infinite life-time. shared_ptr on the other hand provides sharing and deterministic destruction, when the count goes to zero. Therefore, shared_ptr can be used to automatically manage any resources, not just memory allocation. In a sense (just like any RAII design) it turns pointer semantics into more powerful value semantics.
Cyclical references with C++ reference-counted pointers will not be disposed. You can use weak pointers to work around this. Cyclical references in Java or C# may be disposed, when the garbage collector feels like it.
When the count in a C++ reference-counted pointer drops to zero, the destructor is called. When a Java object is no longer reachable, its finalizer may not be called promptly or ever. Therefore, for objects which require explicit disposal of external resources, some form of explicit call is required.

What is the difference between C++ memory management and .NET memory management?

What is the difference between C++ memory management and .NET memory management ?
In C++, you can either allocate objects with static storage so they are around for the whole program, allocate them on the stack when they are local to a function (in which case they are destroyed when the containing block exits), or allocate them on the heap (in which case they are only destroyed when you say so by explicitly calling the appropriate de-allocation function). Heap memory is allocated as raw memory with malloc and released with free, or allocated and constructed into an object with new, and then the object destroyed and the memory released with delete.
C# provides an illusion of infinite memory --- you cannot free memory explicitly, only allocate memory and construct an object with new. Instead the GC reclaims the memory for objects you can no longer access so that it can be reused for new objects.
In C++, class destructors are run when an object is destroyed. This gives each object a chance to release any associated resources whether they are more objects, or external resources such as file handles or database handles.
In C#, you must explicitly manage the release of non-memory resources by calling a release function. The using facility allows you to get the compiler to call Dispose() automatically for you, but this is still separate from the object lifetime --- the memory for the object is reclaimed when the GC system decides to (which may be never).
In C++, facilities like std::shared_ptr (or boost::shared_ptr with older compilers) allow you to pass the responsibility of destroying heap objects over to the C++ runtime by reference-counting the objects. When the last instance of shared_ptr that references a given object is destroyed, then the referenced object is destroyed too and its memory reclaimed. This avoids many of the pitfalls associated with manual memory management.
In .NET, memory is treated different than all other resources: While you have to take care of releasing all resources you need, you don't have to worry about memory.
In C++, you have to take care to release all resources you use, including dynamically allocated memory. However, C++ employs a number of tools and techniques (namely automatic scope-based allocation/deallocation and RAII) to help you with this. In a decade of writing C++ code, I have rarely ever (read: on average less than once per year) manually freed memory and if so, it was in a RAII handle class.
In C#, there is a whole lot less to worry about.
When you want to work with an object in C#, you can simply create it; and once you're done with it, you don't have to do anything else. A background worker (the Garbage Collector) will clean it up behind the scenes when it realises you're not using it any more.
In vanilla C++, there aren't any background processes running to clean up memory. This means that, any time you manually allocate memory (which is a lot of the time), you are responsible for deleting it once you're finished using it. Care must also be taken to ensure you don't delete the same thing twice.
A note on the C# side of things: This doesn't mean you can completely ignore how memory works. It's very helpful to know what happens behind the scenes. In general, though, you won't have to worry about it much.
Edit: As GMan notes below (if I'm understanding him correctly), you can deal with memory management in C++ by letting every allocation be scoped, and thus the C++ environment will look after allocation and deletion for you. In that sense, you again have to understand how C++ allocates and deletes in order to not run into troubles.
some c++ open source choosed to create their own memory gabrage collector like V8 javascript engine to avoid all problems of memory leaks.

Categories

Resources