What are the different heaps in .net? - c#

I was profiling the memory usage of a Windows Forms application in dotmemory and I noticed that for my application there were 0-4 heaps all of varying sizes as well as the large object heap.
I was just wondering if anyone had a good explanation of what each heap is for and what is typically stored in each heap?

The other answers seem to be missing the fact that there is a difference between heaps and generations. I don't see why a commercial profiler would confuse the two concepts, so I strongly suspect it's heaps and not generations after all.
When the CLR GC is using the server flavor, it creates a separate heap for each logical processor in the process' affinity mask. The reason for this breakdown is mostly to improve scalability of allocations, and to perform in GC in parallel. These are separate memory regions, but you can of course have object references between the heaps and can consider them a single logical heap.
So, assuming that you have four logical processors (e.g. an i5 CPU with HyperThreading enabled), you'll have four heaps under server GC.
The Large Object Heap has an unfortunate, confusing name. It's not a heap in the same sense as the per-processor heaps. It's a logical abstraction on top of multiple memory regions that contain large objects.

You have different heaps because of how the C# garbage collector works. It uses a generational GC, which separates data based on how recently it was used. The use of different heaps allows the garbage collector to clean up memory more efficiently.
According to MSDN:
The heap is organized into generations so it can handle long-lived and short-lived objects. Garbage collection primarily occurs with the reclamation of short-lived objects that typically occupy only a small part of the heap.
Generation 0. This is the youngest generation and contains short-lived objects. An example of a short-lived object is a temporary variable. Garbage collection occurs most frequently in this generation.
Newly allocated objects form a new generation of objects and are implicitly generation 0 collections, unless they are large objects, in which case they go on the large object heap in a generation 2 collection.
Most objects are reclaimed for garbage collection in generation 0 and do not survive to the next generation.
Generation 1. This generation contains short-lived objects and serves as a buffer between short-lived objects and long-lived objects.
Generation 2. This generation contains long-lived objects. An example of a long-lived object is an object in a server application that contains static data that is live for the duration of the process.
Objects that are not reclaimed in a garbage collection are known as survivors, and are promoted to the next generation.
Important data quickly gets put on the garbage collector's back burner (higher generations) and is checked for deletion less often. This lowers the amount of time wasted checking memory that truly needs to persist, which lets you see performance gains from an efficient garbage collector.

When it comes to managed objects, there are three Small Object Heaps(SOH) and one Large Object Heap(LOH).
Large Object Heap (LOH)
Objects that are larger than 85KB are going to LOH straight away. There are some risks if you have too many large objects. That's a different discussion, for more details have a look at The Dangers of the Large Object Heap
Small Object Heap (SOH) : Gen0, Gen1, Gen2
Garbage collector uses a clever algorithm to execute the garbage collecton only when it is required. Full garbage collection process is an expensive operation which shouldn't happen too often. So, it has broken its SOH into three parts and as you have noticed each Gen has a specified amount of memory.
Every small object (<85KB) initially going to Gen0. When Gen0 is full, garbage collection executes only for Gen0. It checks all instances that are in Gen0 and clears/releases memory that is used by any unnecessary objects(non-referenced, out of scoped or disposed objects). And then it copies all the required (in used) instances to Gen1.
Above process is actually occurs even when you execute below: (not required to call manually)
// Perform a collection of generation 0 only.
GC.Collect(0);
In this way, Garbage collector clears the memory that are allocated for short lived instances first (strings which is immutable, variables in methods or smaller scopes).
When GC is keep doing this operation at one stage, Gen1 overflows. Then it does the same operation to Gen1. It clears all the unnecessary memory in Gen1 and copies all required ones to Gen2.
Above process is occurs when you execute below manually (not required to call manually)
// Perform a collection of all generations up to and including 1.
GC.Collect(1);
When GC is keep doing this operation at one stage if Gen2 overflows it tries to clean Gen2.
Above process is occurs even when you execute below manually (not required to do manually)
// Perform a collection of all generations up to and including 2.
GC.Collect(2);
If the amount of memory needs to be copy from Gen1 to Gen2 is greater than the amount of memory available in Gen2, GC throws out of memory exception.

Related

Why are gen1/gen2 collections slower than gen0?

From my anecdotal knowledge, short-lived object creation isn't too troublesome in terms of GC - implying, gen0 collections are extremely fast. Gen1/gen2 collections, however, appear to be a little more "dreaded", i.e. said to usually be a whole lot slower than gen0.
Why is that? What makes, say, a gen2 collection on average significantly slower than gen0?
I'm not aware of any structural differences between the collection approaches itself (i.e., things done in the mark/sweep/compaction phase), am I missing something? Or is it just that e.g. gen2 tends to be larger than gen0, hence more objects to check?
To amplify on canton7's answer, it's worthwhile to note a couple of additional things, one of which increases the cost of all collections (but especially gen1 and gen2) but reduces the cost of allocations between them, and one of which reduces the cost of gen0 and gen1 collections:
Many garbage collectors behave in a fashion somewhat analogous to cleaning out a building by moving everything of value to another building, dynamiting the original, and rebuilding the empty shell. A gen0 collection, which moves things from the gen0 building to the gen1 building, will be fairly fast because the gen0 "building" won't have much stuff in it. A gen2 collection would have to move everything that was in the much larger gen2 building. Garbage collection systems may use a separate building for smaller gen2 objects and larger ones, and manage the larger buildings by tracking individual regions of free space, but moving smaller objects and reclaiming storage wholesale is less work than trying to manage all the individual regions of storage that would become eligible for reuse. A key point to observe about generations here, however, is that even when it's necessary to scan a gen1 or gen2 object, it won't be necessary to move it since the "building" it's in isn't targeted for immediate demolition.
Many systems use a "card table" which can record whether each 4K chunk of memory has been written, or contains a reference that was used to modify an object, since the last gen0 or gen1 collection. This significantly slows down the first write to any such region of storage, but during a gen0 and gen1 collections, it makes it possible to skip the examination of a lot of objects. The details of how the card table are used vary, but the basic concept is that if code has a large array of references, but most of it falls within 4K blocks that aren't tagged, the GC can know without even looking in those blocks that any newer objects which would be accessible through them will also be accessible in other ways, and thus it will be possible to find all gen0 objects without bothering to look in those blocks at all.
Note that even simplistic garbage-collection systems without card tables can be simply and easily made to benefit from the principle of generational GC. For example, on Commodore 64 BASIC, whose garbage collector is horrendously slow, a program that has created lots of long-lived strings can avoid lengthy garbage-collection cycles by using a couple peek and poke statements to adjust the top-of-string-heap pointer just below the bottom of the long-lived strings so they won't be considered for relocation/reclamation. If a program uses hundreds of strings that will last throughout program execution (e.g. a table of two-digit hex strings from 00 to FF), and just a handful of other strings, this may slash garbage-collection times by more than an order of magnitude.
A couple of reasons which come to mind:
They're bigger. Collecting gen1 means also collecting gen0, and doing a gen2 collection means collecting all three. The lower generations are sized smaller as well, as gen0 is collected most frequently and so needs to be cheap.
The main cost of a collection is a function of the number of objects which survive, not the number which die. Generational garbage collectors are built around the generational hypothesis, which says that objects tend to live for a short time, or a long time, but not often in the middle. Gen0 collections by their very definition are comprised mainly of objects which die in that generation, and so collections are cheap: gen1 and gen2 collections have a higher proportion of objects which survive (gen2 should ideally be comprised only of objects which survive), and so are more expensive.
If an object is in gen0, then it can only be referenced by other gen0 objects, or by objects in higher generations which were updated to refer to it. Therefore to see whether an object in gen0 is referenced, the GC needs to check other gen0 objects, as well as only those objects in higher generations which have been updated to point to lower-generation objects (which the GC tracks, see "card tables"). To see whether a gen1 object is referenced it needs to check all of gen0 and gen1, and updated objects in gen2.

PerfView GC Trigger Reasons

I have been investigating some garbage collection issues in a c# server app. I'm currently using PerfView to do this. After collecting some data and getting a load of GC Stats I'm a little confused about one of the columns 'Trigger Reason'. I'm getting two values 'AllocLarge' and 'AllocSmall'. I have searched through the help and google and can't find what exactly these two terms mean.
The .NET GC treats objects larger than 85K (a large object) very differently than other objects (small objects). In particular large objects are only collected in 'Generation 2' (the most expensive kind of GC). 'AllocLarge' means a GC was triggered while allocating a large object (and thus must have provoked a Gen 2 GC). 'AllocSmall' means a GC happened in responce to a allocation of an 'ordinary' object.
Note that in general it is bad to have short lived large objects (since these force expensive GCs). You can see everywhere you allocated a large object by looking at the 'GC Alloc Stats' view and looking for the pseudo-frame 'LargeObject'. Double click on that (which brings you to the 'callers' view, and yoiu will see where you are allocating large objects.

What is meant by Generations of Garbage Collector?

What is meant by Generations of Garbage Collector in C#? Is it different from the concept or is GENERATION only a term used to represent the time period?
A GC generation relates to how many garbage collections an object survives.
All objects start in generation 0. When a garbage collection occurs, and a generation N object cannot be collected, it is moved to generation N+1.
The generations are used to performance optimize garbage collection. It is generally true that generation 0:
Is a small fraction of the entire heap in size
Has a lot of short-lived objects.
Therefore, when garbage collection occurs, the garbage collector starts by collecting generation 0, which will be quick. If enough memory could be released, no need to look at the older generations, and therefore, collection can happen quickly.
Books could be written about the subject; but to start with, there is some great details in this article, or the reference here.

Per-thread memory management in C#

Continuing the discussion from Understanding VS2010 C# parallel profiling results but more to the point:
I have many threads that work in parallel (using Parallel.For/Each), which use many memory allocations for small classes.
This creates a contention on the global memory allocator thread.
Is there a way to instruct .NET to preallocate a memory pool for each thread and do all allocations from this pool?
Currently my solution is my own implementation of memory pools (globally allocated arrays of object of type T, which are recycled among the threads) which helps a lot but is not efficient because:
I can't instruct .NET to allocate from a specific memory slice.
I still need to call new many times to allocate the memory for the pools.
Thanks,
Haggai
I searched for two days trying to find an answer to the same issue you had. The answer is you need to set the garbage collection mode to Server mode. By default, garbage collection mode set to Workstation mode.
Setting garbage collection to Server mode causes the managed heap to split into separately managed sections, one-per CPU.
To do this, you need to add a config setting to your app.config file.
<runtime>
<gcServer enabled="true"/>
</runtime>
The speed difference on my 12-core Opteron 6172 was dramatic!
The garbage collector does not allocate memory.
It sounds more like you're allocating lots of small temporary objects and a few long-lived objects, and the garbage collector is spending a lot of time garbage-collecting the temporary objects so your app doesn't have to request more memory from the OS. From .NET Framework 4 Advanced Development - Garbage Collection:
As long as address space is available in the managed heap, the runtime continues to allocate space for new objects. However, memory is not infinite. Eventually the garbage collector must perform a collection in order to free some memory.
The solution: Don't allocate lots of small temporary objects. The page on Garbage Collection and Performance might also be helpful.
You could pre-allocate a bunch of objects, and keep them in groups intended for separate threads. However, it's likely that you won't get any better performance from this.
The garbage collector is specially designed to handle small short-lived objects efficiently. If you keep the objects in a pool, they are long-lived and will survive a garbage collection, which in turns means that they will be copied to the second generation heap. This copying will be more expensive than just allocating new objects.

Short-lived objects

What is the overhead of generating a lot of temporary objects (i.e. for interim results) that "die young" (never promoted to the next generation during a garbage collection interval)? I'm assuming that the "new" operation is very cheap, as it is really just a pointer increment. However, what are the hidden costs of dealing with this temporary "litter"?
Not a lot - the garbage collector is very fast for gen0. It also tunes itself, adjusting the size of gen0 depending on how much it manages to collect each time it goes. (If it's managed to collect a lot, it will reduce the size of gen0 to collect earlier next time, and vice versa.)
The ultimate test is how your application performs though. Perfmon is very handy here, showing how much time has been spent in GC, how many collections there have been of each generation etc.
As you say the allocation itself is very inexpensive. The cost of generating lots of short lived objects is more frequent garbage collections as they are triggered when generation 0's budget is exhausted. However, a generation 0 collection is fairly cheap, so as long as your object really are short lived the overhead is most likely not significant.
On the other hand the common example of concatenating lots of strings in a loop pushes the garbage collector significantly, so it all depends on the number of objects you create. It doesn't hurt to think about allocation.
The cost of garbage collection is that managed threads are suspended during compaction.
In general, this isn't something you should probably be worrying about and sounds like it starts to fall very close to "micro-optimization". The GC was designed with an assumption that a "well tuned application" will have all of it's allocations in Gen0 - meaning that they all "die young". Any time you allocate a new object it is always in Gen0. A collection won't occur until the Gen0 threshold is passed and there isn't enough available space in Gen0 to hold the next allocation.
The "new" operation is actually a bunch of things:
allocating memory
running the types constructor
returning a pointer to the memory
incrementing the next object pointer
Although the new operation is designed and written efficiently it is not free and does take time to allocate new memory. The memory allocation library needs to track what chunks are available for allocation and the newly allocated memory is zeroed.
Creating a lot of objects that die young will also trigger garbage collection more often and that operation can be expensive. Especially with "stop the world" garbage collectors.
Here's an article from the MSDN on how it works:
http://msdn.microsoft.com/en-us/magazine/bb985011.aspx
Note: that it describes how calling garbage collection is expensive because it needs to build the object graph before it can start garbage collection.
If these objects are never promoted out of Generation 0 then you will see pretty good performance. The only hidden cost I can see is that if you exceed your Generation 0 budget you will force the GC to compact the heap but the GC will self-tune so this isn't much of a concern.
Garbage collection is generational in .Net. Short lived objects will collect first and frequently. Gen 0 collection is cheap, but depending on the scale of the number of objects you're creating, it could be quite costly. I'd run a profiler to find out if it is affecting performance. If it is, consider switching them to structs. These do not need to be collected.

Categories

Resources