'Implicit' Garbage Collection doesn't reduce memory footprint. GC.Collect() does - c#

I have application that is processing large amount of data and I'm monitoring .NET memory performance counters for it.
Based on perf counters the #Bytes in All Heaps is slowly growing (about 20MB per 12 hours).
All 3 generations are also being collected (gen0 few times per second, gen1 approximately once per second, gen2 approximately once per minute) - but it doesn't prevent the #Bytes in All Heaps from slowly growing.
However if I explicitly run:
GC.Collect();
GC.WaitForPendingFinalizers();
It will collect all the extra consumed memory. (e.g. if run after 12 hours, heaps footprint drops by 20MB).
I was also trying to inspect the dump (before running GC.Collect) by sos and sosex - and majority of object hanging around are unrooted.
Why are not the implicit garbage collection runs (showed by performance counters) collecting the memory, that explicit GC.Collect() call does?
EDIT:
I forgot to mention that objects that are remaining to hang around unrooted are NOT implementing IDisposable - so they should be reclaimed during the first run of GC on that particular generation (in another words - potential problem with wrong Dispose() method and deadlocked finalizer is out of question here. But points up to Stephen and Roy for pointing to this possibility)

The garbage collector is actually pretty intelligent. It does not collect memory if it doesn't have to, and that provides it some optimization flexibility.
It keeps open the option of resurrecting the objects that require finalization but that are not yet finalized. As long as the memory is not required, the garbage collector thus does not force the finalization, just in case it can resurrect those objects.

Related

Inconsistent(?) behavior of garbage collector and (almost) out of memory issues

Some background:
We are running some pipelines on a buildserver and it consumes way to much memory. The pipeline does some DB imports and it builds up memory over time x times greater than the total size of an exported DB. For the import Entity Framework (core) is used (in order to be able to reuse entity definitions used in other parts of the application).
Situtation:
We are looking into where memory consumption can be reduced. Hence I was using the memory profiler.
I've noticed that sometimes the garbage collector does seem to free up memory after process X was done, and before process Y was started.
This is as expected. The 4GB memory build up is OK(ish), as long as it is released. The code that caused this consumption is running in its own Scope (speaking about dependency injection) and the DbContexts (and other things) used are registered as Scoped. Hence we have these ScopeWorkers.
await _scopeWorker.DoWork<MyProcessX>(_ => _.Import(cancellationToken));
// In some test, memory got freed up in between, but in some other test, memory never seemed to have dropped
await _scopeWorker.DoWork<MyProcessY>(_ => _.Import(cancellationToken));
But in some other test, this drop in memory was never seen.
The red arrow indicates approximately the same moment in time, after MyProcessX.Import, and a significant drop (of 4GBs) was never seen.
Of course I do not know whether the GC spread out the cleaning of this memory over a couple dozen collection moments, instead of 3, as seen in the first screenshot.
Questions
Is it possible to wait for the garbage collector to have collected basically all memory used by MyProcessX.Import, before continueing with MyProcessY.Import?
Should the garbage collector behave consistently? In other words, should I see the same memory consumption graph over time when the processes is repeated and is doing the exact same operations (so same data, as the data comes from a static source)
If the garbage collector is inconsistent in its behavior, how to make good use of the memory profiling feature in Visual Studio to spot opportunities of lowering memory?
EDIT
Yes the memory pressure on the system changes everything, as Evk pointed out. After reserving almost all physical memory on the system (31GB/32GB) and continuing the process which I was attempting to optimize memory usage I could see a definite drop in memory used. I could repeat this, as shown in the image there are actually 2 drops in memory.
Garbage collector uses the following conditions to decide whether it should start collection:
The system has low physical memory. The memory size is detected by
either the low memory notification from the operating system or low
memory as indicated by the host.
The memory that's used by allocated objects on the managed heap
surpasses an acceptable threshold. This threshold is continuously
adjusted as the process runs.
The GC.Collect method is called. In almost all cases, you don't have
to call this method because the garbage collector runs continuously.
This method is primarily used for unique situations and testing.
The first point means it depends on all processes running on current machine, not only on your process. For the same reason you don't know when GC will start, so you can't wait for that to happen.
For that same reason it cannot behave consistently in way you describe, in relation to your process. Your process may do the same thing, but OS as a whole is unlikely to ever do the same things during your process run. In one test run there were enough free memory over whole system, and in another it was not.
What you can do is force GC to run via GC.Collect (and overloads). However that's rarely a good idea.
Main thing you should ask yourself is - does high memory consumption bring any problems? Because by itself it's not a problem (assuming no memory leaks) - you have RAM to be used, not to just stay "free". If there is enough memory currently - GC might rightfully decide to not waste time on garbage collection and do that later when necessary.

How to release the memory which is occupied but large object as soon as possible?

I have a method as such
public void MainMethod()
{
while(true)
{
var largeObject = GetLargeObject();
............
//some work with this largeObject
............
//how to release the memory from largeObject here
}
}
obviously, I will use break in loop.
In that usage of this code, my memory will be full of trash pretty fast.
Is there a way to free memory that uses some object (largeObject, for example) without running garbage collector by GC.Collect()? Not just mark this object as one that can be collect by GC, but free the memory? Or GC collects largeObject as soon as iteration is over, cause it's not in use anymore?
As long as I understand the realization of IDisposable and call for Dispose() just mark the object for GC but not free the memory instantly (so the memory will be released when the GC runs)
P.S. don't say me "GC will collect everything for your", I know that
first of all. Large objects, i.e. larger than 85kb, will be allocated on the large object heap (LOH). The large object heap is only collected in gen 2 collections, i.e. collections are expensive. Because of this it is generally recommended to avoid frequent LOH allocations. If at all possible, reuse the same object, or use a memory pool to avoid frequent allocations.
There are no way to explicitly free managed memory without letting the GC do its thing. While the standard recommendation is to leave the GC alone, there are cases where it may make sense to run it manually. Like if you have recently released large amount of memory, and the performance impact of a collection is acceptable at that point in time.
If the object is small enough to fit in the Small object heap then you should not need to worry about it. Gen 1 collections are fairly cheap, and while collections will need to run more frequently if you do many allocations, the time to run a collection is not proportional to the amount of freed memory. That said, reusing memory might still be a good idea.
In the end, if you suspect it is a problem, do some profiling. Some of the profilers give you the time spent in garbage collection, and also the allocation rate. Do not try to fix a imaginary problem before confirming that it actually is a problem, and you can verify that the fix actually is an improvement.

Determine what GC.Collect() is Collecting

I have a console program written in VB.NET (.NET 4.5.2) that acts as a service. A continuous loop runs, and then waits for a message on an MSMQ queue, and processes the message. Somehow this program has a substantial memory leak. I have gone through all the code and done everything I could to use Using statements, but yet the problem persists. The more times through the loop, the higher the memory used by the program, and this memory is never reclaimed by the garbage collector.
I ended up putting a GC.Collect() at the bottom of my loop, and was able to free up most of the memory. However, I realize that this is bad practice and could cause issues. Just wondering if there is a way to inspect what variables the GC.Collect() is getting rid of, so I can find the root of the problem?
Do While (True)
' Code to wait for message on a queue
' Code to process message (includes calls to class library)
GC.Collect()
Loop
If the garbage collector is freeing memory when you manually invoke it, you are not leaking memory. A leak in a managed memory environment is memory that the GC can't free because it is referenced somewhere on the object graph.
It could be that they way your code uses objects instances that are being promoted to gen1, gen2 or the large object heap. Instances in these generations are collected less frequently then gen0. The windows resource monitor includes a number of performance counters that can be used to profile the behavior of the managed heap. I would guess that you might have objects being promoted into gen2. Tracking the "Gen 1 Promoted Bytes/Sec" counter would give you insight as to whether this is what's happening.
In a managed memory environment the GC runs when there is memory pressure, not when object instances are no longer needed, so the mere presence of increased memory use is not necessarily the sign of a leak.
If you take the Collect out does memory usage always increase (say over a number of minutes or messages being processed form the queue), or does it go up and down somewhat like a sin wave? If it's the latter just let the GC do its thing, you don't have a leak.
Visual studio has a number of memory analysis features https://msdn.microsoft.com/en-us/library/dn342825.aspx
The SOS managed debugger extension is a very powerful tool for sifting through the managed heap, though it is not for the faint of heart. https://learn.microsoft.com/en-us/dotnet/framework/tools/sos-dll-sos-debugging-extension

.NET: What is typical garbage collector overhead?

5% of execution time spent on GC? 10%? 25%?
Thanks.
This blog post has an interesting investigation into this area.
The posters conclusion? That the overhead was negligible for his example.
So the GC heap is so fast that in a real program, even in tight loops, you can use closures and delegates without even giving it a second’s thought (or even a few nanosecond’s thought). As always, work on a clean, safe design, then profile to find out where the overhead is.
It depends entirely on the application. The garbage collection is done as required, so the more often you allocate large amounts of memory which later becomes garbage, the more often it must run.
It could even go as low as 0% if you allocate everything up front and the never allocate any new objects.
In typical applications I would think the answer is very close to 0% of the time is spent in the garbage collector.
The overhead varies widely. It's not really practical to reduce the problem domain into "typical scenarios" because the overhead of GC (and related functions, like finalization) depend on several factors:
The GC flavor your application uses (impacts how your threads may be blocked during a GC).
Your allocation profile, including how often you allocate (GC triggers automatically when an allocation request needs more memory) and the lifetime profile of objects (gen 0 collections are fastest, gen 2 collections are slower, if you induce a lot of gen 2 collections your overhead will increase).
The lifetime profile of finalizable objects, because they must have their finalizers complete before they will be eligible for collection.
The impact of various points on each of those axes of relevancy can be analyzed (and there are probably more relevant areas I'm not recalling off the top of my head) -- so the problem is really "how can you reduce those axes of relevancy to a 'common scenario?'"
Basically, as others said, it depends. Or, "low enough that you shouldn't worry about it until it shows up on a profiler report."
In native C/C++ there is sometimes a large cost of allocating memory due to finding a block of free memory that is of the right size, there is also a none 0 cost of freeing memory due to having to linked the freed memory into the correct list of blocks, and combine small blocks into large blocks.
In .NET it is very quick to allocate a new object, but you pay the cost when the garbage collector runs. However to cost of garbage collection short lived object is as close to free as you can get.
I have always found that if the cost of garbage collection is a problem to you, then you are likely to have over bigger problems with the design of your software. Paging can be a big issue with any GC if you don’t have enough physical RAM, so you may not be able to just put all your data in RAM and depend on the OS to provide virtual memory as needed.
It really can vary. Look at this demonstration short-but-complete program that I wrote:
http://nomorehacks.wordpress.com/2008/11/27/forcing-the-garbage-collector/
that shows the effect of large gen2 garbage collections.
Yes, the Garbage Collector will spend some X% of time collecting when averaged over all applications everywhere. But that doesn't necessarily means that time is overhead. For overhead, you can really only count the time that would be left after releasing an equivalent amount of memory on an unmanaged platform.
With that in mind, the actual overhead is negative, but the Garbage collector will save time by release several chunks of memory in batches. That means fewer context switches and an overall improvement in efficiency.
Additionally, starting with .Net 4 the garbage collector does a lot of it's work on a different thread that doesn't interrupt your currently running code as much. As we work more and more with mutli-core machines where a core might even be sitting idle now and then, this is a big deal.

Short-lived objects

What is the overhead of generating a lot of temporary objects (i.e. for interim results) that "die young" (never promoted to the next generation during a garbage collection interval)? I'm assuming that the "new" operation is very cheap, as it is really just a pointer increment. However, what are the hidden costs of dealing with this temporary "litter"?
Not a lot - the garbage collector is very fast for gen0. It also tunes itself, adjusting the size of gen0 depending on how much it manages to collect each time it goes. (If it's managed to collect a lot, it will reduce the size of gen0 to collect earlier next time, and vice versa.)
The ultimate test is how your application performs though. Perfmon is very handy here, showing how much time has been spent in GC, how many collections there have been of each generation etc.
As you say the allocation itself is very inexpensive. The cost of generating lots of short lived objects is more frequent garbage collections as they are triggered when generation 0's budget is exhausted. However, a generation 0 collection is fairly cheap, so as long as your object really are short lived the overhead is most likely not significant.
On the other hand the common example of concatenating lots of strings in a loop pushes the garbage collector significantly, so it all depends on the number of objects you create. It doesn't hurt to think about allocation.
The cost of garbage collection is that managed threads are suspended during compaction.
In general, this isn't something you should probably be worrying about and sounds like it starts to fall very close to "micro-optimization". The GC was designed with an assumption that a "well tuned application" will have all of it's allocations in Gen0 - meaning that they all "die young". Any time you allocate a new object it is always in Gen0. A collection won't occur until the Gen0 threshold is passed and there isn't enough available space in Gen0 to hold the next allocation.
The "new" operation is actually a bunch of things:
allocating memory
running the types constructor
returning a pointer to the memory
incrementing the next object pointer
Although the new operation is designed and written efficiently it is not free and does take time to allocate new memory. The memory allocation library needs to track what chunks are available for allocation and the newly allocated memory is zeroed.
Creating a lot of objects that die young will also trigger garbage collection more often and that operation can be expensive. Especially with "stop the world" garbage collectors.
Here's an article from the MSDN on how it works:
http://msdn.microsoft.com/en-us/magazine/bb985011.aspx
Note: that it describes how calling garbage collection is expensive because it needs to build the object graph before it can start garbage collection.
If these objects are never promoted out of Generation 0 then you will see pretty good performance. The only hidden cost I can see is that if you exceed your Generation 0 budget you will force the GC to compact the heap but the GC will self-tune so this isn't much of a concern.
Garbage collection is generational in .Net. Short lived objects will collect first and frequently. Gen 0 collection is cheap, but depending on the scale of the number of objects you're creating, it could be quite costly. I'd run a profiler to find out if it is affecting performance. If it is, consider switching them to structs. These do not need to be collected.

Categories

Resources