Determine what GC.Collect() is Collecting - c#

I have a console program written in VB.NET (.NET 4.5.2) that acts as a service. A continuous loop runs, and then waits for a message on an MSMQ queue, and processes the message. Somehow this program has a substantial memory leak. I have gone through all the code and done everything I could to use Using statements, but yet the problem persists. The more times through the loop, the higher the memory used by the program, and this memory is never reclaimed by the garbage collector.
I ended up putting a GC.Collect() at the bottom of my loop, and was able to free up most of the memory. However, I realize that this is bad practice and could cause issues. Just wondering if there is a way to inspect what variables the GC.Collect() is getting rid of, so I can find the root of the problem?
Do While (True)
' Code to wait for message on a queue
' Code to process message (includes calls to class library)
GC.Collect()
Loop

If the garbage collector is freeing memory when you manually invoke it, you are not leaking memory. A leak in a managed memory environment is memory that the GC can't free because it is referenced somewhere on the object graph.
It could be that they way your code uses objects instances that are being promoted to gen1, gen2 or the large object heap. Instances in these generations are collected less frequently then gen0. The windows resource monitor includes a number of performance counters that can be used to profile the behavior of the managed heap. I would guess that you might have objects being promoted into gen2. Tracking the "Gen 1 Promoted Bytes/Sec" counter would give you insight as to whether this is what's happening.
In a managed memory environment the GC runs when there is memory pressure, not when object instances are no longer needed, so the mere presence of increased memory use is not necessarily the sign of a leak.
If you take the Collect out does memory usage always increase (say over a number of minutes or messages being processed form the queue), or does it go up and down somewhat like a sin wave? If it's the latter just let the GC do its thing, you don't have a leak.
Visual studio has a number of memory analysis features https://msdn.microsoft.com/en-us/library/dn342825.aspx
The SOS managed debugger extension is a very powerful tool for sifting through the managed heap, though it is not for the faint of heart. https://learn.microsoft.com/en-us/dotnet/framework/tools/sos-dll-sos-debugging-extension

Related

Inconsistent(?) behavior of garbage collector and (almost) out of memory issues

Some background:
We are running some pipelines on a buildserver and it consumes way to much memory. The pipeline does some DB imports and it builds up memory over time x times greater than the total size of an exported DB. For the import Entity Framework (core) is used (in order to be able to reuse entity definitions used in other parts of the application).
Situtation:
We are looking into where memory consumption can be reduced. Hence I was using the memory profiler.
I've noticed that sometimes the garbage collector does seem to free up memory after process X was done, and before process Y was started.
This is as expected. The 4GB memory build up is OK(ish), as long as it is released. The code that caused this consumption is running in its own Scope (speaking about dependency injection) and the DbContexts (and other things) used are registered as Scoped. Hence we have these ScopeWorkers.
await _scopeWorker.DoWork<MyProcessX>(_ => _.Import(cancellationToken));
// In some test, memory got freed up in between, but in some other test, memory never seemed to have dropped
await _scopeWorker.DoWork<MyProcessY>(_ => _.Import(cancellationToken));
But in some other test, this drop in memory was never seen.
The red arrow indicates approximately the same moment in time, after MyProcessX.Import, and a significant drop (of 4GBs) was never seen.
Of course I do not know whether the GC spread out the cleaning of this memory over a couple dozen collection moments, instead of 3, as seen in the first screenshot.
Questions
Is it possible to wait for the garbage collector to have collected basically all memory used by MyProcessX.Import, before continueing with MyProcessY.Import?
Should the garbage collector behave consistently? In other words, should I see the same memory consumption graph over time when the processes is repeated and is doing the exact same operations (so same data, as the data comes from a static source)
If the garbage collector is inconsistent in its behavior, how to make good use of the memory profiling feature in Visual Studio to spot opportunities of lowering memory?
EDIT
Yes the memory pressure on the system changes everything, as Evk pointed out. After reserving almost all physical memory on the system (31GB/32GB) and continuing the process which I was attempting to optimize memory usage I could see a definite drop in memory used. I could repeat this, as shown in the image there are actually 2 drops in memory.
Garbage collector uses the following conditions to decide whether it should start collection:
The system has low physical memory. The memory size is detected by
either the low memory notification from the operating system or low
memory as indicated by the host.
The memory that's used by allocated objects on the managed heap
surpasses an acceptable threshold. This threshold is continuously
adjusted as the process runs.
The GC.Collect method is called. In almost all cases, you don't have
to call this method because the garbage collector runs continuously.
This method is primarily used for unique situations and testing.
The first point means it depends on all processes running on current machine, not only on your process. For the same reason you don't know when GC will start, so you can't wait for that to happen.
For that same reason it cannot behave consistently in way you describe, in relation to your process. Your process may do the same thing, but OS as a whole is unlikely to ever do the same things during your process run. In one test run there were enough free memory over whole system, and in another it was not.
What you can do is force GC to run via GC.Collect (and overloads). However that's rarely a good idea.
Main thing you should ask yourself is - does high memory consumption bring any problems? Because by itself it's not a problem (assuming no memory leaks) - you have RAM to be used, not to just stay "free". If there is enough memory currently - GC might rightfully decide to not waste time on garbage collection and do that later when necessary.

When GC.Collect() is called and frees up more than 3GB of space, is this necessarily a good thing?

I've read many questions on this network in regards to when one should use GC.Collect(), but none address this concern.
If we somehow manage to use it, as such as it frees a large number of memory, is this necessarily a good thing, rather than simply relying on the Garbage Collector to manage the memory?
Regardless of how idle my ASP.NET application stays, or how much time passes, or operations are done, the memory that gets freed when explicitly calling GC.Collect() is never addressed (only memory on top of it is, like it's some kind of a cache.)
One theory I have in mind, is that perhaps the reason those 3GB+ are not freed (but rather the application frees memory above that threshold) might be due to reusing parts of it, or maybe some other helpful optimization, at the cost of, of course, using that memory throughout the run-time of the application.
Could anyone share some insight on these concerns? I'm following best practices and disposing objects whenever given the chance. No unmanaged resource is left to be finalized - it's always disposed by me.
EDIT: The Garbage Collector used is the workstation one, not the server.
The garbage collector's job is to simulate a machine with infinite memory. If your system has enough unallocated memory to continue functioning, why should the collector spend CPU cycles performing the hard work of identifying unreferenced objects? That's a performance cost with zero benefit.
If left alone, the collector will run and free those 3GB when the system starts to run low on memory. Until that point it is dormant.

Memory not freed by Application until GC is explicitly called

I am launching many threads simultaneously, each one writing / reading data into/from a queue; Data is dequeued progressively while it is processed and stored to DB. For some reason the memory is not freed although the Queues are empty and I made sure all event subscription between the data reader and the data processor are unsubscribed at the end of the threads.The amount of squatted RAM is Exactly the amount of the data that is read by binary readers and put into queues.
In order to isolate the problem, I have bypassed the Processing and the DB Storing step.
Why would be the RAM still squatted long after all threads are finished, until I explicitly call GC.Collect() or terminate the program? I manually nullified the Queue after it is emptied, and also nullified the Binary Reader that read the data. I thought that would be enough for the GC to wake up and do its housekeeping at least after a few minutes.
EDIT :
The (reformulated after deletion) QUESTION :
In short, I was always told that the default GC behaviour managed the memory properly and that I shall almost never call the GC explicitly and let the framework do the job.
I would like to know why in this case memory usage drops down only when an explicit Call is made to GC.Collect
EDIT : Without GC Collect
With GC Collect (Called on a regular basis)
There are three conditions when a Garbage collection might occur (see MSDN):
1.) The system has low physical memory.
2.) The memory that is used by allocated objects on the managed heap
surpasses an acceptable threshold. This means that a threshold of
acceptable memory usage has been exceeded on the managed heap. This
threshold is continuously adjusted as the process runs.
3.) The GC.Collect method is called. In almost all cases, you do not have
to call this method, because the garbage collector runs continuously.
This method is primarily used for unique situations and testing.
From your description it sounds like the framework decided neither 1.) nor 2.) are the case hence it will only collect when you call GC.Collect()
Based on the screenshot, it appears that your memory usage never goes over 50% during the entire run of the application. It's much faster for the .NET framework to leave that memory alone than to stop your application (or at least take up a lot of CPU time in checking for liveness) in order to collect the garbage. If you cut your machine's RAM to 2 GB or so, I'm sure that the garbage collector would step up and keep memory usage within the hardware limits.
It seems you are leaking objects you should use windbg to create a dump then use sos.dll to track root for your objects.
You can follow this explanation on how to track "roots" for your objects and see what is causing these leaks.

WCF Serialization and Caching

I have a WCF service hosted in a console application. and I have a ChannelFactory to call WCF's operation Contracts.
Problem: whenever I call an operation that returns values, it seems that the value returned is cached somewhere by the service when it is serialized.
I am checking the service memory usage through the task manager under windows 7. When I call an operation that returns nothing, the memory is not increased, but when I call an operation that returns data, memory is increased and stays this way even after the data is returned to the client.
My guess is that this is a serialization caching issue?!?
It sounds more like garbage collector hasn't run yet and because of that the memory wasn't released. Moreover when hosting WCF service in console application the GC runs in Workstation mode which can be less effective in such case.
Memory will probably stay increased until the GC runs, which won't correspond to when data is returned to the client.
Have you tried adding a breakpoint or logging of some sort to your service method to make sure that the method is being called on each request? I don't think WCF does any caching on its own; at least, I've never had it do any in my applications that use it.
Edit:
Memory will remain in use until the garbage collector runs. If your process still has plenty of free space in its heap, then there really isn't a reason for the GC to run.
According to MSDN: http://msdn.microsoft.com/en-us/library/ee787088.aspx#conditions_for_a_garbage_collection
Garbage collection occurs when one of
the following conditions is true:
The system has low physical memory.
The memory that is used by allocated objects on the managed heap
surpasses an acceptable threshold.
This means that a threshold of
acceptable memory usage has been
exceeded on the managed heap. This
threshold is continuously adjusted as
the process runs.
The GC.Collect method is called. In almost all cases, you do not have
to call this method, because the
garbage collector runs continuously.
This method is primarily used for
unique situations and testing.
It is likely that you are allocating objects in the heap, but they are not being GC'd because the GC sees no reason to run (there is still open space in the heap generation, and no reason to spend time clearing it out).
However, if you can repeat your WCF call over and over and eventually get an Out of Memory exception, then that would indicate that you do indeed have a problem with references being held somewhere. In that case, I would use a memory profiler to determine what is being held onto, and by what.
Edit #2:
See also this thread: C# Thread not releasing memory
Use a tool like this to see what objects are being created/collected:
http://memprofiler.com/
I wouldn't worry too much if GC isn't being run after each call - this wouldn't be very efficient anyway - the garbage collector will run at a point when the .NET framework determines that objects are old enough to be collected. If memory starts to run low then this will happen more frequently.

Why does running out of memory depend on intermediate calls to GC.GetTotalMemory?

A memory intensive program that I wrote ran out of memory: threw an OutOfMemory exception. During attempts to reduce memory usage, I started calling GC.GetTotalMemory(true) (to write the total memory usage to debug file), which triggers a garbage collect.
For some reason, when calling this function I don't get an out of memory exception anymore. If I remove the calls again (keeping everything else the same), the exception gets thrown again. In my understanding, calls are automatically made to collect garbage when memory pressure increases, so I don't understand this behavior.
Can anyone explain why the out of memory exception is only thrown when there are no calls to GC.collect?
Update:
I'm using VS 2010, but I'm downtargeting the application to framework 3.5. I believe that defragmentation is indeed causing my problems.
I did some tests: When the exception is thrown, a call to GC.gettotalmemory tells me I am using ~800 * 10^6 bytes. However, task manager tells me the application is using 1700 mb. A rather large discrepancy. I'm now planning to allocate memory only once, and to never deallocate any large arrays but reusing them. Luckily, my program allows me to accomplish this without too much fuss.
I solved the problem by doing some smarter memory management. In particular by using a CustomList according to the suggestions on http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/
Is your app running at full CPU? I'm pretty sure automatic garbage collection only occurs when the application is idle. Otherwise, you have to run a manual cycle.
I'm fairly sure that running out of memory does not force a garbage collection. That probably sounds incredibly unintuitive to you but I think this was done for a good reason. It prevents the program from entering a death-spiral where it constantly tries to find more space and getting all objects firmly lodged into gen #2. From which it is very hard to recover again.
The true argument you pass to GetTotalMemory() forces a full garbage collection. I would guess that this happens to free up enough space in the Large Object Heap to satisfy the memory allocation. This will of course work only once. If your program just keeps running, gobbling up memory beyond the 1.5 gigabytes or so that it has already consumed then OOM is just around the corner again. This time without any way to recover. Surviving an OOM requires drastic measures.
You'll need a good memory profiler to find out what's really going on. Unmanaged C++ in your project is always a fertile source of memory leaks. The unmanaged kind, always hard to trouble-shoot.
GC.Collect is merely a "suggestion" to free unused memory - it does not guarantee its release.
[Edit]
It appears that, while once true when I was learning the JVM years ago, this may not be the case in .NET anymore. The MSDN Library says that GC.Collect "Forces an immediate garbage collection of all generations." Good stuff (for me, anyway) about this here.
If you have unmanaged resources that occupy a lot of memory, the garbage collector won't really recognize that memory pressure. If you clean up those resources in finalizers, then forcing a collection will result in those unmanaged resources being freed, while if you don't force a collection, the garbage collector might not realize that it needs to be collecting.
If you are performing large unmanaged allocations, you can use GC.AddMemoryPressure to tell the GC that, so it can take it into account when deciding whether to run a collection or not.
I was browsing around the Microsoft Connect site and I am seeing bug reports where people are making the same claim you are. The claim being that an OutOfMemoryException is occuring which can be resolved by periodically calling GC.Collect. I saw one report where the lead engineer from the garbage collector team responded back and said a bug was fixed in .NET 4.0 that should resolve a fragmentation issue with the large object heap. That is why I asked what version you were using.
It is certainly possible that you have stumbled upon a bug in the garbage collector. As with all GC related issues this could be very version dependent.
My advice would be to:
make sure you have the latest patches and service pack
refactor the code so that it is not as memory intensive
reuse LOH objects as much as possible instead of creating new ones
continue using GC.Collect at strategic points if necessary as a workaround

Categories

Resources