I'm writing a data structure and if I set <gcServer enabled="true" /> in my app.config file, the program adds 500,000 items in 200 milliseconds. If I set <gcServer enabled="false" /> it takes 300 milliseconds. That is, setting this flag to false makes it take 50% longer consistently, as measured by a Stopwatch.
I'm wondering why this is because I am not doing any garbage collecting. I know it is done automatically sometimes but after profiling with CLRProfiler, I can confirm 0 collections are occurring:
Does anyone know why this is happening? If the garbage collector isn't even running, then why is a server garbage collector so much faster? Here is the code where I am checking the speed differences:
Stopwatch sw = Stopwatch.StartNew();
foreach (string s in items)
{
dataStructure.Add(s, s + "a");
}
sw.Stop();
The are five things here I need to cover.
First is understanding this flag does not turn the garbage collector on or off, but rather merely determines which mode it uses. Both examples have garbage collection enabled; only the first (slower) example used server garbage collection.
Second is understanding just because no collections were performed, it does not mean the GC never stopped to check whether it needs to run a collection.
Third is this excerpt from the gcServer docs:
For single-processor computers, the default workstation garbage collection should be the fastest option.
Fourth is understanding "single processor" in the above context really does refer to processors, and not to cores.
Put all this together, and the behavior from the question exactly matches the documentation. Move along; nothing to see here.
Fifth, and finally, is understanding the values at play and how server loads differ from desktop loads. The primary importance in a server system is stability. This is true even over performance; performance doesn't matter if the system crashes all the time. Additionally, servers are more likely to run workloads where a process has a long life-time... even months or years at a time without restarting. So a server admin may sacrifice some performance to get a GC that runs a little more often and runs a more complete check, which may include memory compaction to reclaim address space, and thus avoid issues with OutOfMemoryExceptions that can otherwise cause problems in long-lived .Net applications.
Related
Some background:
We are running some pipelines on a buildserver and it consumes way to much memory. The pipeline does some DB imports and it builds up memory over time x times greater than the total size of an exported DB. For the import Entity Framework (core) is used (in order to be able to reuse entity definitions used in other parts of the application).
Situtation:
We are looking into where memory consumption can be reduced. Hence I was using the memory profiler.
I've noticed that sometimes the garbage collector does seem to free up memory after process X was done, and before process Y was started.
This is as expected. The 4GB memory build up is OK(ish), as long as it is released. The code that caused this consumption is running in its own Scope (speaking about dependency injection) and the DbContexts (and other things) used are registered as Scoped. Hence we have these ScopeWorkers.
await _scopeWorker.DoWork<MyProcessX>(_ => _.Import(cancellationToken));
// In some test, memory got freed up in between, but in some other test, memory never seemed to have dropped
await _scopeWorker.DoWork<MyProcessY>(_ => _.Import(cancellationToken));
But in some other test, this drop in memory was never seen.
The red arrow indicates approximately the same moment in time, after MyProcessX.Import, and a significant drop (of 4GBs) was never seen.
Of course I do not know whether the GC spread out the cleaning of this memory over a couple dozen collection moments, instead of 3, as seen in the first screenshot.
Questions
Is it possible to wait for the garbage collector to have collected basically all memory used by MyProcessX.Import, before continueing with MyProcessY.Import?
Should the garbage collector behave consistently? In other words, should I see the same memory consumption graph over time when the processes is repeated and is doing the exact same operations (so same data, as the data comes from a static source)
If the garbage collector is inconsistent in its behavior, how to make good use of the memory profiling feature in Visual Studio to spot opportunities of lowering memory?
EDIT
Yes the memory pressure on the system changes everything, as Evk pointed out. After reserving almost all physical memory on the system (31GB/32GB) and continuing the process which I was attempting to optimize memory usage I could see a definite drop in memory used. I could repeat this, as shown in the image there are actually 2 drops in memory.
Garbage collector uses the following conditions to decide whether it should start collection:
The system has low physical memory. The memory size is detected by
either the low memory notification from the operating system or low
memory as indicated by the host.
The memory that's used by allocated objects on the managed heap
surpasses an acceptable threshold. This threshold is continuously
adjusted as the process runs.
The GC.Collect method is called. In almost all cases, you don't have
to call this method because the garbage collector runs continuously.
This method is primarily used for unique situations and testing.
The first point means it depends on all processes running on current machine, not only on your process. For the same reason you don't know when GC will start, so you can't wait for that to happen.
For that same reason it cannot behave consistently in way you describe, in relation to your process. Your process may do the same thing, but OS as a whole is unlikely to ever do the same things during your process run. In one test run there were enough free memory over whole system, and in another it was not.
What you can do is force GC to run via GC.Collect (and overloads). However that's rarely a good idea.
Main thing you should ask yourself is - does high memory consumption bring any problems? Because by itself it's not a problem (assuming no memory leaks) - you have RAM to be used, not to just stay "free". If there is enough memory currently - GC might rightfully decide to not waste time on garbage collection and do that later when necessary.
I have a high performance application that is handling a very large amount of data. It is receiving, analysing and discarding enormous amounts of information over very short periods of time. This causes a fair amount of object churn that I am currently trying to optimize, but it also causes a secondary problem. When Garbage Collection kicks in it can cause some long delays as it cleans things up (by long I mean 10s to 100s of milliseconds). 99% of the time this is acceptable, but for brief windows of time about 1-2 minutes long I need to be absolutely sure that Garbage Collection does not cause a delay. I know when these periods of time will occur beforehand and I just need a way to make sure that Garbage collection doesn't happen during this period. The application is written in C# using .NET 4.0 Framework and uses both managed and unmanaged code if that matters.
My questions are;
Is it possible to briefly pause Garbage Collection for the entire program?
Is it possible to use System.GC.Collect() to force garbage collection before the window I need free of Garbage Collection and if I do how long will I be Garbage Collection free?
What advice do people have on minimizing the need for Garbage Collection overall?
Note - this system is fairly complex with lots of different components. I am hoping to avoid going to a approach where I have to implement a custom IDisposable interface on every class of the program.
.NET 4.6 added two new methods: GC.TryStartNoGCRegion and GC.EndNoGCRegion just for this.
GCLatencyMode oldMode = GCSettings.LatencyMode;
// Make sure we can always go to the catch block,
// so we can set the latency mode back to `oldMode`
RuntimeHelpers.PrepareConstrainedRegions();
try
{
GCSettings.LatencyMode = GCLatencyMode.LowLatency;
// Generation 2 garbage collection is now
// deferred, except in extremely low-memory situations
}
finally
{
// ALWAYS set the latency mode back
GCSettings.LatencyMode = oldMode;
}
That will allow you to disable the GC as much as you can. It won't do any large collections of objects until:
You call GC.Collect()
You set GCSettings.LatencyMode to something other than LowLatency
The OS sends a low-memory signal to the CLR
Please be careful when doing this, because memory usage can climb extremely fast while you're in that try block. If the GC is collecting, it's doing it for a reason, and you should only seriously consider this if you have a large amount of memory on your system.
In reference to question three, perhaps you can try reusing objects like byte arrays if you're receiving information through filesystem I/O or a network? If you're parsing that information into custom classes, try reusing those too, but I can't give too much good advice without knowing more about what exactly you're doing.
Here are some MSDN articles that can help too:
Latency Modes
Constrained Execution Regions (this is why we call PrepareConstrainedRegions())
Note: GCSettings.LatencyMode = GCLatencyMode.LowLatency can only be set if GCSettings.IsServerGC == false. IsServerGC can be changed in App.config:
<runtime>
<gcServer enabled="false" />
</runtime>
In a title "Forcing a Garbage Colection" from book "C# 2010 and the .NET 4 Platform" by Andrew Troelsen written:
"Again, the whole purpose of the .NET garbage collector is to manage memory on our behalf. However, in some very rare circumstances, it may be beneficial to programmatically force a garbage collection using GC.Collect(). Specifically:
• Your application is about to enter into a block of code that you don’t want interrupted by a possible garbage collection.
...
"
But stop! Is there a such case when Garbage Collection is undesirable? I never saw/read something like that (because of my little development experience of course). If while your practice you have done something like that, please share. For me it's very interesting point.
Thank you!
Yes, there's absolutely a case when garbage collection is undesirable: when a user is waiting for something to happen, and they have to wait longer because the code can't proceed until garbage collection has completed.
That's Troelsen's point: if you have a specific point where you know a GC isn't problematic and is likely to be able to collect significant amounts of garbage then it may be a good idea to provoke it then, to avoid it triggering at a less opportune moment.
I run a recipe related website, and I store a massive graph of recipes and their ingredient usage in memory. Due to the way I pivot this information for quick access, I have to load several gigs of data into memory when the application loads before I can organize the data into a very optimized graph. I create a huge amount of tiny objects on the heap that, once the graph is built, become unreachable.
This is all done when the web application loads, and probably takes 4-5 seconds to do. After I do so, I call GC.Collect(); because I'd rather re-claim all that memory now rather than potentially block all threads during an incoming HTTP request while the garbage collector is freaking out cleaning up all these short lived objects. I also figure it's better to clean up now since the heap is probably less fragmented at this time, since my app hasn't really done anything else so far. Delaying this might result in many more objects being created, and the heap needing to be compressed more when GC runs automatically.
Other than that, in my 12 years of .NET programming, I've never come across a situation where I wanted to force the garbage collector to run.
The recommendation is that you should not explicitly call Collect in your code. Can you find circumstances where it's useful?
Others have detailed some, and there are no doubt more. The first thing to understand though, is don't do it. It's a last resort, investigate other options, learn how GC works look at how your code is impacted, follow best practices for your designs.
Calling Collect at the wrong point will make your performance worse. Worse still, to rely on it makes your code very fragile. The rare conditions required to make a call to Collect beneficial, or at last not harmful, can be utterly undone with a simple change to the code, which will result unexpected OOMs, sluggish performamnce and such.
I call it before performance measurements so that the GC doesn't falsify the results.
Another situation are unit-tests testing for memory leaks:
object doesItLeak = /*...*/; //The object you want to have tested
WeakReference reference = new WeakRefrence(doesItLeak);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
Assert.That(!reference.IsAlive);
Besides those, I did not encounter a situation in which it would actually be helpful.
Especially in production code, GC.Collect should never be found IMHO.
It would be very rare, but GC can be a moderately expensive process so if there's a particular section that's timing sensitive, you don't want that section interupted by GC.
Your application is about to enter into a block of code that you don’t
want interrupted by a possible garbage collection. ...
A very suspect argument (that is nevertheless used a lot).
Windows is not a Real Time OS. Your code (Thread/Process) can always be pre-empted by the OS scheduler. You do not have a guaranteed access to the CPU.
So it boils down to: how does the time for a GC-run compare to a time-slot (~ 20 ms) ?
There is very little hard data available about that, I searched a few times.
From my own observation (very informal), a gen-0 collection is < 40 ms, usually a lot less. A full gen-2 can run into ~100 ms, probably more.
So the 'risk' of being interrupted by the GC is of the same order of magnitude as being swapped out for another process. And you can't control the latter.
I know there's tons of threads about this. And I read a few of them.
I'm wondering if in my case it is correct to GC.Collect();
I have a server for a MMORPG, in production it is online day and night. And the server is restarted every other day to implement changes to the production codebase. Every twenty minutes the server pauses all other threads, and serializes the current game state. This usually takes 0.5 to 4 seconds
Would it be a good idea to GC.Collect(); after serialization?
The server is, obviously, constantly creating and destroying game items.
Would I have a notorious gain in performance or memory optimization / usage?
Should I not manually collect?
I've read about how collecting can be bad if used in the wrong moments or too frequently, but I'm thinking these saves are both a good moment to collect, and not that frequent.
The server is in framework 4.0
Update in answer to a comment:
We are randomly experiencing server freezes, sometimes, unexpectedly, the server memory usage will raise increasingly until it reaches a point when the server takes way too long to handle any network operation. Thus, I'm considering a lot of different approaches to solve the issue, this is one of them.
The garbage collector knows best when to run, and you shouldn't force it.
It will not improve performance or memory optimization. CLR can tell GC to collect object which are no longer used if there is a need to do that.
Answer to an updated part:
Forcing the collection is not a good solution to the problem. You should rather have a look a bit deeper into your code to find out what is wrong. If memory usage grows unexpectedly you might have an issue with unmanaged resources which are not properly handled or even a "leaky code" within managed code.
One more thing. I would be surprise if calling GC.Collect fixed the problem.
Every twenty minutes the server pauses
all other threads, and serializes the
current game state. This usually takes
0.5 to 4 seconds
If all your threads are suspended already anyway you might as well call the garbage collection, since it should be fairly fast at this point. I suspect doing this will only mask your real problem though, not actually solve it.
We are randomly experiencing server
freezes, sometimes, unexpectedly, the
server memory usage will raise
increasingly until it reaches a point
when the server takes way too long to
handle any network operation. Thus,
I'm considering a lot of different
approaches to solve the issue, this is
one of them.
This sounds more like you actually are still referencing all these objects that use the memory - if you weren't the GC would run due to the memory pressure and try to release those objects. You might be looking at an actual bug in your production code (i.e. objects that are still subscribed to events or otherwise are being referenced when they shouldn't be) rather than something you can fix by manually taking out the garbage.
If possible in this scenario you should run a performance analysis to see where your bottlenecks are and what part of your code is causing the brunt of the memory allocations.
Could the memory increase be an "attack" by a player with a fake/modified game-client? Is a lot of memory allocated by the server when it accepts a new client connection? Does the server handle bogus incoming data well?
I have a C# windows service acting as a server, the service holds some large (>8Gb) data structures in memory and exposes search methods to clients via remoting.
The avg search operation is executed in <200ms and the service handles up to 20 request/sec.
I'm noticing some serious performance degradation (>6000ms) on a regular basis for few seconds
My best guess is that the server threads are stopped by a gen2 garbage collection from time to time.
I'm considering switching from server gc to workstation gc and wrap my search method in this to prevent GC during requests.
static protected void DoLowLatencyAction(Action action)
{
GCLatencyMode oldMode = GCSettings.LatencyMode;
try
{
GCSettings.LatencyMode = GCLatencyMode.LowLatency;
// perform time-sensitive actions here
action();
}
finally
{
GCSettings.LatencyMode = oldMode;
}
}
Is this a good idea?
Under what conditions the GC will be performed anyway inside the low latency block?
Note: I'm running on a x64 server with 8 cores
Thanks
I've not used GCLatencyMode before so cannot comment on whether using it is a good idea or not.
However, are you sure you are using server GC? By default, Windows services use workstation GC.
I've had a similar problem before in a Windows service, and setting server GC mode using:
<configuration>
<runtime>
<gcServer enabled="true" />
</runtime>
</configuration>
in the service's app.config file solved it.
Have a read of this post from Tess Ferrandez's blog for a more details.
I am surprised that your search method even trigger a GC, if the 8GB datastructure is static ( not modified a lot by adding or removing from it ), and all you do is searching it, then you should just try to avoid allocating temporary objects within your search method ( if you can ). Creating objects is the thing that triggers a GC, and if you have a data structure that is rarely modified, then it makes sense to avoid the GC all together ( or delay it as much as possible ).
GCSettings.LowLatency does is it gives a hint to the GC to do eager collections such that we can avoid Gen2 collections while the LowLatency mode is set. What this does as a side effect is make the GC eager to collect outside the region where LowLatency mode is set, and could result in lower performance ( in this case you can try server GC mode ).
This doesn't sound like a great idea. At some point the GC will ignore your hint and do the collection anyway.
I believe you will have much better success by actually profiling and optimizing your service. Run PerfView (free, awesome, Microsoft tool) on your server while it is running. See who owns troublesome objects and how long specific long running GC events are taking.