C# GC.Collect() and Memory - c#

I am receiving a very large list as a method argument and would like to remove it from memory after it is used. Normally I would let the GC do its thing, but I need to be very careful of memory usage in this app.
Will this code accomplish my goal? I've read a lot of differing opinions and am confused.
public void Save(IList<Employee> employees)
{
// I've mapped the passed-in list
var data = Mapper<Employee, EmployeeDTO>.MapList(employees);
// ?????????????
employees = null;
GC.Collect();
// Continues to process very long running methods....
// I don't want this large list to stay in memory
}
Maybe I should use another technique that I'm not aware of?

If the list is not used anymore the GC will automatically collect it when available memory is an issue.
However, if the caller uses the list after passing it to your function then the GC won't collect it even if you set it to null (all you have is a reference to the list - you can't do anything about other objects that hold references as wel).
Unless you have a measurable problem don't try and outsmart the GC.

This is not a direct answer to the question but some ideas how to handle the low-memory situations the poster (#Big Daddy) referred to.
If you're running into low memory situations on an x64 platform with 8 GB of memory, you should determine if your application is responsible for it. If it is, then run a memory profiler (CLR Profiler or something else or even get a full user dump and run WinDbg on it) to see what allocates the memory. It's possible that you have some objects that are not used anymore but are still referenced somewhere - this is not a true memory leak but it's memory that's not freed up in your application - most decent profilers will identify big objects (or objects with a lot of instances) along with their types.
I find it hard to believe that the list passed to this Save function would stress a server with 8 GB of memory but we don't know how much free memory is available to the process and what kind of a process it is (IIS, desktop, etc.)
If Save is called on several threads with huge inputs, it can potentially lead to memory stress but even then, it's not very likely and I'd check various counters and profile data to see when and why memory stress happens.

Related

does mono/.Net GC release free allocated memory back to OS after collection? if not, why?

I heard many times that once C# managed program request more memory from OS, it doesn't free it back, unless system is out of memory. Eg. when object is collected, it gets deleted, and memory that was occupied by the object is free to reuse by another managed object, but memory itself is not returned to operating system (for example, mono on unix wouldn't call brk / sbrk to decrease the amount of virtual memory available to the process back to what it was before its allocation).
I don't know if this really happens or not, but I can see that my c# applications, running on linux, use small amount of memory on beginning, then when I do something memory expensive, it allocates more of it, but later on when all objects get deleted (I can verify that by putting debug message to destructors), the memory is not free'd. On other hand no more memory is allocated when I run that memory expensive operation again. The program just keep on eating the same amount of memory until it is terminated.
Maybe it is just my misunderstanding of how GC in .net works, but if it really does work like this, why is that? What is a benefit of keeping the allocated memory for later, instead of returning it back to the system? How can it even know if system need it back or not? What about other application that would crash or couldn't start because of OOM caused by this effect?
I know that people will probably answer something like "GC manages memory better than you ever could, just don't care about it" or "GC knows what it does best" or "it doesn't matter at all, it's just virtual memory" but it does matter, on my 2gb laptop I am running OOM (and kernel OOM killer gets started because of that) very often when I am running any C# applications after some time precisely because of this irresponsible memory management.
Note: I was testing this all on mono in linux because I really have hard times understanding how windows manage memory, so debugging on linux is much easier for me, also linux memory management is open source code, memory management of windows kernel / .Net is rather mystery for me
The memory manager works this way because there is no benefit of having a lot of unused system memory when you don't need it.
If the memory manager would always try to have as little memory allocated as possible, that would mean that it would do a lot of work for no reason. It would only slow the application down, and the only benefit would be more free memory that no application is using.
Whenever the system needs more memory, it will tell the running applications to return as much as possible. The same signal is also sent to an application when you minimise it.
If this doesn't work the same with Mono in Linux, then that is a problem with that specific implementation.
Generally, if an app needs memory once, it will need it again. Releasing memory back to the OS only to request it back again is overhead, and if nothing else wants the memory: why bother?. It is trying to optimize for the very likely scenario of wanting it again. Additionally, releasing it back requires entire / contiguous blocks that can be handed back, which has very specific impact on things like compaction: it isn't quite as simple as "hey, I'm not using most of this : have it back" - it needs to figure out what blocks can be released, presumably after a full collect and compact (relocate objects etc) cycle.

Simple algorithm to determine when to free some memory .Net

Our system keeps hold of lots of large objects for performance. However, when running low on memory, we want to drop some of the objects. The objects are prioritized, so I know which ones to drop. Is there a simple way of determining when to free memory? Also, dropping 1 object may not be enough, so I guess I need a loop to drop, check, drop again if necessary, etc. But in c#, I won't necessarily see the effect immediately of dropping an object, so how do I avoid kicking too much stuff out?
I guess it's just a simple function of used vs total physical & virtual memory. But what function?
Edit: Some clarifications
"Large objects" was misleading. I meant logical "package" of objects (the objects should be small enough individually to avoid the LOB - that's the intention certainly) that together are large (~ 100MB?)
A request can come in which requires the use of one such package. If it is in memory, the response is rapid. If not, it needs to be reconstructed, which is very slow. So I want to keep stuff in memory as long as possible, but can ditch the least requested ones when necessary.
We have no sensible way to serialize these packages. We should probably do that, but it's a lot of work and there's a lot of resistance to doing so.
Our original simple approach is to periodically compare the following to a configurable threshold.
var c = new ComputerInfo();
return c.AvailablePhysicalMemory / c.TotalPhysicalMemory;
There're a lot of different topics on this questions and I think is best to clarify them before actually answering.
First of, you say your app does get a hold of a lot of "large objects". Define large object. Anything larger than about 85K goes into the LOH which only gets collected as part of a generation 2 collection (the most expensive of them all), anything smaller than that, even if you think is a "big" object, is not and it's treated as any other kind of object.
Secondly there're two problems in terms of "managing memory"
One is managing the amount of space you're using inside your virtual memory space. That is, in 32 bit systems making sure you can address all the memory you're asking for, which in Windows 32 bit uses to be around 1,5 GB.
Secondly is managing disposing of that memory when it's needed, which is a part of the garbage collector work so that it triggers when there's a shortage on memory (although that doesn't mean you can't get an OutOfMemoryException if you don't give the GC time enough to do its job).
With that said, I think you should forget about taking the place of the GC... just let it do its job and, if you're worried then find the critical paths that may fail (on memory request) and protect yourself against OutOfMemoryExceptions.
There're a lot of different patterns for handling the case you're posting and most of them really depend on your business scenario. One example is having a state machine that can actually go to an "OutOfMemory" state, in which case the system switches to freeing memory before doing anything else (that includes disposing old objects and invoking the GC to clean everything up, all while you patiently wait for it to happen).
Other techniques involve saving the data to the disk and then manually swapping in and out objects based on some algorithm when you reach certain levels. That means stopping all your threads (or some, depending on business) and moving the data back and forth.
If your large objects are all controlled in terms of location you can also declare a facade over their creation, so that the facade can check whether it needs to free objects or not based on the amount of memory (virtual memory) your process is using. BTW, use the PerformanceInfo API call as quoted in the other answer as this will include the amount of memory used by unmanaged code, which is, nonetheless, located inside the virtual memory space of your process.
Don't worry too much about "real" memory, as the operating system will make sure the most appropriate pages are located in memory.
Then there're hundreds of other optimizations that completely depend on your business scenario. For example databases "know" to bring data to memory depending on the query and predicting the data you're going to use in advance so the data is ready and they do remove objects that are not used... but that's another topic.
Edit: Based on your edits to the question.
Checking memory in the facade will not add a significant overhead in terms of performance.
If you start getting low on memory you should take a decision of how many objects / how much space are you going to free. Don't do it one at a time, take a bunch of them and free enough memory so that you don't have to collect again.
If you go with the previous approach you can service the request after you've freed enough space and continue cleaning in background.
One of the fastest ways of handling memory / disk swapping is by using memory mapped files.
Use GC.GetTotalMemory and if this exceeds your expectation then you can nullify the objects that you want to release and call GC.Collect.
Have a look at the accepted answer to this question. It uses the GetPerformanceInfo Windows API to determine memory consumption of all sorts. Task Manager is using the same information. This should help you writing a class that observes memory consumption periodically.
Once memory runs low you can fill a FIFO queue with soon-to-be deleted tasks.
The observer will delete the first object in the queue and maybe call GCCollect manually, I'm not too sure about this.
Give the collection some time before you recheck the mem consumption for your application. If there is still not enough free mem, delete the next object from the queue and so on...

What is the best way to reduce Memory Leak?

During Runtime I find that there is a very big memory usage by my application..
But It seems that I only use 3~4 MemoryStreams one of which is sometimes full of 81 Mb..
The others are mainly 20 mb, 3 mb and 1mb containers...
But still there is 525.xx MB memory usage by the application...
I tried using using(...) statements also but without any luck..
So, I am asking here for the most efficient way to cut down memory leaks.
In managed .NET apps, you don't generally have memory "leaks" in the original sense of the word, unless you are allocating unmanaged resource handles and not disposing of them correctly. But that doesn't sound like what you're doing.
More likely is that you are holding on to references to objects that you no longer need, and this is keeping the memory "alive" longer than you are expecting.
For example, if you put 5MB of data into a memory stream, and then assign that memory stream to a static field, then the 5MB will never go away for the lifetime of the application. You need to assign null to the static field that refers to the memory stream when you no longer need what it points to so that the garbage collector will release and reclaim that 5MB of memory.
Similarly, local variables are not released until the function exits. If you allocate a lot of memory and assign it to a local variable, and then call another function that runs for hours, the local variable will be kept alive the whole time. If you don't need that memory anymore, assign null to the local variable.
How are you determining that your app has a memory leak? If you are looking at the process virtual memory allocation shown by Task Manager, that is not very accurate. An application's memory manager may allocate large chunks of memory from the OS and internally free them for other uses within the application without releasing them back to the OS.
Use common sense practices. Call dispose or close as appropriate and assign null to variables as soon as you no longer need their contents.
Just because the garbage collected environment will let you be lazy doesn't mean you shouldn't pay attention to memory allocation and deallocation patterns in your code.
Your definition of memory leak seems to be unusual... Following code will produce exactly the effect you are observing, but it rarely called memory leak:
var data = new byte[512*1024*1024];
data = null;
But you may actually have legitimate leaks. Memory profilers will easily show them, but huge ones can be tracked down by code reviews. If you already know that you have small number of memory streams - check if you don't keep them alive by storing in some list or simply in member variable. Also check if your large arrays are not staying alive for similar reasons.

C# .NET Garbage Collection not functioning?

I am working on a relatively large solution in Visual Studio 2010. It has various projects, one of them being an XNA Game-project, and another one being an ASP.NET MVC 2-project.
With both projects I am facing the same issue: After starting them in debug mode, memory usage keeps rising. They start at 40 and 100MB memory usage respectively, but both climb to 1.5GB relatively quickly (10 and 30 minutes respectively). After that it would sometimes drop back down to close to the initial usage, and other times it would just throw OutOfMemoryExceptions.
Of course this would indicate severe memory leaks, so that is where I initially tried to spot the problem. After searching for leaks unsuccesfully, I tried calling GC.Collect() regularly (about once per 10 seconds). After introducing this "hack", memory usage stayed at 45 and 120MB respectively for 24 hours (until I stopped testing).
.NET's garbage collection is supposed to be "very good", but I can't help suspecting that it just doesn't do its job. I have used CLR Profiler in an attempt to troubleshoot the issue, and it showed that the XNA project seemed to have saved a lot of byte arrays I was indeed using, but to which the references should already be deleted, and therefore collected by the garbage collector.
Again, when I call GC.Collect() regularly, the memory usage issues seem to have gone. Does anyone know what could be causing this high memory usage? Is it possibly related to running in Debug mode?
After searching for leaks unsuccesfully
Try harder =)
Memory leaks in a managed language can be tricky to track down. I have had good experiences with the Redgate ANTS Memory Profiler. It's not free, but they give you a 14 day, full-featured trial. It has a nice UI and shows you where you memory is allocated and why these objects are being kept in memory.
As Alex says, event handlers are a very common source of memory leaks in a .NET app. Consider this:
public static class SomeStaticClass
{
public event EventHandler SomeEvent;
}
private class Foo
{
public Foo()
{
SomeStaticClass.SomeEvent += MyHandler;
}
private void MyHandler( object sender, EventArgs ) { /* whatever */ }
}
I used a static class to make the problem as obvious as possible here. Let's say that, during the life of your application, many Foo objects are created. Each Foo subscribes to the SomeEvent event of the static class.
The Foo objects may fall out of scope at one time or another, but the static class maintains a reference to each one via the event handler delegate. Thus, they are kept alive indefinitely. In this case, the event handler simply needs to be "unhooked".
...the XNA project seemed to have saved a lot of byte arrays I was indeed using...
You may be running into fragmentation in the LOH. If you are allocating large objects very frequently they may be causing the problem. The total size of these objects may be much smaller than the total memory allocated to the runtime, but due to fragmentation there is a lot of unused memory allocated to your application.
The profiler I linked to above will tell you if this is a problem. If it is, you will likely be able to track it down to an object leak somewhere. I just fixed a problem in my app showing the same behavior and it was due to a MemoryStream not releasing its internal byte[] even after calling Dispose() on it. Wrapping the stream in a dummy stream and nulling it out fixed the problem.
Also, stating the obvious, make sure to Dispose() of your objects that implement IDisposable. There may be native resources lying around. Again, a good profiler will catch this.
My suggestion; it's not the GC, the problem is in your app. Use a profiler, get your app in a high memory consumption state, take a memory snapshot and start analyzing.
First and foremost, the GC works, and works well. There's no bug in it that you have just discovered.
Now that we've gotten that out of the way, some thoughts:
Are you using too many threads?
Remember that GC is non deterministic; it'll run whenever it thinks it needs to run (even if you call GC.Collect().
Are you sure all your references are going out of scope?
What are you loading into memory in the first place? Large images? Large text files?
Your profiler should tell you what's using so much memory. Start cutting at the biggest culprits as much as you can.
Also, calling GC.Collect() every X seconds is a bad idea and will unlikely solve your real problem.
Analyzing memory issues in .NET is not a trivial task and you definitely should read several good articles and try different tools to achieve the result. I end up with the following article after investigations: http://www.alexatnet.com/content/net-memory-management-and-garbage-collector You can also try to read some of Jeffrey Richter's articles, like this one: http://msdn.microsoft.com/en-us/magazine/bb985010.aspx
From my experience, there are two most common reasons for Out-Of-Memory issue:
Event handlers - they may hold the object even when no other objects referencing it. So ideally you need to unsubscribe event handlers to destroy the object automatically.
Finalizer thread is blocked by some other thread in STA mode. For example, when STA thread does a lot of work the other threads are stopped and the objects that are in the finalization queue cannot be destroyed.
Edit: Added link to Large Object Heap fragmentation.
Edit: Since it looks like it is a problem with allocating and throwing away the Textures, can you use Texture2D.SetData to reuse the large byte[]s?
First, you need to figure out whether it is managed or unmanaged memory that is leaking.
Use perfmon to see what happens to your process '.net memory# Bytes in all Heaps' and Process\Private Bytes. Compare the numbers and the memory rises. If the rise in Private bytes outpaces the rise in heap memory, then it's unmanaged memory growth.
Unmanaged memory growth would point to objects that are not being disposed (but eventually collected when their finalizer executes).
If it's managed memory growth, then we'll need to see which generation/LOH (there are also performance counters for each generation of heap bytes).
If it's Large Object Heap bytes, you'll want to reconsider the use and throwing away of large byte arrays. Perhaps the byte arrays can be re-used instead of discarded. Also, consider allocating large byte arrays that are powers of 2. This way, when disposed, you'll leave a large "hole" in the large object heap that can be filled by another object of the same size.
A final concern is pinned memory, but I don't have any advice for you on this because I haven't ever messed with it.
I would also add that if you are doing any file access, make sure that you are closing and/or disposing of any Readers or Writers. You should have a matching 1-1 between opening any file and closing it.
Also, I usually use the using clause for resources, like a Sql Connection:
using (var connection = new SqlConnection())
{
// Do sql connection work in here.
}
Are you implementing IDisposable on any objects and possibly doing something custom that is causing any issues? I would double check all of your IDisposable code.
The GC doesn't take into account the unmanaged heap. If you are creating lots of objects that are merely wrappers in C# to larger unmanaged memory then your memory is being devoured but the GC can't make rational decisions based on this as it only see the managed heap.
You end up in a situation where the GC collector doesn't think you are short of memory because most of the things on your gen 1 heap are 8 byte references where in actual fact they are like icebergs at sea. Most of the memory is below!
You can make use of these GC calls:
System::GC::AddMemoryPressure(sizeOfField);
System::GC::RemoveMemoryPressure(sizeOfField);
These methods allow the garbage collector to see the unmanaged memory (if you provide it the right figures)

Deallocate memory from large data structures in C#

I have a fewSortedList<>andSortedDictionary<>structures in my simulation code and I add millions of items in them over time. The problem is that the garbage collector does not deallocate quickly enough memory so there is a huge hit on the application's performance. My last option was to engage theGC.Collect()method so that I can reclaim that memory back. Has anyone got a different idea? I am aware of theFlyweightpattern which is another option but I would appreciate other suggestions that would not require huge refactoring of my code.
You are fighting the "There's no free lunch" principle. You cannot assume that stuffing millions of items in a list isn't going to affect perf. Only the SortedList<> should be a problem, it is going to start allocating memory in the Large Object Heap. That allocation isn't going to be freed soon, it takes a gen #2 collection to chuck stuff out of the LOH again. This delay should not otherwise affect the perf of your program.
One thing you can do is avoiding the multiple of copies of the internal array that SortedList<> will jam into the LOH when it keeps growing. Try to guess a good value for Capacity so it pre-allocates the large array up front.
Next, use Perfmon.exe or TaskMgr.exe and looks at the page fault delta of your program. It should be quite busy while you're allocating. If you see low values (100 or less) then you might have a problem with the paging file being fragmented. A common scourge on older machines that run XP. Defragging the disk and using SysInternals' PageDefrag utility can do extraordinary wonders.
I think the SortedList uses a array as backing field, which means that large SortedList get allocated on the Large object heap. The large object heap can get defragmentated, which can cause an out of memory exception while in principle there is still enough memory available.
See this link.
This might be your problem, as intermediate calls to GC.collect prevent the LOH from getting badly defragmented in some scenarios, which explains why calling it helps you reduce the problem.
The problem can be mitigated by splitting large objects into smaller fragments.
I'd start with doing some memory profiling on your application to make sure that the items you remove from those lists (which I assume is happening from the way your post is written) are actually properly released and not hanging around places.
What sort of performance hit are we talking and on what operating system? If I recall, GC will run when it's needed, not immediately or even "soon". So task manager showing high memory allocated to your application is not necessarily a problem. What happens if you put the machine under higher load (e.g. run several copies of your application)? Does memory get reclaimed faster in that scenario or are you starting to run out of memory?
I hope answers to these questions will help point you in a right direction.
Well, if you keep all of the items in those structures, the GC will never collect the resources because they still have references to them.
If you need the items in the structures to be collected, you must remove them from the data structure.
To clear the entire data structure try using Clear() and setting the data structure reference to null. If the data is still not getting collected fast enough, call CC.Collect().

Categories

Resources