C# Too Much Memory Usage - c#

I have a process going with multiple steps defied. (Let's say generic implementation of a strategy pattern) where all steps are passing a common ProcessParameter object around. (read/write to it)
This ProcessParameter is an object having many arrays and collections. Example:
class ProcessParameter() {
public List<int> NumbersAllStepsNeed {get; set;}
public List<int> OhterNumbersAllStepsNeed {get; set;}
public List<double> SomeOtherData {get; set;}
public List<string> NeedThisToo {get; set;}
...
}
Once the steps finished, I'd like to make sure the memory is freed and not hang around, because this can have a big memory footprint and other processes need to run too.
Do I do that by running:
pParams.NumbersAllStepsNeed = null;
pParams.OhterNumbersAllStepsNeed = null;
pParams.SomeOtherData = null;
...
or should ProcessParameter implement IDosposable, and Dispose method would do that, and then I just need to use pParams.Dispose() (or wrap it in using block)
What is the best and most elegant way to clean the memory footprint of the used data of one process running?
Does having arrays instead of lists change anything? Or Mixed?
The actual param type I need is collections/array of custom objects.
Am I looking in the right direction?
UPDATE
Great questions! Thanks for the comments!
I used to have this process running as a single run and I could see memory usage go very high and then gradually down to "normal".
The problem came when I started chaining this processes on top of each other with different stating parameters. That is when memory went unreasonably high, so I want to include a cleaning step between two processes and looking for best way to do that.
There is a DB, this params is a sort of "cache" to speed up things.
Good point on IDisposable, I do not keep unmanaged resources in the params object.

Whilst using the Disposal pattern is a good idea, I don't think it will give you any extra benefits in terms of freeing up memory.
Two things that might:
Call GC.Collect()
However, I really wouldn't bother (unless perhaps you are getting out of memory exceptions). Calling GC.Collect() explicity may hurt performance and the garbage collector really does do a good job on its own. (But see LOH - below.)
Be aware of the Large Object Heap (LOH)
You mentioned that it uses a "big memory footprint". Be aware that any single memory allocation for 85,000 bytes or above comes from the large object heap (LOH). The LOH doesn't get compacted like the small object heap. This can lead to the LOH becoming fragmented and can result in out of memory errors even when you have plenty of available memory.
When might you stray into the LOH? Any memory allocation of 85,000 bytes or more, so on a 64 bit system that would be any array (or list or dictionary) with 10,625 elements or more, image manipulation, large strings etc.
Three strategies to help minimise fragmentation of the LOH:
i. Redesign to avoid it. Not always practical. But a list of lists or dictionary of dictionaries might avoid the limit. This can make the implementation more complex so I wouldn't unless you really need to, but on the plus side this can be very effective.
ii. Use fixed sizes. If all of more of your memory allocations in the LOH are the same size then this will help minimise any fragmentation. For example for dictionaries and lists set the capacity (which sets the size of the internal array) to the largest size you are likely to use. Not so practical if you are doing image manipulation.
iii. Force the garbage collector to compact the LOH:
System.Runtime.GCSettings.LargeObjectHeapCompactionMode = System.Runtime.GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();
you do need to be using .NET Framework 4.5.1 or later to use that.
This is probably the simplest approach. In my own applications I have a couple of instances where I know I will be straying into the LOH and that fragmentation can be an issue and I set
System.Runtime.GCSettings.LargeObjectHeapCompactionMode = System.Runtime.GCLargeObjectHeapCompactionMode.CompactOnce;
as standard in the destructor - but only call GC.Collect() explicitly if I get an out of memory exception when allocating.
Hope this helps.

Related

Avoiding memory leaks using ArrayPool within a class

I have a class which I am currently using to build byte arrays (for network packets), it starts with some initial buffer, and resizes as more data is written to it. I want to reduce the amount of GC caused by this util by using an ArrayPool, but am having a hard time figuring out a good way to prevent memory leaks without too much overhead.
My main idea was to make my class IDisposable and return the array back to the pool in the Dispose() call.
using ByteBuilder builder = new ByteBuilder();
builder.AddData(); // Add a bunch of data
// write data to network stream
...
// builder is diposed and memory is returned to ArrayPool at the end of the method call
The problem here is that if it happens to be used without the using declaration, the memory would never be returned. Is there some way I can guarantee how someone uses it?
Another idea I had was to use a finalizer to return the memory to the array pool, but it seems that this has significant overhead from what I have read. I am allocating a lot of arrays, and many of them could be small, so having a finalizer seems like it may not be worth the trade off.

Is correct to use GC.Collect(); GC.WaitForPendingFinalizers();?

I've started to review some code in a project and found something like this:
GC.Collect();
GC.WaitForPendingFinalizers();
Those lines usually appear on methods that are conceived to destruct the object under the rationale of increase efficiency. I've made this remarks:
To call garbage collection explicitly on the destruction of every object decreases performance because doing so does not take into account if it is absolutely necessary for CLR performance.
Calling those instructions in that order causes every object to be destroyed only if other objects are being finalized. Therefore, an object that could be destroyed independently has to wait for another object's destruction without a real necessity.
It can generate a deadlock (see: this question)
Are 1, 2 and 3 true? Can you give some reference supporting your answers?
Although I'm almost sure about my remarks, I need to be clear in my arguments in order to explain to my team why is this a problem. That's the reason I'm asking for confirmation and reference.
The short answer is: take it out. That code will almost never improve performance, or long-term memory use.
All your points are true. (It can generate a deadlock; that does not mean it always will.) Calling GC.Collect() will collect the memory of all GC generations. This does two things.
It collects across all generations every time - instead of what the GC will do by default, which is to only collect a generation when it is full. Typical use will see Gen0 collecting (roughly) ten times as often than Gen1, which in turn collects (roughly) ten times as often as Gen2. This code will collect all generations every time. Gen0 collection is typically sub-100ms; Gen2 can be much longer.
It promotes non-collectable objects to the next generation. That is, every time you force a collection and you still have a reference to some object, that object will be promoted to the subsequent generation. Typically this will happen relatively rarely, but code such as the below will force this far more often:
void SomeMethod()
{
object o1 = new Object();
object o2 = new Object();
o1.ToString();
GC.Collect(); // this forces o2 into Gen1, because it's still referenced
o2.ToString();
}
Without a GC.Collect(), both of these items will be collected at the next opportunity. With the collection as writte, o2 will end up in Gen1 - which means an automated Gen0 collection won't release that memory.
It's also worth noting an even bigger horror: in DEBUG mode, the GC functions differently and won't reclaim any variable that is still in scope (even if it's not used later in the current method). So in DEBUG mode, the code above wouldn't even collect o1 when calling GC.Collect, and so both o1 and o2 will be promoted. This could lead to some very erratic and unexpected memory usage when debugging code. (Articles such as this highlight this behaviour.)
EDIT: Having just tested this behaviour, some real irony: if you have a method something like this:
void CleanUp(Thing someObject)
{
someObject.TidyUp();
someObject = null;
GC.Collect();
GC.WaitForPendingFinalizers();
}
... then it will explicitly NOT release the memory of someObject, even in RELEASE mode: it'll promote it into the next GC generation.
There is a point one can make that is very easy to understand: Having GC run automatically cleans up many objects per run (say, 10000). Calling it after every destruction cleans up about one object per run.
Because GC has high overhead (needs to stop and start threads, needs to scan all objects alive) batching calls is highly preferable.
Also, what good could come out of cleaning up after every object? How could this be more efficient than batching?
Your point number 3 is technically correct, but can only happen if someone locks during a finaliser.
Even without this sort of call, locking inside a finaliser is even worse than what you have here.
There are a handful of times when calling GC.Collect() really does help performance.
So far I've done so 2, maybe 3 times in my career. (Or maybe about 5 or 6 times if you include those where I did it, measured the results, and then took it out again - and this is something you should always measure after doing).
In cases where you're churning through hundreds or thousands of megs of memory in a short period of time, and then switching over to much less intensive use of memory for a long period of time, it can be a massive or even vital improvement to explicitly collect. Is that what's happening here?
Anywhere else, they're at best going to make it slower and use more memory.
See my other answer here:
To GC.Collect or not?
two things can happen when you call GC.Collect() yourself: you end up spending more time doing collections (because the normal background collections will still happen in addition to your manual GC.Collect()) and you'll hang on to the memory longer (because you forced some things into a higher order generation that didn't need to go there). In other words, using GC.Collect() yourself is almost always a bad idea.
About the only time you ever want to call GC.Collect() yourself is when you have specific information about your program that is hard for the Garbage Collector to know. The canonical example is a long-running program with distinct busy and light load cycles. You may want to force a collection near the end of a period of light load, ahead of a busy cycle, to make sure resources are as free as possible for the busy cycle. But even here, you might find you do better by re-thinking how your app is built (ie, would a scheduled task work better?).
We have run into similar problems to #Grzenio however we are working with much larger 2-dimensional arrays, in the order of 1000x1000 to 3000x3000, this is in a webservice.
Adding more memory isn't always the right answer, you have to understand your code and the use case. Without GC collecting we require 16-32gb of memory (depending on customer size). Without it we would require 32-64gb of memory and even then there are no guarantees the system won't suffer. The .NET garbage collector is not perfect.
Our webservice has an in-memory cache in the order of 5-50 million string (~80-140 characters per key/value pair depending on configuration), in addition with each client request we would construct 2 matrices one of double, one of boolean which were then passed to another service to do the work. For a 1000x1000 "matrix" (2-dimensional array) this is ~25mb, per request. The boolean would say which elements we need (based on our cache). Each cache entry represents one "cell" in the "matrix".
The cache performance dramatically degrades when the server has > 80% memory utilization due to paging.
What we found is that unless we explicitly GC the .net garbage collector would never 'cleanup' the transitory variables until we were in the 90-95% range by which point the cache performance had drastically degraded.
Since the down-stream process often took a long duration (3-900 seconds) the performance hit of a GC collection was neglible (3-10 seconds per collect). We initiated this collect after we had already returned the response to the client.
Ultimately we made the GC parameters configurable, also with .net 4.6 there are further options. Here is the .net 4.5 code we used.
if (sinceLastGC.Minutes > Service.g_GCMinutes)
{
Service.g_LastGCTime = DateTime.Now;
var sw = Stopwatch.StartNew();
long memBefore = System.GC.GetTotalMemory(false);
context.Response.Flush();
context.ApplicationInstance.CompleteRequest();
System.GC.Collect( Service.g_GCGeneration, Service.g_GCForced ? System.GCCollectionMode.Forced : System.GCCollectionMode.Optimized);
System.GC.WaitForPendingFinalizers();
long memAfter = System.GC.GetTotalMemory(true);
var elapsed = sw.ElapsedMilliseconds;
Log.Info(string.Format("GC starts with {0} bytes, ends with {1} bytes, GC time {2} (ms)", memBefore, memAfter, elapsed));
}
After rewriting for use with .net 4.6 we split the garbage colleciton into 2 steps - a simple collect and a compacting collect.
public static RunGC(GCParameters param = null)
{
lock (GCLock)
{
var theParams = param ?? GCParams;
var sw = Stopwatch.StartNew();
var timestamp = DateTime.Now;
long memBefore = GC.GetTotalMemory(false);
GC.Collect(theParams.Generation, theParams.Mode, theParams.Blocking, theParams.Compacting);
GC.WaitForPendingFinalizers();
//GC.Collect(); // may need to collect dead objects created by the finalizers
var elapsed = sw.ElapsedMilliseconds;
long memAfter = GC.GetTotalMemory(true);
Log.Info($"GC starts with {memBefore} bytes, ends with {memAfter} bytes, GC time {elapsed} (ms)");
}
}
// https://msdn.microsoft.com/en-us/library/system.runtime.gcsettings.largeobjectheapcompactionmode.aspx
public static RunCompactingGC()
{
lock (CompactingGCLock)
{
var sw = Stopwatch.StartNew();
var timestamp = DateTime.Now;
long memBefore = GC.GetTotalMemory(false);
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();
var elapsed = sw.ElapsedMilliseconds;
long memAfter = GC.GetTotalMemory(true);
Log.Info($"Compacting GC starts with {memBefore} bytes, ends with {memAfter} bytes, GC time {elapsed} (ms)");
}
}
Hope this helps someone else as we spent a lot of time researching this.
[Edit] Following up on this, we have found some additional problems with the large matrices. we have started encountering heavy memory pressure and the application suddenly being unable to allocate the arrays, even if the process/server has plenty of memory (24gb free). Upon deeper investigation we discovered that the process had standby memory that was almost 100% of the "in use memory" (24gb in use, 24gb standby, 1gb free). When the "free" memory hit 0 the application would pause for 10+ seconds while standby was reallocated as free and then it could start responding to requests.
Based on our research this appears to be due to fragmentation of the large object heap.
To address this concern we are taking 2 approaches:
We are going to change to jagged array vs multi-dimensional arrays. This will reduce the amount of continuous memory required, and ideally keep more of these arrays out of the Large Object Heap.
We are going to implement the arrays using the ArrayPool class.
I've used this just once: to clean up server-side cache of Crystal Report documents. See my response in Crystal Reports Exception: The maximum report processing jobs limit configured by your system administrator has been reached
The WaitForPendingFinalizers was particularly helpful for me, as sometimes the objects were not being cleaned up properly. Considering the relatively slow performance of the report in a web page - any minor GC delay was negligible, and the improvement in memory management gave an overall happier server for me.

How do I get .NET to garbage collect aggressively?

I have an application that is used in image processing, and I find myself typically allocating arrays in the 4000x4000 ushort size, as well as the occasional float and the like. Currently, the .NET framework tends to crash in this app apparently randomly, almost always with an out of memory error. 32mb is not a huge declaration, but if .NET is fragmenting memory, then it's very possible that such large continuous allocations aren't behaving as expected.
Is there a way to tell the garbage collector to be more aggressive, or to defrag memory (if that's the problem)? I realize that there's the GC.Collect and GC.WaitForPendingFinalizers calls, and I've sprinkled them pretty liberally through my code, but I'm still getting the errors. It may be because I'm calling dll routines that use native code a lot, but I'm not sure. I've gone over that C++ code, and make sure that any memory I declare I delete, but still I get these C# crashes, so I'm pretty sure it's not there. I wonder if the C++ calls could be interfering with the GC, making it leave behind memory because it once interacted with a native call-- is that possible? If so, can I turn that functionality off?
EDIT: Here is some very specific code that will cause the crash. According to this SO question, I do not need to be disposing of the BitmapSource objects here. Here is the naive version, no GC.Collects in it. It generally crashes on iteration 4 to 10 of the undo procedure. This code replaces the constructor in a blank WPF project, since I'm using WPF. I do the wackiness with the bitmapsource because of the limitations I explained in my answer to #dthorpe below as well as the requirements listed in this SO question.
public partial class Window1 : Window {
public Window1() {
InitializeComponent();
//Attempts to create an OOM crash
//to do so, mimic minute croppings of an 'image' (ushort array), and then undoing the crops
int theRows = 4000, currRows;
int theColumns = 4000, currCols;
int theMaxChange = 30;
int i;
List<ushort[]> theList = new List<ushort[]>();//the list of images in the undo/redo stack
byte[] displayBuffer = null;//the buffer used as a bitmap source
BitmapSource theSource = null;
for (i = 0; i < theMaxChange; i++) {
currRows = theRows - i;
currCols = theColumns - i;
theList.Add(new ushort[(theRows - i) * (theColumns - i)]);
displayBuffer = new byte[theList[i].Length];
theSource = BitmapSource.Create(currCols, currRows,
96, 96, PixelFormats.Gray8, null, displayBuffer,
(currCols * PixelFormats.Gray8.BitsPerPixel + 7) / 8);
System.Console.WriteLine("Got to change " + i.ToString());
System.Threading.Thread.Sleep(100);
}
//should get here. If not, then theMaxChange is too large.
//Now, go back up the undo stack.
for (i = theMaxChange - 1; i >= 0; i--) {
displayBuffer = new byte[theList[i].Length];
theSource = BitmapSource.Create((theColumns - i), (theRows - i),
96, 96, PixelFormats.Gray8, null, displayBuffer,
((theColumns - i) * PixelFormats.Gray8.BitsPerPixel + 7) / 8);
System.Console.WriteLine("Got to undo change " + i.ToString());
System.Threading.Thread.Sleep(100);
}
}
}
Now, if I'm explicit in calling the garbage collector, I have to wrap the entire code in an outer loop to cause the OOM crash. For me, this tends to happen around x = 50 or so:
public partial class Window1 : Window {
public Window1() {
InitializeComponent();
//Attempts to create an OOM crash
//to do so, mimic minute croppings of an 'image' (ushort array), and then undoing the crops
for (int x = 0; x < 1000; x++){
int theRows = 4000, currRows;
int theColumns = 4000, currCols;
int theMaxChange = 30;
int i;
List<ushort[]> theList = new List<ushort[]>();//the list of images in the undo/redo stack
byte[] displayBuffer = null;//the buffer used as a bitmap source
BitmapSource theSource = null;
for (i = 0; i < theMaxChange; i++) {
currRows = theRows - i;
currCols = theColumns - i;
theList.Add(new ushort[(theRows - i) * (theColumns - i)]);
displayBuffer = new byte[theList[i].Length];
theSource = BitmapSource.Create(currCols, currRows,
96, 96, PixelFormats.Gray8, null, displayBuffer,
(currCols * PixelFormats.Gray8.BitsPerPixel + 7) / 8);
}
//should get here. If not, then theMaxChange is too large.
//Now, go back up the undo stack.
for (i = theMaxChange - 1; i >= 0; i--) {
displayBuffer = new byte[theList[i].Length];
theSource = BitmapSource.Create((theColumns - i), (theRows - i),
96, 96, PixelFormats.Gray8, null, displayBuffer,
((theColumns - i) * PixelFormats.Gray8.BitsPerPixel + 7) / 8);
GC.WaitForPendingFinalizers();//force gc to collect, because we're in scenario 2, lots of large random changes
GC.Collect();
}
System.Console.WriteLine("Got to changelist " + x.ToString());
System.Threading.Thread.Sleep(100);
}
}
}
If I'm mishandling memory in either scenario, if there's something I should spot with a profiler, let me know. That's a pretty simple routine there.
Unfortunately, it looks like #Kevin's answer is right-- this is a bug in .NET and how .NET handles objects larger than 85k. This situation strikes me as exceedingly strange; could Powerpoint be rewritten in .NET with this kind of limitation, or any of the other Office suite applications? 85k does not seem to me to be a whole lot of space, and I'd also think that any program that uses so-called 'large' allocations frequently would become unstable within a matter of days to weeks when using .NET.
EDIT: It looks like Kevin is right, this is a limitation of .NET's GC. For those who don't want to follow the entire thread, .NET has four GC heaps: gen0, gen1, gen2, and LOH (Large Object Heap). Everything that's 85k or smaller goes on one of the first three heaps, depending on creation time (moved from gen0 to gen1 to gen2, etc). Objects larger than 85k get placed on the LOH. The LOH is never compacted, so eventually, allocations of the type I'm doing will eventually cause an OOM error as objects get scattered about that memory space. We've found that moving to .NET 4.0 does help the problem somewhat, delaying the exception, but not preventing it. To be honest, this feels a bit like the 640k barrier-- 85k ought to be enough for any user application (to paraphrase this video of a discussion of the GC in .NET). For the record, Java does not exhibit this behavior with its GC.
Here are some articles detailing problems with the Large Object Heap. It sounds like what you might be running into.
http://connect.microsoft.com/VisualStudio/feedback/details/521147/large-object-heap-fragmentation-causes-outofmemoryexception
Dangers of the large object heap:
http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/
Here is a link on how to collect data on the Large Object Heap (LOH):
http://msdn.microsoft.com/en-us/magazine/cc534993.aspx
According to this, it seems there is no way to compact the LOH. I can't find anything newer that explicitly says how to do it, and so it seems that it hasn't changed in the 2.0 runtime:
http://blogs.msdn.com/maoni/archive/2006/04/18/large-object-heap.aspx
The simple way of handling the issue is to make small objects if at all possible. Your other option to is to create only a few large objects and reuse them over and over. Not an idea situation, but it might be better than re-writing the object structure. Since you did say that the created objects (arrays) are of different sizes, it might be difficult, but it could keep the application from crashing.
Start by narrowing down where the problem lies. If you have a native memory leak, poking the GC is not going to do anything for you.
Run up perfmon and look at the .NET heap size and Private Bytes counters. If the heap size remains fairly constant but private bytes is growing then you've got a native code issue and you'll need to break out the C++ tools to debug it.
Assuming the problem is with the .NET heap you should run a profiler against the code like Redgate's Ant profiler or JetBrain's DotTrace. This will tell you which objects are taking up the space and not being collected quickly. You can also use WinDbg with SOS for this but it's a fiddly interface (powerful though).
Once you've found the offending items it should be more obvious how to deal with them. Some of the sort of things that cause problems are static fields referencing objects, event handlers not being unregistered, objects living long enough to get into Gen2 but then dying shortly after, etc etc. Without a profile of the memory heap you won't be able to pinpoint the answer.
Whatever you do though, "liberally sprinkling" GC.Collect calls is almost always the wrong way to try and solve the problem.
There is an outside chance that switching to the server version of the GC would improve things (just a property in the config file) - the default workstation version is geared towards keeping a UI responsive so will effectively give up with large, long running colections.
Use Process Explorer (from Sysinternals) to see what the Large Object Heap for your application is. Your best bet is going to be making your arrays smaller but having more of them. If you can avoid allocating your objects on the LOH then you won't get the OutOfMemoryExceptions and you won't have to call GC.Collect manually either.
The LOH doesn't get compacted and only allocates new objects at the end of it, meaning that you can run out of space quite quickly.
If you're allocating a large amount of memory in an unmanaged library (i.e. memory that the GC isn't aware of), then you can make the GC aware of it with the GC.AddMemoryPressure method.
Of course this depends somewhat on what the unmanaged code is doing. You haven't specifically stated that it's allocating memory, but I get the impression that it is. If so, then this is exactly what that method was designed for. Then again, if the unmanaged library is allocating a lot of memory then it's also possible that it's fragmenting the memory, which is completely beyond the GC's control even with AddMemoryPressure. Hopefully that's not the case; if it is, you'll probably have to refactor the library or change the way in which it's used.
P.S. Don't forget to call GC.RemoveMemoryPressure when you finally free the unmanaged memory.
(P.P.S. Some of the other answers are probably right, this is a lot more likely to simply be a memory leak in your code; especially if it's image processing, I'd wager that you're not correctly disposing of your IDIsposable instances. But just in case those answers don't lead you anywhere, this is another route you could take.)
Just an aside: The .NET garbage collector performs a "quick" GC when a function returns to its caller. This will dispose the local vars declared in the function.
If you structure your code such that you have one large function that allocates large blocks over and over in a loop, assigning each new block to the same local var, the GC may not kick in to reclaim the unreferenced blocks for some time.
If on the other hand, you structure your code such that you have an outer function with a loop that calls an inner function, and the memory is allocated and assigned to a local var in that inner function, the GC should kick in immediately when the inner function returns to the caller and reclaim the large memory block that was just allocated, because it's a local var in a function that is returning.
Avoid the temptation to mess with GC.Collect explicitly.
Apart from handling the allocations in a more GC-friendly way (e.g. reusing arrays etc.), there's a new option now: you can manually cause compaction of the LOH.
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
This will cause a LOH compaction the next time a gen-2 collection happens (either on its own, or by your explicit call of GC.Collect).
Do note that not compacting the LOH is usually a good idea - it's just that your scenario is a decent enough case for allowing for manual compaction. The LOH is usually used for huge, long-living objects - like pre-allocated buffers that are reused over time etc.
If your .NET version doesn't support this yet, you can also try to allocate in sizes of powers of two, rather than allocating precisely the amount of memory you need. This is what a lot of native allocators do to ensure memory fragmentation doesn't get impossibly stupid (it basically puts an upper limit on the maximum heap fragmentation). It's annoying, but if you can limit the code that handles this to a small portion of your code, it's a decent workaround.
Do note that you still have to make sure it's actually possible to compact the heap - any pinned memory will prevent compaction in the heap it lives in.
Another useful option is to use paging - never allocating more than, say, 64 kiB of contiguous space on the heap; this means you'll avoid using the LOH entirely. It's not too hard to manage this in a simple "array-wrapper" in your case. The key is to maintain a good balance between performance requirements and reasonable abstraction.
And of course, as a last resort, you can always use unsafe code. This gives you a lot of flexibility in handling memory allocations (though it's a bit more painful than using e.g. C++) - including allowing you to explicitly allocate unmanaged memory, do your work with that and release the memory manually. Again, this only makes sense if you can isolate this code to a small portion of your total codebase - and make sure you've got a safe managed wrapper for the memory, including the appropriate finalizer (to maintain some decent level of memory safety). It's not too hard in C#, though if you find yourself doing this too often, it might be a good idea to use C++/CLI for those parts of the code, and call them from your C# code.
Have you tested for memory leaks? I've been using .NET Memory Profiler with quite a bit of success on a project that had a number of very subtle and annoyingly persistent (pun intended) memory leaks.
Just as a sanity check, ensure that you're calling Dispose on any objects that implement IDisposable.
You could implement your own array class which breaks the memory into non-contiguious blocks. Say, have a 64 by 64 array of [64,64] ushort arrays which are allocated and deallocated seperately. Then just map to the right one. Location 66,66 would be at location [2,2] in the array at [1,1].
Then, you should be able to dodge the Large Object Heap.
The problem is most likely due to the number of these large objects you have in memory. Fragmentation would be a more likely issue if they are variable sizes (while it could still be an issue.) You stated in the comments that you are storing an undo stack in memory for the image files. If you move this to Disk you would save yourself tons of application memory space.
Also moving the undo to disk should not cause too much of a negative impact on performance because it's not something you will be using all of the time. (If it does become a bottle neck you can always create a hybrid disk/memory cache system.)
Extended...
If you are truly concerned about the possible impact of performance caused by storing undo data on the file system, you may consider that the virtual memory system has a good chance of paging this data to your virtual page file anyway. If you create your own page file/swap space for these undo files, you will have the advantage of being able to control when and where the disk I/O is called. Don't forget, even though we all wish our computers had infinite resources they are very limited.
1.5GB (useable application memory space) / 32MB (large memory request size) ~= 46
you can use this method:
public static void FlushMemory()
{
Process prs = Process.GetCurrentProcess();
prs.MinWorkingSet = (IntPtr)(300000);
}
three way to use this method.
1 - after dispose managed object such as class ,....
2 - create timer with such 2000 intervals.
3 - create thread to call this method.
i suggest to you use this method in thread or timer.
The best way to do it is like this article show, it is in spanish, but you sure understand the code.
http://www.nerdcoder.com/c-net-forzar-liberacion-de-memoria-de-nuestras-aplicaciones/
Here the code in case link get brock
using System.Runtime.InteropServices;
....
public class anyname
{
....
[DllImport("kernel32.dll", EntryPoint = "SetProcessWorkingSetSize", ExactSpelling = true, CharSet = CharSet.Ansi, SetLastError = true)]
private static extern int SetProcessWorkingSetSize(IntPtr process, int minimumWorkingSetSize, int maximumWorkingSetSize);
public static void alzheimer()
{
GC.Collect();
GC.WaitForPendingFinalizers();
SetProcessWorkingSetSize(System.Diagnostics.Process.GetCurrentProcess().Handle, -1, -1);
}
....
you call alzheimer() to clean/release memory.
The GC doesn't take into account the unmanaged heap. If you are creating lots of objects that are merely wrappers in C# to larger unmanaged memory then your memory is being devoured but the GC can't make rational decisions based on this as it only see the managed heap.
You end up in a situation where the GC doesn't think you are short of memory because most of the things on your gen 1 heap are 8 byte references where in actual fact they are like icebergs at sea. Most of the memory is below!
You can make use of these GC calls:
System::GC::AddMemoryPressure(sizeOfField);
System::GC::RemoveMemoryPressure(sizeOfField);
These methods allow the GC to see the unmanaged memory (if you provide it the right figures).

Tips on Dealing with Large Strings With Regards to Memory Usage

How can I force a shrink of a DataTable and/or List so that I can free memory efficiently? I am currently removing the processed row from the DataSet every iteration of the loop, but I'm not sure if the memory is being released.
for (int i = m_TotalNumberOfLocalRows - 1; i >= 0; i--)
{
dr = dt.Rows[i];
// Do stuff
dt.Rows.Remove(dr);
}
If this doesn't shrink the DataTable's footprint in memory I can use a List. I'm only using the DataTable as a store for the DataRows, I could use whatever collection or storage mechanism that would be low on memory and be able to release memory every x iterations.
Thank you.
Edit:
After doing some memory profiling after reading What Are Some Good .NET Profilers? I've discovered that the main consumer of memory are strings.
We are doing lots of user output during this process, and the memory consumption is cycling between approximately 170MB-230MB, peaking at around 300MB. I'm using a StringBuilder with an initial size of 20971520 to hold the output/log of what is happening and after one percent of the total number of records have been processed I'm setting a DevExpress MemoEdit control's Text property to the StringBuilder.ToString(). I've found that this method is quicker than appending the StringBuilder.ToString() to the MemoEdit.Text (obviously the logic w.r.t. the StringBuilder is different between appending and setting the MemoEdit.Text)
I've also found that instead of recreating the StringBuilder(20971520) it's easier on memory and quicker to execute to just StringBuilder.Remove(0, StringBuilder.Length)
Are there any tips that you could share to improve performance when working with large strings (the log file that it written out that contains the log is approximately 12.2MB for about 30 000 records)?
Note: I have changed the title of the question and the tags.
Old title:How can I Force a Shrink of a DataTable and/or a List to Release Memory?
Old tags: list datatable c# .net memory
Unless you are having a problem with memory, don't try and manually free it by calling the garbage collector. The run time will handle it for you and 99% of the time be more efficient at it than you will by trying to guess when the optimal time is.
What you have to remember is that when you call GC.Collect(), it runs against all the levels of the Garbage Collection and "tidies up" all objects that need to need to be freed. You will most likely be spending processor time etc. handling something that doesn't need to be done at that point in time.
If you absolutely have to the command is GC.Collect()
http://msdn.microsoft.com/en-us/library/xe0c2357.aspx
http://www.developer.com/net/csharp/article.php/3343191/C-Tip-Forcing-Garbage-Collection-in-NET.htm
Try forcing a garbage collector pass:
GC.Collect();
If you want to make sure, that all objects are finalized before your code execution continues, call
GC.WaitForPendingFinalizers();
right after GC.Collect()
EDIT: As people mentioned in the comments below, it is widely considered a bad practice to call the Garbage Collector directly. Nevertheless, this sould achieve the goal of freeing the unnused memory of your deleted rows.
Stick to the DataTable and remove unnecessary rows as in your example.
By doing this you can't control memory usage: this is done by the CLR Garbage Collector.
Do you have an explicit need to manage this directly? The garbage collector manages this for you.

Force garbage collection of arrays, C#

I have a problem where a couple 3 dimensional arrays allocate a huge amount of memory and the program sometimes needs to replace them with bigger/smaller ones and throws an OutOfMemoryException.
Example: there are 5 allocated 96MB arrays (200x200x200, 12 bytes of data in each entry) and the program needs to replace them with 210x210x210 (111MB). It does it in a manner similar to this:
array1 = new Vector3[210,210,210];
Where array1-array5 are the same fields used previously. This should set the old arrays as candidates for garbage collection but seemingly the GC does not act quickly enough and leaves the old arrays allocated before allocating the new ones - which causes the OOM - whereas if they where freed before the new allocations the space should be enough.
What I'm looking for is a way to do something like this:
GC.Collect(array1) // this would set the reference to null and free the memory
array1 = new Vector3[210,210,210];
I'm not sure if a full garbage collecion would be a good idea since that code may (in some situations) need to be executed fairly often.
Is there a proper way of doing this?
This is not an exact answer to the original question, "how to force GC', yet, I think it will help you to reexamine your issue.
After seeing your comment,
Putting the GC.Collect(); does seem to help, altought it still does not solve the problem completely - for some reason the program still crashes when about 1.3GB are allocated (I'm using System.GC.GetTotalMemory( false ); to find the real amount allocated).
I will suspect you may have memory fragmentation. If the object is large (85000 bytes under .net 2.0 CLR if I remember correctly, I do not know whether it has been changed or not), the object will be allocated in a special heap, Large Object Heap (LOH). GC does reclaim the memory being used by unreachable objects in LOH, yet, it does not perform compaction, in LOH as it does to other heaps (gen0, gen1, and gen2), due to performance.
If you do frequently allocate and deallocate large objects, it will make LOH fragmented and even though you have more free memory in total than what you need, you may not have a contiguous memory space anymore, hence, will get OutOfMemory exception.
I can think two workarounds at this moment.
Move to 64-bit machine/OS and take advantage of it :) (Easiest, but possibly hardest as well depending on your resource constraints)
If you cannot do #1, then try to allocate a huge chuck of memory first and use them (it may require to write some helper class to manipulate a smaller array, which in fact resides in a larger array) to avoid fragmentation. This may help a little bit, yet, it may not completely solve the issue and you may have to deal with the complexity.
Seems you've run into LOH (Large object heap) fragmentation issue.
Large Object Heap
CLR Inside Out Large Object Heap Uncovered
You can check to see if you're having loh fragmentation issues using SOS
Check this question for an example of how to use SOS to inspect the loh.
Forcing a Garbage Collection is not always a good idea (it can actually promote the lifetimes of objects in some situations). If you have to, you would use:
array1 = null;
GC.Collect();
array1 = new Vector3[210,210,210];
Isn't this just large object heap fragmentation? Objects > 85,000 bytes are allocated on the large object heap. The GC frees up space in this heap but never compacts the remaining objects. This can result in insufficent contiguous memory to successfully allocate a large object.
Alan.
If I had to speculate you problem is not really that you are going from Vector3[200,200,200] to a Vector3[210,210,210] but that most likely you have similar previous steps before this one:
i.e.
// first you have
Vector3[10,10,10];
// then
Vector3[20,20,20];
// then maybe
Vector3[30,30,30];
// .. and so on ..
// ...
// then
Vector3[200,200,200];
// and eventually you try
Vector3[210,210,210] // and you get an OutOfMemoryException..
If that is true, I would suggest a better allocation strategy. Try over allocating - maybe doubling the size every time as opposed to always allocating just the space that you need. Especially if these arrays are ever used by objects that need to pin the buffers (i.e. if that have ties to native code)
So, instead of the above, have something like this:
// first start with an arbitrary size
Vector3[64,64,64];
// then double that
Vector3[128,128,128];
// and then.. so in thee steps you go to where otherwise
// it would have taken you 20..
Vector3[256,256,256];
They might not be getting collected because they're being referenced somewhere you're not expecting.
As a test, try changing your references to WeakReferences instead and see if that resolves your OOM problem. If it doesn't then you're referencing them somewhere else.
I understand what you're trying to do and pushing for immediate garbage collection is probably not the right approach (since the GC is subtle in its ways and quick to anger).
That said, if you want that functionality, why not create it?
public static void Collect(ref object o)
{
o = null;
GC.Collect();
}
An OutOfMemory exception internally triggers a GC cycle automatically once and attempts the allocation again before actually throwing the exception to your code. The only way you could be having OutOfMemory exceptions is if you're holding references to too much memory. Clear the references as soon as you can by assigning them null.
Part of the problem may be that you're allocating a multidimensional array, which is represented as a single contiguous block of memory on the large object heap (more details here). This can block other allocations as there isn't a free contiguous block to use, even if there is still some free space somewhere, hence the OOM.
Try allocating it as a jagged array - Vector3[210][210][210] - which spreads the arrays around memory rather than as a single block, and see if that improves matters
John, Creating objects > 85000 bytes will make the object end up in the large object heap. The large object heap is never compacted, instead the free space is reused again.
This means that if you are allocating larger arrays every time, you can end up in situations where LOH is fragmented, hence the OOM.
you can verify this is the case by breaking with the debugger at the point of OOM and getting a dump, submitting this dump to MS through a connect bug (http://connect.microsoft.com) would be a great start.
What I can assure you is that the GC will do the right thing trying to satisfy you allocation request, this includes kicking off a GC to clean the old garbage to satisfy the new allocation requests.
I don't know what is the policy of sharing out memory dumps on Stackoverflow, but I would be happy to take a look to understand your problem more.

Categories

Resources