C# large byte array and memory leak if not nulled quickly - c#

I have a class that has a byte array holding from 1,048,576 bytes up to 134,217,728 bytes.
In the dispose void I have the array set to null and the method calling the dispose calls GC.Collect after that.
If I dispose right away I will get my memory back, but if I wait like 10 hours and dispose, memory usage doesn't change.

Memory usage is based on OS memory allocation. It may be freed immediately, it may not be. This depends on OS utilization, application, etc. You free it in the runtime, but this doesn't mean the OS alway gets it back. I have to think heres a determination based on memory patterns (ie time here) that affects this calculation of when to return it or not. This was addressed here:
Explicitly freeing memory in c#

Note: if the memory isn't released, it doesn't automatically means it used. It's up to the CLR whether it's releases chuncks of memory or not, but this memory isn't wasted.
That said, if you want a technical explanation, you'll want to read litterature about the Large Object Heap: http://msdn.microsoft.com/en-us/magazine/cc534993.aspx
Basically, it's a zone of memory where very large objects (more than 85kB) are allocated. It differs from the other zones of memory in that it's never compacted, and thus can become fragmented. I think what happens in your case is:
Case 1: you allocate the object and immediately call GC.Collect. The object is allocated at the end of the heap, then freed. The CLR sees a free segment at the end of the heap and releases it to the OS.
Case 2: you allocate the object and wait for a while. In the mean time, an other object is allocated in the LOH. Now, your object isn't the last one anymore. Then, when you call GC.Collect, your object is erased, but there's still the other object(s) at the end of the memory segment. So the CLR cannot release the memory to the OS.
Just a guess based on my knowledge of memory management in .NET. I may be completely wrong.

Your findings are not unusual, but it doesn't mean that anything is wrong either. In order to be collected, something must prompt the GC to collect (often an attempted allocation). As a result, you can build an app that consumes a bunch of memory, releases it, and then goes idle. If there is no memory pressure on the machine, and if your app doesn't try to do anything after that, the GC won't fire (because it doesn't need to). Once you get busy, the GC will kick in and do its job. This behavior is very commonly mistaken for a leak.
BTW:
Are you using that very large array more than once? If so, you might be better off keeping it around and reusing it. Reason: any object larger than 85,000 bytes is allocated on the Large Object Heap. That heap only gets GC'd on Generation 2 collections. So if you are allocating and reallocating arrays very often, you will be causing a lot of Gen 2 (expensive) collections.
(note: that doesn't mean that there's a hard and fast rule to always reuse large arrays, but if you are doing a lot of allocation/deallocation/allocation of the array, you should measure how much it helps if you re-use).

Related

How to release the memory which is occupied but large object as soon as possible?

I have a method as such
public void MainMethod()
{
while(true)
{
var largeObject = GetLargeObject();
............
//some work with this largeObject
............
//how to release the memory from largeObject here
}
}
obviously, I will use break in loop.
In that usage of this code, my memory will be full of trash pretty fast.
Is there a way to free memory that uses some object (largeObject, for example) without running garbage collector by GC.Collect()? Not just mark this object as one that can be collect by GC, but free the memory? Or GC collects largeObject as soon as iteration is over, cause it's not in use anymore?
As long as I understand the realization of IDisposable and call for Dispose() just mark the object for GC but not free the memory instantly (so the memory will be released when the GC runs)
P.S. don't say me "GC will collect everything for your", I know that
first of all. Large objects, i.e. larger than 85kb, will be allocated on the large object heap (LOH). The large object heap is only collected in gen 2 collections, i.e. collections are expensive. Because of this it is generally recommended to avoid frequent LOH allocations. If at all possible, reuse the same object, or use a memory pool to avoid frequent allocations.
There are no way to explicitly free managed memory without letting the GC do its thing. While the standard recommendation is to leave the GC alone, there are cases where it may make sense to run it manually. Like if you have recently released large amount of memory, and the performance impact of a collection is acceptable at that point in time.
If the object is small enough to fit in the Small object heap then you should not need to worry about it. Gen 1 collections are fairly cheap, and while collections will need to run more frequently if you do many allocations, the time to run a collection is not proportional to the amount of freed memory. That said, reusing memory might still be a good idea.
In the end, if you suspect it is a problem, do some profiling. Some of the profilers give you the time spent in garbage collection, and also the allocation rate. Do not try to fix a imaginary problem before confirming that it actually is a problem, and you can verify that the fix actually is an improvement.

Memory leak in a small operation

There is a possible memory leak in this statement for a very large string (tempText can grow as big as ~10mb).
string strXML = new string(tempText.Where(ch => XmlConvert.IsXmlChar(ch)).ToArray());
Memory allocated for strXML doesn't get released even after exiting the function. And I've to call this function multiple times. Any possible solution, without having this string as a class variable?
I'm not very familiar with C# memory management, can someone shed some light on this issue?
The garbage collector doesn't collect objects the instance that their lifetime ends. It executes periodically, based on it's perceived need, to free up memory. The string will be eventually collected at some indeterminate point in time after it is long longer referenced by any rooted object.
When you build up a large object, it will remain in memory for much longer than other small objects.
Read up on the large object heap and generation 2 garbage collection... it gets technical but those two terms should suffice to point out what's going on here.
That's why you are not seeing the garbage collector reclaim that memory as quickly as you'd like.
In order to overcome this, either allocate work buffers once and reuse them, or work on the data in smaller chunks.

What is the best way to reduce Memory Leak?

During Runtime I find that there is a very big memory usage by my application..
But It seems that I only use 3~4 MemoryStreams one of which is sometimes full of 81 Mb..
The others are mainly 20 mb, 3 mb and 1mb containers...
But still there is 525.xx MB memory usage by the application...
I tried using using(...) statements also but without any luck..
So, I am asking here for the most efficient way to cut down memory leaks.
In managed .NET apps, you don't generally have memory "leaks" in the original sense of the word, unless you are allocating unmanaged resource handles and not disposing of them correctly. But that doesn't sound like what you're doing.
More likely is that you are holding on to references to objects that you no longer need, and this is keeping the memory "alive" longer than you are expecting.
For example, if you put 5MB of data into a memory stream, and then assign that memory stream to a static field, then the 5MB will never go away for the lifetime of the application. You need to assign null to the static field that refers to the memory stream when you no longer need what it points to so that the garbage collector will release and reclaim that 5MB of memory.
Similarly, local variables are not released until the function exits. If you allocate a lot of memory and assign it to a local variable, and then call another function that runs for hours, the local variable will be kept alive the whole time. If you don't need that memory anymore, assign null to the local variable.
How are you determining that your app has a memory leak? If you are looking at the process virtual memory allocation shown by Task Manager, that is not very accurate. An application's memory manager may allocate large chunks of memory from the OS and internally free them for other uses within the application without releasing them back to the OS.
Use common sense practices. Call dispose or close as appropriate and assign null to variables as soon as you no longer need their contents.
Just because the garbage collected environment will let you be lazy doesn't mean you shouldn't pay attention to memory allocation and deallocation patterns in your code.
Your definition of memory leak seems to be unusual... Following code will produce exactly the effect you are observing, but it rarely called memory leak:
var data = new byte[512*1024*1024];
data = null;
But you may actually have legitimate leaks. Memory profilers will easily show them, but huge ones can be tracked down by code reviews. If you already know that you have small number of memory streams - check if you don't keep them alive by storing in some list or simply in member variable. Also check if your large arrays are not staying alive for similar reasons.

Memory not freed by Application until GC is explicitly called

I am launching many threads simultaneously, each one writing / reading data into/from a queue; Data is dequeued progressively while it is processed and stored to DB. For some reason the memory is not freed although the Queues are empty and I made sure all event subscription between the data reader and the data processor are unsubscribed at the end of the threads.The amount of squatted RAM is Exactly the amount of the data that is read by binary readers and put into queues.
In order to isolate the problem, I have bypassed the Processing and the DB Storing step.
Why would be the RAM still squatted long after all threads are finished, until I explicitly call GC.Collect() or terminate the program? I manually nullified the Queue after it is emptied, and also nullified the Binary Reader that read the data. I thought that would be enough for the GC to wake up and do its housekeeping at least after a few minutes.
EDIT :
The (reformulated after deletion) QUESTION :
In short, I was always told that the default GC behaviour managed the memory properly and that I shall almost never call the GC explicitly and let the framework do the job.
I would like to know why in this case memory usage drops down only when an explicit Call is made to GC.Collect
EDIT : Without GC Collect
With GC Collect (Called on a regular basis)
There are three conditions when a Garbage collection might occur (see MSDN):
1.) The system has low physical memory.
2.) The memory that is used by allocated objects on the managed heap
surpasses an acceptable threshold. This means that a threshold of
acceptable memory usage has been exceeded on the managed heap. This
threshold is continuously adjusted as the process runs.
3.) The GC.Collect method is called. In almost all cases, you do not have
to call this method, because the garbage collector runs continuously.
This method is primarily used for unique situations and testing.
From your description it sounds like the framework decided neither 1.) nor 2.) are the case hence it will only collect when you call GC.Collect()
Based on the screenshot, it appears that your memory usage never goes over 50% during the entire run of the application. It's much faster for the .NET framework to leave that memory alone than to stop your application (or at least take up a lot of CPU time in checking for liveness) in order to collect the garbage. If you cut your machine's RAM to 2 GB or so, I'm sure that the garbage collector would step up and keep memory usage within the hardware limits.
It seems you are leaking objects you should use windbg to create a dump then use sos.dll to track root for your objects.
You can follow this explanation on how to track "roots" for your objects and see what is causing these leaks.

Are memory leaks possible in managed environments like .NET?

In C++ it is easily possible to have a permanent memory leak - just allocate memory and don't release it:
new char; //permanent memory leak guaranteed
and that memory stays allocated for the lifetime of the heap (usually the same as program runtime duration).
Is the same (a case that will lead to a specific unreferenced object never been released while memory management mechanisms are working properly) possible in a C# program?
I've carefully read this question and answers to it and it mentions some cases which lead to getting higher memory consumption than expected or IMO rather extreme cases like deadlocking the finalizer thread, but can a permanent leak be formed in a C# program with normally functioning memory management?
It depends on how you define a memory leak. In an unmanaged language, we typically think of a memory leak as a situation where memory has been allocated, and no references to it exist, so we are unable to free it.
That kind of leaks are pretty much impossible to create in .NET (unless you call out into unmanaged code, or unless there's a bug in the runtime).
However, you can get another "weaker" form of leaks: when a reference to the memory does exist (so it is still possible to find and reset the reference, allowing the GC to free the memory normally), but you thought it didn't, so you assumed the object being referenced would get GC'ed. That can easily lead to unbounded growth in memory consumption, as you're piling up references to objects that are no longer used, but which can't be garbage collected because they're still referenced somewhere in your app.
So what is typically considered a memory leak in .NET is simply a situation where you forgot that you have a reference to an object (for example because you failed to unsubscribe from an event). But the reference exists, and if you remember about it, you can clear it and the leak will go away.
You can write unmanaged code in .NET if you wish, you have enclose your block of code with unsafe keyword, so if you are writing unsafe code are you not back to the problem of managing memory by yourself and if not get a memory leak?
It's not exactly a memory leak, but if you're communicating with hardware drivers directly (i.e. not through a properly-written .net extension of a set of drivers) then it's fairly possible to put the hardware into a state where, although there may or may not be an actual memory leak in your code, you can no longer access the hardware without rebooting it or the PC...
Not sure if this is a useful answer to your question, but I felt it was worth mentioning.
GC usually delay the collection of unreachable memory to some later time when an analysis of the references show that the memory is unreachable. (In some restricted cases, the compiler may help the GC and warn it that a memory zone is unreachable when it become so.)
Depending on the GC algorithm, unreachable memory is detected as soon as a collection cycle is ran, or it may stay undetected for a certain number of collection cycles (generational GC show this behavior for instance). Some techniques even have blind spots which are never collected (use of reference counted pointer for instance) -- some deny them the name of GC algorithm, they are probably unsuitable in general purpose context.
Proving that a specific zone will be reclaimed will depend on the algorithm and on the memory allocation pattern. For simple algorithm like mark and sweep, it is easy to give a bound (says till the next collection cycle), for more complex algorithms the matter is more complex (under a scheme which use a dynamic number of generations, the conditions in which a full collection is done are not meaningful to someone not familiar with the detail of the algorithm and the precise heuristics used)
A simple answer is that classic memory leaks are impossible in GC environments, as classically a memory leak is leaked because, as an unreferenced block theres no way for the software to find it to clean it up.
On the other hand, a memory leak is any situation where the memory usage of a program has unbounded growth. This definition is useful when analyzing how software might fail when run as a service (where services are expected to run, perhaps for months at a time).
As such, any growable data structure that continues to hold onto references onto unneeded objects could cause service software to effectively fail because of address space exhaustion.
Easiest memory leak:
public static class StaticStuff
{
public static event Action SomeStaticEvent;
}
public class Listener
{
public Listener() {
StaticStuff.SomeStaticEvent+=DoSomething;
}
void DoSomething() {}
}
instances of Listener will never be collected.
If we define memory leak as a condition where a memory that can be used for creating objects, cannot be used or a memory that can be released does not then
Memory leaks can happen in:
Events in WPF where weak events need to be used. This especially can happens in Attached Properties.
Large objects
Large Object Heap Fragmentation
http://msdn.microsoft.com/en-us/magazine/cc534993.aspx

Categories

Resources