There is a possible memory leak in this statement for a very large string (tempText can grow as big as ~10mb).
string strXML = new string(tempText.Where(ch => XmlConvert.IsXmlChar(ch)).ToArray());
Memory allocated for strXML doesn't get released even after exiting the function. And I've to call this function multiple times. Any possible solution, without having this string as a class variable?
I'm not very familiar with C# memory management, can someone shed some light on this issue?
The garbage collector doesn't collect objects the instance that their lifetime ends. It executes periodically, based on it's perceived need, to free up memory. The string will be eventually collected at some indeterminate point in time after it is long longer referenced by any rooted object.
When you build up a large object, it will remain in memory for much longer than other small objects.
Read up on the large object heap and generation 2 garbage collection... it gets technical but those two terms should suffice to point out what's going on here.
That's why you are not seeing the garbage collector reclaim that memory as quickly as you'd like.
In order to overcome this, either allocate work buffers once and reuse them, or work on the data in smaller chunks.
Related
I have a method as such
public void MainMethod()
{
while(true)
{
var largeObject = GetLargeObject();
............
//some work with this largeObject
............
//how to release the memory from largeObject here
}
}
obviously, I will use break in loop.
In that usage of this code, my memory will be full of trash pretty fast.
Is there a way to free memory that uses some object (largeObject, for example) without running garbage collector by GC.Collect()? Not just mark this object as one that can be collect by GC, but free the memory? Or GC collects largeObject as soon as iteration is over, cause it's not in use anymore?
As long as I understand the realization of IDisposable and call for Dispose() just mark the object for GC but not free the memory instantly (so the memory will be released when the GC runs)
P.S. don't say me "GC will collect everything for your", I know that
first of all. Large objects, i.e. larger than 85kb, will be allocated on the large object heap (LOH). The large object heap is only collected in gen 2 collections, i.e. collections are expensive. Because of this it is generally recommended to avoid frequent LOH allocations. If at all possible, reuse the same object, or use a memory pool to avoid frequent allocations.
There are no way to explicitly free managed memory without letting the GC do its thing. While the standard recommendation is to leave the GC alone, there are cases where it may make sense to run it manually. Like if you have recently released large amount of memory, and the performance impact of a collection is acceptable at that point in time.
If the object is small enough to fit in the Small object heap then you should not need to worry about it. Gen 1 collections are fairly cheap, and while collections will need to run more frequently if you do many allocations, the time to run a collection is not proportional to the amount of freed memory. That said, reusing memory might still be a good idea.
In the end, if you suspect it is a problem, do some profiling. Some of the profilers give you the time spent in garbage collection, and also the allocation rate. Do not try to fix a imaginary problem before confirming that it actually is a problem, and you can verify that the fix actually is an improvement.
During Runtime I find that there is a very big memory usage by my application..
But It seems that I only use 3~4 MemoryStreams one of which is sometimes full of 81 Mb..
The others are mainly 20 mb, 3 mb and 1mb containers...
But still there is 525.xx MB memory usage by the application...
I tried using using(...) statements also but without any luck..
So, I am asking here for the most efficient way to cut down memory leaks.
In managed .NET apps, you don't generally have memory "leaks" in the original sense of the word, unless you are allocating unmanaged resource handles and not disposing of them correctly. But that doesn't sound like what you're doing.
More likely is that you are holding on to references to objects that you no longer need, and this is keeping the memory "alive" longer than you are expecting.
For example, if you put 5MB of data into a memory stream, and then assign that memory stream to a static field, then the 5MB will never go away for the lifetime of the application. You need to assign null to the static field that refers to the memory stream when you no longer need what it points to so that the garbage collector will release and reclaim that 5MB of memory.
Similarly, local variables are not released until the function exits. If you allocate a lot of memory and assign it to a local variable, and then call another function that runs for hours, the local variable will be kept alive the whole time. If you don't need that memory anymore, assign null to the local variable.
How are you determining that your app has a memory leak? If you are looking at the process virtual memory allocation shown by Task Manager, that is not very accurate. An application's memory manager may allocate large chunks of memory from the OS and internally free them for other uses within the application without releasing them back to the OS.
Use common sense practices. Call dispose or close as appropriate and assign null to variables as soon as you no longer need their contents.
Just because the garbage collected environment will let you be lazy doesn't mean you shouldn't pay attention to memory allocation and deallocation patterns in your code.
Your definition of memory leak seems to be unusual... Following code will produce exactly the effect you are observing, but it rarely called memory leak:
var data = new byte[512*1024*1024];
data = null;
But you may actually have legitimate leaks. Memory profilers will easily show them, but huge ones can be tracked down by code reviews. If you already know that you have small number of memory streams - check if you don't keep them alive by storing in some list or simply in member variable. Also check if your large arrays are not staying alive for similar reasons.
I have a class that has a byte array holding from 1,048,576 bytes up to 134,217,728 bytes.
In the dispose void I have the array set to null and the method calling the dispose calls GC.Collect after that.
If I dispose right away I will get my memory back, but if I wait like 10 hours and dispose, memory usage doesn't change.
Memory usage is based on OS memory allocation. It may be freed immediately, it may not be. This depends on OS utilization, application, etc. You free it in the runtime, but this doesn't mean the OS alway gets it back. I have to think heres a determination based on memory patterns (ie time here) that affects this calculation of when to return it or not. This was addressed here:
Explicitly freeing memory in c#
Note: if the memory isn't released, it doesn't automatically means it used. It's up to the CLR whether it's releases chuncks of memory or not, but this memory isn't wasted.
That said, if you want a technical explanation, you'll want to read litterature about the Large Object Heap: http://msdn.microsoft.com/en-us/magazine/cc534993.aspx
Basically, it's a zone of memory where very large objects (more than 85kB) are allocated. It differs from the other zones of memory in that it's never compacted, and thus can become fragmented. I think what happens in your case is:
Case 1: you allocate the object and immediately call GC.Collect. The object is allocated at the end of the heap, then freed. The CLR sees a free segment at the end of the heap and releases it to the OS.
Case 2: you allocate the object and wait for a while. In the mean time, an other object is allocated in the LOH. Now, your object isn't the last one anymore. Then, when you call GC.Collect, your object is erased, but there's still the other object(s) at the end of the memory segment. So the CLR cannot release the memory to the OS.
Just a guess based on my knowledge of memory management in .NET. I may be completely wrong.
Your findings are not unusual, but it doesn't mean that anything is wrong either. In order to be collected, something must prompt the GC to collect (often an attempted allocation). As a result, you can build an app that consumes a bunch of memory, releases it, and then goes idle. If there is no memory pressure on the machine, and if your app doesn't try to do anything after that, the GC won't fire (because it doesn't need to). Once you get busy, the GC will kick in and do its job. This behavior is very commonly mistaken for a leak.
BTW:
Are you using that very large array more than once? If so, you might be better off keeping it around and reusing it. Reason: any object larger than 85,000 bytes is allocated on the Large Object Heap. That heap only gets GC'd on Generation 2 collections. So if you are allocating and reallocating arrays very often, you will be causing a lot of Gen 2 (expensive) collections.
(note: that doesn't mean that there's a hard and fast rule to always reuse large arrays, but if you are doing a lot of allocation/deallocation/allocation of the array, you should measure how much it helps if you re-use).
I have a fewSortedList<>andSortedDictionary<>structures in my simulation code and I add millions of items in them over time. The problem is that the garbage collector does not deallocate quickly enough memory so there is a huge hit on the application's performance. My last option was to engage theGC.Collect()method so that I can reclaim that memory back. Has anyone got a different idea? I am aware of theFlyweightpattern which is another option but I would appreciate other suggestions that would not require huge refactoring of my code.
You are fighting the "There's no free lunch" principle. You cannot assume that stuffing millions of items in a list isn't going to affect perf. Only the SortedList<> should be a problem, it is going to start allocating memory in the Large Object Heap. That allocation isn't going to be freed soon, it takes a gen #2 collection to chuck stuff out of the LOH again. This delay should not otherwise affect the perf of your program.
One thing you can do is avoiding the multiple of copies of the internal array that SortedList<> will jam into the LOH when it keeps growing. Try to guess a good value for Capacity so it pre-allocates the large array up front.
Next, use Perfmon.exe or TaskMgr.exe and looks at the page fault delta of your program. It should be quite busy while you're allocating. If you see low values (100 or less) then you might have a problem with the paging file being fragmented. A common scourge on older machines that run XP. Defragging the disk and using SysInternals' PageDefrag utility can do extraordinary wonders.
I think the SortedList uses a array as backing field, which means that large SortedList get allocated on the Large object heap. The large object heap can get defragmentated, which can cause an out of memory exception while in principle there is still enough memory available.
See this link.
This might be your problem, as intermediate calls to GC.collect prevent the LOH from getting badly defragmented in some scenarios, which explains why calling it helps you reduce the problem.
The problem can be mitigated by splitting large objects into smaller fragments.
I'd start with doing some memory profiling on your application to make sure that the items you remove from those lists (which I assume is happening from the way your post is written) are actually properly released and not hanging around places.
What sort of performance hit are we talking and on what operating system? If I recall, GC will run when it's needed, not immediately or even "soon". So task manager showing high memory allocated to your application is not necessarily a problem. What happens if you put the machine under higher load (e.g. run several copies of your application)? Does memory get reclaimed faster in that scenario or are you starting to run out of memory?
I hope answers to these questions will help point you in a right direction.
Well, if you keep all of the items in those structures, the GC will never collect the resources because they still have references to them.
If you need the items in the structures to be collected, you must remove them from the data structure.
To clear the entire data structure try using Clear() and setting the data structure reference to null. If the data is still not getting collected fast enough, call CC.Collect().
I have a large 2GB file with 1.5 million listings to process. I am running a console app that performs some string manipulation then uploads each listing to the database.
I created a LINQ object and clear the object by assigning it to a new LinqObject() for each listing (loop).
When the object is complete, I add it to a list.
When the list reaches 100 objects, I submitAll on the entire list, clear the list, then repeat.
My memory usage continues to grow as the program runs. Is there anything I should be doing to keep memory usage down? I tried GC.collect. I think I want to use dispose..
Thanks in advance for looking.
It's normal for the memory usage of a program to increase when it's working. You should not try to force the garbage collector to reduce the memory usage to try to save resources, this will most likely waste resources instead.
Contrary to one's first reaction, high memory usage is not a performance problem as long as there are any free memory left at all. Having a lot of unused memory doesn't increase the performance a bit. If you try to reduce the memory usage only to keep it down, you are just wasting CPU time doing cleanup that is not needed.
If you are running out of free memory or if some other application needs it, the garbage collector will do the appropriate cleanup. In almost every situation the garbage collector will know much more about the current memory situatiuon than you can possibly anticipate when writing the code.
If you are using objects that implement the IDisposable interface, you should call the Dispose method to free unmanaged resources, but all other objects are handled by the garbage collector. Managed objects normally don't leak memory at all.
Do you need your memory usage to stay low? Absent an actual functional problem, high memory usage in and of itself is not an issue.
How large is the memory usage growing? It may be that .NET is just "settling" effectively.
It's not really clear exactly how you're doing this, but the general principle sounds okay. I suggest you take the database work out of the equation - just comment out whichever line would actually submit to the database. See how much memory that uses. Other than the StreamReader (or whatever) you shouldn't have anything else that needs disposing if you're not touching the database - just building batches of transformed objects and throwing them away.