How to overcome Memory leak problem in .Net(Windows Application) [closed]

How to overcome Memory leak problem in .Net(Windows Application) [closed] - c#

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
I have a problem of memory leaks in my application when its running.
The Application Uses CPU Memory like this.
Minimum Percentage is 6%
Maximum Percentage is 35%
Maximum Peak Memory is 90MB
I have used ANTS Memory Profiler for Analyzing Memory Leaks in the Application.
But I don't know how to reduce the memory usage of Application while running.
Please Any one can give me solution as soon as possible.
Thanks and Regards
Ramesh N

How do you know you have memory leaks? Bear in mind that the GC may not run if there's no memory pressure on the system, so it may look like memory is being allocated and not released - the GC will deal with it if necessary.

Why do you think your application is leaking? If it stays at a consistent 90MB usage then this isn't a leak - it's just showing more memory use than you think. If it's a genuine memory leak then over time it would creep higher through usage. If you can't get it to 100MB then it's not really leaking...
.NET applications often show higher memory usage (especially in certain views of task manager) than you'd expect. Is this actually a problem for you, or are you perceiving it as a problem because it's higher than you think?

Do you experience any problems from the memory use? Otherwise it doesn't seem like there is a problem at all.
Unless there is any actual memory leeks (but I suppose there isn't, as you have profiles the code), an application using several megabytes or memory, or even growing constantly to a certain point, is no problem.
It's a common misconception that a computer should have as much free memory as possible, but there is no performance benefit from that. Having unused memory doesn't make the applications run any faster in any way.
It's normal for a .NET application to allocate more memory as it runs. As long as there is free memory, this is by far more efficient than running throrough garbage collections to try to free up memory. The application will clean up the memory when needed.
The system can send a signal to an application to make it free up as much memory as possible. If you minimise an application, this signal is sent to it, so you can use that to find out approximately how much memory your application uses more than the absolute minimum.

First, put in some TEMPORARY code that calls GC.GetTotalMemory(true) at regular intervals and logs it.
Run the application for some time.
THEN TAKE OUT THE TEMPORARY CODE. Since this method really does hurt memory usage, but it will give you some useful details in doing so. Remember, this is purely an investigatory step, not something to use in 99% of production code.
Now, see if the figures it's returning are steadily climbing. If they're not (and that includes climbing a bit and then dropping again), you've no problem. End of solution.
If you do, then you need to look at direct or indirect use of unmanaged resources, which either are unmanaged memory, or use it. These will split into too cases.
The first is where you yourself are using unmanaged resources. Make sure that you are wrapping them in some safe-handle based wrapper and that they are disposed on each use along with having a finaliser. Don't mix direct use of managed and unmanaged resources in the same class (and then avoid the Dispose(bool) pattern, as it's really part of the anti-pattern of mixing these).
The second is where you use something that in turn uses unmanaged resources. A class might be such if it implements IDisposable. Make sure these are always disposed.
Make sure you are not interning strings needlessly. Interning strings is a useful memory-saving technique, but only if you know that the string value in question will be used regularly throughout the lifetime of the project (or at the very least, you will add few which won't be used again throughout that lifetime). If you intern strings that aren't regularly used, you've hit on one of the best ways to push memory into a tight spot with managed code (GC can happen on the intern pool now, but it often doesn't).
There are also techniques to reduce memory use rather than avoid leaks, but since you're only using a very small amount of memory (90MB) these aren't worth considering here.
Incidentally, what size paging file do you have? 90MB being 35% means a total memory of 256MB. Unless you've got 64MB of physical RAM, that's a bit low. Current advice puts page files at about 100% or less of physical RAM, but that's based on the tendency toward larger RAM sizes these days. If you've got 128MB in that thing, I'd at least double up that page file to give a total memory of around 390MB.

Related

How to minimize the length of the GC collections?

I need an application that will run smoothly. I have many serial chunks of computations I need to consecutively perform in short periods of time each, so I don't mind the GC doing it's job and I even can take more frequent collections but what I need to minimize the length of each GC collection.
I would like (if possible) to have 1 milli max pause of thread activity due to the GC each time.
what is the best way to acheive this in .NET (I know that .NET it not the technology for such demands but if it will meet my demands when optimized the save of development hours and flexibility for future specs is good incentive to try it out)?

Right from the MSDN page:
https://msdn.microsoft.com/en-us/library/ms973837.aspx
The .NET garbage collector provides a high-speed allocation service
with good use of memory and no long-term fragmentation problems,
however it is possible to do things that will give you much less than
optimal performance. To get the best out of the allocator you should
consider practices such as the following:
Allocate all of the memory (or as much as possible) to be used with a given data structure at the same time. Remove temporary allocations
that can be avoided with little penalty in complexity.
Minimize the number of times object pointers get written, especially those writes made to older objects.
Reduce the density of pointers in your data structures.
Make limited use of finalizers, and then only on "leaf" objects, as much as possible. Break objects if necessary to help with this.
A regular practice of reviewing your key data structures and conducting memory usage profiles with tools like Allocation Profiler
will go a long way to keeping your memory usage effective and having
the garbage collector working its best for you.
As Ron mentioned in his comment. You have to be extra smart with .NET if you want a lot of control over the GC.

does mono/.Net GC release free allocated memory back to OS after collection? if not, why?

I heard many times that once C# managed program request more memory from OS, it doesn't free it back, unless system is out of memory. Eg. when object is collected, it gets deleted, and memory that was occupied by the object is free to reuse by another managed object, but memory itself is not returned to operating system (for example, mono on unix wouldn't call brk / sbrk to decrease the amount of virtual memory available to the process back to what it was before its allocation).
I don't know if this really happens or not, but I can see that my c# applications, running on linux, use small amount of memory on beginning, then when I do something memory expensive, it allocates more of it, but later on when all objects get deleted (I can verify that by putting debug message to destructors), the memory is not free'd. On other hand no more memory is allocated when I run that memory expensive operation again. The program just keep on eating the same amount of memory until it is terminated.
Maybe it is just my misunderstanding of how GC in .net works, but if it really does work like this, why is that? What is a benefit of keeping the allocated memory for later, instead of returning it back to the system? How can it even know if system need it back or not? What about other application that would crash or couldn't start because of OOM caused by this effect?
I know that people will probably answer something like "GC manages memory better than you ever could, just don't care about it" or "GC knows what it does best" or "it doesn't matter at all, it's just virtual memory" but it does matter, on my 2gb laptop I am running OOM (and kernel OOM killer gets started because of that) very often when I am running any C# applications after some time precisely because of this irresponsible memory management.
Note: I was testing this all on mono in linux because I really have hard times understanding how windows manage memory, so debugging on linux is much easier for me, also linux memory management is open source code, memory management of windows kernel / .Net is rather mystery for me

The memory manager works this way because there is no benefit of having a lot of unused system memory when you don't need it.
If the memory manager would always try to have as little memory allocated as possible, that would mean that it would do a lot of work for no reason. It would only slow the application down, and the only benefit would be more free memory that no application is using.
Whenever the system needs more memory, it will tell the running applications to return as much as possible. The same signal is also sent to an application when you minimise it.
If this doesn't work the same with Mono in Linux, then that is a problem with that specific implementation.

Generally, if an app needs memory once, it will need it again. Releasing memory back to the OS only to request it back again is overhead, and if nothing else wants the memory: why bother?. It is trying to optimize for the very likely scenario of wanting it again. Additionally, releasing it back requires entire / contiguous blocks that can be handed back, which has very specific impact on things like compaction: it isn't quite as simple as "hey, I'm not using most of this : have it back" - it needs to figure out what blocks can be released, presumably after a full collect and compact (relocate objects etc) cycle.

C# garbage collection

I have a business app that I have written, that effectively recurses through a directory structure looking for specific Excel files, and stores their addresses. It then loops through these files and parses them by creating a DocumentParser object for each file, this is done one at a time, and not async. The software seems to be very stable, so much so that the business would like to run it to recurse through a massive directory containing upwards of 10000 relevant Excel files.
My question is, as I am creating a new DocumentParser object each time, will the GC be effective enough to discard each of the objects when they go out of scope, ie when that Excel sheet has been parsed, or is there a way I can monitor this and where necessary manually do a GC? I've never had to deal with such large amounts of data before, generally only testing it on a maximum of 40-50 Excel files at a time.
Thanks.

The GC is a very complex piece of software. And the GC is at least the only one that knows when garbage collection is necessary. So my advice is to leave the GC on it's own.
Additionally: The GC will handle these masses objects. Perhaps you will recognize a decrease of performance. If this is a problem you can try to optimize your code. But not premature.

I would leave the GC to its business. 10,000 objects is not really much work for the GC. And it's likely the cost of the GC work will be much lower than the cost of the Excel work. So it's not worth complicating your design to tweak things for the GC. If you end up with so many files to process that your application can't finish in time, it's most likely going to be the speed of the Excel processing holding you up.
However one note which may be relevant: if the DocumentParser is using unmanaged memory in its work with the Excel file, you can use GC.Add/RemoveMemoryPressure to indicate to the GC the real added cost when opening the file. If you didn't write the DocumentParser yourself, the author may already be doing this.
The issue here is that you may have a managed object that costs something in the order of 100 bytes, which allocates a large amount of unmanaged memory when it does Excel work. The GC will have no way of knowing this, so these methods help notify the GC that there is more memory pressure than it was aware of. This may change its behaviour in how/when it decides to collect, which may lead to the application maintaining a lower memory footprint. If the application's memory usage balloons out over time, then you may start seeing some slow downs from length garbage collection and possibly paging on the machine (depending on how much memory you have). You'll want to keep an eye on its memory usage to make sure it's not leaking memory as it processes - a memory profiler may be helpful there.

You don't need to manually call the GC unless you are holding some very large resource which is not the case in your situation. The GC will tweak itself with every call and if you call it manually you will just disrupt its internal profiling data.
BTW GC can collect stuff not only when it goes out of scope but also after its last usage (i.e. while it is still in scope but the variable is not used anymore).

Yes and no - The GC is effective enough to release when it needs to, but you can't generally be sure when that is.
There is a way to force a GC collection but it's generally considered to be bad practise in production code because of the effects of forcing a stack walk when it's not required is worse then using a bit of extra memory until the GC decides it needs to free resources to allocate more objects.

Quantifying the Performance of Garbage Collection vs. Explicit Memory Management

I found this article here:
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
http://www.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf
In the conclusion section, it reads:
Comparing runtime, space consumption,
and virtual memory footprints over a
range of benchmarks, we show that the
runtime performance of the
best-performing garbage collector is
competitive with explicit memory
management when given enough memory.
In particular, when garbage collection
has five times as much memory as
required, its runtime performance
matches or slightly exceeds that of
explicit memory management. However,
garbage collection’s performance
degrades substantially when it must
use smaller heaps. With three times as
much memory, it runs 17% slower on
average, and with twice as much
memory, it runs 70% slower. Garbage
collection also is more susceptible to
paging when physical memory is scarce.
In such conditions, all of the garbage
collectors we examine here suffer
order-of-magnitude performance
penalties relative to explicit memory
management.
So, if my understanding is correct: if I have an app written in native C++ requiring 100 MB of memory, to achieve the same performance with a "managed" (i.e. garbage collector based) language (e.g. Java, C#), the app should require 5*100 MB = 500 MB?
(And with 2*100 MB = 200 MB, the managed app would run 70% slower than the native app?)
Do you know if current (i.e. latest Java VM's and .NET 4.0's) garbage collectors suffer the same problems described in the aforementioned article? Has the performance of modern garbage collectors improved?
Thanks.

if I have an app written in native C++
requiring 100 MB of memory, to achieve
the same performance with a "managed"
(i.e. garbage collector based)
language (e.g. Java, C#), the app
should require 5*100 MB = 500 MB? (And
with 2*100 MB = 200 MB, the managed
app would run 70% slower than the
native app?)
Only if the app is bottlenecked on allocating and deallocating memory. Note that the paper talks exclusively about the performance of the garbage collector itself.

You seem to be asking two things:
have GC's improved since that research was performed, and
can I use the conclusions of the paper as a formula to predict required memory.
The answer to the first is that there have been no major breakthroughs in GC algorithms that would invalidate the general conclusions:
GC'ed memory management still requires significantly more virtual memory.
If you try to constrain the heap size the GC performance drops significantly.
If real memory is restricted, the GC'ed memory management approach results in substantially worse performance due to paging overheads.
However, the conclusions cannot really be used as a formula:
The original study was done with JikesRVM rather than a Sun JVM.
The Sun JVM's garbage collectors have improved in the ~5 years since the study.
The study does not seem to take into account that Java data structures take more space than equivalent C++ data structures for reasons that are not GC related.
On the last point, I have seen a presentation by someone that talks about Java memory overheads. For instance, it found that the minimum representation size of a Java String is something like 48 bytes. (A String consists of two primitive objects; one an Object with 4 word-sized fields and the other an array with a minimum of 1 word of content. Each primitive object also has 3 or 4 words of overhead.) Java collection data structures similarly use far more memory than people realize.
These overheads are not GC-related per se. Rather they are direct and indirect consequences of design decisions in the Java language, JVM and class libraries. For example:
Each Java primitive object header1 reserves one word for the object's "identity hashcode" value, and one or more words for representing the object lock.
The representation of a String has to use a separate "array of characters" because of JVM limitations. Two of the three other fields are an attempt to make the substring operation less memory intensive.
The Java collection types use a lot of memory because collection elements cannot be directly chained. So for example, the overheads of a (hypothetical) singly linked list collection class in Java would be 6 words per list element. By contrast an optimal C/C++ linked list (i.e. with each element having a "next" pointer) has an overhead of one word per list element.
1 - In fact, the overheads are less than this on average. The JVM only "inflates" a lock following use & contention, and similar tricks are used for the identity hashcode. The fixed overhead is only a few bits. However, these bits add up to a measurably larger object header ... which is the real point here.

Michael Borgwardt is kind of right about if the application is bottlenecked on allocating memory. This is according to Amdahl's law.
However, I have used C++, Java, and VB .NET. In C++ there are powerful techniques available that allocate memory on the stack instead of the heap. Stack allocation is easily a hundreds of times faster than heap allocation. I would say that use of these techniques could remove maybe one allocation in eight, and use of writable strings one allocation in four.
It's no joke when people claim highly optimized C++ code can trounce the best possible Java code. It's the flat out truth.
Microsoft claims the overhead in using any of the .NET family of languages over C++ is about two to one. I believe that number is just about right for most things.
HOWEVER, managed environments carry a particular benefit in that when dealing with inferior programmers you don't have to worry about one module trashing another module's memory and the resulting crash being blamed on the wrong developer and the bug difficult to find.

At least as I read it, your real question is whether there have been significant developments in garbage collection or manual memory management since that paper was published that would invalidate its results. The answer to that is somewhat mixed. On one hand, the vendors who provide garbage collectors do tune them so their performance tends to improve over time. On the other hand, there hasn't been anything like a major breakthroughs such as major new garbage collection algorithms.
Manual heap managers generally improve over time as well. I doubt most are tuned with quite the regularity of garbage collectors, but in the course of 5 years, probably most have had at least a bit of work done.
In short, both have undoubtedly improved at least a little, but in neither case have there been major new algorithms that change the fundamental landscape. It's doubtful that current implementations will give a difference of exactly 17% as quoted in the article, but there's a pretty good chance that if you repeated the tests today, you'd still get a difference somewhere around 15-20% or so. The differences between then and now are probably smaller than the differences between some of the different algorithms they tested at that time.

I am not sure how relivent your question still is today. A performance critical application shouldn't spend a sigificant portion of its time doing object creation (as the micro-benchmark is very likely to do) and the performance on modern systems is more likely to be determined by how well the application fits into the CPUs cache, rather than how much main memory it uses.
BTW: There are lots of ticks you can do in C++ which support this which are not available in Java.
If you are worried about the cost of GC or object creation, you can take steps to minimise how many objects you create. This is generally a good idea where performance is critical in any language.
The cost of main memory isn't as much of an issue as it used to me. A machine with 48 GB is relatively cheap these days. An 8 core server with 48 GB of main memory can be leased for £9/day. Try hiring a developer for £9/d. ;) However, what is still relatively expensive is CPU cache memory. It is fairly hard to find a system with more than 16 MB of CPU cache. c.f. 48,000 MB of main memory. A system performs much better when an application is using its CPU cache and this is the amount of memory to consider if performance is critical.

First note that its now 2019 and a lot of things has improved.
As long as you dont trigger GC, allocation would be like as simple as incrementing a pointer. In C++ its much more if you dont implement your own mechanism to allocate in chunks.
And if you use smart shared pointers each change to refercence count will required locked increment (xaddl instruction) is slow itself and requires processors communicate to invalidate and resynch their cacheline.
What is more, with GC you get more locality with at least three ways. First when it allocates a new segment, it zero's memory and warms cachelines. Second it compacts heap and cause data to stay closer togeter and lastly all threads use its own heap.
In conclusion, although its hard to test and compare with every scenario and GC implementation ive read somewhere on SO that its proven GC performs better than manual memory management.

C# .NET Linq Memory Cleanup or Leak?

I have a large 2GB file with 1.5 million listings to process. I am running a console app that performs some string manipulation then uploads each listing to the database.
I created a LINQ object and clear the object by assigning it to a new LinqObject() for each listing (loop).
When the object is complete, I add it to a list.
When the list reaches 100 objects, I submitAll on the entire list, clear the list, then repeat.
My memory usage continues to grow as the program runs. Is there anything I should be doing to keep memory usage down? I tried GC.collect. I think I want to use dispose..
Thanks in advance for looking.

It's normal for the memory usage of a program to increase when it's working. You should not try to force the garbage collector to reduce the memory usage to try to save resources, this will most likely waste resources instead.
Contrary to one's first reaction, high memory usage is not a performance problem as long as there are any free memory left at all. Having a lot of unused memory doesn't increase the performance a bit. If you try to reduce the memory usage only to keep it down, you are just wasting CPU time doing cleanup that is not needed.
If you are running out of free memory or if some other application needs it, the garbage collector will do the appropriate cleanup. In almost every situation the garbage collector will know much more about the current memory situatiuon than you can possibly anticipate when writing the code.
If you are using objects that implement the IDisposable interface, you should call the Dispose method to free unmanaged resources, but all other objects are handled by the garbage collector. Managed objects normally don't leak memory at all.

Do you need your memory usage to stay low? Absent an actual functional problem, high memory usage in and of itself is not an issue.

How large is the memory usage growing? It may be that .NET is just "settling" effectively.
It's not really clear exactly how you're doing this, but the general principle sounds okay. I suggest you take the database work out of the equation - just comment out whichever line would actually submit to the database. See how much memory that uses. Other than the StreamReader (or whatever) you shouldn't have anything else that needs disposing if you're not touching the database - just building batches of transformed objects and throwing them away.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.