In my C# program I do various memory-consuming operations. Depending on the amount of memory currently available and various non-constant circumstances, the program fails with OutOfMemoryException at different stages.
I would like to stop processing at some point when it is more or less obvious that the program will fail with an OOM in the near future.
However, there is no fixed threshold for this; Other users may have more (or less) memory, another OS with its memory specifics and so on.
I do not want to just check that the software consumes more than eg 500MB as this may be too high or too low a restriction.
Is there any reliable way to predict an upcoming OOM in .NET?
MemoryFailPoint should do what you want.
http://msdn.microsoft.com/en-us/library/system.runtime.memoryfailpoint.aspx
The sort answer is that its probably not the best solution - instead of anticipating when you are going to run out of memory and trying to do something about it, you should probably either try and reduce the memory consumption of your application or just wait until the OutOfMemoryException exception is thrown and then clean up.
An out of memory exception is thrown whenever the .Net runtime fails to allocate the memory that was just requested - this could be because the machine has run out of physical memory and the swap file is disabled (or the machine has run out of disk space), or it could be because the virtual memory space of the process is too fragmented to allocate the desired block of memory (which can happen when working with large objects). Predicting when this is going to be is fairly difficult.
You can use the MemoryFailPoint class to check to see if a certain amount of memory is going to be available before starting an operation that uses up a large amount of memory, however this class does not guarantee that the memory will remain available for the duration of the operation and so your application could still fail with an OOM exception anyway. Although this class can be useful in some scenarios to attempt to reduce OOM exceptions (and in turn to avoid corrupting application state), its probably not going to be a "magic bullet" solution to your problem.
Many of the OOM exceptions I observe so far were caused by improper written application with memory leak or third party components. Consider using a data structure optimized for your operation.
Detecting OOM is quite difficult because there are randomly thrown. Maybe you can use MemoryFailPoint and Performance counters (available memory) to ensure there is enough available memory.
Related
Rewording my question in an attempt to make it On-Topic:
We have a client (only one client out of many) that is consistently getting an Out of Memory exception with our software. I feel like we've eliminated the usual suspects that would cause this and am looking for ideas of what other things (less standard causes) that might cause an OOM. Specifically, since this seems to be specific to a single customer, could it be caused by something wrong in the hardware, OS, or .Net install?
Here are the things I am aware of that cause an OOM and why I believe we've eliminated them as suspects:
1 - OOM caused by system running out of memory.
Why Not? Because the system has several GB available when these exceptions occur.
2 - OOM caused by process running out of memory due to over allocation or memory leaks.
Why Not? Because the process is using only about 100MB of memory at the time of the exceptions. We have monitored the memory usage for days (on the system in question) and have not noticed any significant increase in memory usage.
3 - OOM caused by running out of other system resources such as file handles, etc.
Why Not? The exceptions are happening, exclusively, during run-of-the-mill memory allocations, not while opening a file or connecting to a socket.
4 - OOM caused by attempting to allocate a large array with excessive memory fragmentation.
Why Not? The memory blocks that we are allocating are fairly small (640x480x2, for the most part). With so much memory available, I have trouble believing that it could be so fragmented that something like that would fail.
So, just to be clear, I am not asking "Why doesn't my code run?" My code does run, on all machines but one. I'm not asking anyone to debug my code. My question is: "What other possible causes, besides those we've eliminated, could be resulting in an Out of Memory exception?" Or, "Am I missing something that could have caused me to eliminate one of the known causes prematurely?"
As an FYI for anyone struggling with similar issues. I think we've finally hunted down the cause of this bug. Turns out the OpenGL drivers on certain cheaper on-board Intel graphics cards had a problem with the way we were writing bitmap data to the same texture ID over and over. I changed the code to delete the texture and allocate a new ID each time and the problem seems to have gone away.
I heard many times that once C# managed program request more memory from OS, it doesn't free it back, unless system is out of memory. Eg. when object is collected, it gets deleted, and memory that was occupied by the object is free to reuse by another managed object, but memory itself is not returned to operating system (for example, mono on unix wouldn't call brk / sbrk to decrease the amount of virtual memory available to the process back to what it was before its allocation).
I don't know if this really happens or not, but I can see that my c# applications, running on linux, use small amount of memory on beginning, then when I do something memory expensive, it allocates more of it, but later on when all objects get deleted (I can verify that by putting debug message to destructors), the memory is not free'd. On other hand no more memory is allocated when I run that memory expensive operation again. The program just keep on eating the same amount of memory until it is terminated.
Maybe it is just my misunderstanding of how GC in .net works, but if it really does work like this, why is that? What is a benefit of keeping the allocated memory for later, instead of returning it back to the system? How can it even know if system need it back or not? What about other application that would crash or couldn't start because of OOM caused by this effect?
I know that people will probably answer something like "GC manages memory better than you ever could, just don't care about it" or "GC knows what it does best" or "it doesn't matter at all, it's just virtual memory" but it does matter, on my 2gb laptop I am running OOM (and kernel OOM killer gets started because of that) very often when I am running any C# applications after some time precisely because of this irresponsible memory management.
Note: I was testing this all on mono in linux because I really have hard times understanding how windows manage memory, so debugging on linux is much easier for me, also linux memory management is open source code, memory management of windows kernel / .Net is rather mystery for me
The memory manager works this way because there is no benefit of having a lot of unused system memory when you don't need it.
If the memory manager would always try to have as little memory allocated as possible, that would mean that it would do a lot of work for no reason. It would only slow the application down, and the only benefit would be more free memory that no application is using.
Whenever the system needs more memory, it will tell the running applications to return as much as possible. The same signal is also sent to an application when you minimise it.
If this doesn't work the same with Mono in Linux, then that is a problem with that specific implementation.
Generally, if an app needs memory once, it will need it again. Releasing memory back to the OS only to request it back again is overhead, and if nothing else wants the memory: why bother?. It is trying to optimize for the very likely scenario of wanting it again. Additionally, releasing it back requires entire / contiguous blocks that can be handed back, which has very specific impact on things like compaction: it isn't quite as simple as "hey, I'm not using most of this : have it back" - it needs to figure out what blocks can be released, presumably after a full collect and compact (relocate objects etc) cycle.
I have been reading about out of memory for some time now and I figured out that in most cases out of memory exception (at least in .NET) isn't really caused by system actually running out of memory but rather system could not allocate chunks of requested memory block due to fragmentation.
What I don't really understand is I've been in a situation where I still get out of memory exception even if I try to allocate a large chunk of contiguous memory on application startup (eg: loading 100 images). Since the application has just started up, it is assumed that not much allocations / de-allocations have been done prior to that, so there should be many free contiguous blocks available. In that case why would the application still get hit by memory fragmentation issue?
Note that I'm also fairly certain that the issue was not caused by the system actually running out of memory quota allocated for my application because loading 100 images in my specific case only takes ~200 mb or so.
In my experience, Out of Memory mostly means poor object management. It's symptomatic of creating too many objects too fast and GC is having a hard time keeping up. Setting aside the few projects that take and never give memory back (like a SQL Server) out of memory can be prevented with caching and a well defined object life cycle.
Here's a problem with what should be a continuously-running, unattended console app: I'm seeing too-frequent app exits from System.OutOfMemoryException being thrown from a wide variety of methods deep in the call stack -- often System.String.ToCharArray(), or System.String.CtorCharArrayStartLength(), or System.Xml.XmlTextReaderImpl.InitTextReaderInput(), but sometimes down in a System.String.Concat() call in a MongoCollection.Save() call's stack, and other unlikely places.
For what it's worth, we're using parallel tasks, but this is essentially the only app running on the server, and the app's total thread count never gets over 60. In some of these cases I know of a reason for some other exception to be thrown, but OutOfMemoryException makes no sense in these contexts, and it creates problems:
According to TaskManager and Perfmon logs, the system has had a minimum of 65% out of 8GB free memory when this has happened, and
While exception handlers sometimes fire & log the exception, they do not prevent an app crash, and
There's no continuing from this exception without user interaction (unless you suppress windows error reporting, which isn't what we want system-wide, or run the app as a service, which is possible but sub-optimal for our use-case)
So I'm aware of the workarounds mentioned above, but what I'd really like is some explanation -- and ideally a code-based handler -- for the unexpected OOM exceptions, so that we can engage appropriate continuation logic. Any ideas?
Getting that exception when using under 3GB of memory suggests that you are running a 32-bit app. Build it as a 64-bit app and it will be able to use as much memory as is available (close to 8GB).
As to why it's failing in the first place...how large is the data are you working with? If it's not very large, have you looked for references to data being kept around much longer than they are necessary (i.e. a memory leak), thus preventing proper GC?
You need to profile the application, but the most common reason for these exceptions is excessive string creation. Also, excessive serialization can cause this and excessive Xslt transformations.
Do you have a lot of objects larger or equal to 85000 bytes? Every such object will go to the Large Object Heap, which is not compacted. I.e. unlike Small Object Heap, GC will not move objects around to fill the memory holes, which can lead to fragmentation, which is a potential problem for long-lived applications.
As of .NET 4, this is still the case, but it seems they made some improvements in .NET 4.5.
A quick-and-dirty workaround is to make sure the application can use all the available memory by building it as "x64" or "Any CPU", but the real solution would be to minimize repeated allocation/deallocation cycles of large objects (i.e. use object pooling or avoid large objects altogether, if possible).
You might also want to look at this.
We have a 64bit C#/.Net3.0 application that runs on a 64bit Windows server. From time to time the app can use large amount of memory which is available. In some instances the application stops allocating additional memory and slows down significantly (500+ times slower).When I check the memory from the task manager the amount of the memory used barely changes. The application keeps on running very slowly and never gives an out of memory exception.
Any ideas? Let me know if more data is needed.
You might try enabling server mode for the Garbage Collector. By default, all .NET apps run in Workstation Mode, where the GC tries to do its sweeps while keeping the application running. If you turn on server mode, it temporarily stops the application so that it can free up memory (much) faster, and it also uses different heaps for each processor/core.
Most server apps will see a performance improvement using the GC server mode, especially if they allocate a lot of memory. The downside is that your app will basically stall when it starts to run out of memory (until the GC is finished).
* To enable this mode, insert the following into your app.config or web.config:
<configuration>
<runtime>
<gcServer enabled="true"/>
</runtime>
</configuration>
The moment you are hitting the physical memory limit, the OS will start paging (that is, write memory to disk). This will indeed cause the kind of slowdown you are seeing.
Solutions?
Add more memory - this will only help until you hit the new memory limit
Rewrite your app to use less memory
Figure out if you have a memory leak and fix it
If memory is not the issue, perhaps your application is hitting CPU very hard? Do you see the CPU hitting close to 100%? If so, check for large collections that are being iterated over and over.
As with 32-bit Windows operating systems, there is a 2GB limit on the size of an object you can create while running a 64-bit managed application on a 64-bit Windows operating system.
Investigating Memory Issues (MSDN article)
There is an awful lot of good stuff mentioned in the other answers. However, I'm going to chip in my two pence (or cents - depending on where you're from!) anyway.
Assuming that this is indeed a 64-bit process as you have stated, here's a few avenues of investigation...
Which memory usage are you checking? Mem Usage or VMem Size? VMem size is the one that actually matters, since that applies to both paged and non-paged memory. If the two numbers are far out of whack, then the memory usage is indeed the cause of the slow-down.
What's the actual memory usage across the whole server when things start to slow down? Does the slow down also apply to other apps? If so, then you may have a kernel memory issue - which can be due to huge amounts of disk accessing and low-level resource usage (for example, create 20000 mutexes, or load a few thousand bitmaps via code that uses Win32 HBitmaps). You can get some indication of this on the Task Manager (although Windows 2003's version is more informative directly on this than 2008's).
When you say that the app gets significantly slower, how do you know? Are you using vast dictionaries or lists? Could it not just be that the internal data structures are getting so big so as to complicate the work any internal algorithms are performing? When you get to huge numbers some algorithms can start to become slower by orders of magnitude.
What's the CPU load of the application when it's running at full-pelt? Is actually the same as when the slow-down occurs? If the CPU usage decreases as the memory usage goes up, then that means that whatever it's doing is taking the OS longer to fulfill, meaning that it's probably putting too much load on the OS. If there's no difference in CPU load, then my guess is it's internal data structures getting so big as to slow down your algos.
I would certainly be looking at running a Perfmon on the application - starting off with some .Net and native memory counters, Cache hits and misses, and Disk Queue length. Run it over the course of the application from startup to when it starts to run like an asthmatic tortoise, and you might just get a clue from that as well.
Having skimmed through the other answers, I'd say there's a lot of good ideas. Here's one I didn't see:
Get a memory profiler, such as SciTech's MemProfiler. It will tell you what's being allocated, by what, and it will show you the whole slice n dice.
It also has video tutorials in case you don't know how to use it. In my case, I discovered I had IDisposable instances that I wasn't Using(...)