I've got what I assume is a memory fragmentation issue.
We've recently ported our WinForms application to a WPF application. There's some image processing that this application does, and this processing always worked in the WinForms version of the app. We go to WPF, and the processing dies. Debugging into the library has the death at random spots, but always with an array that's nulled, ie, allocation failed.
The processing itself is done in a C++ library called by a p/invoke and is fairly memory intense; if the given image is N x M pixels big, then the image is N x M x 2 bytes big (each pixel is an unsigned short, and it's a greyscale image). During the processing, image pyramids are made, which are in float space, so the total memory usage will be N x M x (2 + 2 + 4 + 4 + 4 + 4), where the first 2 is the input, the second 2 is the output, the first 4 is the input in floats, the second 4 is the 0th level difference image, and the last two fours are the rest of the pyramid (since they're pyramids and each level is half the size in each direction, these 4s are upper bounds). So, for a 5000x6000 image, that's 600 mb, which should fit into memory just fine.
(There's the possibility that using marshalling is increasing the memory requirement by another N x M x 4, ie, the input and output images on the C# side and then the same arrays copied to the C++ side-- could the marshalling requirement be bigger?)
How fragmented is WPF compared to WinForms? Is there a way to consolidate memory before running this processing? I suspect that fragmentation is the issue due to the random nature of the breakages, when they happen, and that it's always a memory allocation problem.
Or should I avoid this problem entirely by making the processing run as a separate process, with data transfer via sockets or some such similar thing?
If I read this correctly, the memory allocation failure is happening on the non-managed side, not the managed side. It seems strange then to blame WPF. I recognize that you are drawing your conclusion based on the fact that "it worked in WinForms", but there are likely more changes than just that. You can use a tool like the .NET Memory Profiler to see the differences between how the WPF application and the WinForms application are treating memory. You might find that your application is doing something you don't expect. :)
Per comment: Yup, I understand. If you're confident that you've ruled out things like environment changes, I think you have to grab a copy of BoundsChecker and Memory Profiler (or DevPartner Studio) and dig in and see what's messing up your memory allocation.
I'm guessing that the GC is moving your memory. Try pinning the memory in unmanaged land as long as you have a raw pointer to the array, and unpin it as soon as possible. It's possible that WPF causes the GC to run more often, which would explain why it happens more often with it, and if it's the GC, then that would explain why it happens at random places in your code.
Edit: Out of curiosity, could you also pre-allocate all of your memory up front (I don't see the code, so don't know if this is possible), and make sure all of your pointers are non-null, so you can verify that it's actually happening in the memory allocation, rather than some other problem?
It sounds like you want to be more careful about your memory management in general; ie: either run the processing engine in a separate address space which carefully manages memory, or pre-allocate a sufficiently large chunk before memory gets too fragmented and manage images in that area only. If you're sharing address space with the .NET runtime in a long-running process, and you need large contiguous areas, it's always going to potentially fail at some point. Just my 2c.
This post might be useful
http://blogs.msdn.com/tess/archive/2009/02/03/net-memory-leak-to-dispose-or-not-to-dispose-that-s-the-1-gb-question.aspx
Related
As the title states, I have a problem with high page file activity.
I am developing a program that process a lot of images, which it loads from the hard drive.
From every image it generates some data, that I save on a list. For every 3600 images, I save the list to the hard drive, its size is about 5 to 10 MB. It is running as fast as it can, so it max out one CPU Thread.
The program works, it generates the data that it is supposed to, but when I analyze it in Visual Studio I get a warning saying: DA0014: Extremely high rates of paging active memory to disk.
The memory comsumption of the program, according to Task Manager is about 50 MB and seems to be stable. When I ran the program I had about 2 GB left out of 4 GB, so I guess I am not running out of RAM.
http://i.stack.imgur.com/TDAB0.png
The DA0014 rule description says "The number of Pages Output/sec is frequently much larger than the number of Page Writes/sec, for example. Because Pages Output/sec also includes changed data pages from the system file cache. However, it is not always easy to determine which process is directly responsible for the paging or why."
Does this mean that I get this warning simply because I read a lot of images from the hard drive, or is it something else? Not really sure what kind of bug I am looking for.
EDIT: Link to image inserted.
EDIT1: The images size is about 300 KB each. I dipose each one before loading the next.
UPDATE: Looks from experiments like the paging comes from just loading the large amount of files. As I am no expert in C# or the underlying GDI+ API, I don't know which of the answers are most correct. I chose Andras Zoltans answer as it was well explained and because it seems he did a lot of work to explain the reason to a newcomer like me:)
Updated following more info
The working set of your application might not be very big - but what about the virtual memory size? Paging can occur because of this and not just because of it's physical size. See this screen shot from Process Explorer of VS2012 running on Windows 8:
And on task manager? Apparently the private working set for the same process is 305,376Kb.
We can take from this a) that Task Manager can't necessarily be trusted and b) an application's size in memory, as far as the OS is concerned, is far more complicated than we'd like to think.
You might want to take a look at this.
The paging is almost certainly because of what you do with the files and the high final figures almost certainly because of the number of files you're working with. A simple test of that would be experiment with different numbers of files and generate a dataset of final paging figures alongside those. If the number of files is causing the paging, then you'll see a clear correlation.
Then take out any processing (but keep the image-loading) you do and compare again - note the difference.
Then stub out the image-loading code completely - note the difference.
Clearly you'll see the biggest drop in faults when you take out the image loading.
Now, looking at the Emgu.CV Image code, it uses the Image class internally to get the image bits - so that's firing up GDI+ via the function GdipLoadImageFromFile (Second entry on this index)) to decode the image (using system resources, plus potentially large byte arrays) - and then it copies the data to an uncompressed byte array containing the actual RGB values.
This byte array is allocated using GCHandle.Alloc (also surrounded by GC.AddMemoryPressure and GC.RemoveMemoryPressure) to create a pinned byte array to hold the image data (uncompressed). Now I'm no expert on .Net memory management, but it seems to me that what we have a potential for heap fragmentation here, even if each file is loaded sequentially and not in parallel.
Whether that's causing the hard paging I don't know. But it seems likely.
In particular the in-memory representation of the image could be specifically geared around displaying as opposed to being the original file bytes. So if we're talking JPEGs, for example, then a 300Kb JPEG could be considerably larger in physical memory, depending on its size. E.g. a 1027x768 32 bit image is 3Mb - and that's been allocated twice for each image since it's loaded (first allocation) then copied (second allocation) into the EMGU image object before being disposed.
But you have to ask yourself if it's necessary to find a way around the problem. If your application is not consuming vast amounts of physical RAM, then it will have much less of an impact on other applications; one process hitting the page file lots and lots won't badly affect another process that doesn't, if there's sufficient physical memory.
However, it is not always easy to determine which process is directly responsible for the paging or why.
The devil is in that cop-out note. Bitmaps are mapped into memory from the file that contains the pixel data using a memory-mapped file. That's an efficient way to avoid reading and writing the data directly into/from RAM, you only pay for what you use. The mechanism that keeps the file in sync with RAM is paging. So it is inevitable that if you process a lot of images then you'll see a lot of page faults. The tool you use just isn't smart enough to know that this is by design.
Feature, not a bug.
I'm trying to insert data with big column values (1-25Mb) and after a couple seconds, one of my nodes dies, either by throwing an OOM or by being stuck in an endless GC loop.
It usually tries to flush CFs, but then it says Unable to reduce heap usage since there are no dirty column families.
Since the log advised me to reduce memtable/cache sizes, I tried to figure out what was using up all this memory in order to adapt my settings, so I ran nodetool flush / invalidaterowcache / invalidatekeycache and then triggered a GC through jconsole.
Unfortunately, my memory usage stayed high (>60%) even though the server is idling.
So, my problem is Why is the server running out of memory when inserting big values? and also, why isn't the server giving some memory back?
Edit
I did a heapdump and the heap is full of byte[], mainly referenced by of org.apache.cassandra.io.sstable.IndexSummary$KeyPosition.
I don't understand how this is possible since everything is supposed to have been flushed.
It seems to me that you hit the infamous memory fragmentation issue. I'm not sure whether Cassandra takes away some of the fragmentation issues, but generally, in .NET and potentially any Windows program, can run into this.
When you select anything above 85000 bytes (yes, odd number, but it's what it is), objects are stored in the Large Object Heap. The LOH gets GC'ed only as generation 2, but worse, it gets never compacted. The reason is partly caused by the way the OS is implemented.
Result: when you store objects of say 2MB, 5MB, 3MB, 2MB, 3MB and objects of 2MB get GC'ed you have potentially 4MB free. But if you then try to create a new object of 3MB, it cannot be placed there because of the fragmentation (2 holes of 2MB) and moves to the top of the heap. Eventually, this runs out of room. So: there can be enough memory available, but you will get an OOM regardless, due to this fragmentation.
This issue is mostly seen on 32 bit x86 applications on 64 bit (WOW64) and 32 bit Windows. 64 bit applications also have the fragmentation issue, but since virtual memory is much larger, you first hit paging the memory (becoming real slow) before you hit actual fragmentation issues.
If this is indeed the issue (you can check the fragmentation visually with VMMap and with WinDbg) you can solve it by creating a large pool of bytes and reuse your own pool, thus preventing fragmentation.
I investigated the heap dump with MAT and it turns out that the OutOfMemory happened because a lot of memory was used by Thrift.
Since I had to transfer big chunks of data for my column values, I changed those settings to 128, to "be safe":
thrift_framed_transport_size_in_mb
thrift_max_message_length_in_mb
But it turns out that Thrift allocates one byte[2 * thrift_max_message_length_in_mb] per receiving thread, and I had three of those. So I was using 768Mb just for receive buffers...
Changing the settings to 32 fixed my issue.
Have simple C# console app which imports text data into SQL.
It takes around 300K in memory and 80% in CPU. There are 2Gb RAM available at any time and yet the Page Fault shows 500K.
The app is 32 bit and OS is either W2000 or XP 32 bit and .NET 3.5
Anyone can explain what could be the problem and how can I investigate this further?
EDIT: I am now certain that the page faults are related to the disk I/O (read). I commented out SQL part and the pure disk read generates that high number alone.
EDIT2: There are 200 hard faults/sec and 4000 soft faults/sec on average.
I wonder if the same would appear on W2008
First, how do you measure the memory the app is using? If you're looking at "working set" that's only the part that resides in physical memory. You should also take a look at the "VM Size" (or "Commit Size") where the actual virtual memory your process takes up.
If Windows kernel Balance Set Manager thinks that your app is inactive, or should be left behind to give other processes more power, it can decide to reduce the working set size. If working set size is smaller than what your application actually needs to work on, you could easily see a lot of page faults because it simply becomes a race between The Balance Set Manager and the application. Usually balance set manager monitors memory usage and can also decide to increase working set size accordingly. However, this might be prevented in certain circumstances like low physical free memory, high I/O (cache stress on physical memory), low process priorty, background/foreground status of the application etc.
It can simply be the behavior of .NET garbage collector due to vast amount of small memory blocks getting allocated and disposed in a very short time, causing a stress on both memory allocation and releasing. The "VM Size" could stay around the same size but behind the scenes it could be continously allocating/freeing memory, causing continous page faults.
Also know that the DLLs the process is using are also accounted for the process statistics. Not your app but one of the COM or .NET DLL you are using might be causing this behavior as well. You can deduce actual culprit by changing your application's behavior (e.g. removing DB access code and only leave object allocation code behind) to see which component is actually causing thrashing.
EDIT: About your question on GC impact on memory thrashing: The CLR actually grows the heap dynamically and gives the memory back to the OS as needed. That does not occur synchronously. GC runs behind the scenes and frees memory in large chunks to prevent hindering application performance. Say you are allocating many small objects and freeing them almost immediately. That causes many references to stay for a moment in memory before freeing. It is easy to imagine that it becomes like a head-to-head race between the garbage collector and the memory allocating code. While GC eventually catches up, the required new memory must be satisified from a "new memory", not the old one because old one is not freed up yet. Since actual memory we are working on stays around the same, balance set manager may not think of giving our process more memory because we're on the edge, always around the same physical memory size but constantly need "newly allocated memory" not "more memory", therefore page faults.
Page faults are normal. Memory gets swapped out and when you next access it that's a page fault and the system brings it back. This is by design.
I've got an app running on my machine right now with 500 million page faults. There's nothing to worry about!
Page faults means Memory issues
Consider increasing memory, if you have excessive page faults.
Have a large working set size.
The working set is the set of memory pages currently loaded in RAM. This is measured by Process\Working Set. A high value might indicate that you have loaded a number of assemblies.
Process\Working Set has no specific threshold value to watch, although a high or fluctuating value can indicate a memory shortage. A high or fluctuating value accompanied by a high rate of page faults clearly indicates that your server does not have enough memory.
Further reading:
Check memory under System Resources in following MSDN article:
http://msdn.microsoft.com/en-us/library/ff647791.aspx#scalenetchapt15_topic9
Please provide some code to investigate.
A possible answer to this I am currently testing it on my application. Break up your working set into smaller chunks and work with the chunks.
For instance I have a large list of objects (9000-30000). If I break up that list into a chunk of 500 or so at a time it should maintain the 500 objects in memory while I work on them.
You will want to increase or decrease the size of your chunk until you can work with it fast enough that the OS will maintain it in memory. This is theory I haven't fully tested it yet. But it should work.
I need to load large amounts of bitmaps into memory for display in a WPF app (using .net 4.0). Where I run into trouble is when I approach around 1,400MB of memory ( I am getting this from the process list in the task manager).
This same thing happens whether the app is run on a machine with 4GB of memory or 6GB (and some other configs that I do not have the details on). It is easy to test by reducing the images loaded and when it works on 1 machine then it works on them all, but when it crashes on one it also does on all.
When I reduce the image count and allow the app to load without causing the memory exception I can run multiple instances of the app (exceeding the 1.4GB of the single instance) without the problem so it appears to be some per instance limit or per instance error on my part.
I load the images as a BitmapImage and they are either stored in a List<BitmapImage> or loaded into a List<byte[]> where they are later used in a bunch of layered sequences (using a Writeablebitmap)
The error occurrs when I load the images not while in use. In the repeatable case I load 600 640X640 images plus another 200-300 smaller images ranging from 100X100 to 200X200, although it appears to be an overall bit count that is the problem.
So my questions are:
*Is there some built in per process memory limit in a situation like this?
*Is there a better technique to load large amounts of image data into memory?
Thanks,
Brian
Yes, there is a limit on per process memory allocations. One of the solutions is to make your binary LARGEADDRESSAWARE to use up more memory.
Refer Out of memory? Easy ways to increase the memory available to your program, it has great discussion around solutions to this.
Below may be a cause but i am not sure
Problem is not about loading large amout of data but because CLR maintains a Large Heap for object greater than 85k of memory and you don't have any control to free this large heap.
and these objects became Long Lived and will normally deallocated when Appdomain Unloads.
i would suggest that try to load larger images in another AppDomain and use that appdomain to manupulate larger images.
See this MSDN Entry to Profiling GC
See if Memory Mapped Files helps in case you are using .net 4.0
And more example
A x86 build can access 4 GB on 64 bit Windows, so that's the theoretical upper limit for the process. This requires the application to be large address aware. Additionally .NET imposes a 2 GB limit on a single object.
You may be suffering from LOH fragmentation. Objects larger than 85000 bytes are stored on the Large Object Heap, which is a special part of the managed heap that doesn't get compacted.
You say that the images are 600x600, but what is the pixel format and is there a mask as well? If you use a byte per color channel plus a byte for the alpha channel each picture is 600x600x32, so trying to load 600 of them at once will be a problem in a 32 bit process.
You're running into the limitation 32 bit processes which can only access about 2Gb of data. If you were to run 64 bit you wouldn't have the issues.
There are a number of ways to work around the issue, some of which are:
Simply don't load that much data, load only when needed. Use caching.
Use memory mapped files to map whole chucks of data into memory. Not recommended as you'll have to do all the memory management yourself.
Use multiple processes to hold the data and use an IPC mechanism to only bring over the data you need, similar to item 1.
What is the largest heap you have personally used in a managed environment such as Java or .NET? What were some of the performance issues you ran into, and did you end up getting a diminishing returns the larger the heap was?
I work on a 64-bit .Net system that typically uses 9-12 GB, and sometimes as much as 20GB. I have not seen any performance problems even while garbage collecting, and I have been looking hard as I was not expecting it to work so well.
An earlier version hung on to some objects for too long resulting in occasional GCs that freed up 3GB+. Even then, there was no noticeable impact on performance. The system is running on a 16-core server with 32GB RAM, which probably helps...
In .Net, on Windows 32-bit, You can only really get to about 1.4 GB of memory usage before things start getting really screwy (out of memory exceptions). This is due to a limitation in 32 bit windows that limits a single process to using more than 2 GB of RAM. There is /3GB switch you can put in your boot.ini, but that will only bring you a little bit further. If you want to use lots of memory, you should seriously consider running on a 64 bit version of windows.
I currently have a production application with 6 GB of memory. You'll need a 64-bit box as well for the JVM to be able to address that much.
The garbage collector is really the only thing (that I've found so far) where performance degrades with size, and then only if you manually kick off a System.GC, which forces the JVM to bring everything to a screeching halt as it traverses 6 GB worth of objects. Takes a good 20 seconds, too. The default GC behavior does not do this, BTW, you have to be dumb enough to make it do that. Also worth researching JVM tuning at this size.
You can also find things like distributed and clustered JVMs, sorry, don't have any good references as I didn't look into this option too closely, although I did find references to larger installations.
I am unsure what you mean by heap, but if you mean memory used, I have used quite a bit, 2GB+. I have a web app that does image processing and it requires loading 2 large scan files into memory to do analysis.
There were performance issues. Windows would swap out lots of ram, and then that would create a lot of page faults. There was never any need for anymore than 2 images at a time as all requests were gainst those images (I only allowed 1 session per image set at a time)
For instance, to setup the files for initial viewing would take about 5 seconds. Doing simple analysis and zooming would be fairly fast once in memory, in the order of .1 to .5 seconds.
I still had to optimize, so I ended up preparsung the files and chopping into smaller peices and worked only with the peices that were required by the user at the time.
I have used from 2GB to 5GB of memory in java, but usually when I get to more than 2GB I really start thinking about memory optimization. Diminishing returns can vary from not optimizing when it's necessary because you have a lot of memory, to not having memory available for the OS/Disk caches (which can help your application overall).
For Java, I recommend watching your memory usage per generation over time. Do you create a lot of temporary objects or have long-lasting objects that consume a lot of memory? A lot of optimization of memory can be done when knowing those things.