I am running a large ASP.net 4.0 website. It uses a popular .Net content management system, has thousands of content items, hundreds of concurrent users - is basically a heavy website.
Over the course of 1 day the memory usage of the IIS7 worker process can rise to 8-10GB. The server has 16GB installed and is currently set to recycle the app pool once per day.
I am getting pressured to reduce memory usage. Much of the memory usage is due to caching of large strings of data - but the cache interval is only set to 5-10 minutes - so these strings should eventually expire from memory.
However after running RedGate Memory Profiler I can see what I think are memory leaks. I have filtered my Instance List results by objects that are "kept in memory exclusively by Disposed Objects" ( I read on the RedGate forum that this is how you find memory leaks ). This gave me a long list of strings that are being held in memory.
For each string I use Instance Retention Graph to see what holds it in memory. The System.string objects seem to have been cached at some point by System.Web.Caching.CacheDependency. If I follow the graph all the way up it goes through various other classes including System.Collections.Specialized.ListDictionary until it reaches System.Web.FileMonitor. This makes some sense as the strings are paths to a file (images / PDFs / etc).
It seems that the CMS is caching paths to files, but these cached objects are then "leaked". Over time this builds up and eats up RAM.
Sorry this is long winded... Is there a way for me to stop these memory leaks? Or to clear them down without resorting to recycling the app pool? Can I find what class / code is doing the caching to see if I can fix the leak?
It sounds like the very common problem of stuff being left in memory as part of session state. If that's the case your only options are 1. don't put so much stuff in each user's session, 2. Set the session lifetime to something shorter (the default is 20 minutes, I think), and 3. periodically recycle the app pool.
As part of 1. I found that there are "good ways" and "bad ways" of presenting data in a data grid control. You may want to check that you are copying only the data you need and not accidentally maintaining references to the entire datagrid.
Related
Some facts:
We have developed wcf service that acts as a layer between clients and the database.
It's selfhosted and runs as a windows service.
The service keeps several caches, where the largest are about 1-2gb in memory. Total memory usage is usually about 5-8gb.
Connections are duplex and uses tcp protocol and the serialization is made with protobuf-net. Our connected client count usually range from 1000-1500.
The server is a 8-core xeon of newish model with 64gb memory and runs nothing more then the service.
The problem: After x amount of time, it has been everywhere from a day to a week the service gets extremely slow. Requests that takes 0.5 seconds can take over a minute. This behaviour goes on for 15-40 minutes or til the service is restarted.
What we have done :
We have checked the network and network connection to the server and there is no problem. CPU utilization goes up somewhat during this time from f.eks. 30% avg to 40-50% avg.
We have taken memory dumps and there are no logical locks in code that blocks the users and not much activity at all.
Our latest lead is the Garbage collector. In perfmon we can see that "% time in gc" is constantly over 90%,(90-97%) and the collection counts rises. Both GC0 and GC1. We suspect there is a blocking GC2 running also but we had to restart the service as this is in production so it didn't count up during the 5min window we ran perfmon. Memory usage was 7,6 Gb.
Note : Calls outstanding rises so the calls get there but the service does not handle them.
My questions are, Can the garbage collector get in a state where it runs and blocks constantly for over 15minutes? or are the problem probably related to some other issue?
Our service ran GC in workstation mode and latencymode : Interactive
We have now changed this to Server and SustainedLowLatency and hopes this will help somewhat. Are there anything else we can do if its the garbage collector?
Edit : The large memory usage is by design, the data in the caches is that large and there is lots of more memory available.
Excessive garbage collection is often caused by code issues. You either create too many objects in a short time, or you keep allocating memory without releasing it.
There is actually an extensive checklist available on MSDN that should help you diagnose the problem.
A very large GC2 means that the objects in there survived multiple garbage collections, which means they are kept in memory for a longer period of time. That could be the root cause of your issue. Maybe there is a caching mechanism that could use some tuning / retention policy (remove data that isn't used for a long time).
I have a similar situation. Large database data cache in a service using protobuf with WCF for client communication. The cache is not purely just for clients, the business layer uses the cache to perform operations. The memory footprint of the service can be anywhere between 2 and 10 GB. I release a segment of the cache after 8 hours of inactivity. The machine has 8 virtual cores and 32 GB of memory. I am using .Net 4.5.1.
The GC would consume 98% of the CPU for an hour as soon as I loaded the cache from the database. The interesting point here in both our cases there is no memory pressure what so ever.
I think the GC is performed regardless because something was changed where the GC tries to keep available memory for all threads. Since one thread allocated a large amount of memory when loading the cache, the GC kicked in. I had to do several things to fix it.
1) Removed Tuples from the cache. I was using them as dictionary keys and their implementation of StructuralEquality is horrible. It compares all properties as objects so there is a lot of boxing going on for properties that are values and these will have to be garbage collected at some point.
2) When replacing Tuples used as keys I could not simply replace them with structures without implementing Equals as the value comparison uses reflection and it is too expensive so I ended up creating a Generic Pair structure. I decided to use structures to remove the number of objects when they were in arrays.
3) To remove the Tuples I had to create my own Pair structure that compares the properties using default equals for property types. Eseentially the same thing that PowerCollections created.
I have a question regarding high memory usage of Web Role running MVC application, with Simple Injector as DI, Entity Framework 6 for DAL. Application is running on Azure Cloud Service as Web Role with 2 x Standard A2 Instances (2 Cores, 3.5 GB RAM) and is also running CachingService (Co-located Role) with 20% memory usage configured.
Problem is that when instance is started or rebooted the memory usage of w3wp.exe service is only around 500-600 MB (with all other apps memory usage is around 50%), but even if there are no requests coming in it starts and continues growing until around 1.7GB and stops (with all other apps memory usage is around 90%). But what I noticed is that memory drops sometimes randomly and of course after reboot or republishing.
After monitoring memory heaps I noticed that it is Gen2 Heap that grows and stays large and after debugging locally with ANTS Memory Profiler I saw that largest amount of Gen2 is taken by Entity Framework objects with class name "TypeUsage" and "MetadataProperty" objects ("System.Data.Entity.Core.Metadata.Edm" namespace).
Now my question are:
is this a memory leak in our code and how can I solve it if that is the case (I checked and already tried to dispose DbContext that is created every request)?
is this a memory leak in EF, if that is the case what can I do about this, maybe another DAL framework?
is this a normal behavior and I should leave it as it is?
There is a very low chance that this is a memory leak in EF and this is not OK and you shouldn't leave it like this. Your code leaks memory.
The best way to find the leak is to use a memory profiler (ANTS is a good option, I used dotMemory). The profiler will show you the leaked objects and it should also show you two other important things:
The stack trace of the location in code where the object was created
The object tree which keeps reference to your leaked object and doesn't allow it to be collected.
These should help you understand how the objects were created and why they weren't GC'ed.
You mentioned that most of the memory is in Gen2. That means that your leaked objects are referenced by something "long lived". This could be a static variable, ASP.Net Application, or something similar.
The random drop of memory may occur when IIS recycles your application. By default that happens every 29 hours, but IIS may be configured differently or may decide to recycle your application for some other purpose.
"But what I noticed is that memory drops sometimes randomly..."
Probably it's not a memory leak but an issue of the uncontrolled growth of garbage collection. I faced something similar some years ago.
The problem is that by default garbage collector lets the process memory grow until its size overs some bound of the totally available memory in OS. When your process runs in the cloud environment which is a kind of shared hosting, it's possible that it still doesn't reach the necessary memory bound from the OS point of view and so the memory is not collected, but it overs a memory limitation for a shared process.
I'd recommend you to force the garbage collector to collect the memory explicitly by using GC.Collect(0); periodically, after the certain amount of operations. May be it can solve the problem.
I had a similar problem (web app with EF and lots of TypeUsage's taking up memory in the dump) and found that setting "Enable 32-Bit Applications" on the application pool reduced the memory use considerably.
I have an Asp.Net application, where by time the memory usage is growing up to 2.5GB and more. Website is using heavily the System.Runtime.Caching.MemoryCache.
Is there anyway by code to know how much of that memory is used by the System.Runtime.Caching.MemoryCache, so I can know if the issue is too much caching, or whether there are memory leaks in the application?
First look at the PhysicalMemoryLimit . How big is it? Set it to a reasonable limit for your environment.
"The PhysicalMemoryLimit property returns the percentage of total physical computer memory that can be used by a single instance of the MemoryCache class. If the cache instance exceeds the specified limit, cache entries are removed."
Also look into CacheMemoryLimit
"Gets the amount of memory on the computer, in bytes, that can be used by the cache."
Also beware how many Cache instances your application is using, hopefully it uses Default.
Also look into the exprirations of cache entries, in general it's DefaultCacheCapabilities...
Have simple C# console app which imports text data into SQL.
It takes around 300K in memory and 80% in CPU. There are 2Gb RAM available at any time and yet the Page Fault shows 500K.
The app is 32 bit and OS is either W2000 or XP 32 bit and .NET 3.5
Anyone can explain what could be the problem and how can I investigate this further?
EDIT: I am now certain that the page faults are related to the disk I/O (read). I commented out SQL part and the pure disk read generates that high number alone.
EDIT2: There are 200 hard faults/sec and 4000 soft faults/sec on average.
I wonder if the same would appear on W2008
First, how do you measure the memory the app is using? If you're looking at "working set" that's only the part that resides in physical memory. You should also take a look at the "VM Size" (or "Commit Size") where the actual virtual memory your process takes up.
If Windows kernel Balance Set Manager thinks that your app is inactive, or should be left behind to give other processes more power, it can decide to reduce the working set size. If working set size is smaller than what your application actually needs to work on, you could easily see a lot of page faults because it simply becomes a race between The Balance Set Manager and the application. Usually balance set manager monitors memory usage and can also decide to increase working set size accordingly. However, this might be prevented in certain circumstances like low physical free memory, high I/O (cache stress on physical memory), low process priorty, background/foreground status of the application etc.
It can simply be the behavior of .NET garbage collector due to vast amount of small memory blocks getting allocated and disposed in a very short time, causing a stress on both memory allocation and releasing. The "VM Size" could stay around the same size but behind the scenes it could be continously allocating/freeing memory, causing continous page faults.
Also know that the DLLs the process is using are also accounted for the process statistics. Not your app but one of the COM or .NET DLL you are using might be causing this behavior as well. You can deduce actual culprit by changing your application's behavior (e.g. removing DB access code and only leave object allocation code behind) to see which component is actually causing thrashing.
EDIT: About your question on GC impact on memory thrashing: The CLR actually grows the heap dynamically and gives the memory back to the OS as needed. That does not occur synchronously. GC runs behind the scenes and frees memory in large chunks to prevent hindering application performance. Say you are allocating many small objects and freeing them almost immediately. That causes many references to stay for a moment in memory before freeing. It is easy to imagine that it becomes like a head-to-head race between the garbage collector and the memory allocating code. While GC eventually catches up, the required new memory must be satisified from a "new memory", not the old one because old one is not freed up yet. Since actual memory we are working on stays around the same, balance set manager may not think of giving our process more memory because we're on the edge, always around the same physical memory size but constantly need "newly allocated memory" not "more memory", therefore page faults.
Page faults are normal. Memory gets swapped out and when you next access it that's a page fault and the system brings it back. This is by design.
I've got an app running on my machine right now with 500 million page faults. There's nothing to worry about!
Page faults means Memory issues
Consider increasing memory, if you have excessive page faults.
Have a large working set size.
The working set is the set of memory pages currently loaded in RAM. This is measured by Process\Working Set. A high value might indicate that you have loaded a number of assemblies.
Process\Working Set has no specific threshold value to watch, although a high or fluctuating value can indicate a memory shortage. A high or fluctuating value accompanied by a high rate of page faults clearly indicates that your server does not have enough memory.
Further reading:
Check memory under System Resources in following MSDN article:
http://msdn.microsoft.com/en-us/library/ff647791.aspx#scalenetchapt15_topic9
Please provide some code to investigate.
A possible answer to this I am currently testing it on my application. Break up your working set into smaller chunks and work with the chunks.
For instance I have a large list of objects (9000-30000). If I break up that list into a chunk of 500 or so at a time it should maintain the 500 objects in memory while I work on them.
You will want to increase or decrease the size of your chunk until you can work with it fast enough that the OS will maintain it in memory. This is theory I haven't fully tested it yet. But it should work.
After reading a few enlightening articles about memory in the .NET technology, Out of Memory does not refer to physical memory, 597499.
I thought I understood why a C# app would throw an out of memory exception -- until I started experimenting with two servers-- both are having 2.5 gigs of ram, windows server 2003 and identical programs running.
The only significant difference between the two being one has 7% hard drive storage left and the other more than 50%.
The server with 7% storage space left is consistently throwing an out of memory while the other is performing consistently well.
My app is a C# web application that process' hundreds of MBs of String object.
Why would this difference happen seeing that the most likely reason for the out of memory issue is out of contiguous virtual address space.
All I can think of is that you're exhausting the virtual memory. Sounds like you need to run a memory profiler on the app.
I've used the Red Gate profiler in similar situations in the past. You may be surprised how much memory your strings are actually using.
Is the paging file fragmentation different on each machine? High fragmentation could slow down paging operations and thus exacerbate memory issues. If the paging file is massively fragmented, sort it out e.g. bring the server off-line, set the paging file size to zero, defrag the drive, re-create the paging file.
It's hard to give any specific advice on how to deal with perf problems with your string handling without more detail of what you are doing.
Why would this difference happen
seeing that the most likely reason for
the out of memory issue is out of
contiguous virtual address space?
With 7% free hard disk your server is probably running out of space to page out memory from either your process or other processes, hence it has to keep everything in RAM and therefore you are unable to allocate additional memory more often than on the server with 50% free space.
What solutions do you guys propose?
Since you've already run a profiler and seen at least 600MB+ of usage with all the string data you need to start tackling this problem.
The obvious answer would be to not hold all that data in memory. If you are processing a large data set then load a bit, process it and then throw that bit away and load the next bit instead of loading it all up front.
If it's data you need to serve, look at a caching strategy like LRU (least recently used) and keep only the hottest data in memory but leave the rest on disk.
You could even offload the strings into a database (in-memory or disk-based) and let that handle the cache management for you.
A slighty left-of-field solution I've had to use in the past was simply compressing the string data in memory as it arrived and decompressing it again when needed using the SharpZipLib. It wasn't that slow surprisingly.
I would agree that your best bet is to use a memory profiler. I've used .Net Memory Profiler 3.5 and was able to diagnose the issue, which in my case were undisposed Regex statements. They have demo tutorials which will walk you through the process if you're not familiar.
As you your question, any single reference to the strings, the jagged array for instance, would still prevent the string from disposing. Without knowing more about your architecture, it would be tough to make a specific recommendation. I would suggest trying to optimize your app before extending memory though. It will come back to bite you later.
An OutOfMemoryException is more likely to indicate fragmentation in your page file - not that you are out of RAM or disk space.
It is generally (wrongly) assumed that the page file is used as a swap disk - that RAM overflow is written to the page file. All allocated memory is stored in the page file and only data that is under heavy usage is copied to RAM.
There's no simple code fix to this problem other than trying to reduce the memory footprint of your application. But if you really get desperate you can always try PageDefrag, which is a free application originally developed by SysInternals.
There is a few tricks to increase memory (I dont know if it works with a web-app, but it looks like it does):
"Out of memory? Easy ways to increase the memory available to your program"
http://blogs.msdn.com/b/calvin_hsia/archive/2010/09/27/10068359.aspx