Causes for web service memory leak - c#

We have a web service that uses up more and more private bytes until that application stops responding. The managed heap (mostly Gen2) will show some 200-250 MB, while private bytes shows over 1GB. What are possible causes of a memory leak outside of the managed heap?
I've already checked for the following:
Prolific dynamic assemblies (Xml serialization, regex, etc.)
Session state (turned off)
System.Policy.Evidence memory leak (SP1 installed)
Threading deadlock (no use of Join, only lock)
Use of SQLOLEDB (using SqlClient)
What other sources can I check for?

Make sure your app is complied in release mode. If you compile under debug mode, and deploy that, simply instantiating a class that has an event defined (event doesn't even need to be raised), will cause a small piece of memory to leak. Instantiating enough of these objects over a long enough period of time will cause all the memory to be used. I've seen web apps that would use up all the memory within a matter of hours, simply because a debug build was used. Compiling as a release build immediately and permanently fixed the problem.

I would recommend you view snapshots of the stack at various times, and see what's using up the memory. If your application is using Java, then jmap works extremely well - you just give it the PID of the java process.
If using something else, try Lambda Probe (http://www.lambdaprobe.org/d/index.htm). It doesn't show as much detail, but will at least show you memory use.
I had a bad memory leak in my JDBC code that ended up being traced to a change in the JDBC specification a few years ago that I missed (with respect to closing statements and such). It took a combination of Lamdba Probe and then jmap to localize the problem enough to fix it.
Cheers,
-R

Also look for:
COM Assemblies being loaded
DB Connections not being closed
Cache & State (Session, Application)
Try forcing the Garbage Collector (GC) to run (write a page that does it when it loads) or try the instrumentation, but that's a bit hit and miss in my experience. Another thing would be to keep it running and see if it runs out of memory.
What could be happening is that there is plenty of memory and Windows does not signal your app to clean up. This causes the app to look like its using more and more memory because it can, when in fact the system can reclaim the memory when it needs. SQL Server and Exchange do this a lot. The idea is why cause a unnecessary cleanup when there are plenty of resources.
Rob

Garbage collection does not run until a request for memory is denied due to lack of available memory. This can often make things look like a memory leak when one is not around.
Do you have any events and event handlers within the service? Services often have static variables, and if you are creating event handlers from the static instances, connected to a non-static instance object, the static will hold a reference to the instance forever, which will stop it from releasing.

Double check that trace is not enabled. I've seen instances of trace slowly consuming memory until the app reaches it's app pool limit.

For what it is worth, my issue was not with the service, but the with HttpClient that was calling it.
The client was not properly disposed, so it kept the connection open and the memory locked.
After disposing the client the service released the memory as expected.

Related

Memory leak because of pinned GC handles / no gc root visible

What is the reason for pinned GC handles when working with unmanaged .net components? This happens from time to time without any code changed or something else. When investigating the issue, I see a lot of pinned GC-Handles
These handles seem to stick in the memory for the entire application lifetime. In this case, the library is GdPicture (14). Is there any way to investigate why those instances are not cleaned up? I'm using Dispose()/using everywhere and can't find any GC roots in the managed code.
Thanks a lot!
EDIT
Another behaviour that is strange is, that the task manager shows that the application uses about 6GB ram, when the memory profiler shows the usage of 400MB (red line is live bytes)
What is the reason for pinned GC handles when working with unmanaged .net components?
Pinning is needed when working with unmanaged code. It prevents objects from being moved during garbage collection so that the unmanaged code can have a pointer to it. The garbage collector will update all .NET references, but it will not update unmanaged pointer values.
Is there any way to investigate why those instances are not cleaned up?
No. The reason always is: there's a bug in the code. Either your code (assume that first) or in a 3rd party library (libraries are used often, chances are that leaks in the library have been found by someone else before).
I'm using Dispose()/using everywhere
Seems like you missed one or it's not using the disposable pattern.
Another behaviour that is strange is, that the task manager shows that the application uses about 6GB ram, when the memory profiler shows the usage of 400MB (red line is live bytes)
A .NET memory profiler may only show the .NET part of memory (400 MB) and omit the rest (5600 MB).
Task manager is not interested in .NET. It cares about physical RAM mostly, which is why Task Manager is not a good analytics tool in general. You don't want to analyze physical RAM, you want to analyze virtual memory.
To look for memory leaks, use Process Explorer and show the "Private Bytes" and "Virtual size" column. Process Explorer can also show you a graph over time per process.
How to proceed?
Forget about the unmanaged leak for a moment. Use a .NET profiler that has the capability of taking memory snapshots and allows you to see each individual object inside as well as a statistics.
Try to figure out the steps that it takes to create more leaks in a consistent way. Then
Take a snapshot
Repeat the leak procedure 10 times
Take a snapshot
Repeat the leak procedure another 10 times
Take a snaphot
Compare snapshot of step 1 and 3. Check for managed types that differ in multiples of 10. Compare snapshot of step 3 and 5. Check the same type again. It must be a multiple of 10. You can't leak 7 objects when you run a method 10 times.
Do a code review on the places where the affected types are used based on internal knowledge on the leak procedure (which methods are called) and the managed type. Make sure it's disposed or released properly.

Async Socket High Memory Usage and Possible Leak

I have a server application rewrite underway and am puzzled by the memory usage of the application. The earlier version was written with TcpListener while the new one is plain old Socket. This is mostly for performance and stability reasons which are secondary to this question and even this issue.
As mentioned, everything is heavily async'd with AcceptAsync, SendAsync, and ReceiveAsync. On top of that, I use ThreadPool.QueueUserWorkItem for utility tasks such as the initial kick-off for AcceptAsync and keeping the next AcceptAsync queued, the call after processing to write back to the Socket, and the call cleaning up disconnected clients. Further, there are a series of events that I fire with BeginInvoke and EndInvoke.
The detection for those disconnects as well as the main driver for data availability are handled by a custom class that I call AvailabilityNotifier that peaks on a ReceiveAsync as well as detecting for SocketAsyncEventArgs.BytesTransferred being zero which fires a Disconnect event.
The throughput of the application is good, and there's almost zero (relatively speaking) lock contention thanks to a healthy usage of System.Collections.Concurrent objects. However, it clings to memory like a predator clinging to a kill.
I've debugged to verify my internal collections are getting cleared, the client sockets are being shutdown and disposed of, and utilizing a buffer pool instead of creating new buffers for each read. Running a test application that ultimately performs 1,000 connections (100 concurrent) and sends/receives 100,000 messages bloats the server process memory to around 800 MB and it never goes down even after Windows clears any TIME_WAITs that might have happened. I know for sure the diposal code is firing thanks to a ton of ObjectDisposedException and null exception catch blocks that you can see in the linked github below.
I say all that without quoted code as it's quite long for a post here so here's a github: https://github.com/hoagsie/TcpServer. The Program.cs and ClientProgram.cs is provided as well if you'd want to run it yourself but the main action is in NetworkServer.cs and AvailabilityNotifier.cs. The sample server I have running also has a WCF service it talks to but is just the standard WCF project with literally no modifications. I just needed it to match a sample scenario.
I'm also not sure if it matters on some level, but I do build this in x64 mode rather than AnyCPU/x86. This is mostly for resource consumption opportunity on the target server it will be going on, but I haven't noticed a difference in behavior with regard to this issue in either x86 or x64.
EDIT:
A coworker pointed out the Snapshot tool in Visual Studio. I had never seen this before and it displayed things differently from what I had been using, which was dotTrace. It pointed to a ton of allocations around the SocketAsyncEventArgs object which makes sense, but they kept building and building. I looked at its member list again and discovered it had a Dispose method. My issue has gone away. I didn't realize that was an IDisposable object.
A coworker pointed out the Snapshot tool in Visual Studio. I had never seen this before and it displayed things differently from what I had been using, which was dotTrace. It pointed to a ton of allocations around the SocketAsyncEventArgs object which makes sense, but they kept building and building. I looked at its member list again and discovered it had a Dispose method. My issue has gone away. I didn't realize that was an IDisposable object.

Limiting the allowed RAM for a service, possible using MaxWorkingSet

I have a service that runs on a domain controller that is randomly accessed by other computers on the network. I can't shutdown the service and run it only when needed (this would defeat the purpose of running it as a service anyway).
The problem is that the memory used by the service doesn't seem to ever get cleared, and increases every time the service is queried by a remote computer.
Is there a way to set a limit on the RAM used by the application?
I've found a few references to using MaxWorkingSet, but none of the references actually tell me how to use it. Can I use MaxWorkingSet to limit the RAM used to, for example, 35MB? and if so, how? (what is the syntax etc?)
Otherwise, is there a function like "clearall()" that I could use to reset the variables and memory at the end of each run through? I've tried using GC.Collect(), but it didn't work.
Literally, MaxWorkingSet only affect Working set, which is the amount of physical memory. To restrict of an overall memory usage, you need Job Object API. But it is danger if your program really need such memory (many codes don't consider an OutOfMemoryException and sometimes .NET runtime has strange behaviors when memory is not enough)
You need to:
Create a Win32 Job object
Set the maximum memory to the job
Assign your process to the job
Here is a wrapper for .NET. ^reference
Besides, you could try this method of GC: (for .NET 4.6 or newer)
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(2, GCCollectionMode.Forced, true, true);
(for older but sometimes doesn't work)
GC.Collect(2, GCCollectionMode.Forced);
The third param in 4.6 version of GC.Collect() is to tell runtime whether to do garbage collecting immediately. In older versions, GC.Collect() only notifies and leaves the decision to runtime.
As for some programming advice, I suggest you could wrap a class for one query. The class could be explicitly disposed after a query is done. It may help make GC smarter.
Finally, indeed there are something in .NET framework which you need to manage yourself. Like Bitmap.GetHBitmap, they need to be disposed manually.

Why does unmanaged code increments memory to a specific limit?

I work with a flow control system written in. NET, the system inter-acts with external systems through TCP connections and routes transactions between different endpoints.
My problem:
At startup / initialization the private working set memory level is about 25000KB. After initialization when the system is in idle state, the private working set is stepping up with about 50-100KB per second until it reaches a limit of about 57000KB.
Information:
The system is generating page faults during the incrementation.
When the limit is reached, the private working set stays very stable and oscillates up and down with a few MB when I connect +300 clients and exchange high-frequency transactions for a couple of hours, the logic for garbage collection works very well.
I have profiled this system with a tool from Redgate called "Memory Profiler" which tells me the memory stepping up after initialization is allocated by unmanaged code, unfortunately this profiler does not support insight to memory allocated by unmanaged code so I have difficulties to find out what this allocated memory contains, why it is allocated and which code that allocates the memory.
The whole codebase is developed in C#, there are no references to COM+ assemblies and there is no communication with native windows API's (during the incrementation of this memory).
My question:
I need to be pointed in the right direction to find out why the memory is continuously incrementing in small chunks to a specific level after initialization.
If a page is in not working set this does not mean the page is stored only on disk or on disk at all. Pages on Windows can go to the standby list. If they do, they leave the WS and require a soft fault to bring them back. (I never understood why this mechanism is there, but it is). A soft fault is cheap.
Using Process Explorer's system information window you can see the number of hard and soft faults per seconds. Probably also available using perfmon. I suggest you check if you have hard faults (which I believe you don't so you don't have a problem and you can close the investigation).
Also, WS has nothing to do with memory usage, but I think you already knew that.

c# and garbage collection

There must be a secret in .NET garbage collection that I don't understand as for now.
Our C# WinForms application has a dialog that loads very many objects via a OR mapper tool. This process consumes a lot of memory and we think that most of this memory could be consumed by strings. When we open this dialog, the task menager shows 900MB of memory usage and by doing the query again, we get a out of memory exception. Whow.
Now we got the tipp that some type of garbage collection is done when we minimize the application. By doing this and maximizing it again, the application only consumes 10MBs. Cool.
But now, when we do the query again, the memory consumption suddenly jumps back to 900MB and we get the out of memory exception again.
What happens here and how can we reduce our memory consumption? In such cases, how can the memory consumption be researched and reduced?
There are a bunch of .Net Memory Profilers available to help diagnose issues like this. My favorite is dotTrace by JetBrains
Are you sure all your loops are completing. I know the biggest issue I have had with resources in my desktop applications have been loops that get stuck or do not end successfully. Are you sure your query is coming back correctly?
Are you doing more then just querying? I would make sure all your processes are starting and finishing correctly.

Categories

Resources