Async Socket High Memory Usage and Possible Leak - c#

I have a server application rewrite underway and am puzzled by the memory usage of the application. The earlier version was written with TcpListener while the new one is plain old Socket. This is mostly for performance and stability reasons which are secondary to this question and even this issue.
As mentioned, everything is heavily async'd with AcceptAsync, SendAsync, and ReceiveAsync. On top of that, I use ThreadPool.QueueUserWorkItem for utility tasks such as the initial kick-off for AcceptAsync and keeping the next AcceptAsync queued, the call after processing to write back to the Socket, and the call cleaning up disconnected clients. Further, there are a series of events that I fire with BeginInvoke and EndInvoke.
The detection for those disconnects as well as the main driver for data availability are handled by a custom class that I call AvailabilityNotifier that peaks on a ReceiveAsync as well as detecting for SocketAsyncEventArgs.BytesTransferred being zero which fires a Disconnect event.
The throughput of the application is good, and there's almost zero (relatively speaking) lock contention thanks to a healthy usage of System.Collections.Concurrent objects. However, it clings to memory like a predator clinging to a kill.
I've debugged to verify my internal collections are getting cleared, the client sockets are being shutdown and disposed of, and utilizing a buffer pool instead of creating new buffers for each read. Running a test application that ultimately performs 1,000 connections (100 concurrent) and sends/receives 100,000 messages bloats the server process memory to around 800 MB and it never goes down even after Windows clears any TIME_WAITs that might have happened. I know for sure the diposal code is firing thanks to a ton of ObjectDisposedException and null exception catch blocks that you can see in the linked github below.
I say all that without quoted code as it's quite long for a post here so here's a github: https://github.com/hoagsie/TcpServer. The Program.cs and ClientProgram.cs is provided as well if you'd want to run it yourself but the main action is in NetworkServer.cs and AvailabilityNotifier.cs. The sample server I have running also has a WCF service it talks to but is just the standard WCF project with literally no modifications. I just needed it to match a sample scenario.
I'm also not sure if it matters on some level, but I do build this in x64 mode rather than AnyCPU/x86. This is mostly for resource consumption opportunity on the target server it will be going on, but I haven't noticed a difference in behavior with regard to this issue in either x86 or x64.
EDIT:
A coworker pointed out the Snapshot tool in Visual Studio. I had never seen this before and it displayed things differently from what I had been using, which was dotTrace. It pointed to a ton of allocations around the SocketAsyncEventArgs object which makes sense, but they kept building and building. I looked at its member list again and discovered it had a Dispose method. My issue has gone away. I didn't realize that was an IDisposable object.

A coworker pointed out the Snapshot tool in Visual Studio. I had never seen this before and it displayed things differently from what I had been using, which was dotTrace. It pointed to a ton of allocations around the SocketAsyncEventArgs object which makes sense, but they kept building and building. I looked at its member list again and discovered it had a Dispose method. My issue has gone away. I didn't realize that was an IDisposable object.

Related

Application Intermittently goes to Not responding

We have an application built in .Net 4.5(C#, Winforms) which only in one production environment goes to NOt Respondingstate intermittently for 10 to 20 second.
I have written log on important lines. It hangs on loading heavy user controls and when data fetching calls are done.
The system on which it hangs has considerably low memory 2GB. I have almost reproduced the situation on a local machine by lowering the memory. My question is what are my options to avoid these hangs up.
The application memory does raises 200 to 300 mb.
The behavior is not consistant. Some time it takes 30 seconds to complete a task the next time it takes 3 seconds hardly.
The Not responding state comes usually in the start up.
My last attempt was i loaded the important assemblies on start up but i have no luck.
Lastly let me tell you that we have several third party controls.
If you are concern about memory allocation,
Use the .NET Memory Allocation option of the Performance Wizard found in the Analyze menu to create a Performance Session which will help you figure out what changes you need to make in your code to reduce memory usage.
We really need more info here. But a few suggestions:
Use the using keyword or call Dispose() on all disposable objects when
you are done with them
Make sure you unregister event handlers when you are done listening
for events
If you have lot of timers in your application,
This is only a problem with the System.Threading.Timer class if you don't otherwise store a reference to it somewhere. It has several constructor overloads, the ones that take the state object are important. The CLR pays attention to that state object. As long as it is referenced somewhere, the CLR keeps the timer in its timer queue and the timer object won't get garbage collected. Most programmers will not use that state object, the MSDN article certainly doesn't explain its role.
System.Timers.Timer is a wrapper for the System.Threading.Timer class, making it easier to use. In particular, it will use that state object and keep a reference to it as long as the timer is enabled.

WeakReferences are not freed in embedded OS

I've got a strange behavior here:
I get a massive memory leak in production running a WPF application that runs on a DLOG-Terminal (Windows Embedded Standard SP1) that behaves perfectly fine if I run it localy on a normal desktop (Win7 prof.)
After many unsucessful attempts to find any problem I put one of those directly beside my monitor, installed the ANTs MemoryProfiler and did one hour test run simulating user operations on both the terminal and my development PC.
Result is, that due to some strange reasons the embedded system piles up a huge amount of WeakReference and EffectiveValueEntry[] Objects.
Here are are some pictures:
Development (PC):
And the terminal:
Just look at the class list...
Has anyone seen something like this before and are there known solutions to this?
Where can I get help?
(PS the terminals where installed with images prepared for .net4)
PPS: for the close-voter: I think the question is clear: how can I fix this.
You could argue if this is a IT/OS problem vs. a programming problem but I think if I post this in Server Fault it will get a off-topic close in no time...
UPDATE:
I was able to find a big portion of the problem - but it feels a bit like C++:
I use a ViewModel-like Items class for a WPF-List that provides (among others) a ICommand (RelayCommand-pattern). The Items where created on the fly in the getter of a ViewModel-Property for the view and it seems that the application/GC did never free those unused commands - or the subscribtions to their CanExecuteChanged - the memory profiler shows those as "held by a weak reference". I changed my code to reuse those item-viewmodels and Dispose/set to null every used properties in their Dispose and use this too as clean up - as I said: feels like "delete" in those old C++ days.
On top of this I use a forced GC.Collect every 30mins (yeah I know - you never should - but I got no other solution till now).
With this setup the applications runs for 6+ hours without problems so far but it don't feel right.
I cannot understand why those WeakReferences are not claimed as they are on my desktop machine...
Any thoughts on this? Please!
UPDATE:
I am still not able to pin down this problem but I see a strange behavior:
If I use PC-Anywhere to observe the operation of my software on one of the terminals the problem goes away!
Even after running 8hr. straight the software runs as it should - it will even free memory (I put a little memorycounter-display in the main-screen - let's say I connect to the terminal and see that memory is low - after waiting a few minutes the memory is reclaimed)
So I think Devin (one Answer below) has a lead in the right direction - something in the Remote-Control software unblocks the finalizer-thread or whatever is blocking the GC - be it the simulated keyboard/mouse or whatever.
Any thoughts on this?
We had a (somewhat) similar issue running my app on a tablet. The memory would be reclaimed when run on a desktop, but not when run on a tablet or some other device that used a PC Input panel. The problem is that the finalization queue was getting stuck. The COM object finalizer was waiting to run something on the main thread, which didn't have a message loop.
The solution was to find an adequate time to invoke Application.DoEvents(). We had a method that would be called intermittently and we invoke it with every 10th call. I don't know if this is the same issue you are having, but maybe it can shed some light.
EDIT: I do need to make it clear, in general, calling DoEvents() is a bad idea. It works in that case because there isn't any UI on that thread or anything else happening that those events can interfere with.
From the screenshots it is interesting to see that the LOH grows at the same time the used space does not grow much. The free space is growing a lot at the LOH which indicates memory fragmentation due to pinned objects. This looks like a stuck finalizer thread which does prevent the cleanup of managed objects. You should get a memory dump and check in which method the finalizer thread was stuck. You can do this quite easy with Windbg.

C# Production Server, Do I collect the garbage?

I know there's tons of threads about this. And I read a few of them.
I'm wondering if in my case it is correct to GC.Collect();
I have a server for a MMORPG, in production it is online day and night. And the server is restarted every other day to implement changes to the production codebase. Every twenty minutes the server pauses all other threads, and serializes the current game state. This usually takes 0.5 to 4 seconds
Would it be a good idea to GC.Collect(); after serialization?
The server is, obviously, constantly creating and destroying game items.
Would I have a notorious gain in performance or memory optimization / usage?
Should I not manually collect?
I've read about how collecting can be bad if used in the wrong moments or too frequently, but I'm thinking these saves are both a good moment to collect, and not that frequent.
The server is in framework 4.0
Update in answer to a comment:
We are randomly experiencing server freezes, sometimes, unexpectedly, the server memory usage will raise increasingly until it reaches a point when the server takes way too long to handle any network operation. Thus, I'm considering a lot of different approaches to solve the issue, this is one of them.
The garbage collector knows best when to run, and you shouldn't force it.
It will not improve performance or memory optimization. CLR can tell GC to collect object which are no longer used if there is a need to do that.
Answer to an updated part:
Forcing the collection is not a good solution to the problem. You should rather have a look a bit deeper into your code to find out what is wrong. If memory usage grows unexpectedly you might have an issue with unmanaged resources which are not properly handled or even a "leaky code" within managed code.
One more thing. I would be surprise if calling GC.Collect fixed the problem.
Every twenty minutes the server pauses
all other threads, and serializes the
current game state. This usually takes
0.5 to 4 seconds
If all your threads are suspended already anyway you might as well call the garbage collection, since it should be fairly fast at this point. I suspect doing this will only mask your real problem though, not actually solve it.
We are randomly experiencing server
freezes, sometimes, unexpectedly, the
server memory usage will raise
increasingly until it reaches a point
when the server takes way too long to
handle any network operation. Thus,
I'm considering a lot of different
approaches to solve the issue, this is
one of them.
This sounds more like you actually are still referencing all these objects that use the memory - if you weren't the GC would run due to the memory pressure and try to release those objects. You might be looking at an actual bug in your production code (i.e. objects that are still subscribed to events or otherwise are being referenced when they shouldn't be) rather than something you can fix by manually taking out the garbage.
If possible in this scenario you should run a performance analysis to see where your bottlenecks are and what part of your code is causing the brunt of the memory allocations.
Could the memory increase be an "attack" by a player with a fake/modified game-client? Is a lot of memory allocated by the server when it accepts a new client connection? Does the server handle bogus incoming data well?

Determining the source of a thread

I've been experiencing a high degree of flicker and UI lag in a small application I've developed to test a component that I've written for one of our applications. Because the flicker and lag was taking place during idle time (when there should--seriously--be nothing going on), I decided to do some investigating. I noticed a few threads in the Threads window that I wasn't aware of (not entirely unexpected), but what caught my eye was one of the threads was set to Highest priority. This thread exists at the time Main() is called, even before any of my code executes. I've discovered that this thread appears to be present in every .NET application I write, even console applications.
Being the daring soul that I am, I decided to freeze the thread and see what happened. The flickering did indeed stop, but I experienced some oddness when it came to doing database interaction (I'm using SQL CE 3.5 SP1). My thought was that this might be the thread that the database is actually running on, but considering it's started at the time the application loads (before any references to the DB) and is present in other, non-database applications, I'm inclined to believe this isn't the case.
Because this thread (like a few others) shows up with no data in the Location column and no Call Stack listed if I switch to it in the debugger while paused, I tried matching the StartAddress property through GetCurrentProcess().Threads for the corresponding thread, but it falls outside all of the currently loaded modules address ranges.
Does anyone have any idea what this thread is, or how I might find out?
Edit
After doing some digging, it looks like the StartAddress is in kernel32.dll (based upon nearby memory contents). This leads me to think that this is just the standard system function used to start the thread, according to this page, which basically puts me back at square one as far as determining where this thread actually comes from. This is further confirmed by the fact that ALL of the threads in this list have the same value for StartAddress, leading me to ask exactly what the purpose is...?
Edit 2
Process Explorer let me to an actually meaningful start address. It looks like it's mscorwks.dll!CreateApplicationContext+0xbbef. This dll is in %WINDOWS%\Microsoft.NET\Framework\v2.0.50, so it looks like it's clearly a runtime assembly. I'm still not sure why
it's Highest priority
it appears to be causing hiccups in my application
You could try using Sysinternals. Process Explorer let's you dig in pretty deep. Right click on the Process to access Properties. Then "Threads" tab. In there, you can see the thread's stack and module.
EDIT:
After asking around some, it seems that your "Highest" priority thread is the Finalizer thread that runs due to a garbage collection. I still don't have a good reason as to why it would constantly keep running. Maybe you have some funky object lifetime behavior going on in your process?
I'm not sure what this is, but if you turn on unmanaged debugging, and set up Visual Studio with the Windows symbol server, you might get some more clues.
Might be the Garbage Collector thread. I noticed it too when I was once investigating a finalizer-related bug. Perhaps your system memory is low and the GC is trying to collect all the time? This was the case in the previously mentioned bug too. I couldn't reproduce it on my machine, but a co-worker of mine had a machine with less RAM where it would reappear like clockwork.

Causes for web service memory leak

We have a web service that uses up more and more private bytes until that application stops responding. The managed heap (mostly Gen2) will show some 200-250 MB, while private bytes shows over 1GB. What are possible causes of a memory leak outside of the managed heap?
I've already checked for the following:
Prolific dynamic assemblies (Xml serialization, regex, etc.)
Session state (turned off)
System.Policy.Evidence memory leak (SP1 installed)
Threading deadlock (no use of Join, only lock)
Use of SQLOLEDB (using SqlClient)
What other sources can I check for?
Make sure your app is complied in release mode. If you compile under debug mode, and deploy that, simply instantiating a class that has an event defined (event doesn't even need to be raised), will cause a small piece of memory to leak. Instantiating enough of these objects over a long enough period of time will cause all the memory to be used. I've seen web apps that would use up all the memory within a matter of hours, simply because a debug build was used. Compiling as a release build immediately and permanently fixed the problem.
I would recommend you view snapshots of the stack at various times, and see what's using up the memory. If your application is using Java, then jmap works extremely well - you just give it the PID of the java process.
If using something else, try Lambda Probe (http://www.lambdaprobe.org/d/index.htm). It doesn't show as much detail, but will at least show you memory use.
I had a bad memory leak in my JDBC code that ended up being traced to a change in the JDBC specification a few years ago that I missed (with respect to closing statements and such). It took a combination of Lamdba Probe and then jmap to localize the problem enough to fix it.
Cheers,
-R
Also look for:
COM Assemblies being loaded
DB Connections not being closed
Cache & State (Session, Application)
Try forcing the Garbage Collector (GC) to run (write a page that does it when it loads) or try the instrumentation, but that's a bit hit and miss in my experience. Another thing would be to keep it running and see if it runs out of memory.
What could be happening is that there is plenty of memory and Windows does not signal your app to clean up. This causes the app to look like its using more and more memory because it can, when in fact the system can reclaim the memory when it needs. SQL Server and Exchange do this a lot. The idea is why cause a unnecessary cleanup when there are plenty of resources.
Rob
Garbage collection does not run until a request for memory is denied due to lack of available memory. This can often make things look like a memory leak when one is not around.
Do you have any events and event handlers within the service? Services often have static variables, and if you are creating event handlers from the static instances, connected to a non-static instance object, the static will hold a reference to the instance forever, which will stop it from releasing.
Double check that trace is not enabled. I've seen instances of trace slowly consuming memory until the app reaches it's app pool limit.
For what it is worth, my issue was not with the service, but the with HttpClient that was calling it.
The client was not properly disposed, so it kept the connection open and the memory locked.
After disposing the client the service released the memory as expected.

Categories

Resources