How to debug "Not enough storage is available to process this command" - c#

We've started to experience Not enough storage available to process this command. The application is WPF, the exception starts to pop up after some hours of working normally.
System.ComponentModel.Win32Exception (0x80004005): Not enough storage is available to process this command
at MS.Win32.UnsafeNativeMethods.RegisterClassEx(WNDCLASSEX_D wc_d)
at MS.Win32.HwndWrapper..ctor(Int32 classStyle, Int32 style, Int32 exStyle, Int32 x, Int32 y, Int32 width, Int32 height, String name, IntPtr parent, HwndWrapperHook[] hooks)
at System.Windows.Interop.HwndSource.Initialize(HwndSourceParameters parameters)
at System.Windows.Window.CreateSourceWindow(Boolean duringShow)
at System.Windows.Window.CreateSourceWindowDuringShow()
at System.Windows.Window.SafeCreateWindowDuringShow()
at System.Windows.Window.ShowHelper(Object booleanBox)
at System.Windows.Window.Show()
at System.Windows.Window.ShowDialog()
My understanding is this is some kind of out of memory exception, specific to allocation of windows resources. What is the possible reason of this and how can I debug it?
Update
I have reviewed the topic suggested by #Thili77 (this one). I used GDIView and task manager to look at the consumed handles during our app performing (Handles, USER Objects and GDI objects in taskmgr), and it doesn't look like they are growing. My next test is to try to run it for a day without VS (previously it was running under VS host process) and check whether this still happens. I'm still looking for any advices or tips if anybody has any
Update #2
It happens on a new clean PC without hosting VS. The handles, USER Objects and GDI Objects are OK during crash. When the PC in a crashed state, nothing works properly - looks like the handles are really leaked, but ProcMon doesn't show big numbers for these values. Also weirdly this always happens around 7-8 pm, when there is nobody in the office and it doesn't matter when I started the app run. It is already a third crash like that. Coincidence? Only thing that I've notice I find weird is a big number of page faults for the app, that grows constantly. Could this be related? Does not appear anymore, see Update #3
Update #3
Next are the details of a crash I experience. The system is x86, app is x86, W7 SP1.
The current state that is shown on the screenshots are exactly right after the crash, with windbg that pauses the process.
For some reason now the exception has different message: The operation completed successfully. But it still the same Win32Exception coming from the same piece of code.
I also need to pinpoint that I'm running with reduced amount of desktop heap and with AppAnalyzer Basic options on - in order to make the fault more frequent (which seems to work). The time assumption was indeed a coincidence, no time related shared theme noticed anymore.

One possibility is that the global atom table has run out of available space. There is a limit of 0x4000 string atoms in the table, and there is also a limit on the total amount of space allocated to the table. Window classes are one of the things that go into this table.
I have never attempted to debug such an issue myself, but I did find an article about checking for this problem using WinDbg: Identifying Global Atom Table Leaks. You might want to look into that as a possible cause.
If this turns out to be the culprit, one possible cause is that the application is not closing Window instances. HwndWrapper cleans up its global atom in its Dispose, which happens in response to WM_DESTROY, which happens in response to calling Close on the Window (or setting DialogResult, which ends up closing the window if the value changes and the window was shown by calling ShowDialog rather than Show). There may be other possible causes for an atom leak as well.
P.S. The reason I suspect this is because "Not enough storage is available to process this command" is the error that is returned when RegisterClassEx is unable to add to the global atom table.

Looks like an issue which was not resolved on purpose by Microsoft, check this Connect link, in which it was stated:
We appreciate the feedback. However, this issue will not be addressed in the next version of WPF. Thank you.
–WPF Team.
A workaround is provided, it might help:
You can work around this bug by adding the following code to your thread proc:
Dispatcher dispatcher = Dispatcher.CurrentDispatcher;
dispatcher.BeginInvokeShutdown(DispatcherPriority.Normal);
Dispatcher.Run();
This asks the dispatcher associated with the thread to shut down right away.

From my experience I received that type of exception in case your UI thread hangs up and other threads continue placing messages to main application UI dispatcher. So in the some period of type the message queue is full and than you will recieve this exception.
To debug that you may need find your thread 1(which is UI) in VS during debug session and monitor it's activities. Maybe there is some infinite waiter on some external event or etc.

Related

Window "capture" application, upon unexpected termination, allows captured windows to disappear, how can I prevent/fix this issue?

I have an application (C# + WPF) that attempts to wrest control of the graphical interface of any process passed to it as an input and resize/reposition for my own purposes.
It does its job rather well, I think. Upon expected termination (the base class inherits from IDisposable) the "captured" process is released - its parent is set to the original, its windowstyle is reset, etc. etc.
In fact, on testing, I can capture, release, recapture, and so on, the same process as many times as I want with no issues.
However, upon unexpected termination (say another process forcefully kills it), the process never regains its graphical interface! I can tell its still running but I can never set that process back to its original state.
It almost seems like the process doesn't respond to window-based Win32 API calls that set specific window features anymore (for example, I can get information with GetParent, GetWindowThreadProcessId, etc but calling ShowWindow or related results in nothing).
Any suggestions on why this is happening? I'm guessing that since I set the parent of the process to my WPF application (which then unexpectedly closes) it causes some issue in trying to recover the initial interface?
This is why it's happening (or, at least, an indication of why I had so much difficulty finding the issue out on my own); can I recover from it? And, if so, how?
Edit -
IInspectable makes a good point in the comments, question adjusted to make better sense for this particular application.
It seems I've gotten my answer; so, for the sake of completeness I'll post what I've gotten here in case anyone else has a similar issue.
According to the information provided by IInspectable in here and here (with more context in the comments), it seems that what I'm trying to do here (assign a new parent cross-process) is essentially unsupported behavior.
My Solution:
Recovering (at least at the point that I'm talking about - i.e. unexpected crashes or exits) probably isn't feasible, as we've already gone off the end in undetermined/unknown behavior. So I've decided to go for the preventative route.
Our current project already makes use of the Nancy framework to communicate across servers/processes so I'm going to "refine" our shutdown procedure a bit for my portion of the program to allow it to exit more gracefully.
In the case of a truely unexpected termination, I'm still at a loss. I could just restart the processes (actually services with a console output, in our case, but w/e) but my application is just a GUI/Interface and isn't very important when compared to the function these processes serve. I may make some sort of semaphore file that indicates whether a successful shutdown occurs and branch my code off so that it indicates that the processes are no longer visible until the next time they're restarted.

createWindowEx failed exception

createWindowEx failed exception is thrown by my server which is using overbyteICS dll in .net C# windowsforms.
I have a server which handles large number of clients throughout the day. But when the total connections(i.e Connection and disconnections altogether) count reaches to 10000 the above error appears and the server stops accepting user connections and also hangs the machine.
I agree with Roger, but let's confirm it first - When this error occurs, run SPY++ from MicrosoftVisualStudio\Tools in the start Menu and look through the window tree. Expand the branches and look for duplicates of some windows. Surely there will be many of them, but you are interested in hundreds and thousands of copies. If you hit that, then it's what Roger said... ...and there's almost no solution other that periodically restarting the connection-server process (or whole machine, just in case) just to be sure it doesnt hang (of course, server restart will irritate the users almost as much..), or fixing/patching/reimplementing the connection-server process to be more resource-friendly..
Note that while opening a hidden window per single connection is a very wasteful approach, it still shuold not hang the machine. It simply should drop the connections that it cannot handle. Here, it seems it has no limits implemented at all, which is a bug.
edit: on pre-NT (i.e. win9x) the limit is hardcoded. On NT class systems, you can try to tweak the pool:
http://weblogs.asp.net/israelio/archive/2007/02/07/max-num-of-open-windows-under-xp-2003-vista-resolved.aspx
but still, I'd consider that as a last restort, as problem will return when number of connection rises again. First, try to ping the server developers to fix that permanently..
You diagnosed it well. Yes, a CreateWindowEx() failure and 10,000 belong together. 10,000 is the default user32 object quota for a process. In other words, a single process isn't allowed to create more than 10,000 windows. This is a counter-measure against apps that leak window handles, a very common bug. The total number of windows that can be created in a session is a limited resource, having one process consume them all would cause outright failure, you couldn't shut down Windows anymore.
Clearly it is not a leak in your case. You can find temporary relief by changing a registry setting, HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\USERProcessHandleQuota. Reboot to make it effective.
Increasing from 10,000 to the maximum of 18,000 should be okayish if the machine doesn't otherwise run processes that require a lot of windows. Something you can see with Taskmgr.exe, Processes tab. Choose View + Select Columns and tick USER objects. Also tick GDI objects and Handles while you are at it, other resources that have a quota.
Long term, this behavior does not scale well. You'll need to find the code that creates a window handle for every web request and fix it.

WeakReferences are not freed in embedded OS

I've got a strange behavior here:
I get a massive memory leak in production running a WPF application that runs on a DLOG-Terminal (Windows Embedded Standard SP1) that behaves perfectly fine if I run it localy on a normal desktop (Win7 prof.)
After many unsucessful attempts to find any problem I put one of those directly beside my monitor, installed the ANTs MemoryProfiler and did one hour test run simulating user operations on both the terminal and my development PC.
Result is, that due to some strange reasons the embedded system piles up a huge amount of WeakReference and EffectiveValueEntry[] Objects.
Here are are some pictures:
Development (PC):
And the terminal:
Just look at the class list...
Has anyone seen something like this before and are there known solutions to this?
Where can I get help?
(PS the terminals where installed with images prepared for .net4)
PPS: for the close-voter: I think the question is clear: how can I fix this.
You could argue if this is a IT/OS problem vs. a programming problem but I think if I post this in Server Fault it will get a off-topic close in no time...
UPDATE:
I was able to find a big portion of the problem - but it feels a bit like C++:
I use a ViewModel-like Items class for a WPF-List that provides (among others) a ICommand (RelayCommand-pattern). The Items where created on the fly in the getter of a ViewModel-Property for the view and it seems that the application/GC did never free those unused commands - or the subscribtions to their CanExecuteChanged - the memory profiler shows those as "held by a weak reference". I changed my code to reuse those item-viewmodels and Dispose/set to null every used properties in their Dispose and use this too as clean up - as I said: feels like "delete" in those old C++ days.
On top of this I use a forced GC.Collect every 30mins (yeah I know - you never should - but I got no other solution till now).
With this setup the applications runs for 6+ hours without problems so far but it don't feel right.
I cannot understand why those WeakReferences are not claimed as they are on my desktop machine...
Any thoughts on this? Please!
UPDATE:
I am still not able to pin down this problem but I see a strange behavior:
If I use PC-Anywhere to observe the operation of my software on one of the terminals the problem goes away!
Even after running 8hr. straight the software runs as it should - it will even free memory (I put a little memorycounter-display in the main-screen - let's say I connect to the terminal and see that memory is low - after waiting a few minutes the memory is reclaimed)
So I think Devin (one Answer below) has a lead in the right direction - something in the Remote-Control software unblocks the finalizer-thread or whatever is blocking the GC - be it the simulated keyboard/mouse or whatever.
Any thoughts on this?
We had a (somewhat) similar issue running my app on a tablet. The memory would be reclaimed when run on a desktop, but not when run on a tablet or some other device that used a PC Input panel. The problem is that the finalization queue was getting stuck. The COM object finalizer was waiting to run something on the main thread, which didn't have a message loop.
The solution was to find an adequate time to invoke Application.DoEvents(). We had a method that would be called intermittently and we invoke it with every 10th call. I don't know if this is the same issue you are having, but maybe it can shed some light.
EDIT: I do need to make it clear, in general, calling DoEvents() is a bad idea. It works in that case because there isn't any UI on that thread or anything else happening that those events can interfere with.
From the screenshots it is interesting to see that the LOH grows at the same time the used space does not grow much. The free space is growing a lot at the LOH which indicates memory fragmentation due to pinned objects. This looks like a stuck finalizer thread which does prevent the cleanup of managed objects. You should get a memory dump and check in which method the finalizer thread was stuck. You can do this quite easy with Windbg.

Diagnose/Debug potential stack corruption .NET application

I think I have a curly one here... I have an WinForms application that crashes fairly regularly every hour or so when running as an x64 process. I suspect this is due to stack corruption and would like to know if anyone has seen a similar issue or has some advice for diagnosing and detecting the issue.
The program in question has no visible UI. It's just a message window that sits in the background and acts as a sort of 'middleware' between our other client programs and a server.
It dies in different ways on different machines. Sometimes it's an 'APPCRASH' dialog that reports a fault in ntdll.dll. Sometimes it's an 'APPCRASH' that reports our own dll as the culprit. Sometimes it's just a silent death. Sometimes our unhandled exception hook logs the error, sometimes it doesn't.
In the cases where Windows Error Reporting kicks in, I've examined memory dumps from several different crash scenarios and found the same Managed exception in memory each time. This is the same exception I see reported as an unhandled exception in the cases where we it logs before it dies.
I've also been lucky (?) enough to have the application crash while I was actively debugging with Visual Studio - and saw that same exception take down the program.
Now here's the kicker. This particular exception was thrown, caught and swallowed in the first few seconds of the program's life. I have verified this with additional trace logging and I have taken memory dumps of the application a couple of minutes after application startup and verified that exception is still sitting there in the heap somewhere. I've also run a memory profiler over the application and used that to verify that no other .NET object had a reference to it.
The code in question looks a bit like this (vastly simplified, but maintains the key points of flow control)
public class AClass
{
public object FindAThing(string key)
{
object retVal = null;
Collection<Place> places= GetPlaces();
foreach (Place place in places)
{
try
{
retval = place.FindThing(key);
break;
}
catch {} // Guaranteed to only be a 'NotFound' exception
}
return retval;
}
}
public class Place
{
public object FindThing(string key)
{
bool found = InternalContains(key); // <snip> some complex if/else logic
if (code == success)
return InternalFetch(key);
throw new NotFoundException(/*UsefulInfo*/);
}
}
The stack trace I see, both in the event log and when looking at the heap with windbg looks a bit like this.
Company.NotFoundException:
Place.FindThing()
AClass.FindAThing()
Now... to me that reeks of something like stack corruption. The exception is thrown and caught while the application is starting up. But the pointer to it survives on the stack for an hour or more, like a bullet in the brain, and then suddenly breaches a crucial artery, and the application dies in a puddle.
Extra clues:
The code within 'InternalFetch' uses some Marshal.[Alloc/Free]CoTask and pinvoke code. I have run FxCop over it looking for portability issues, and found nothing.
This particular manifestation of the issue is only affecting x64 code built in release mode (with code optimization on). The code I listed for the 'Place.Find' method reflects the optimized .NET code. The unoptimized code returns the found object as the last statement, not 'throw exception'.
We make some COM calls during startup before the above code is run... and in a scenario where the above problem is going to manifest, the very first COM call fails. (Exception is caught and swallowed). I have commented out that particular COM call, and it does not stop the exception sticking around on the heap.
The problem might also affect 32 bit systems, but if it does - then the problem does not manifest in the same spot. I was only sent (typical users!) a few pixels worth of a screen shot of an 'APP CRASH' dialog, but the one thing I could make out was 'StackHash_2264' in the faulting module field.
EDIT:
Breakthrough!
I have narrowed down the problem to a particular call to SetTimer.
The pInvoke looks like this:
[DllImport("user32")]
internal static extern IntPtr SetTimer(IntPtr hwnd, IntPtr nIDEvent, int uElapse, TimerProc CB);
internal delegate void TimerProc(IntPtr hWnd, uint nMsg, IntPtr nIDEvent, int dwTime);
There is a particular class that starts a timer in its constructor. Any timers set before that object is constructed work. Any timers set after that object is constructed work. Any timer set during that constructor causes the application to crash, more often than not. (I have a laptop that crashes maybe 95% of the time, but my desktop only crashes 10% of the time).
Whether the interval is set to 1 hour, or 1 second, seems to make no different. The application dies when the timer is due - usually by throwing some previously handled exception as described above. The callback does not actually get executed. If I set the same timer on the very next line of managed code after the constructor returns - all is fine and happy.
I have had a debugger attached when the bad timer was about to fire, and it caused an access violation in 'DispatchMessage'. The timer callback was never called. I have enabled the MDAs that relate to managed callbacks being garbage collected, and it isn't triggering. I have examined the objects with sos and verified that the callback still existed in memory, and that the address it pointed to was the correct callback function.
If I run '!analyze -v' at this point, it usually (but not always) reports something along the lines of 'ERROR_SXS_CORRUPT_ACTIVATION_STACK'
Replacing the call to SetTimer with Microsoft's 'System.Windows.Forms.Timer' class also stops the crash. I've used a Reflector on the class and can see internally it still calls SetTimer - but does not register a procedure. Instead it has a native window that receives the callback. It's pInvoke definition actually looks wrong... it uses 'ints' for the eventId, where MSDN documentation says it should be a UIntPtr.
Our own code originally also used 'int' for nIDEvent rather than IntPtr - I changed it during the course of this investigation - but the crash continued both before and after this declaration change. So the only real difference that I can see is that we are registering a callback, and the Windows class is not.
So... at this stage I can 'fix' the problem by shuffing one particular call to SetTimer to a slightly different spot. But I am still no closer to actually understanding what is so special about starting the timer within that constructor that causes this error. And I dearly would like to understand the root cause of this issue.
Just briefly thinking about it it sounds like an x64 interop issue (i.e., calling x32 native functions from x64 managed code is fraught with danger). Does the problem go away if you force your application to compile as x32 platform from within project properties?
You can read suggestions on forcing x32 compile during x32/x64 development on Dotnetrocks. Richard Campbell's suggestion is that Visual Studio should default to x32 platform and not AnyCPU.
http://www.dotnetrocks.com/default.aspx?showNum=341 (transcript).
With regard to advanced debugging, I have not had a chance to debug x64 interop code, but i hear that this book is an great resource: Advanced .NET Debugging.
Finally, one thing you might try is force Visual Studio to break when an exception is thrown.
Use something like DebugDiag for x64 or Windbg to write a dump on Kernel32!TerminateProcess and second chance exception on .NET which should give you the actual .excr context frame of the exception that occurred.
This should help you in identifying the call-stack for the process terminate.
IMO it could be mostly because of PInvoke calls. You could use Managed Debugging Assistants to debug these issues.
If MDA is used along with Windbg it would give out messages that would be helpful in debugging
Also I have found tools from the http://clrinterop.codeplex.com/ team are extremely handy when dealing with interop
EDIT
This should give an answer why it is not working in 64 bit Issue with callback method in SetTimer Windows API called from C# code .
This does sound like a corruption issue. I would go through all of your interop calls and ensure that all of the parameters to the DllImport'ed functions are the correct types. For exmaple, using an int in place of an IntPtr will work in 32 bit code but can crash 64 bit.
I would use a site like PInvoke.net to verify all of the signatures.

Determining the source of a thread

I've been experiencing a high degree of flicker and UI lag in a small application I've developed to test a component that I've written for one of our applications. Because the flicker and lag was taking place during idle time (when there should--seriously--be nothing going on), I decided to do some investigating. I noticed a few threads in the Threads window that I wasn't aware of (not entirely unexpected), but what caught my eye was one of the threads was set to Highest priority. This thread exists at the time Main() is called, even before any of my code executes. I've discovered that this thread appears to be present in every .NET application I write, even console applications.
Being the daring soul that I am, I decided to freeze the thread and see what happened. The flickering did indeed stop, but I experienced some oddness when it came to doing database interaction (I'm using SQL CE 3.5 SP1). My thought was that this might be the thread that the database is actually running on, but considering it's started at the time the application loads (before any references to the DB) and is present in other, non-database applications, I'm inclined to believe this isn't the case.
Because this thread (like a few others) shows up with no data in the Location column and no Call Stack listed if I switch to it in the debugger while paused, I tried matching the StartAddress property through GetCurrentProcess().Threads for the corresponding thread, but it falls outside all of the currently loaded modules address ranges.
Does anyone have any idea what this thread is, or how I might find out?
Edit
After doing some digging, it looks like the StartAddress is in kernel32.dll (based upon nearby memory contents). This leads me to think that this is just the standard system function used to start the thread, according to this page, which basically puts me back at square one as far as determining where this thread actually comes from. This is further confirmed by the fact that ALL of the threads in this list have the same value for StartAddress, leading me to ask exactly what the purpose is...?
Edit 2
Process Explorer let me to an actually meaningful start address. It looks like it's mscorwks.dll!CreateApplicationContext+0xbbef. This dll is in %WINDOWS%\Microsoft.NET\Framework\v2.0.50, so it looks like it's clearly a runtime assembly. I'm still not sure why
it's Highest priority
it appears to be causing hiccups in my application
You could try using Sysinternals. Process Explorer let's you dig in pretty deep. Right click on the Process to access Properties. Then "Threads" tab. In there, you can see the thread's stack and module.
EDIT:
After asking around some, it seems that your "Highest" priority thread is the Finalizer thread that runs due to a garbage collection. I still don't have a good reason as to why it would constantly keep running. Maybe you have some funky object lifetime behavior going on in your process?
I'm not sure what this is, but if you turn on unmanaged debugging, and set up Visual Studio with the Windows symbol server, you might get some more clues.
Might be the Garbage Collector thread. I noticed it too when I was once investigating a finalizer-related bug. Perhaps your system memory is low and the GC is trying to collect all the time? This was the case in the previously mentioned bug too. I couldn't reproduce it on my machine, but a co-worker of mine had a machine with less RAM where it would reappear like clockwork.

Categories

Resources