My .NET application uses BlockingCollection to process data received from RabbitMQ. Sometimes on production it stops processing messages. I made dump using procdump util and tried to inspect the dump using WinDbg. !threads command shows me one thread with LockCount=1. ~*e!clrstack command gave me this output for another thread:
Child SP IP Call Site
0f3bf068 7760c8ac [GCFrame: 0f3bf068]
0f3bf118 7760c8ac [HelperMethodFrame_1OBJ: 0f3bf118] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)
0f3bf1a4 70ac5fb7 System.Threading.Monitor.Wait(System.Object, Int32, Boolean)
0f3bf1b4 70ad76a7 System.Threading.SemaphoreSlim.WaitUntilCountOrTimeout(Int32, UInt32, System.Threading.CancellationToken)
0f3bf1d0 70ad75f0 System.Threading.SemaphoreSlim.Wait(Int32, System.Threading.CancellationToken)
0f3bf230 703dc110 System.Collections.Concurrent.BlockingCollection`1[[System.__Canon, mscorlib]].TryAddWithNoTimeValidation(System.__Canon, Int32, System.Threading.CancellationToken)
I tried to find more information about locks. !locks command gave me only this Scanned 40 critical sections. !SyncBlock shows this:
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
-----------------------------
Total 1364
CCW 1
RCW 1
ComClassFactory 0
Free 713
What isGCFrame? What is CCW and RCW? I can't find it in documentation.
Also is it a way to find the thread which owns the lock?
Related
I got a memory dump. I can get the normal callstack (with line number)
When I use Debug Diag to analyze the dump I got this callstack on thread 62.
.NET Call Stack
[[HelperMethodFrame_1OBJ] (System.Threading.WaitHandle.WaitOneNative)] System.Threading.WaitHandle.WaitOneNative(System.Runtime.InteropServices.SafeHandle, UInt32, Boolean, Boolean)
mscorlib_ni!System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean)+21
mscorlib_ni!System.Threading.WaitHandle.WaitOne(Int32, Boolean)+31
CaptureServices.GenericInfrastructure.ExportLogic.ChannelsThread.ChannelsStateThread()+bb
mscorlib_ni!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+15e
mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+17
mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)+52
mscorlib_ni!System.Threading.ThreadHelper.ThreadStart()+52
[[GCFrame]]
[[DebuggerU2MCatchHandlerFrame]]
As I understand .NET has some mechanism to shows human readable names instead of adresses. Now I want this line in WinDbg:
CaptureUtilities.AudioProcessing.APProcessorThread.IterateAPStreamProcessorQueue()+49
I open WinDbg and load the dump. I execute ~62 k and get
Child-SP RetAddr Call Site
00000016`4965e0c8 00007ffc`b59113ed ntdll!NtWaitForMultipleObjects+0xa
00000016`4965e0d0 00007ffc`abde77be KERNELBASE!WaitForMultipleObjectsEx+0xe1
00000016`4965e3b0 00007ffc`abde7658 clr!WaitForMultipleObjectsEx_SO_TOLERANT+0x62
00000016`4965e410 00007ffc`abde7451 clr!Thread::DoAppropriateWaitWorker+0x1e4
00000016`4965e510 00007ffc`abdebd15 clr!Thread::DoAppropriateWait+0x7d
00000016`4965e590 00007ffc`a94ecdf1 clr!WaitHandleNative::CorWaitOneNative+0x165
00000016`4965e7c0 00007ffc`a94ecdc1 mscorlib_ni+0x48cdf1
00000016`4965e7f0 00007ffc`4cf2e97b mscorlib_ni+0x48cdc1
00000016`4965e830 00007ffc`a94e674e 0x00007ffc`4cf2e97b
00000016`4965e890 00007ffc`a94e65e7 mscorlib_ni+0x48674e
00000016`4965e960 00007ffc`a94e65a2 mscorlib_ni+0x4865e7
00000016`4965e990 00007ffc`a94ed1f2 mscorlib_ni+0x4865a2
00000016`4965e9e0 00007ffc`abc36a53 mscorlib_ni+0x48d1f2
00000016`4965ea20 00007ffc`abc36913 clr!CallDescrWorkerInternal+0x83
Ok, as I understand it is the same. Now we have
0x00007ffc`4cf2e97b
instead of
CaptureServices.GenericInfrastructure.ExportLogic.ChannelsThread.ChannelsStateThread()+bb
So I have Microsoft debug symbols, now I need to load my own symbols to see the callstack.
The question is - do I need to load all debug symbols for my projects or I need only debug symbols for dll which contains CaptureServices.GenericInfrastructure.ExportLogic?
Or maybe I need to load only part of my debug symbols to handle this thread?
Try !sosex.mk. It gives a user-friendly stack trace with interleaved managed and native frames. I do not believe that this is a symbol issue. Also, when you have a managed address, you can pass it to !sosex.mln to see what's located there, but I think you're already aware of this command.
The k command as in ~62k is a command for the native call stack. It does dot show any .NET stuff (except the native methods in clr.dll).
To see the .NET stack, you need to load the .NET extension for WinDbg:
.loadby sos clr
And then use a command of that extension to see the .NET call stack. Switch to thread 62 first
~62s
!clrstack
!dumpstack
IMHO those commands will load symbols from PDBs when needed. If you get symbol warnings, see How to fix symbols in WinDbg
You need the debug symbols of whatever library that function belongs to.
I'm investigating a handle leak in a WCF service, running on .NET 4.6.2. The service runs fine but over time handle count keeps increasing, with thousands of Token type handles sitting around in the process. It seems memory is also leaking very slowly (likely related to the handle leak).
Edit: it looks like event and thread handles are also leaked.
Process Explorer shows that the suspect handles all have the same name:
DOMAIN\some.username$:183db90
and all share the same address.
I attached WinDbg to the process and ran !htrace -enable and then !htrace -diff some time later. This gave me a list of almost 2000 newly opened handles and native stack traces, like this one:
Handle = 0x000000000000b02c - OPEN
Thread ID = 0x000000000000484c, Process ID = 0x0000000000002cdc
0x00007ffc66e80b3a: ntdll!NtCreateEvent+0x000000000000000a
0x00007ffc64272ce8: KERNELBASE!CreateEventW+0x0000000000000084
0x00007ffc5b392e0a: clr!CLREventBase::CreateManualEvent+0x000000000000003a
0x00007ffc5b3935c7: clr!Thread::AllocHandles+0x000000000000007b
0x00007ffc5b3943c7: clr!Thread::CreateNewOSThread+0x000000000000007f
0x00007ffc5b394308: clr!Thread::CreateNewThread+0x0000000000000090
0x00007ffc5b394afb: clr!ThreadpoolMgr::CreateUnimpersonatedThread+0x00000000000000cb
0x00007ffc5b394baf: clr!ThreadpoolMgr::MaybeAddWorkingWorker+0x000000000000010c
0x00007ffc5b1d8c74: clr!ManagedPerAppDomainTPCount::SetAppDomainRequestsActive+0x0000000000000024
0x00007ffc5b1d8d27: clr!ThreadpoolMgr::SetAppDomainRequestsActive+0x000000000000003f
0x00007ffc5b1d8cae: clr!ThreadPoolNative::RequestWorkerThread+0x000000000000002f
0x00007ffc5a019028: mscorlib_ni+0x0000000000549028
0x00007ffc59f5f48f: mscorlib_ni+0x000000000048f48f
0x00007ffc59f5f3b9: mscorlib_ni+0x000000000048f3b9
Another stack trace (a large portion of the ~2000 new handles have this):
Handle = 0x000000000000a0c8 - OPEN
Thread ID = 0x0000000000003614, Process ID = 0x0000000000002cdc
0x00007ffc66e817aa: ntdll!NtOpenProcessToken+0x000000000000000a
0x00007ffc64272eba: KERNELBASE!OpenProcessToken+0x000000000000000a
0x00007ffc5a01aa9b: mscorlib_ni+0x000000000054aa9b
0x00007ffc5a002ebd: mscorlib_ni+0x0000000000532ebd
0x00007ffc5a002e68: mscorlib_ni+0x0000000000532e68
0x00007ffc5a002d40: mscorlib_ni+0x0000000000532d40
0x00007ffc5a0027c7: mscorlib_ni+0x00000000005327c7
0x00007ffbfbfb3d6a: +0x00007ffbfbfb3d6a
When I run the !handle 0 0 command in WinDbg, I get the following result:
21046 Handles
Type Count
None 4
Event 2635 **
Section 360
File 408
Directory 4
Mutant 9
Semaphore 121
Key 77
Token 16803 **
Thread 554 **
IoCompletion 8
Timer 3
TpWorkerFactory 2
ALPC Port 7
WaitCompletionPacket 51
The ones marked with ** are increasing over time, although at a different rate.
Edit 3:
I ran !dumpheap to see the number of Thread objects (unrelated classes removed):
!DumpHeap -stat -type System.Threading.Thread
Statistics:
MT Count TotalSize Class Name
00007ffc5a152bb0 745 71520 System.Threading.Thread
Active thread count fluctuates between 56 and 62 in Process Explorer as the process is handling some background tasks periodically.
Some of the stack traces are different but they're all native traces so I don't know what managed code triggered the handle creation. Is it possible to get the managed function call that's running on the newly created thread when Thread::CreateNewThread is called? I don't know what WinDbg command I'd use for this.
Note:
I cannot attach sample code to the question because the WCF service loads hundreds of DLL-s, most of which are built from many source files - I have some very vague suspicions in what major area this may come from but I don't know any details to show an MCVE.
Edit: after Harry Johnston's comments below, I noticed that thread handle count is also increasing - I overlooked this earlier because of the high number of token handles.
Here is one simplified example I found that really hard to debug deadlock in awaited tasks in some cases:
class Program
{
static void Main(string[] args)
{
var task = Hang();
task.Wait();
}
static async Task Hang()
{
var tcs = new TaskCompletionSource<object>();
// do some more stuff. e.g. another await Task.FromResult(0);
await tcs.Task;
tcs.SetResult(0);
}
}
This example is easy to understand why it deadlocks, it is awaiting a task which is finished later on. It looks stupid but similar scenario could happen in more complicated production code and deadlocks could be mistakenly introduced due to lack of multithreading experience.
Interesting thing for this example is inside Hang method there is no thread blocking code like Task.Wait() or Task.Result. Then when I attach VS debugger, it just shows the main thread is waiting for the task to finish. However, there is no thread showing where the code has stopped inside Hang method using Parallel Stacks view.
Here are the call stacks on each thread (3 in all) I have in the Parallel Stacks:
Thead 1:
[Managed to Native Transition]
Microsoft.VisualStudio.HostingProcess.HostProc.WaitForThreadExit
Microsoft.VisualStudio.HostingProcess.HostProc.RunParkingWindowThread
System.Threading.ThreadHelper.ThreadStart_Context
System.Threading.ExecutionContext.RunInternal
System.Threading.ExecutionContext.Run
System.Threading.ExecutionContext.Run
System.Threading.ThreadHelper.ThreadStart
Thread 2:
[Managed to Native Transition]
Microsoft.Win32.SystemEvents.WindowThreadProc
System.Threading.ThreadHelper.ThreadStart_Context
System.Threading.ExecutionContext.RunInternal
System.Threading.ExecutionContext.Run
System.Threading.ExecutionContext.Run
System.Threading.ThreadHelper.ThreadStart
Main Thread:
System.Threading.Monitor.Wait
System.Threading.Monitor.Wait
System.Threading.ManualResetEventSlim.Wait
System.Threading.Tasks.Task.SpinThenBlockingWait
System.Threading.Tasks.Task.InternalWait
System.Threading.Tasks.Task.Wait
System.Threading.Tasks.Task.Wait
ConsoleApplication.Program.Main Line 12 //this is our Main function
[Native to Managed Transition]
[Managed to Native Transition]
System.AppDomain.ExecuteAssembly
Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly
System.Threading.ThreadHelper.ThreadStart_Context
System.Threading.ExecutionContext.RunInternal
System.Threading.ExecutionContext.Run
System.Threading.ExecutionContext.Run
System.Threading.ThreadHelper.ThreadStart
Is there anyway to find out where the task has stopped inside Hang method? And the call stack if possible? I believe internally there must be some states about each task and their continuation points so that the scheduler could work. But I don't know how to check that out.
Inside of Visual Studio, I am not aware of a way to debug this sort of situation simply. However, there are two other ways to visualize this for full framework applications, plus a bonus preview of a way to do this in .NET Core 3.
tldr version: Yep, its hard, and Yep, the information you want is there, just difficult to find. Once you find the heap objects as per the below methods, you can use the address of them in the VS watch window to use the visualizers to take a deeper dive.
WinDbg
WinDbg has a primitive but helpful extension that provides a !dumpasync command.
If you download the extension from the vs-threading release branch and copy the x64 and x86 AsyncDebugTools.dll to C:\Program Files (x86)\Windows Kits\10\Debuggers\[x86|x64]\winext folders, you can do the following:
.load AsyncDebugTools
!dumpasync
The output (taken from the link above) looks like:
07494c7c <0> Microsoft.Cascade.Rpc.RpcSession+<SendRequestAsync>d__49
.07491d10 <1> Microsoft.Cascade.Agent.WorkspaceService+<JoinRemoteWorkspaceAsync>d__28
..073c8be4 <5> Microsoft.Cascade.Agent.WorkspaceService+<JoinWorkspaceAsync>d__22
...073b7e94 <0> Microsoft.Cascade.Rpc.RpcDispatcher`1+<>c__DisplayClass23_2+<<BuildMethodMap>b__2>d[[Microsoft.Cascade.Contracts.IWorkspaceService, Microsoft.Cascade.Common]]
....073b60e0 <0> Microsoft.Cascade.Rpc.RpcServiceUtil+<RequestAsync>d__3
.....073b366c <0> Microsoft.Cascade.Rpc.RpcSession+<ReceiveRequestAsync>d__42
......073b815c <0> Microsoft.Cascade.Rpc.RpcSession+<>c__DisplayClass40_1+<<Receive>b__0>d
On your sample above the output is less interesting:
033a23c8 <0> StackOverflow41476418.Program+<Hang>d__1
The description of the output is:
The output above is a set of stacks – not exactly callstacks, but actually "continuation stacks". A continuation stack is synthesized based on what code has 'awaited' the call to an async method. It's possible that the Task returned by an async method was awaited from multiple places (e.g. the Task was stored in a field, then awaited by multiple interested parties). When there are multiple awaiters, the stack can branch and show multiple descendents of a given frame. The stacks above are therefore actually "trees", and the leading dots at each frame helps recognize when trees have multiple branches.
If an async method is invoked but not awaited on, the caller won't appear in the continuation stack.
Once you see the nested hierarchy for more complex situations, you can at least deep-dive into the state objects and find their continuations and roots.
LinqPad and ClrMd
Another useful too is LinqPad coupled with ClrMd and ClrMD.Extensions. The latter package is used to bridge ClrMd into LINQPad - there is a getting started guide. Once you have the packages/namespaces set, this query is what you want:
var session = ClrMD.Extensions.ClrMDSession.LoadCrashDump(#"dmpfile.dmp");
var stateMachineTypes = (
from type in session.Heap.EnumerateTypes()
where type.Interfaces.Any(item => item.Name == "System.Runtime.CompilerServices.IAsyncStateMachine")
select type);
session.Heap.EnumerateDynamicObjects(stateMachineTypes).Dump(2);
Below is a sample of the output running on your sample code:
DotNet Core 3
For .NET Core 3.x they added !dumpasync into the WinDbg sos extension. It is MUCH better than the extension described above, as it gives much more context. You can see it is part of a much larger user story to improve debugging of async code. Here is the output from that under .NET Core 3.0 preview 6 with a preview 7 version of SOS with extended options. Note that line numbers are present, something you don't get with the above options.:
0:000> !dumpasync -stacks -roots
Statistics:
MT Count TotalSize Class Name
00007ffb564e9be0 1 96 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]
Total 1 objects
In 1 chains.
Address MT Size State Description
00000209915d21a8 00007ffb564e9be0 96 0 StackOverflow41476418_Core.Program+<Hang>d__1
Async "stack":
.00000209915d2738 System.Threading.Tasks.Task+SetOnInvokeMres
GC roots:
Thread bc20:
000000e08057e8c0 00007ffbb580a292 System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs # 2939]
rbp+10: 000000e08057e930
-> 00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]
000000e08057e930 00007ffbb580a093 System.Threading.Tasks.Task.InternalWaitCore(Int32, System.Threading.CancellationToken) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs # 2878]
rsi:
-> 00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]
000000e08057e9b0 00007ffbb5809f0a System.Threading.Tasks.Task.Wait(Int32, System.Threading.CancellationToken) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs # 2789]
rsi:
-> 00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]
Windows symbol path parsing FAILED
000000e08057ea10 00007ffb56421f17 StackOverflow41476418_Core.Program.Main(System.String[]) [C:\StackOverflow41476418_Core\Program.cs # 12]
rbp+28: 000000e08057ea38
-> 00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]
000000e08057ea10 00007ffb56421f17 StackOverflow41476418_Core.Program.Main(System.String[]) [C:\StackOverflow41476418_Core\Program.cs # 12]
rbp+30: 000000e08057ea40
-> 00000209915d21a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[StackOverflow41476418_Core.Program+<Hang>d__1, StackOverflow41476418_Core]]
We have a rich client application developed using WPF/C#.Net 4.0 which interops with in-house COM DLLs. Regular events are raised via this COM interface containing video data.
As part of the application we render video via Windows Media Foundation and have created interops to use Window Media Foundation. We have multiple WMF pipelines rendering different video at the same time.
The application runs for 6-8 hours rendering video. Private bytes remaining consistently steady during this time (say around 500-600MB).
At some point the application appears to hang, at this point private bytes increases very rapidly until the process consumes approximately 1.4GB of memory and crashes with an OutOfMemoryException.
We have reproduced this on 5 different workstations with different graphic cards (NVIDIA and ATI cards) and a mixture of Windows 7 32 and 64bit.
We have analyzed 3 dump files and found that the finalizer thread is waiting on a call to the ole32.GetToSTA() method. We are unable to determine what causes the finalizer thread to block and how to resolve this. I have pasted excerpts from three dumps we've been analyzing:
Dump 1)
Thread 2:ae0 is waiting on an STA thread efc
Thread 28:efc is calling a WaitForSingleObject. The handle it is waiting on is actually a thread handle 5ab4 which is thread id 14a4
Thread 130:14a4 has the following stack:
37f4fdf4 753776a6 ntdll!NtRemoveIoCompletion+0x15
37f4fe20 63301743 KERNELBASE!GetQueuedCompletionStatus+0x29
37f4fe74 6330d0db WMNetMgr!CNSIoCompletionPortNT::WaitAndServeCompletionsLoop+0x5e
37f4fe94 633199bf WMNetMgr!CNSIoCompletionPortNT::WaitAndServeCompletions+0x4c
37f4fecc 63312dbd WMNetMgr!CWorkThreadManager::CWorkerThread::ThreadMain+0xa2
37f4fed8 769b3677 WMNetMgr!CWMThread::ThreadFunc+0x3b
37f4fee4 77679f42 kernel32!BaseThreadInitThunk+0xe
37f4ff24 77679f15 ntdll!__RtlUserThreadStart+0x70
37f4ff3c 00000000 ntdll!_RtlUserThreadStart+0x1b
Dump2)
STA thread:
1127f474 75f80a91 ntdll!ZwWaitForSingleObject+0x15
1127f4e0 77411184 KERNELBASE!WaitForSingleObjectEx+0x98
1127f4f8 77411138 kernel32!WaitForSingleObjectExImplementation+0x75
1127f50c 63ae5f29 kernel32!WaitForSingleObject+0x12
1127f530 63a8eb2e WMNetMgr!CWMThread::Wait+0x78
1127f54c 63a8f128 WMNetMgr!CWorkThreadManager::CThreadPool::Shutdown+0x70
1127f568 63a76e10 WMNetMgr!CWorkThreadManager::Shutdown+0x34
1127f59c 63a76f2d WMNetMgr!CNSClientNetManagerHelper::Shutdown+0xdd
1127f5a4 63cd228e WMNetMgr!CNSClientNetManager::Shutdown+0x66
WARNING: Stack unwind information not available. Following frames may be wrong.
1127f5bc 63cd23a6 WMVCORE!WMCreateProfileManager+0xeef6
1127f5dc 63c573ca WMVCORE!WMCreateProfileManager+0xf00e
1127f5e8 63c62f18 WMVCORE!WMIsAvailableOffline+0x2ba3b
1127f618 63c19da6 WMVCORE!WMIsAvailableOffline+0x37589
1127f630 63c1aca2 WMVCORE!WMIsContentProtected+0x56e4
1127f63c 63c14bd7 WMVCORE!WMIsContentProtected+0x65e0
1127f650 113de6e8 WMVCORE!WMIsContentProtected+0x515
1127f660 113de513 wmp!CWMDRMReaderStub::CExternalStub::ShutdownInternalRefs+0x1d0
1127f674 113c1988 wmp!CWMDRMReaderStub::ExternalRelease+0x4f
1127f67c 1160a5b9 wmp!CWMDRMReaderStub::CExternalStub::Release+0x13
1127f6a4 1161745f wmp!CWMGraph::CleanupUpStream_selfprotected+0xbe
Finalizer thread is trying to switch to STA:
0126eccc 75f80a91 ntdll!ZwWaitForSingleObject+0x15
0126ed38 77411184 KERNELBASE!WaitForSingleObjectEx+0x98
0126ed50 77411138 kernel32!WaitForSingleObjectExImplementation+0x75
0126ed64 75d78907 kernel32!WaitForSingleObject+0x12
0126ed88 75e9a819 ole32!GetToSTA+0xad
Dump3)
The finalizer thread is in the GetToSTA call, so it is waiting for a COM object to free
Thread 29 is a COM object in the STA, and it is waiting on a critical section owned by thread 53 (1bf4)
Thread 53 is doing:
1cbcf990 76310a91 ntdll!ZwWaitForSingleObject+0x15
1cbcf9fc 74cb1184 KERNELBASE!WaitForSingleObjectEx+0x98
1cbcfa14 74cb1138 kernel32!WaitForSingleObjectExImplementation+0x75
1cbcfa28 65dfb6bb kernel32!WaitForSingleObject+0x12
WARNING: Stack unwind information not available. Following frames may be wrong.
1cbcfa48 74cb3677 wmp!Ordinal3000+0x53280
1cbcfa54 77029f42 kernel32!BaseThreadInitThunk+0xe
1cbcfa94 77029f15 ntdll!__RtlUserThreadStart+0x701cbcfaac 00000000 ntdll!_RtlUserThreadStart+0x1b
Any ideas on how we might resolve this issue?
Well, the finalizer thread is deadlocked. That will certainly result in an eventual OOM. We can't see the full stack trace for the finalizer thread but some odds that you'll see SwitchAptAndDispatchCall() and ReleaseRCWListInCorrectCtx() in the trace, indicating that it is trying to call IUnknown::Release() to release a COM object. And that object is apartment threaded so a thread switch is required to safely make the call.
I don't see any decent candidates in the stack traces you posted, possibly because you didn't get the right one or the thread is already busy shutting down due to the exception. Try to catch it earlier with a debugger break as soon as you see the virtual memory size climb.
The most common cause for a deadlock like this is violating the requirements for an STA thread. Which state that it must never block and must pump a message loop. The never-block requirement is typically easily met in a .NET program, the CLR will pump a message loop when necessary when you use the lock statement or a WaitHandle.WaitXxx() call. It is however very common to forget to pump a message loop, especially since doing so is kinda painful. Application.Run() is required.
I have a multi-threaded .NET application that hangs on an OnUserPreferenceChanged event. This is typically caused by a UI control or message loop started on a background thread (see e.g. http://www.ikriv.com/en/prog/info/dotnet/MysteriousHang.html), but as far as I can tell that isn't the case here. I verified this by setting a breakpoint in the WindowsFormsSynchronizationContext (as suggested here http://www.aaronlerch.com/blog/2008/12/15/debugging-ui/) and it is only constructed once, in the main UI thread.
Here's the output from !clrstack in windbg:
0013eea8 7c90e514 [HelperMethodFrame_1OBJ: 0013eea8]
System.Threading.WaitHandle.WaitOneNative(Microsoft.Win32.SafeHandles.SafeWaitHandle,
UInt32, Boolean, Boolean) 0013ef54 792b68af
System.Threading.WaitHandle.WaitOne(Int64, Boolean) 0013ef70 792b6865
System.Threading.WaitHandle.WaitOne(Int32, Boolean) 0013ef84 7b6f1a4f
System.Windows.Forms.Control.WaitForWaitHandle(System.Threading.WaitHandle)
0013ef98 7ba2d68b
System.Windows.Forms.Control.MarshaledInvoke(System.Windows.Forms.Control,
System.Delegate, System.Object[], Boolean) 0013f038 7b6f33ac
System.Windows.Forms.Control.Invoke(System.Delegate, System.Object[])
0013f06c 7b920bd7
System.Windows.Forms.WindowsFormsSynchronizationContext.Send(System.Threading.SendOrPostCallback,
System.Object) 0013f084 7a92ed62
Microsoft.Win32.SystemEvents+SystemEventInvokeInfo.Invoke(Boolean,
System.Object[]) 0013f0b8 7a92dc8f
Microsoft.Win32.SystemEvents.RaiseEvent(Boolean, System.Object,
System.Object[]) 0013f104 7a92e227
Microsoft.Win32.SystemEvents.OnUserPreferenceChanged(Int32, IntPtr,
IntPtr) 0013f124 7aaa06ec
Microsoft.Win32.SystemEvents.WindowProc(IntPtr, Int32, IntPtr, IntPtr)
The last method I can get param info on is:
0013f084 7a92ed62
Microsoft.Win32.SystemEvents+SystemEventInvokeInfo.Invoke(Boolean,
System.Object[])
PARAMETERS:
this = 0x01404420
checkFinalization = 0x00000001
args = 0x0144a298
Here's my question: How can I get more information here? Ultimately, I'd like to know which objects and/or threads this Invoke is for. Something like "!do 0x01404420" or "!do 0x0144a298" but I don't know where to go from there.
Search for exceptions in the heap by using !dumpheap -type Exception.
Also you can see the value of variables in a class,which will be useful to understand the state of the class. Use !dumpheap -type ClassName. You will get a MT(Method Table) address. From MT address see the Object address. Use !do address to dump the class.
Use !syncblk to see the locked threads
Regarding problems caused by SystemEvents class and in particular OnUserPreferenceChanged event try to use the CheckSystemEventsHandlersForFreeze() function in this answer which can help to find root cause, i.e. which controls was created on wrong thread and thus causing freeze.