I have a complex system with several threads. sometimes i see the application in 100% cpu and force to restart the system. I have no idea which thread caused it and which code caused it.
I need something that will give me the state of each thread in the system (i.e. in which line the thread is now) so i can find which code causes the 100% CPU
(in java you have the thread dump kill -3 which gives you the state of each thread)
Can you help please?
Tess's blog has some great debugging tutorials, including:
.NET Hang Debugging Walkthrough
People have suggested Process Explorer to me before.
You could use the debugger to break and then find out what all threads are doing. (Add the Debug Location toolbar to Visual Studio)
Another option is to remove all thread one by one and find the guilty one.
I have found that in cases like this one of the best tools around these days in Microsoft's intellitrace. This allows for historical debugging & will give you the state of all threads etc. when you break execution.
Unfortunatly its only available in Visual Studio 2010 Ultimate edition, but if this is a really critical issue and you don't have this edition you could always download a 30 day evaluation.
Use VS debugger to attach to your process, and then press the "break" (pause symbol) to break execution. In this state, you can open the Debug window called "Threads" which should give you the state of each thread, and which line they're currently executing. It also helps at this point to give explicit names to your threads when debugging them.
I find that 99% of the time (at least for me) its because I accidentally make a loop infinite when I don't mean to make it so or there should be at least a few milliseconds of sleep before the the loop continues.
You can use profiler like visual vm to better debug the behavior of threads in your application.
Related
I am currently health checking an application which experiences UI stuttering during heavy usage.
Using Microsoft Concurrency Visualizer extension for Visual Studio 2015, showed that quite a lot of short-lived threads are created and stopped after ~100ms of execution.
Unfortunately, their displayed callstack is like clr.dll!0x98071 ntdll.dll!0x634fb and I am not quite sure how to extract useful information out of it.
I have no clue what is the purpose of those threads and which part of the code in the application is creating them.
How can I better identify where each one of them gets started?
In the code, I was able to grep a handful of Tasks, another of QueueUserWorkItems, several dozens of plain Thread instantiations, some System.Threading.Timer & System.Timers.Timer, no Reactive Extensions. I put breakpoints for all of them but it seems I am missing some...
I don't think those are from the threadpool because they would be displayed in synchronisation state in concurrency visualizer, instead they just end, and another one with another Id gets created later. But maybe I am misleading.
We also use a few third-party libs and a bunch of JuggerNET generated code, so maybe the origin is not even in the application itself.
I was finally able to find the culprit of those short-lived threads by looking closely at some of the cryptic callstacks, which included for instance:
mmdevapi.dll
wdmaud.drv
avrt.dll
audioses.dll
It led me to thinking that I should double check the sound alert system. It was indeed this one which spawned those threads.
Note:
I will not accept my answer however because I would like someone to share a better process or any kind of tips and tricks for diagnosing unwanted threads origin.
I have a seriously nasty bug which I'm not having much uck tracking down. It only manifests itself as the program freezing and windows saying that the program is not responding when I run it without a debugger attached.
When I attach a debugger to the process from within visual studio and step through the code there is nothing untoward, and resuming execution sets the program running again, just fine, no longer frozen.
What type of bug could this possibly be which is dispelled by the very presence of a debugger?
You should look out for any race conditions in your code. Setting breakpoints and stepping through the code might resolve any timing issues where one action has not yet been completed in time, but when you pause the execution, it completes in time.
It might not actually be locked up - it's probably that the code that is executing is running on the same thread as the UI, so it LOOKS like it's locked up. Obviously if you're stepping through the code, you can see that it's actually doing something, but when some process is going on that freezes the UI it often appears to be locked up.
Look for extensive loops or processes that take time to complete, and try running them on a different thread and see if that takes care of it. You can also tell if your app is actually frozen or if it's just running a long process by looking at the CPU usage in the Task Manager.
As a temporary measure, you could try to do something during loops that you suspect may be causing this by doing something to update the UI. For example, in a WinForms application, you can add code to show where you are in the loop by adding a label and modifying the text.
For example:
for (each Employee currentEmployee in myEmployees)
{
lblStatus.Text = "working on " + currentEmployee.FullName;
Application.DoEvents();
}
Updating the UI this way will slow down your app because calls to Application.DoEvents are expensive, BUT it will help to assure your users that the program is not locked up (if you keep it in production) or you could choose to leave it out of the final production version and just use it while developing/testing to see how the processing is going and assure yourself that the app is not locked up.
When I run into an issue where the act of running the debugger seems to change the behavior of my application, I'll fall back to the old sneaker-net debugging method of outputting comments to a file. This will give you great insight into the execution paths of your program and help you to identify where it may be getting stuck.
Sounds like a race condition or deadlock.
The most common affect attaching a debugger has is timing and hence affecting race conditions which exist in the code. This is the first thing I think of when I have a scenario where attaching a debugger changes whether or not the bug reproduces.
Here are a couple of things you can try to work around this problem.
Try launching the process under the debugger vs. attaching
Use WinDbg over Visual Studio. It's a much lighter weight debugger and IME tends to affect the target process less.
One more change in behavior without debugger attached is optimizations are enabled during JIT compile - as result lifetime of variables could be different (smaller) and some objects could be garbage collected earlier (when they are no longer accessible, which could be earlier than end of method). When you attache debugger before JIT happens this optimizations are normally disabled. (see http://naveensrinivasan.com/2010/05/04/net-%e2%80%93-how-can-debugtrue-extend-the-life-time-of-local-variable/)
Advise: Attach debugger AFTER the bug happens and investigate. Collect memory dump and investigate with WinDbg if needed.
When I am working on multithreads how can I debug to know which thread causes an abnormal behavior?
Can I use permonitor for debugging, or are there any other tools or debugging facilities that are available?
Tips for Debugging Threads From MSDN
Neat New Multithreaded Debugging Features in VS 2008 by John Robbin
As an alternative to debugging, you could do thread-related testing. The book The Art of Unit Testing has a section on this in Appendix B. The author mentions three tools (two of which he has a personal interest in):
Typemock Racer
Microsoft CHESS
Osherove.ThreadTester
You can use visual studio to set up breakpoints on certain threads. See here and here for how to do it.
It depends what do you mean by "abnormal behavior"...
for most of the time, the visual studio debugger should be enough. the Threads and CallStack windows will give you a lot of information about what is going on.
for the heavy duty stuff you can use WinDbg+SOS. read about the !threads, !threadpool and !runaway commands.
If you have several threads of the same type* you could modify your code to only run one of each type of thread (or perhaps put it in the application's configuration file so you can change it quickly while debugging).
If the application still misbehaves then you know that it's an interaction between the different types of thread that's causing the problem. If it doesn't then it could be that there's some resource you've not thread locked properly (for example).
What I'm trying to say is simplify you application to the point where it's using the minimum number of threads to still be your original design.
* Not the best word to use, but for example if you spawn 10 threads to deal with file i/o only spawn 1.
How do you define abnormal behavior? Would that be an exception thrown? Not sure if this will help you but What I often do is name the thread object when I create it, then if I catch an exception or if certain criteria exists, I write to the event log. I include the time, the application name, the name of the thread and exception information. I don't just use it for debugging, I use it if a user complains about odd behavior or reports an error. Then I can go back and get information about it.
I've been experiencing a high degree of flicker and UI lag in a small application I've developed to test a component that I've written for one of our applications. Because the flicker and lag was taking place during idle time (when there should--seriously--be nothing going on), I decided to do some investigating. I noticed a few threads in the Threads window that I wasn't aware of (not entirely unexpected), but what caught my eye was one of the threads was set to Highest priority. This thread exists at the time Main() is called, even before any of my code executes. I've discovered that this thread appears to be present in every .NET application I write, even console applications.
Being the daring soul that I am, I decided to freeze the thread and see what happened. The flickering did indeed stop, but I experienced some oddness when it came to doing database interaction (I'm using SQL CE 3.5 SP1). My thought was that this might be the thread that the database is actually running on, but considering it's started at the time the application loads (before any references to the DB) and is present in other, non-database applications, I'm inclined to believe this isn't the case.
Because this thread (like a few others) shows up with no data in the Location column and no Call Stack listed if I switch to it in the debugger while paused, I tried matching the StartAddress property through GetCurrentProcess().Threads for the corresponding thread, but it falls outside all of the currently loaded modules address ranges.
Does anyone have any idea what this thread is, or how I might find out?
Edit
After doing some digging, it looks like the StartAddress is in kernel32.dll (based upon nearby memory contents). This leads me to think that this is just the standard system function used to start the thread, according to this page, which basically puts me back at square one as far as determining where this thread actually comes from. This is further confirmed by the fact that ALL of the threads in this list have the same value for StartAddress, leading me to ask exactly what the purpose is...?
Edit 2
Process Explorer let me to an actually meaningful start address. It looks like it's mscorwks.dll!CreateApplicationContext+0xbbef. This dll is in %WINDOWS%\Microsoft.NET\Framework\v2.0.50, so it looks like it's clearly a runtime assembly. I'm still not sure why
it's Highest priority
it appears to be causing hiccups in my application
You could try using Sysinternals. Process Explorer let's you dig in pretty deep. Right click on the Process to access Properties. Then "Threads" tab. In there, you can see the thread's stack and module.
EDIT:
After asking around some, it seems that your "Highest" priority thread is the Finalizer thread that runs due to a garbage collection. I still don't have a good reason as to why it would constantly keep running. Maybe you have some funky object lifetime behavior going on in your process?
I'm not sure what this is, but if you turn on unmanaged debugging, and set up Visual Studio with the Windows symbol server, you might get some more clues.
Might be the Garbage Collector thread. I noticed it too when I was once investigating a finalizer-related bug. Perhaps your system memory is low and the GC is trying to collect all the time? This was the case in the previously mentioned bug too. I couldn't reproduce it on my machine, but a co-worker of mine had a machine with less RAM where it would reappear like clockwork.
Other than that I don't know if I can reproduce it now that it's happened (I've been using this particular application for a week or two now without issue), assuming that I'm running my application in the VS debugger, how should I go about debugging a deadlock after it's happened? I thought I might be able to get at call stacks if I paused the program and hence see where the different threads were when it happened, but clicking pause just threw Visual Studio into a deadlock too till I killed my application.
Is there some way other than browsing through my source tree to find potential problems? Is there a way to get at the call stacks once the problem has occured to see where the problem is? Any other tools/tips/tricks that might help?
What you did was the correct way. If Visual Studio also deadlocks, that happens now and then. It's just bad luck, unless there's some other issue.
You don't have to run the application in the debugger in order to debug it. Run the application normally, and if the deadlock happens, you can attach VS later. Ctrl+Alt+P, select the process, choose debugger type and click attach. Using a different set of debugger types might reduce the risk of VS crashing (especially if you don't debug native code)
A deadlock involves 2 or more threads. You probably know the first one (probably your UI thread) since you noticed the deadlock in your application. Now you only need to find the other one. With knowledge of the architecture, it should be easy to find (e.g. what other threads use the same locks, interact with the UI etc)
If VS doesn't work at all, you can always use windbg. Download here: http://www.microsoft.com/whdc/devtools/debugging/default.mspx
I'd try different approaches in the following order:
First, inspect the code to look for thread-safety violations, making sure that your critical regions don't call other functions that will in turn try to lock a critical region.
Use whatever tool you can get your hands on to visualize thread activity, I use an in-house perl script that parses an OS log we made and graphs all the context switches and shows when a thread gets pre-empted.
If you can't find a good tool, do some logging to see the last threads that were running before the deadlock occurred. This will give you a clue as to where the issue might be caused, it helps if the locking mechanisms have unique names, like if an object has it's own thread, create a dedicated semaphore or mutex just to manage that thread.
I hope this helps. Good luck!
You can use different programs like Intel(R) Parallel Inspector:
http://software.intel.com/en-us/intel-parallel-inspector/
Such programs can show you places in your code with potential deadlocks. However you should pay for it, or use it only evaluation period. Don't know if there is any free tools like this.
Just like anywhere, there're no "Silver bullet" tools to catch all the deadlocks. It is all about the sequence in which different threads aquire resources so your job is to find out where the order was violated. Usually Visual Studio or other debugger will provide stack traces and you will be able to find out where the discrepancy is. DevPartner Studio does provide deadlock analysis but last time I've checked there were too many false positives. Some static analysis tools will find some potential deadlocks too.
Other than that it helps to get the architecture straight to enforce resource aquisition order. For example, layering helps to make sure upper level locks are taken before lower ones but beware of callbacks.