I'll start by saying I do not have a lot of experience in troubleshooting multi-threading problems. So a lot of what I've read about debugging race conditions, dead locks, live locks, etc. are strictly theoretical to me.
I have this .NET application that is making use of a dynamically loaded native win32 dll. If the dll is never loaded the application terminates without a problem. However, if the dll is loaded then when the user exits the application the UI disappears but the process never terminates.
I've turned on native code debugging in the project settings so I can see all the threads that are running. When the user closes the application window the main thread appears to terminate. I know this because if I perform a Break All in the Threads windows in Visual Studio the main thread is re-categorized as a worker thread and there is no call stack available for it. There are 20 other threads still active, all with call stacks. I've looked through the call stacks for all of these threads and nothing sticks out at me (there's no mention of the dll in any of the call stacks).
What steps can I take to narrow down the cause of this problem? Are there any additional tools I should be using to help pin point the problem?
This means that some of your Foreground Threads are still alive. Unlike Background Threads, Foreground Threads keeps the process alive.
You should either use Background Threads or Stop your Foregrounds Threads to be able to exit the process gracefully.
Windows application will automatically exit when all of its thread(s) are stopped.
As you said
If the dll is never loaded the application terminates without a
problem
I'm assuming all the running threads are unmanaged threads(not created by clr). There is no concept of background threads in native code. All threads must be terminated to terminate the process.
You must find a way to signal all threads to exit in that library. See if you have any api exposed for that.
If you don't find anything, you've got a "Sledge Hammer" approach. i.e Terminate the process with Environment.Exit which works almost always. Be sure you use this approach as a last resort as this can corrupt the process state.
I have a need to create multiple processing threads in a new application. Each thread has the possibility of being "long running". Can someone comment on the viability of the built in .net threadpool or some existing custom threadpool for use in my application?
Requirements :
Works well within a windows service. (queued work can be removed from the queue, currently running threads can be told to halt)
Ability to spin up multiple threads.
Work needs to be started in sequential order, but multiple threads can be processing in parallel.
Hung threads can be detected and killed.
EDIT:
Comments seem to be leading towards manual threading. Unfortunately I am held to 3.5 version of the framework. Threadpool was appealing because it would allow me to queue work up and threads created for me when resources were available. Is there a good 3.5 compatable pattern (producer/consumer perhaps) that would give me this aspect of threadpool without actually using the threadpool?
Your requirements essentially rule out the use of the .NET ThreadPool;
It generally should not be used for long-running threads, due to the danger of exhausting the pool.
It does work well in Windows services, though, and you can spin up multiple threads - limited automatically by the pool's limits.
You can not guarantee thread starting times with the thread pool; it may queue threads for execution when it has enough free ones, and it does not even guarantee they will be started in the sequence you submit them.
There are no easy ways to detect and kill running threads in the ThreadPool
So essentially, you will want to look outside the ThreadPool; I might recommend that perhaps you might need 'full' System.Threading.Thread instances just due to all of your requirements. As long as you handle concurrency issues (as you must with any threading mechanism), I don't find the Thread class to be all that difficult to manage myself, really.
Simple answer, but the Task class (Fx4) meets most of your requirements.
Cancellation is cooperative, ie your Task code has to check for it.
But detecting hung threads is difficult, that is a very high requirement anyway.
But I can also read your requirements as for a JobQueue, where the 'work' consists of mostly similar jobs. You could roll your own system that Consumes that queue and monitors execution on a few Threads.
I've done essentially the same thing with .Net 3.5 by creating my own thread manager:
Instantiate worker classes that know how long they've been running.
Create threads that run a worker method and add them to a Queue<Thread>.
A supervisor thread reads threads from the Queue and adds them to a Dictionary<int, Worker> as it launches them until it hits its maximum running threads. Add the thread as a property of the Worker instance.
As each worker finishes it invokes a callback method from the supervisor that passes back its ManagedThreadId.
The supervisor removes the thread from the Dictionary and launches another waiting thread.
Poll the Dictionary of running workers to see if any have timed out, or put timers in the workers that invoke a callback if they take too long.
Signal a long-running worker to quit, or abort its thread.
The supervisor invokes callbacks to your main thread to inform of progress, etc.
We have a WinForms desktop application, which is heavily multithreaded. 3 threads run with Application.Run and a bunch of other background worker threads. Getting all the threads to shut down properly was kind of tricky, but I thought I finally got it right.
But when we actually deployed the application, users started experiencing the application not exiting. There's a System.Threading.Mutex to prevent them from running the app multiple times, so they have to go into task manager and kill the old one before they can run it again.
Every thread gets a Thread.Join before the main thread exits, and I added logging to each thread I spawn. According to the log, every single thread that starts also exits, and the main thread also exits. Even stranger, running SysInternals ProcessExplorer show all the threads disappear when the application exits. As in, there are 0 threads (managed or unmanaged), but the process is still running.
I can't reproduce this on any developers computers or our test environment, and so far I've only seen it happen on Windows XP (not Vista or Windows 7 or any Windows Server). How can a process keep running with 0 threads?
Edit:
Here's a bit more detail. One of event loops is hosting an Win32 interop DLL which uses a COM object to talk to a device driver. I put it in its own thread because the device driver is time sensitive, and whenever the UI thread would block for a significant amount of time (such as waiting for a database call to finish), it would interfere with the device driver.
So I changed the code so the main thread would do a Thread.Join with the device driver thread. That actually caused the application to lock up... it logs a few more calls on the UI thread after the Join completed and then everything stops. If the device is powered off, the driver never starts, and the problem goes away. So it looks like the driver must be responsible for keeping the application alive, even after it's supposedly been shut down.
When you create your threads, set IsBackground=true on them. When your main ui thread/app is closed, all created threads will automatically be shut down.
http://msdn.microsoft.com/en-us/library/system.threading.thread.isbackground.aspx
Is it possible that the children of your Application.Run calls aren't terminating? Also, what's actually causing the application to exit - does it automatically close when all the threads have terminated (automatically meaning you wrote some code to do that), or is it user-imitated?
I had an issue once where there was a race condition in my "thread complete" event code that would sometimes result in what you're seeing. The last two threads would finish at the same time, fire the event simultaneously, and each event would decide it wasn't the last thread, so the application would continue running, even though the thread count was zero. To resolve this, I was able to find and eliminate the race condition, but you could also use a timer that checks every second or two, gets a count of the threads, and if none are still open, it kills the application.
We never did figure out the root programmatic cause, but it was a specific driver version that caused the issue, and upgrading to a new driver fixed the problem.
Unfortunately, this is all the answer I can give, if someone else someday runs into a similar problem...
I got a thread that is just banishing.. i'd like to know who is killing my thread and why.
It occurs to me my thread is being killed by the OS, but i'd like to confirm this and if possible to know why it's killing it.
As for the thread, i can assert it has at least 40 min of execution before dying, but it suddenly dies around 5 min.
public void RunWorker()
{
Thread worker = new Thread(delegate()
{
try
{
DoSomethingForALongLongTime();
}
catch(Exception e)
{
//Nothing is never logged :(
LogException(e);
throw e;
}
});
worker.IsBackground = true;
worker.SetApartmentState(System.Threading.ApartmentState.STA);
worker.Start();
}
EDIT: Addressing answers
Try/Catch Possible exceptions:
It's implemented and it catches nothing :(
Main Thread dying:
This thread is created by the web server, which continues to run
Work completion:
The work is not completed, as it finally affects the database, i can check whether it's done or not when the thread dies.
Having thought of these things brought me to this question, who is killing my threads??
ps. It's not Lady Goldent in the living room with the candle stick :)
Various people (including myself, here) pointed out that hosting a long-running thread in IIS is a bad idea. Your thread will being running inside an IIS 'worker process'. These processes are periodically terminated (recycled) by IIS, which will cause your thread to die.
I suggest that you try turning-off IIS worker process recycling to see if that makes a difference. You can find more information here.
Your thread probably just threw an exception. Try putting a try/catch block around DoSomethingForALongLongTime and see what it picks up.
Update: I didn't notice before that you were starting this from a web server. That can be a very bad idea. In particular, is the separate thread using any information derived from HttpContext.Current? That would include Request, Response, Session, etc., as well as any information from the page.
This is bad because these things only last as long as the request lasts. Once the request is over, they become invalid, to say the very least.
If you need to kick off a long-running thread from within a web application or web service, then you should create a simple Windows Service and host a WCF service within it. Have the web page then send all the information needed to perform the task to the service. The service can even use MSMQ as a transport, which will ensure that no messages are lost, even if the service gets busy.
A potential way to get more information: attach a debugger and break on thread termination. Depending on how your thread is being terminated, this might not work.
Download Debugging Tools for Windows if you don't already have it
Run windbg.exe, attach to your process
Break into windbg, type sxe et to enable breaking on thread exit
When the debugger breaks, inspect the state of the system, other threads, etc.
To get the managed stack, load sos.dll (.loadby sos mscorsvr, .loadby sos mscorwks, or .loadby sos clr should work), then run !clrstack (see !help for other sos commands)
If you get a lot of noise from other threads exiting, script windbg to continue after breaking if it's not the thread ID you care about.
Edit: If you think the thread is being terminated from within your process, you can also set a breakpoint on TerminateThread (bp kernel32!TerminateThread) and ExitThread (bp kernel32!ExitThread) to catch the stack of the killer.
I don't know the answer, but some thoughts:
Could it be throwing an exception? Have you tried putting a try/catch around the DoSomethingForALongLongTime() call?
Are there any points where it exits normally? Try putting some logging on them.
Do you get the same behaviour in and out of the debugger? Does the output window in the debugger provide any hints?
UPDATE
You said:
This thread is created by the web
server, which continues to run
If the thread is running inside asp.net then it may be that the thread is being killed when the asp.net worker process recycles, which it will do periodically. You could try turning off worker process recycling and see if that makes any difference.
Your edit reveals the answer:
It's the butler web server.
How exactly do you host these threads? A webserver environment isn't exactly designed to host long living processes. In fact, it is probably configured to halt runaway sites, every 40 minutes maybe?
Edit:
For a quick fix, your best chance is to set worker.IsBackground = false; because your current setting of true allows the system to kill the parent-thread w/o waiting for your bgw.
On another note, there is little point in using a BackgroundWorker in an ASP.NET application, it is intended for WinForms and WPF. It would be better to create a separate thread for this, since you are changing some of the Threads properties. That is not advised for a ThreadPool (Bgw) thread.
The process might be terminating. That would be what worker.IsBackground = true; is designed to do, kill your thread when the main thread exits.
A background thread will only run as long there are foreground threads runnnig.
As soon that all foreground threads end, any background thread still running will aborted.
If checking for an exception doesn't show anything useful, get your thread code to write to a log file at key points. You'll then be able to see exactly when it stops working and hopefully why.
A simple answer would be: "The killer doesn't leave a name card" ;)
If your thread is hosted in IIS, probably the thread is killed by the app pool process which recycles. The server might continue running but the process which hosts your item is stopped untill a new request fires everything up again.
If your thread is hosted in an executable, the only way it can be killed is by killing the thread yourself, throwing an exception in the thread or terminating the host process
Hope this helps.
You can try to increase executionTimeout value of configuration\system.web\httpRuntime in web.config (default value is 110 seconds in .NET 4.0 and 90 in corresponds to http://msdn.microsoft.com/en-us/library/e1f13641.aspx). You can try to change it dynamically Server.ScriptTimeout = 300 (see http://www.beansoftware.com/ASP.NET-Tutorials/Long-Operations.aspx). It this parameter will not helps, then I think you have a problem other as thread recycling from IIS. How you can see default value of this parameter is much less as typical live time of your thread. I think, that your problem has another nature, but to be sure...
Why you set apartment state for the thread? Which COM objects you use in the working thread? Do you have an unmanaged code which do the most of work where you can also insert some code? I think you should have more information about SomethingForALongLongTime to be able to solve the problem.
And one more a little suggestion. Could you insert a line of code after calling SomethingForALongLongTime(); to be sure, that SomethingForALongLongTime not end without an exception?
UPDATED: To be absolutely sure that your thread will be not killed by IIS, you can try to create a process which do SomethingForALongLongTime(); instead of using threads.
When you call RunWorker(), you can add a reference to your thread to a list. Once you have detected that your thread has died, you can inspect the state of the thread, perhaps it will reveal how it died. Or, perhaps it hasn't died, its just waiting on some resource (like the connection to the database).
List runningThreads = ...
public void RunWorker() {
Thread worker = new Thread(delegate()
..
runningThreads.add(worker);
worker.Start();
}
public void checkThreads() {
for (Thread t : runningThreads) {
Console.WriteLine("ThreadState: {0}", t.ThreadState);
}
}
It could be throwing one of the various uncatcheable exceptions including Stack Overflow or Out of Memory. These are the hardest exceptions to track down.
What does memory consumption look like while this thread is running? Can you use a memory profiler on it to see if it's out of control? Can you add some logging in inner loops? If you have a recursive method, add a counter and throw an exception if it recurses an impossible number of times. Are you using large objects that could be causing large object heap fragmentation (causes out of memory errors even when you aren't really out).
You should instrument DoSomethingForALongLongTime() with lots of debug logs, so you can find out at what spot does the code stop executing. Or attach a debugger and break on all first chance exceptions.
use AsyncTasks to achieve your long running work in asp.net
Try use app domain UnhandledException event: http://msdn.microsoft.com/en-us/library/system.appdomain.unhandledexception.aspx
it may give you some information if you miss some exceptions
HI,
I have a .NET application which uses threads excessively. At the time of exit the process does not kill itself. Is there is any tool that can show what is causing the problem? although I have checked thoroughly but was unable to find the problem.
Abdul Khaliq
See: Why doesn't my process terminate even after main thread terminated
That means you have some foreground thread running. A .net application will not terminate unless all the foreground threads complete execution.
You can mark threads as Background Threads and then check (Thread.IsBackground property). Please note that all background threads terminate immediately when application exits. If you are doing some important work in those threads like serializing data to database then you should keep them as foreground threads only. Background threads are good for non critical things like spell cheker etc.
Normally, attaching with a debugger should tell you what threads are still active and what code is running on them.