I have several long-running threads in an MVC3 application that are meant to run forever.
I'm running into a problem where a ThreadAbortException is being called by some other code (not mine) and I need to recover from this gracefully and restart the thread. Right now, our only recourse is to recycle the worker process for the appDomain, which is far from ideal.
Here's some details about this code works:
A singleton service class exists for this MVC3 application. It has to be a singleton because it caches data. This service is responsible for making request to a database. A 3rd party library is used for the actual database connection code.
In this singleton class we use a collection of classes that are called "QueryRequestors". These classes identify unique package+stored_procedure names for requests to the database, so that we can queue those calls. That is the purpose of the QueryRequestor class: to make sure calls to the same package+stored_procedure (although they may have infinite different parameters) are queued, and do not happen simultaneously. This eases our database strain considerably and improves performance.
The QueryRequestor class uses an internal BlockingCollection and an internal Task (thread) to monitor its queue (blocking collection). When a request comes into the singleton service, it finds the correct QueryRequestor class via the package+stored_procedure name, and it hands the query over to that class. The query gets put in the queue (blocking collection). The QueryRequestor's Task sees there's a request in the queue and makes a call to the database (now the 3rd party library is involved). When the results come back they are cached in the singleton service. The Task continues processing requests until the blocking collection is empty, and then it waits.
Once a QueryRequestor is created and up and running, we never want it to die. Requests come in to this service 24/7 every few minutes. If the cache in the service has data, we use it. When data is stale, the very next request gets queued (and subsequent simultaneous requests continue to use the cache, because they know someone (another thread) is already making a queued request, and this is efficient).
So the issue here is what to do when the Task inside a QueryRequestor class encounters a ThreadAbortException. Ideally I'd like to recover from that and restart the thread. Or, at the very least, dispose of the QueryRequestor (it's in a "broken" state now as far as I'm concerned) and start over. Because the next request that matches the package+stored_procedure name will create a new QueryRequestor if one is not present in the service.
I suspect the thread is being killed by the 3rd party library, but I can't be certain. All I know is that nowhere do I abort or attempt to kill the thread/task. I want it to run forever. But clearly we have to have code in place for this exception. It's very annoying when the service bombs because a thread has been aborted.
What is the best way to handle this? How can we handle this gracefully?
You can stop re-throwing of ThreadAbortException by calling Thread.ResetAbort.
Note that most common case of the exception is Redirect call, and canceling thread abort may case undesired effects of execution of request code that otherwise would be ignored due to killing the thread. It is common issue in WinForms (where separation of code and rendering is less clear) than in MVC (where you can return special redirect results from controllers).
Here's what I came up with for a solution, and it works quite nicely.
The real issue here isn't preventing the ThreadAbortException, because you can't prevent it anyway, and we don't want to prevent it. It's actually a good thing if we get an error report telling us this happened. We just don't want our app coming down because of it.
So, what we really needed was a graceful way to handle this Exception without bringing down the application.
The solution I came up with was to create a bool flag property on the QueryRequestor class called "IsValid". This property is set to true in the constructor of the class.
In the DoWork() call that is run on the separate thread in the QueryRequestor class, we catch the ThreadAbortException and we set this flag to FALSE. Now we can tell other code that this class is in an Invalid (broken) state and not to use it.
So now, the singleton service that makes use of this QueryRequestor class knows to check for this IsValid property. If it's not valid, it replaces the QueryRequestor with a new one, and life moves on. The application doesn't crash and the broken QueryRequestor is thrown away, replaced with a new version that can do the job.
In testing, this worked quite well. I would intentionally call Thread.Abort() on the DoWork() thread, and watch the Debug window for output lines. The app would report that the thread had been aborted, and then the singleton service was correctly replacing the QueryRequestor. The replacement was then able to successfully handle the request.
Related
Is there a way to fire an Http call to an external web API within my own web API without having to wait for results?
The scenario I have is that I really don't care whether or not the call succeeds and I don't need the results of that query.
I'm currently doing something like this within one of my web API methods:
var client = new HttpClient() { BaseAddress = someOtherApiAddress };
client.PostAsync("DoSomething", null);
I cannot put this piece of code within a using statement because the call doesn't go through in that case. I also don't want to call .Result() on the task because I don't want to wait for the query to finish.
I'm trying to understand the implications of doing something like this. I read all over that this is really dangerous, but I'm not sure why. What happens for example when my initial query ends. Will IIS dispose the thread and the client object, and can this cause problems at the other end of the query?
Is there a way to fire an Http call to an external web API within my own web API without having to wait for results?
Yes. It's called fire and forget. However, it seems like you already have discovered it.
I'm trying to understand the implications of doing something like this
In one of the links in the answers you linked above state the three risks:
An unhandled exception in a thread not associated with a request will take down the process. This occurs even if you have a handler setup via the Application_Error method.
This means that any exception thrown in your application or in the receiving application won't be caught (There are methods to get past this)
If you run your site in a Web Farm, you could end up with multiple instances of your app that all attempt to run the same task at the same time. A little more challenging to deal with than the first item, but still not too hard. One typical approach is to use a resource common to all the servers, such as the database, as a synchronization mechanism to coordinate tasks.
You could have multiple fire-and forget calls when you mean to have just one.
The AppDomain your site runs in can go down for a number of reasons and take down your background task with it. This could corrupt data if it happens in the middle of your code execution.
Here is the danger. Should your AppDomain go down, it may corrupt the data that is being sent to the other API causing strange behavior at the other end.
I'm trying to understand the implications of doing something like
this. I read all over that this is really dangerous
Dangerous is relative. If you execute something that you don't care at all if it completes or not, then you shouldn't care at all if IIS decides to recycle your app while it's executing either, should you? The thing you'll need to keep in mind is that offloading work without registration might also cause the entire process to terminate.
Will IIS dispose the thread and the client object?
IIS can recycle the AppDomain, causing your thread to abnormally abort. Will it do so depends on many factors, such as how recycling is defined in your IIS, and if you're doing any other operations which may cause a recycle.
In many off his posts, Stephan Cleary tries to convey the point that offloading work without registering it with ASP.NET is dangerous and may cause undesirable side effects, for all the reason you've read. That's also why there are libraries such as AspNetBackgroundTasks or using Hangfire for that matter.
The thing you should most worry about is a thread which isn't associated with a request can cause your entire process to terminate:
An unhandled exception in a thread not associated with a request will
take down the process. This occurs even if you have a handler setup
via the Application_Error method.
Yes, there are a few ways to fire-and-forget a "task" or piece of work without needing confirmation. I've used Hangfire and it has worked well for me.
The dangers, from what I understand, are that an exception in a fire-and-forget thread could bring down your entire IIS process.
See this excellent link about it.
We're working with a 3rd-party legacy system that requires thread affinity for some of the tear-down logic. We're also hosting a WCF service inside IIS which, under heavy loads will do a rude unloading of our app domain. In these cases it falls to the critical finalizer to do cleanup. Unfortunately, without thread affinity in the finalizer, the 3rd-party system deadlocks.
So roughly:
public class FooEnvironment : CriticalFinalizerObject, IDisposable
{
public FooEnvironment()
{
// start up C API
}
public bool Dispose()
{
// shutdown C API (from same thread ctor was called on)
}
~FooEnvironment()
{
// try to shutdown C API but deadlock!
}
}
I've tried various things where we Run with the ExecutionContext from the initializing thread, but this doesn't work (at least in IIS) and we get an invalid operation exception stating that this execution context can't be used (ostensibly because it may have been marashalled across AppDomains, which seems likely).
I've read several things basically stating that what I'm trying to do can't be done but I figured I would ask since there isn't a lot of information on this topic.
Back in the old days I developed a library that wrapped the hideous DDEML which is a Win32 api wrapper around the DDE protocol. The DDEML has thread affinity requirements as well so I feel your pain.
The only strategy that is going to work is to create a dedicate thread that executes all of your library calls. This means biting the bullet and marshaling every single request to call into this API onto this dedicated thread and then marshaling back the result to the original thread. It sucks and its slow, but it is the only method guaranteed to work.
It can be done, but it is painful. You can see how I tackled the problem in my NDde library. Basically, the finalizer will simply post a message via static method calls to a thread that can accept and dispatch them to the appropriate API call. In my case I created a thread that called Application.Run to listen for messages because DDE required a Windows message loop anyway. In your case you will want to create the thread in a manner that monitors a custom message queue. This is not terribly difficult if you use the BlockingCollection class because the Take method blocks until an item appears the queue.
I have a somewhat unusual scenario where I need to be able to outright slaughter "hung", self-hosted WorkflowInstance's after a given timeout threshold. I tried the Abort(), Terminate() and Cancel() methods but these are all too "nice". They all appear to require a response from the WorkflowInstance before they are honored.
In my scenario, a workflow entered an infinite loop and was therefore unresponsive. Calls to the normal methods mentioned above would simply hang since the workflow was completely unresponsive. I was surprised to learn the WorkflowRuntime does not appear have a mechanism for dealing with this scenario, or that Abort() and Terminate() are merely suggestions as opposed to violent directives.
I scoured google/msdn/stackoverflow/etc trying to find out what to do when Terminate() simply isn't going to get the job done and came up dry. I considered creating my own base activity and giving it a timeout value so my "root" activity can kill itself if one of its child activities hangs. This approach seems like I'd be swatting at flies with a sledge hammer...
Is there a technique I overlooked?
The only true solution is to consider this a bug, fix whatever went wrong, and consider the matter closed.
The only way to forcibly abort any code that is locked in an infinite loop is to call Abort() on the thread. Of course, this is considered bad juju, and should only be done when the state of the application can be ensured after the call.
So, you must supply the WorkflowApplication an implementation of SynchronizationContext that you write which can call Abort() on the thread that the workflow Post()s to.
I am not sure if this will work, but have you tried the WorkflowInstance.TryUnload() function? I remember this to fire off a few events inside of the workflow (been a while since I did this), so you might be able to have an event handler in your workflow that catches this and does a kill switch on itself.
I have a function called ApiCalls() that is wrapped in a locker because the api I'm using is not multi-thread safe. Occasionally an api call fails to return and I can't think of a way to handle this kind of situation. I was thinking about creating a timer on the lock object, but it seems the locker doesn't not have something like that.
There's really no good answer for this. A bad, but probably workable, answer is to have a watchdog thread that Aborts the calling thread after a timeout. In other words, after acquiring the lock but before calling the API, you'd order the watchdog to kill you. When you get back from the call (if you get back), you'd call off the watchdog.
Again, this is not a great solution, as Abort is very messy.
I don't think you can reasonably recover from this problem. Suppose that you could timeout, you would then attempt to call the API again, but the previous call is still active and you have said that the API is not thread-safe.
You simply cannot defend yourself from fundamentally flawed dependencies of this kind.
The only really safe thing to do is to restart the process. Steven Sudit's suggestion is one way to achieve that.
This can be solved by wrapping the API calls in a separate assembly and loading that assembly into a seperate application domain by using the AppDomain class.....
Use application domains to isolate
tasks that might bring down a process.
If the state of the AppDomain that's
executing a task becomes unstable, the
AppDomain can be unloaded without
affecting the process. This is
important when a process must run for
long periods without restarting.
You can then call thread abort on the call in the separate AppDomain, signal the host domain that an abort has happened. The host domain would unload the offending domain, thus unloading the API, and start a new domain with the API reset. You would also want a watchdog on the API domain so the host could take action if the API domain freezes.
Miscellaneous links: C# Nutshell AppDomain Listings, cbrumme's WebLog, Good example of use of AppDomain, Using AppDomain to Load and Unload Dynamic Assemblies
The only safe-ish solution is probably to start another process to handle the API calls, and then kill the process if they get stuck. Even that doesn't guarantee that the API's handlers won't get into a bogus state that can only be cured via system restart, but using Thread.Abort can mortally wound a process.
If you don't want to use "untrusted" means of killing the process, you could have one thread in the process perform the API calls while another watches for a "please die" message. Watchdogs can be tricky; if a watchdog is set for 15 seconds and an action would take 17 seconds to complete, one might request an action, time out after 15 seconds, retry the action, time out after 15 seconds, etc. indefinitely. It may be good to have the watchdog time adjust after each failure (e.g. try an action, letting it have up to 15 seconds; if that doesn't work, and nobody's complaining, try again and let it go 30 seconds; if that's still no good, give it 60 seconds.)
As part of a large automation process, we are calling a third-party API that does some work calling services on another machine. We discovered recently that every so often when the other machine is unavailable, the API call will spin away sometimes up to 40 minutes while attempting to connect to the remote server.
The API we're using doesn't offer a way to specify a timeout and we don't want our program waiting around for that long, so I thought threads would be a nice way to enforce the timeout. The resulting code looks something like:
Thread _thread = new Thread(_caller.CallServices());
_thread.Start();
_thread.Join(timeout);
if (_thread.IsAlive)
{
_thread.Abort();
throw new Exception("Timed-out attempting to connect.");
}
Basically, I want to let APICall() run, but if it is still going after timeout has elapsed, assume it is going to fail, kill it and move on.
Since I'm new to threading in C# and on the .net runtime I thought I'd ask two related questions:
Is there a better/more appropriate mechanism in the .net libraries for what I'm trying to do, and have I committed any threading gotchas in that bit of code?
Thread.Abort() is a request for the thread to abort, and gives no guarantee that it will do so in a timely manner. It is also considered bad practice (it will throw a thread abort exception in the aborted thread, but it seems like the 3rd party API offers you no other choices.
If you know (programmatically) the address of the remote service host you should ping it before you transfer control to the 3rd party API.
If not using a backgroundworker, you could set the thread's IsBackgroundThread to true, so it doesn't keep your program from terminating.
Bad idea. Thread.Abort doesn't necessarily clean up the mess left by such an interrupted API call.
If the call is expensive, consider writing a separate .exe that makes the call, and pass the arguments to/from it using the command line or temporary files. You can kill an .exe much more safely than killing a thread.
You can also just use a delegate... Create a delegate for the method that does the work, Then call BeginInvoke on the delegate, passing it the arguments, and a callback function to handle the return values (if you want)...
Immediately after the BeginInvoke you can wait a designated time for the asynch delegate to finish, and if it does not in that specified time, move on...
public delegate [ReturnType] CallerServiceDelegate
([parameter list for_caller.CallService]);
CallerServiceDelegate callSvcDel = _caller.CallService;
DateTime cutoffDate = DateTime.Now.AddSeconds(timeoutSeconds);
IAsyncResult aR = callSvcDel.BeginInvoke([here put parameters],
AsynchCallback, null);
while (!aR.IsCompleted && DateTime.Now < cutoffDate)
Thread.Sleep(500);
if (aR.IsCompleted)
{
ReturnType returnValue = callSvcDel.EndInvoke(aR);
// whatever else you need to do to handle success
}
else
{
callSvcDel.EndInvoke(aR);
// whatever you need to do to handle timeout
}
NOTE: as written AsynchCallback could be null, as the code retrieves the return value from the EndInvoke(), but if you want to you can have the CallService() method call the AsynchCallback delegate and pass it the return values instaed...
It might work, but nobody could say for sure without an understanding of the third-party API. Aborting the thread like that could leave the component in some invalid state that it might not be able to recover from, or maybe it won't free resources that it allocated (think - what if one of your routines just stopped executing half-way through. Could you make any guarantees about the state your program would be in?).
As Cicil suggested, it might be a good idea to ping the server first.
Does your application run for long periods of time or is it more of a run-as-needed application? If it's the latter, I personally would consider using the Thread.Abort() option. While it may not be the most desirable from a purist's perspective (resource management, etc.), it is certainly straightforward to implement and may foot the bill given the way your particular application works.
The idea of a separate executable makes sense. Perhaps another option would be to use AppDomains. I'm not an expert in this area (I welcome refinements/corrections to this), but as I understand it, you'd put the API call in a separate DLL and load it into a separate AppDomain. When the API call is finished or you have to abort it, you can unload the AppDomain along with the DLL. This may have the added benefit of cleaning up resources that a straightforward Thread.Abort() will not.