Are there any non-obvious dangers in using threads in ASP.NET? - c#

This is something of a sibling question to this programmers question.
Briefly, we're looking at pushing some work that's been piggy-backing on user requests into the background "properly." The linked question has given me plenty of ideas should we go the service route, but hasn't really provided any convincing arguments as to why, exactly, we should.
I will admit that, to me, the ability to do the moral equivalent of
WorkQueue.Push(delegate(object context) { ... });
is really compelling, so if its just a little difficult (rather than inherently unworkable) I'm inclined to go with the background thread approach.
So, the problems with background threads I'm aware of (in the context of an AppPool):
They can die at any time due to the AppPool being recycled
Solution: track when a task is being executed, so it can be re-run* should a new thread be needed
The ThreadPool is used to respond to incoming HTTP queries, so using it can starve IIS
Solution: build our own thread pool, capping the number of threads as well.
My question is, what am I missing, if anything? What else can go wrongǂ with background threads in ASP.NET?
* The task in questions are already safe to re-run, so this isn't a problem.
ǂ Assume we're not doing anything really dumb, like throwing exceptions in background threads.

I would stay away from launching threads from with-in your IIS AppDomain for StackOverflow. I don't have any hard evidence to support what I am going to say, but working with IIS for 10 years, I know that it works best when it is the only game in town.
There is also an alternative, I know this is going to be sort of a take off on my answer over on the programmers thread. But as I understand it you already have a solution that works by piggy-backing the work on user requests. Why not use that code, but only launch it when a special internal API is called. Then use Task Scheduler to call a CURL command that calls that API every 30 seconds or so to launch the tasks. This way you are letting IIS handle the threading and your code is handling something that it already does easily.

One danger I ran into personally is the CallContext. We were using the CallContext to set user identity data, because the same code was shared across our web application and our .NET Remoting based application services (which is designed to use the CallContext for storing call specific data) - so we weren't using the HttpContext.
We noticed that, sometimes, a new request would end up with a non-null identity in the CallContext. In other words, ASP .NET was not nulling out the data stored in the CallContext between requests...and thus an unauthenticated user might get into the application if they picked up a thread which still had the CallContext containing validated user identity info.

Let me tell you about a non-obvious danger :)
I used threads to collect Update some RSS feeds into my database for a website I was hosting with GoDaddy. The threads worked fine (if they were terminated, they would be restarted automatically due to some checks I had built in some web pages).
It was working excellently and I was very happy, until GoDaddy (my host then) first started killing the threads, and then blocked them completely. So my app just died!
If that wasn't non-obvious, what is?

One could be are you overly complicating your architecture without getting any benefits.
You program will be more expensive to write, more expensive to maintain and have a greater chance of having bugs.

Related

Is there any drawback of using ThreadPool.QueueUserWorkItem in web app.

I did some research on this topic, but I am unable to find the expected answer for this. In my application I have used
ThreadPool.QueueUserWorkItem
Like following way.
ThreadPool.QueueUserWorkItem(o => CaseBll.SendEmailNotificationForCaseUpdate(currentCase, caseUpdate));
My app in asp.net mvc & I have handled all background task which is not required to execute on user operation for faster execution & quick user response.
Now I wants to know that, Is there any bad side of using ThreadPool.QueueUserWorkItem When we have larger audience for application.
No, you should never use it. I can write a lot of reasons why but instead you should read this article from Scott Hansleman who is a genius IMHO
http://www.hanselman.com/blog/ChecklistWhatNOTToDoInASPNET.aspx
Under "Reliability and Performance":
Fire-and-Forget Work - Avoid using ThreadPool.QueueUserWorkItem as your app pool could disappear at any time. Move this work outside or use WebBackgrounder if you must.
So, as recommended, don't use ThreadPool.QueueUserWorkItem. An excellent alternative for this is at:
https://www.asp.net/aspnet/overview/web-development-best-practices/what-not-to-do-in-aspnet-and-what-to-do-instead#fire
Edit: As mentioned by #Scott-Chamberlain, here's a better link:
http://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
It really depends on what you are going to be doing but generally speaking, your likely concerns will be:
Persistence. Threads in the managed pool are background threads. They will die when the application recycles which is generally undesirable. In your case you want to send e-mails. Imagine if your process dies for some reason before the thread executes. Your e-mail will never be sent.
Local storage is shared, which means you need to make sure there are no leftovers from the last thread if using it. Applies to fields marked with ThreadStaticAttribute as well.
I would instead recommend that you implement a job scheme where you schedule the job somewhere and have some other component actually read from this list (e.g. database) and perform the job, then mark it as complete. That way it persists across application unload, there is no memory reuse and you can throttle the performance. You could implement the processing component inside your application or even as a Windows service if you prefer.

Call from Web API to another Web API without waiting for results

Is there a way to fire an Http call to an external web API within my own web API without having to wait for results?
The scenario I have is that I really don't care whether or not the call succeeds and I don't need the results of that query.
I'm currently doing something like this within one of my web API methods:
var client = new HttpClient() { BaseAddress = someOtherApiAddress };
client.PostAsync("DoSomething", null);
I cannot put this piece of code within a using statement because the call doesn't go through in that case. I also don't want to call .Result() on the task because I don't want to wait for the query to finish.
I'm trying to understand the implications of doing something like this. I read all over that this is really dangerous, but I'm not sure why. What happens for example when my initial query ends. Will IIS dispose the thread and the client object, and can this cause problems at the other end of the query?
Is there a way to fire an Http call to an external web API within my own web API without having to wait for results?
Yes. It's called fire and forget. However, it seems like you already have discovered it.
I'm trying to understand the implications of doing something like this
In one of the links in the answers you linked above state the three risks:
An unhandled exception in a thread not associated with a request will take down the process. This occurs even if you have a handler setup via the Application_Error method.
This means that any exception thrown in your application or in the receiving application won't be caught (There are methods to get past this)
If you run your site in a Web Farm, you could end up with multiple instances of your app that all attempt to run the same task at the same time. A little more challenging to deal with than the first item, but still not too hard. One typical approach is to use a resource common to all the servers, such as the database, as a synchronization mechanism to coordinate tasks.
You could have multiple fire-and forget calls when you mean to have just one.
The AppDomain your site runs in can go down for a number of reasons and take down your background task with it. This could corrupt data if it happens in the middle of your code execution.
Here is the danger. Should your AppDomain go down, it may corrupt the data that is being sent to the other API causing strange behavior at the other end.
I'm trying to understand the implications of doing something like
this. I read all over that this is really dangerous
Dangerous is relative. If you execute something that you don't care at all if it completes or not, then you shouldn't care at all if IIS decides to recycle your app while it's executing either, should you? The thing you'll need to keep in mind is that offloading work without registration might also cause the entire process to terminate.
Will IIS dispose the thread and the client object?
IIS can recycle the AppDomain, causing your thread to abnormally abort. Will it do so depends on many factors, such as how recycling is defined in your IIS, and if you're doing any other operations which may cause a recycle.
In many off his posts, Stephan Cleary tries to convey the point that offloading work without registering it with ASP.NET is dangerous and may cause undesirable side effects, for all the reason you've read. That's also why there are libraries such as AspNetBackgroundTasks or using Hangfire for that matter.
The thing you should most worry about is a thread which isn't associated with a request can cause your entire process to terminate:
An unhandled exception in a thread not associated with a request will
take down the process. This occurs even if you have a handler setup
via the Application_Error method.
Yes, there are a few ways to fire-and-forget a "task" or piece of work without needing confirmation. I've used Hangfire and it has worked well for me.
The dangers, from what I understand, are that an exception in a fire-and-forget thread could bring down your entire IIS process.
See this excellent link about it.

How are IIS7 threads assigned?

I added log4net to my application and can now see the thread Ids of user activities as they navigate through my website. Is there any specific algorithm to how threads assignment happens with IIS7, or is it just a random number assignment (I suspect it's not completely random because my low traffic site show threads mostly in the range 10-30)? Any maximum to the number of threads available? And I notice that my scheduler shows up with a weird threads id -- any reason for this? The scheduler is Quartz.net and the id shows as "Scheduler_Worker-10", and not just a number.
This explains all you need to know.
An Excerpt:
When ASP.NET is hosted on IIS 7.0 in
integrated mode, the use of threads is
a bit different. First of all, the
application-level queues are no more.
Their performance was always really
bad, there was no hope in fixing this,
and so we got rid of them. But perhaps
the biggest difference is that in IIS
6.0, or ISAPI mode, ASP.NET restricts the number of threads concurrently
executing requests, but in IIS 7.0
integrated mode, ASP.NET restricts the
number of concurrently executing
requests. The difference only matters
when the requests are asynchronous
(the request either has an
asynchronous handler or a module in
the pipeline completes
asynchronously). Obviously if the
reqeusts are synchronous, then the
number of concurrently executing
requests is the same as the number of
threads concurrently executing
requests, but if the requests are
asynchronous then these two numbers
can be quite different as you could
have far more reqeusts than threads.
So basically, if requests are synchronous, the same number of threads per request. See here for various parameters.
I've explained this is a blog post on my blog
ASP.NET Performance-Instantiating Business Layers
The title doesn't coincide with your question but I explain the way IIS handles Requests and I believe you'll have your answer.
A quote from the article
When IIS fields a request for your
application it hands it over to the
worker process. The worker process in
turn creates and instance of your
Global class (which is of type
HttpApplication). From that point on
the typical flow of an ASP.NET
application takes place (the ASP.NET
pipeline). However, what you need to
know and understand is that the worker
process (think of it as IIS really)
keeps the instance of your
HttpApplication (an instance of your
Global class) alive, in order to field
other requests. In fact it by default
it would create and cache up to 10
instances of your Global class, if
required (Lazy instantiation)
depending on load the number of
requests your website receives other
factors. In Figure1 above the
instances of your ASP.NET application
are shown as the red boxes. There
could be up to 10 of these cached by
the worker process. These are really
threads that the worker process has
created and cached and each thread has
its own instance of your Global class.
Note that each of these threads is in
the same App Domain. So any static
classes you may have in your
application are shared across each of
these threads or application
instances.
I suggest you read that article and I'll be happy to answer any questions you may have. Please note that I've intentional kept the article simple in that I don't talk about what happens in the kernel or go into details of the various components that participate. Keeping it simple helps people understand the concepts a lot better (I feel).
I'll answer some of your other questions here:
Is there any specific algorithm to how threads assignment happens with IIS7?
No, for all intents an purposes it's random. This is explain in the article I pointed to. The short answer is that if a cached thread is available then IIs will use it. If not, it will create a new thread, create and instance of your HttpApplication (Global) and assign all of the context to it. So in a site that's not busy, you may see the same threads handle requests. But there are no guarantees. If there is more than one free thread IIS will pick a thread at random to service that request. You should note here, that even in a not so busy site, if your requests take a long time, IIS will be forced to create new threads to service other incoming requests.
Any maximum to the number of threads available?
Yes (as explained in th article) typically 10 threads per worker process. This can be adjusted but I've worked on a number of extremely busy websites and I've never had to. The key is to make your applications respond as fast as possible. Mind you an application can have multiple worker process assigned to it (configured in your app pool) so in busy sites you actually want multiple worker processes for your application, however the implication is that you have the required hardware (CPU cores and memory).
The scheduler is Quartz.net and the id shows as "Scheduler_Worker-10", and not just a number
Threads can have names instead of Ids. If the thread has been assigned a name then you'll see that instead of an id. Of course for threads IIS creates you have no such control. Mind you, I've not used (nor know about Quartz) so I don't know about that but I'm guess that's the case.

How can I send the HTTP response back to the user but still do more things on the server after that?

Sometimes there is a lot that needs to be done when a given Action is called. Many times, there is more that needs to be done than what needs to be done to generate the next HTML for the user. In order to make the user have a faster experience, I want to only do what I need to do to get them their next view and send it off, but still do more things afterwards. How can I do this, multi-threading? Would I then need to worry about making sure different threads don't step on each others feet? Is there any built in functionality for this type of thing in ASP.NET MVC?
As others have mentioned, you can use a spawned thread to do this. I would take care to consider the 'criticality' of several edge cases:
If your background task encounters an error, and fails to do what the user expected to be done, do you have a mechanism of report this failure to the user?
Depending on how 'business critical' the various tasks are, using a robust/resilient message queue to store 'background tasks to be processed' will help protected against a scenario where the user requests some action, and the server responsible crashes, or is taken offline, or IIS service is restarted, etc. and the background thread never completes.
Just food for though on other issues you might need to address.
How can I do this, multi-threading?
Yes!
Would I then need to worry about making sure different threads don't step on each others feet?
This is something you need to take care of anyway, since two different ASP.NET request could arrive at the same time (from different clients) and be handled in two different worker threads simultaneously. So, any code accessing shared data needs to be coded in a thread-safe way anyway, even without your new feature.
Is there any built in functionality for this type of thing in ASP.NET MVC?
The standard .net multi-threading techniques should work just fine here (manually starting threads, or using the Task features, or using the Async CTP, ...).
It depends on what you want to do, and how reliable you need it to be. If the operaitons pending after the response was sent are OK to be lost, then .Net Async calls, ThreadPool or new Thread are all going to work just fine. If the process crashes the pending work is lost, but you already accepted that this can happen.
If the work requires any reliable guarantee, for instance the work incurs updates in the site database, then you cannot use the .Net process threading, you need to persist the request to do the work and then process this work even after a process restart (app-pool recycle as IIS so friendly calls them).
One way to do this is to use MSMQ. Other way is to use the a database table as a queue. The most reliable way is to use the database activation mechanisms, as described in Asynchronous procedure execution.
You can start a background task, then return from the action. This example is using the task Parallel Library, found in .NET 4.0:
public ActionResult DoSomething()
{
Task t = new Task(()=>DoSomethingAsynchronously());
t.Start();
return View();
}
I would use MSMQ for this kind of work. Rather than spawning threads in an ASP.NET application, I'd use an Asynchronous out of process way to do this. It's very simple and very clean.
In fact I've been using MSMQ in ASP.NET applications for a very long time and have never had any issues with this approach. Further, having a different process (that is an executable in a different app domain) do the long running work is an ideal way to handle it since your web application is no being used to do this work. So IIS, the threadpool and your web application can continue to do what they need to, while other processes handle long running tasks.
Maybe you should give it a try: Using an Asynchronous Controller in ASP.NET MVC

Looking at what happens when a c#/ASP.NET thread is terminated and how to get around problems

I'm working on a ASP.NET website that on some requests will run a very lengthy caching process. I'm wondering what happens exactly if the execution timeout is reached while it is still running in terms of how the code handles it.
Particularly I am wondering about things like if the code is in the try of a try/finally block will the finally still be run?
Also given I am not sure I want the caching to terminate even if it goes on that long is there a way with spawning new threads, etc. that I can circumvent this execution timeout? I am thinking it would be much nicer to return to the user immediately and say "a cache build is happening" rather than just letting them time out. I have recently started playing with some locking code to make sure only one cache build happens at a time but am thinking about extending this to make it run out of sync.
I've not really played with creating threads and such like myself so am not sure exactly how they work, particularly in terms of interacting with ASP.NET. eg if the parent thread that launched it is terminated will that have any effect on the spawned thread?
I know there is kind of a lot of different questions in here and I can split them if that is deemed best but they all seem to go together... I'll try to summarise the questions though:
Will a finally block still be executed if a thread is terminated by ASP.NET while in the try block
Would newly created threads be subject to the same timeouts as the original thread?
Would newly created threads die at the same time as the parent thread that created them?
And the general one of what is the best way to do long running background processes on an ASP.NET site?
Sorry for some noobish questions, I've never really played with threads and they still intimidate me a bit (my brain says they are hard). I could probably test the answer to a lot of tehse questions but I wouldn't be confident enough of my tests. :)
Edit to add:
In response to Capital G:
The problem I have is that the ASp.NET execution timeout is currently set to one hour which is not always long enough for some of these processes I reckon. I've put some stuff in with locks to prevent more than one person setting off these long processes and I was worried the locks might not be released (which if finally blocks aren't always run might happen I guess).
Your comments on not running long processes in ASP.NET is why I was thinking of moving them to other threads rather than blocking the request thread but I don't know if that still counts as running within the ASP.NET architecture that you said was bad.
The code is not actually mine so I'm not allowed (and not sure I 100% understand it enough) to rework it into a service though that is certainly where it would best live.
Would using a BackgroundWorker process for something that could take an hour be feasible in this situation (with respect to comments on long running processes in ASP.NET). I would then make request return a "Cache is building" page until its finished and then go back to serving normally... Its all a bit of a nightmare but its my job so I've got to find a way to improve it. :)
Interesting question, just tested and no it's not guaranteed to execute the code in the finally block, if a thread is aborted it could stop at any point in the processing. You can design some sanity checking and other mechanisms to handle special cleanup routines and such but it has a lot to do with your thread handling as well.
Not necessarily, it depends on how your implementing your threads. If you are working with threads yourself, then you can easily get into situations where the parent thread is killed while it's child threads are still out there processing, you generally want to do some cleanup in the parent thread that ends the child threads as well. Some objects might do a lot of this for you as well, so it's a tough call to say one way or the other. Never assume this at the very least.
No, not necessarily, don't assume this at least, again has to do with your design and whether your doing threading yourself or using some higher level threading object/pattern. I would never assume this regardless.
I don't recommend long running processes within the ASP.NET architecture, unless its within the typical timeout, if it's 10-20s okay but if it's minutes, no, the reason is resource usage within ASP.NET and it's awfully bad on a user. That being said you could perform asynchronous operations where you hand off the work to the server, then you return back to the user when the processing is finished, (this is great for those 10-20s+ processes), the user can be given a little animation or otherwise not have their browser all stuck for that long waiting for whatever is happening on the server to happen.
If it is a long running process, things that take greater than 30-60s+, unless it absolutely has to be done in ASP.NET due to the nature of the process, I suggest moving it to a windows service and schedule it in some way to occur when required.
Note: Threading CAN be complicated, it's not that it's hard so much as that you have to be very aware of what your doing, which requires a firm understanding of what threads are and how they work, I'm no expert, but I'm also not completely new and I'll tell you that in most situations you don't need to get into the realm of threading, even when it seems like you do, if you must however, I would suggest looking into the BackgroundWorker object as they are simplified for the purposes of doing batched processing etc. (honestly for many situations that DO need threads, this is usually a very simple solution).
http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx
Long or time consuming processes to be started behind the web-page; it should not hit the ASP.NET execution time out; the user page should be freed; running the requests under lock etc. All these situation points towards using async services. In one of the products, where I architected, used services for such scenarios. The service exposes some async method to initiate. The status of the progress can be queried using another method. Every request is given some id and no duplicate requests are fired ever. The progress proceeds even if the user logs out. The user can see the results at a later time.
If you have looked at such options already, let me know if there is any issue. Or if you are yet to look in this direction, please get it this way. For any help, just send in your comments.

Categories

Resources