Does an HttpHandler listen for a disconnect from the browser?
My guess is "no" since it seems to be mostly/only used for dynamic file creation, so why would it?
But I can't find an answer in the docs or goog.
Many thanks in advance!
Background
I'd like to "abort" an HttpHandler because currently, I allow huge excel exports (~150k sql rows, so ~600k html lines). For reasons almost as ridiculous as the code, I have a query that fires for as many sql rows that the user tries to export. As you can imagine, this takes a very long time.
I think I'm getting backed up with worker processes because users probably get frustrated with the lag, and try again with a smaller result. I currently flush the worker procs automatically every 30 min, but I'd rather cleanup more quickly.
I don't have the time to clean up the sql right now, so I'd like to just listen for an "abort" from the client and kill the handler if "aborted".
What you're hoping to accomplish by listening for a client connection drop won't really help solve your problem at all. The core of your problem is a long running task being kicked off in an HttpHandler directly.
In this case, even if you could listen for a client disconnect it wouldn't ever be acted upon as your code will be too busy executing to listen for it.
The only way to properly determine progress and perform actions during long running processes such as this is to ensure that your code is multi-threaded. The problem with doing this in ASP.NET for long running processes is they'll suck up threads from the thread pool needed to serve your pages. This could result in your website hanging or responding very slowly, as you've been experiencing.
I would recommend writing a Windows Service to handle these long running jobs and having it spit the results into a staging directory. I would then use MSMQ or similar to throw the request to the service for processing.
Ultimately, you want to get this long running thread outside of ASP.NET where you can take advantage of the benefits that multi-threading can offer you. Such as, the ability to report back the progress and to abort when needed.
Related
Sometimes there is a lot that needs to be done when a given Action is called. Many times, there is more that needs to be done than what needs to be done to generate the next HTML for the user. In order to make the user have a faster experience, I want to only do what I need to do to get them their next view and send it off, but still do more things afterwards. How can I do this, multi-threading? Would I then need to worry about making sure different threads don't step on each others feet? Is there any built in functionality for this type of thing in ASP.NET MVC?
As others have mentioned, you can use a spawned thread to do this. I would take care to consider the 'criticality' of several edge cases:
If your background task encounters an error, and fails to do what the user expected to be done, do you have a mechanism of report this failure to the user?
Depending on how 'business critical' the various tasks are, using a robust/resilient message queue to store 'background tasks to be processed' will help protected against a scenario where the user requests some action, and the server responsible crashes, or is taken offline, or IIS service is restarted, etc. and the background thread never completes.
Just food for though on other issues you might need to address.
How can I do this, multi-threading?
Yes!
Would I then need to worry about making sure different threads don't step on each others feet?
This is something you need to take care of anyway, since two different ASP.NET request could arrive at the same time (from different clients) and be handled in two different worker threads simultaneously. So, any code accessing shared data needs to be coded in a thread-safe way anyway, even without your new feature.
Is there any built in functionality for this type of thing in ASP.NET MVC?
The standard .net multi-threading techniques should work just fine here (manually starting threads, or using the Task features, or using the Async CTP, ...).
It depends on what you want to do, and how reliable you need it to be. If the operaitons pending after the response was sent are OK to be lost, then .Net Async calls, ThreadPool or new Thread are all going to work just fine. If the process crashes the pending work is lost, but you already accepted that this can happen.
If the work requires any reliable guarantee, for instance the work incurs updates in the site database, then you cannot use the .Net process threading, you need to persist the request to do the work and then process this work even after a process restart (app-pool recycle as IIS so friendly calls them).
One way to do this is to use MSMQ. Other way is to use the a database table as a queue. The most reliable way is to use the database activation mechanisms, as described in Asynchronous procedure execution.
You can start a background task, then return from the action. This example is using the task Parallel Library, found in .NET 4.0:
public ActionResult DoSomething()
{
Task t = new Task(()=>DoSomethingAsynchronously());
t.Start();
return View();
}
I would use MSMQ for this kind of work. Rather than spawning threads in an ASP.NET application, I'd use an Asynchronous out of process way to do this. It's very simple and very clean.
In fact I've been using MSMQ in ASP.NET applications for a very long time and have never had any issues with this approach. Further, having a different process (that is an executable in a different app domain) do the long running work is an ideal way to handle it since your web application is no being used to do this work. So IIS, the threadpool and your web application can continue to do what they need to, while other processes handle long running tasks.
Maybe you should give it a try: Using an Asynchronous Controller in ASP.NET MVC
Let's say we are building some public service that grabs the setup of a user (what server, user and pwd he wants to perform the call), logs in into that server and do some processing...
the process takes about 15 seconds to complete
each user has a different setup (server/user/pwd), so the process needs to run against each one
if 1000 users tells the system to run the method at 1:00PM
How can I insure that the method is processed in the next 15 minutes?
What should be the correct approach to this little problem?
I'm thinking that I need to do something Asynchronously, and parallel processing could speed up things, maybe throttling the processes, maybe execute 100 calls per each 30 seconds?
I never did something like this and would love to get your feedback on ideas and future problems just to spend 100 hours of work and realize that I took a wrong road :(
Thank you.
added
The only thing to have in consideration is that this should be a 100% web solution.
If one call to your method does not affect the result of another method call (which seems to be the case here), parallel programming seems to be the way to go.
Consider not processing this in the asp.net application directly, but rather placing such requests on a queue and having another process (windows service may be a good candidate here) pulling items off the queue for processing. The windows service can have multiple threads and can pull as many items off the queue at once as there are processing threads available. With an appropriate queuing mechanism, the windows service can run on separate hardware if needed to reach your performance goals.
You can have the original web page query the result using e.g. Ajax to provide the user feedback if that's a requirement.
UPDATE:
Microsoft has recommended a pattern for long running tasks that can be used in a hosted environment.
Well, 1000 * 15 seconds is more than 4 hours, so you can only complete the entire task within the 15 minute time frame if you parallelize the batch.
I would set up a queue and have a sufficient number of threads or processes pull from that queue.
You can define an in-process queue with Queue<T> or out-of-process either with a database table or MSMQ.
If you don't want to write multithreaded code, you can just have a bunch of different processes running on different machines, all pulling from the same queue.
A console application can do this, but a Windows Service is definitely also an alternative.
I have a web application which runs multiple threads on button click each thread making IO call on different ipAddresses ie(login windows account and then making file operations). There is a treshold value of 30 seconds. I assume that while login attempt if the treshold is exceeded, device on ipAddress does not match my conditions thus I dont care it. Thread.Abort() does not fit my situation where it waits for the IO call to finish which might take long time.
I tried doing the db operations acording to states of the threads right after the treshold timeout. It worked fine but when I checked out the log file, I noticed that the thread.IsAlive property of the nonresponding threads were still true. After several debuggings on my local pc, I encountered a possible deadlock situation (which ı suspect) that my pc crashed badly.
In short, do you have any idea about killing (forcefully) nonresponding threads (waiting for the IO opreation) right after the execution of the button_click?
(PS: I am not using the threadpool)
Oguzhan
EDIT
For further clarification,
I need to validate the given local administrator credentials on each ipAddress, and insert DB record for the successive ones. The rest, I dont care.
In my validation method, I first made a call to logonuser method of win32 by importing advapi32.dll for impersonating the administrator user. After that I attampt to create a temp dir on remote sys drive via Directory.CreateDirectory method just to check authorization. If any exception was thrown (UnauthorizedAccessException or IOException) then the remote machine is out of interest, else we got it so insert into DB.
I called the validation method syncronous way for a given ip range and it worked fine for a bunch of successive endpoints. But when I tested the method for some irrelevant ipAddress range, each validation attempt took 20 secs to 5 mins to complete.
I then turned my design into a multithreaded fashion in which I decided to run each validation in seperate thread and abort the nonresponding threads at the end of the treshold amount. The problem was the thread.abort did not suit the situation well, which in fact waits for the return of the IO instruction (which I dont want to) and raises a ThreadAbortException after that.
In order to complete the execution of the successive threads, I ignored the nonresponding threads and proceed with the DB operations and return back from the button click method (nonresponding threds were still alive at that point of time). Everyting seemd ok until I got a bad system crash after executing the button click (in debug mode) several times. The porblem was probably the increasing number of living threads under IIS service.
SOLUTION
Cause of the threads not to respond in a timely fashion is the network path does not found situation. My solution is to check the connectivity via TCP on port 135 (port 135 is mandatory for RPC on windows) before making the IO call. Default timeout period is 20 secs. If you need to set the timout use BeginConenct. An other option would be Pinging (if the ICMP is enabled in network)
Why do you really need to abort the threads anyway? Why not just let them complete normally but ignore the results? (Keep a token to indicate which "batch" of requests it's in, and then remember which batch you're actually interested in at the moment.)
Another option is to keep hold of whatever you're using to make an IO call (e.g. a socket) and close it; that should cause an exception on the thread making the request.
Yet another option is to avoid putting the requests on different threads, but instead use asynchronous IO - you'll still get parallelism, but without tying up threads (just IO completion ports). Also, can't you put a timeout on the IO operation itself? The request should just time out naturally that way.
Thanks in Advance for reading and answer this question.
I got button in asp 2.0 that will process something BIG. It will take sometime to finish (more than 30,000 comparison) and I want to know if the browser says that it lost the comunication with the server, the server will finish the process?
You probably want to modify your architecture so that the HTTP response is not dependent on the processing finishing within the timeout period. It sounds as if you are not going to tell the user anything based on the results of the calculation anyway based on the question. There are different methods you could use, but most involve writing a message to a queue, and then having a separate process, like a Windows Service monitor that queue and do the long running work separately.
You should not execute this button live on the site but instead spawn a thread server side.
You could use AJAX to tell the services to start the comparison and listen for the answer later on.
In my web application there is a process that queries data from all over the web, filters it, and saves it to the database. As you can imagine this process takes some time. My current solution is to increase the page timeout and give an AJAX progress bar to the user while it loads. This is a problem for two reasons - 1) it still takes to long and the user must wait 2) it sometimes still times out.
I've dabbled in threading the process and have read I should async post it to a web service ("Fire and forget").
Some references I've read:
- MSDN
- Fire and Forget
So my question is - what is the best method?
UPDATE: After the user inputs their data I would like to redirect them to the results page that incrementally updates as the process is running in the background.
To avoid excessive architecture astronomy, I often use a hidden iframe to call the long running process and stream back progress information. Coupled with something like jsProgressBarHandler, you can pretty easily create great out-of-band progress indication for longer tasks where a generic progress animation doesn't cut it.
In your specific situation, you may want to use one LongRunningProcess.aspx call per task, to avoid those page timeouts.
For example, call LongRunningProcess.aspx?taskID=1 to kick it off and then at the end of that task, emit a
document.location = "LongRunningProcess.aspx?taskID=2".
Ad nauseum.
We had a similar issue and solved it by starting the work via an asychronous web service call (which meant that the user did not have to wait for the work to finish). The web service then started a SQL Job which performed the work and periodically updated a table with the status of the work. We provided a UI which allowed the user to query the table.
I ran into this exact problem at my last job. The best way I found was to fire off an asychronous process, and notify the user when it's done (email or something else). Making them wait that long is going to be problematic because of timeouts and wasted productivity for them. Having them wait for a progress bar can give them false sense of security that they can cancel the process when they close the browser which may not be the case depending on how you set up the system.
How are you querying the remote data?
How often does it change?
Are the results something that could be cached for a period of time?
How long a period of time are we actually talking about here?
The 'best method' is likely to depend in some way on the answers to these questions...
You can create another thread and store a reference to the thread in the session or application state, depending on wether the thread can run only once per website, or once per user session.
You can then redirect the user to a page where he can monitor the threads progress. You can set the page to refresh automatically, or display a refresh button to the user.
Upon completion of the thread, you can send an email to the user.
My solution to this, has been an out of band service that does these and caches them in db.
When the person asks for something the first time, they get a bit of a wait, and then it shows up but if they refresh, its immediate, and then, because its int he db, its now part of the hourly update for the next 24 hours from the last request.
Add the job, with its relevant parameters, to a job queue table. Then, write a windows service that will pick up these jobs and process them, save the results to an appropriate location, and email the requester with a link to the results. It is also a nice touch to give some sort of a UI so the user can check the status of their job(s).
This way is much better than launching a seperate thread or increasing the timeout, especially if your application is larger and needs to scale, as you can simply add multiple servers to process jobs if necessary.