I have a web application which runs multiple threads on button click each thread making IO call on different ipAddresses ie(login windows account and then making file operations). There is a treshold value of 30 seconds. I assume that while login attempt if the treshold is exceeded, device on ipAddress does not match my conditions thus I dont care it. Thread.Abort() does not fit my situation where it waits for the IO call to finish which might take long time.
I tried doing the db operations acording to states of the threads right after the treshold timeout. It worked fine but when I checked out the log file, I noticed that the thread.IsAlive property of the nonresponding threads were still true. After several debuggings on my local pc, I encountered a possible deadlock situation (which ı suspect) that my pc crashed badly.
In short, do you have any idea about killing (forcefully) nonresponding threads (waiting for the IO opreation) right after the execution of the button_click?
(PS: I am not using the threadpool)
Oguzhan
EDIT
For further clarification,
I need to validate the given local administrator credentials on each ipAddress, and insert DB record for the successive ones. The rest, I dont care.
In my validation method, I first made a call to logonuser method of win32 by importing advapi32.dll for impersonating the administrator user. After that I attampt to create a temp dir on remote sys drive via Directory.CreateDirectory method just to check authorization. If any exception was thrown (UnauthorizedAccessException or IOException) then the remote machine is out of interest, else we got it so insert into DB.
I called the validation method syncronous way for a given ip range and it worked fine for a bunch of successive endpoints. But when I tested the method for some irrelevant ipAddress range, each validation attempt took 20 secs to 5 mins to complete.
I then turned my design into a multithreaded fashion in which I decided to run each validation in seperate thread and abort the nonresponding threads at the end of the treshold amount. The problem was the thread.abort did not suit the situation well, which in fact waits for the return of the IO instruction (which I dont want to) and raises a ThreadAbortException after that.
In order to complete the execution of the successive threads, I ignored the nonresponding threads and proceed with the DB operations and return back from the button click method (nonresponding threds were still alive at that point of time). Everyting seemd ok until I got a bad system crash after executing the button click (in debug mode) several times. The porblem was probably the increasing number of living threads under IIS service.
SOLUTION
Cause of the threads not to respond in a timely fashion is the network path does not found situation. My solution is to check the connectivity via TCP on port 135 (port 135 is mandatory for RPC on windows) before making the IO call. Default timeout period is 20 secs. If you need to set the timout use BeginConenct. An other option would be Pinging (if the ICMP is enabled in network)
Why do you really need to abort the threads anyway? Why not just let them complete normally but ignore the results? (Keep a token to indicate which "batch" of requests it's in, and then remember which batch you're actually interested in at the moment.)
Another option is to keep hold of whatever you're using to make an IO call (e.g. a socket) and close it; that should cause an exception on the thread making the request.
Yet another option is to avoid putting the requests on different threads, but instead use asynchronous IO - you'll still get parallelism, but without tying up threads (just IO completion ports). Also, can't you put a timeout on the IO operation itself? The request should just time out naturally that way.
Related
Does an HttpHandler listen for a disconnect from the browser?
My guess is "no" since it seems to be mostly/only used for dynamic file creation, so why would it?
But I can't find an answer in the docs or goog.
Many thanks in advance!
Background
I'd like to "abort" an HttpHandler because currently, I allow huge excel exports (~150k sql rows, so ~600k html lines). For reasons almost as ridiculous as the code, I have a query that fires for as many sql rows that the user tries to export. As you can imagine, this takes a very long time.
I think I'm getting backed up with worker processes because users probably get frustrated with the lag, and try again with a smaller result. I currently flush the worker procs automatically every 30 min, but I'd rather cleanup more quickly.
I don't have the time to clean up the sql right now, so I'd like to just listen for an "abort" from the client and kill the handler if "aborted".
What you're hoping to accomplish by listening for a client connection drop won't really help solve your problem at all. The core of your problem is a long running task being kicked off in an HttpHandler directly.
In this case, even if you could listen for a client disconnect it wouldn't ever be acted upon as your code will be too busy executing to listen for it.
The only way to properly determine progress and perform actions during long running processes such as this is to ensure that your code is multi-threaded. The problem with doing this in ASP.NET for long running processes is they'll suck up threads from the thread pool needed to serve your pages. This could result in your website hanging or responding very slowly, as you've been experiencing.
I would recommend writing a Windows Service to handle these long running jobs and having it spit the results into a staging directory. I would then use MSMQ or similar to throw the request to the service for processing.
Ultimately, you want to get this long running thread outside of ASP.NET where you can take advantage of the benefits that multi-threading can offer you. Such as, the ability to report back the progress and to abort when needed.
I need to setup an automated task that runs every minute and sends emails in the queue. I'm using ASP.NET 4.5 and C#. Currently, I use a scheduler class that starts in the global.asax and makes use of caching and cache callback. I've read this leads to several problems.
The reason I did it that way is because this app runs on multiple load balanced servers and this allows me to have the execution in one place and the code will run even if one or more servers are offline.
I'm looking for some direction to make this better. I've read about Quartz.NET but never used it. Does Quartz.NET call methods from the application? or from a windows service? or from a web service?
I've also read about using a Windows service, but as far as I can tell, those are installed to the server direct. The thing is, I need the task to execute regardless of how many servers are online and don't want to duplicate it. For example, if I have a scheduled task setup on server 1 and server 2, they would both run together therefore duplicating the requests. However, if server 1 was offline, I need server 2 to run the task.
Any advice on how to move forward here or is the global.asax method the best way for the multi-server environment? BTW, the web servers are running Win Server 2012 with IIS 8.
EDIT
In a request for more information, the queue is stored in a database. I should also make mention that the database servers are separate from the web servers. There are two database servers, but only one runs at a time. There is a central storage they both read from so there is only one instance of the database. When one database server goes down, the other comes online.
That being said, would it make more sense to put a Windows Service deployed to both database servers? That would make sure only one runs at a time.
Also, what are your thoughts about running Quartz.NET from the application? As millimoose mentions, I don't necessarily need it running on the web front end, however, doing so allows me to not deploy a windows service to multiple machines and I don't think there would be a performance difference going either way. Thoughts?
Thanks everyone for the input so far. If any additional info is needed, please let me know.
I have had to tackle the exact problem you're facing now.
First, you have to realize that you absolutely cannot reliably run a long-running process inside ASP.NET. If you instantiate your scheduler class from global.asax, you have no control over the lifetime of that class.
In other words, IIS may decide to recycle the worker process that hosts your class at any time. At best, this means your class will be destroyed (and there's nothing you can do about it). At worst, your class will be killed in the middle of doing work. Oops.
The appropriate way to run a long-lived process is by installing a Windows Service on the machine. I'd install the service on each web box, not on the database.
The Service instantiates the Quartz scheduler. This way, you know that your scheduler is guaranteed to continue running as long as the machine is up. When it's time for a job to run, Quartz simply calls a method on a IJob class that you specify.
class EmailSender : Quartz.IJob
{
public void Execute(JobExecutionContext context)
{
// send your emails here
}
}
Keep in mind that Quartz calls the Execute method on a separate thread, so you must be careful to be thread-safe.
Of course, you'll now have the same service running on multiple machines. While it sounds like you're concerned about this, you can actually leverage this into a positive thing!
What I did was add a "lock" column to my database. When a send job executes, it grabs a lock on specific emails in the queue by setting the lock column. For example, when the job executes, generate a guid and then:
UPDATE EmailQueue SET Lock=someGuid WHERE Lock IS NULL LIMIT 1;
SELECT * FROM EmailQueue WHERE Lock=someGuid;
In this way, you let the database server deal with the concurrency. The UPDATE query tells the DB to assign one email in the queue (that is currently unassigned) to the current instance. You then SELECT the the locked email and send it. Once sent, delete the email from the queue (or however you handle sent email), and repeat the process until the queue is empty.
Now you can scale in two directions:
By running the same job on multiple threads concurrently.
By virtue of the fact this is running on multiple machines, you're effectively load balancing your send work across all your servers.
Because of the locking mechanism, you can guarantee that each email in the queue gets sent only once, even though multiple threads on multiple machines are all running the same code.
In response to comments: There's a few differences in the implementation I ended up with.
First, my ASP application can notify the service that there are new emails in the queue. This means that I don't even have to run on a schedule, I can simply tell the service when to start work. However, this kind of notification mechanism is very difficult to get right in a distributed environment, so simply checking the queue every minute or so should be fine.
The interval you go with really depends on the time sensitivity of your email delivery. If emails need to be delivered ASAP, you might need to trigger every 30 seconds or even less. If it's not so urgent, you can check every 5 minutes. Quartz limits the number of jobs executing at once (configurable), and you can configure what should happen if a trigger is missed, so you don't have to worry about having hundreds of jobs backing up.
Second, I actually grab a lock on 5 emails at a time to reduce query load on the DB server. I deal with high volumes, so this helped efficiency (fewer network roundtrips between the service and the DB). The thing to watch out here is what happens if a node happens to go down (for whatever reason, from an Exception to the machine itself crashing) in the middle of sending a group of emails. You'll end up with "locked" rows in the DB and nothing servicing them. The larger the size of the group, the bigger this risk. Also, an idle node obviously can't work on anything if all remaining emails are locked.
As far as thread safety, I mean it in the general sense. Quartz maintains a thread pool, so you don't have to worry about actually managing the threads themselves.
You do have to be careful about what the code in your job accesses. As a rule of thumb, local variables should be fine. However, if you access anything outside the scope of your function, thread safety is a real concern. For example:
class EmailSender : IJob {
static int counter = 0;
public void Execute(JobExecutionContext context) {
counter++; // BAD!
}
}
This code is not thread-safe because multiple threads may try to access counter at the same time.
Thread A Thread B
Execute()
Execute()
Get counter (0)
Get counter (0)
Increment (1)
Increment (1)
Store value
Store value
counter = 1
counter should be 2, but instead we have an extremely hard to debug race condition. Next time this code runs, it might happen this way:
Thread A Thread B
Execute()
Execute()
Get counter (0)
Increment (1)
Store value
Get counter (1)
Increment (2)
Store value
counter = 2
...and you're left scratching your head why it worked this time.
In your particular case, as long as you create a new database connection in each invocation of Execute and don't access any global data structures, you should be fine.
You'll have to be more specific about your architecture. Where is the email queue; in memory or a database? If they exist on a database, you could have a flag column named "processing" and when a task grabs an email from the queue it only grabs emails that are not currently processing, and sets the processing flag to true for emails it grabs. You then leave concurrency woes to the database.
Before I go into this question, I d like to say that, I have read the threading modeling for IIS 7, 7, 7.5 so I know how threads are handled.
My application starts a thread when a request comes in.
We can assume the threads as cron jobs.
GET request comes in, Lets say /Handle
in the scope of /Handle I start a thread from that action , THREAD A
I am not long polling the GET request, so it returns back to the
user right away. So thread handling the GET is returned to the POOL
Then I wait until the thread A completes to do anything else.
So No threads are running as far as I know. Both the thread that was
handling the GET and THREAD A has exited.
I make the same request a few times SEQUANTIALLY. I always wait for both threads to exit.
After a while `Thread.Start()1 function blocks.
Questions :
I know that the threads are returning and I am not leaking any ghost threads.
Why does IIS not allowing me to start new threads after a like 4-5 requests. ?
What is the right way to create application thread for the user application.
If I said Thread t= new Thread(), does this allocate a thread from the pool that handled the GETS or CLR?
I am using IIS7.
I know that I exit each thread, I call a JOIN on THREAD A , and it never blocks, and at this point I am not worried about scalability so I always have ONE user hitting the server sequentially.
So to answer your question "What is the right way to create application thread for the user application?" (i.e. ASP.NET application) - You have many options:
run on the ASP.NET thread, without any threading - ASP.NET will still handle more then one request
use async calls (see async operations) for long running operations
use CLR ThreadPool
send a message to some other server (e.g. using WCF services), so the long running processing takes place outside the Web server.
You mentioned reading about threading in ASP.NET, but in "MSDN: Performing Asynchronous Work, or Tasks, in ASP.NET Applications" there's a relatively short description of how threading in ASP.NET works. At the end of the post, there's a question:
"Q4: Should I create my own threads (new Thread)?" and the answer for that question is "A4) Please don’t (create new Threads). Or to put it a different way, no!!! (...) ".
And to answer your question: "Why does IIS not allowing me to start new threads after a like 4-5 requests"?
That's really a strange behaviour, maybe IIS knows that your are doing it wrong ;)
Client application initiates a process on a server (via RIA, but the implementationis not important). When I say process, I mean business code running and not referring to an actual process running on the CPU.
Code is C#.
The client then checks to see the process status. Failed, completed, still running.
The basics are relatively easy to implement. I store a process Id statically on the server in which the client can poll the server regularily to check the process which will check the status associated with the process id.
Edge cases around this require a bit more work. Fatal and catastrophic reasons that abnormally abort the thread (process) without allowing the code to handle the exception and gracefully set the status associated with the process to fail. In this scenario, the client will continue to assume the process is still in progress.
I was thinking of running the process in a separate thread andf tracking the thread id. When the client calls the server to check process state, we could check the IsAlive property of the thread running the process.
I am wondering if there are any scenarios where this could be problematic? Maybe there could be a chance that IsAlive would return True although the thread is hung.
Another approach would be to have the process on the server periodically set a timestamp which could be used when the client checks the state. The code checking the state, could see how old the timestamp is and then based on whatever interval we chose (let's say 2 minutes) it could decide if the process is still running (2 minutes hasn't passed since last timestamp written), or process has timedout without exception (more than 2 minutes has passed since the thread wrote the last timestamp). All timestamps would be done in memory.
Are there any best practices surrounding this that could be beneficial? Does anyone have any special insight or tips on how to best approach this? I am also open to other scenarios or ideas people have?
I am sure the better approach is to have a fully controled server app which handles everything including exceptions for each business process and maintains state of all. and the even better if client receives StateChanged event rather than polling (with WCF duplex channels for example). what you do is 1999 basically. .NET gives you all the stuff almost for free. correct architecture actually faster to write and in time cheaper to support.
Here's the situation, I am writing the framework for a code war contest. As the code runs, for each turn, it calls a method in the library provided by each contestant. The rules of the contest is the method must return in 1 second or we kill the task calling them. We then use a default result for that turn.
The method has no support for a cancel because we cannot trust the called code to respond to a cancel. And we need to kill the thread because if we have 10 or 20 ignored background tasks then all calls going forward will provide fewer clock cycles on each call and methods that before took less than 1 second now take more.
On the plus side, the method we're killing should have no resources open, etc. so an abort should not leave anything hanging.
Update: Two things to keep in mind here. First, this is like a game - so performance is important. Second, the worker thread is unlikely to have any resources open. If one of the called methods goes overlong, I need to abort it and move on quickly.
You should run each contestant in his own AppDomain with low privileges. This has several advantages:
It's sandboxed
It can't interact with any other code in the process
Force unloading an AppDomain is relatively clean.
Even if you prefer killing the thread over unloading the AppDomain I'd still put each contestant into an AppDomain to get the isolation.
Unfortunately Thread.Abort is not enough. It still executes finally clauses which can take as long as they want.
I would recommend that you run the code in a second process and carefully define the interface for communicating with it to ensure that it can handle not receiving a response. Most operating systems are designed to clean up fairly well after a killing a process.
For communication, you should probably avoid .NET remoting, as that seems likely to be left in an inconsistent state on the server side. Some other choices: sockets, named pipes, web service.
Thread.Interrupt() method is maybe what you are looking for.
As the MSDN documentation says, "If this thread is not currently blocked in a wait, sleep, or join state, it will be interrupted when it next begins to block."
It is not an abort, it forces the running thread to throws ThreadInterruptedException when the thread enters in a wait state.
You can then use a timer in another thread with a timeout to check if the thread don't really want to terminate, if the thread refuses to terminate in, for example, 30 seconds, you can abort it.