Does Server.ClearError() prevent IIS Rapid Fail Protection - c#

In our ASP.Net application we usually try to handle all our exceptions by catching them in relevant places to give the end user useful error messages, but some exceptions are impossible for us to catch due the place they are thrown.
This is an issue to our server setup since we want to keep the IIS Rapid Fail Protection working as intended, and all errors to be written to our custom error log. So to avoid unexpected resets of the server and flooding our error log, I have added some code in Global.asax.cs to suppress certain kinds of errors. At the moment we are looking at two kinds of HttpExceptions thrown by the IIS itself, to prevent too long URLs (based on the maxUrlLength setting), and to prevent faulty WebResource or ScriptResource requests. These are impossible for us to prevent due to some webcrawlers generating them.
What I'm interested in knowing, that is difficult for me to find info on anywhere is:
Can the referenced HttpExceptions even potentially cause the Rapid
Fail Protection to restart the server? I'm told that any uncaught
exception can cause it, but it seems illogical to me that this kind
of exception should be able to cause it.
If I call Server.ClearError() in the Application_Error() event, is that enough to suppress errors that could cause a rapid fail protection restart?
Or is it already too late at this point? Since we're already in the
process of responding to an unhandled exception.

The Rapid-Fail Protection (here, RFP) feature is meant to protect the system from application pools and worker processes that are not starting properly or are failing often. These issues could be caused by your application(s) or an IIS worker process. The official (albeit old) list of causes can be found here.
Not directly. If the logic that is attempting to handle the error fails, the worker process could crash. This would trigger RFP. Usually, this will not happen because IIS will try to handle an exception in Application_Error.
If your application has gracefully handled the exception in Application_Error, then it stops there. Your exception was "unhandled" at the application level, but IIS was able to handle it (usually serving the "yellow screen of death"). Therefore, the worker process is still healthy and RFP will not be triggered.
I have seen an IIS worker process crash under the following conditions:
Recursive call results in an infinite loop.
Insufficient system resources to process a request (out of memory or memory limit reached).

Related

Global exception handler in ASP.Net Core (Not UI)

I am working on an application that will run in Kubernetes. Kubernetes relies on the application to know if it is healthy or not.
So, I need to know when I get a critical exception thrown. By "Critical", I mean Out of Memory, Stack Overflow, etc. Things that mean that the container should be killed.
I have seen things in ASP.Net Core that allow you to show an error page when an exception happens, but I need this to happen with both UI and Web API applications. And I don't really want it to interact with my UI at all (on the ones that have a UI ).
Is there an event (or something similar) that is raised when an exception was thrown in an ASP.Net Core application?
A .NET application will not be able to handle “critical” issues like memory issues or stack overflows in a way that it can report about its own health. There are basically two possible outcomes with unexpected errors: The application can handle the problem, in which case the ASP.NET Core application is expected to work properly for future requests, or the process terminates abruptly.
Observing the latter should be done from the outside. You can do this for example by checking if the process is still alive in your container.
Another option would be to employ health checks which is a way for an ASP.NET Core application to report about its own health:
Health probes can be used by container orchestrators and load balancers to check an app's status. For example, a container orchestrator may respond to a failing health check by halting a rolling deployment or restarting a container. A load balancer might react to an unhealthy app by routing traffic away from the failing instance to a healthy instance.
So your container orchestrator could check whether the ASP.NET Core application is still able to respond to a health probe, and if it isn’t assume that the application crashed in some way or another, requiring a container restart.

Stop website because of startup errors

If my web service runs into some initialization error during startup, such as not being able to connect to its database, is it possible for me to stop the website? This will make it easier for administrators to see that something has gone wrong without checking the log files and would analogous to terminating an application early if the command line parameters are wrong. One more caveat, I can't use Microsoft.Web.Administration.
You could raise a specific type of Exception for those fatal errors and handle it on Application_Error.
When those catastrophic events happen and are handled there you could just generate an App_Offline.htm that'll bring the website down without phisically stopping it on IIS.
This will make your website return 503 Service Unavailable which is probably the most semantically correct way of a web service communicating it's down because of a catastrophic event.

ASP application causing Error 500 and causing server to stop frequently

The ASP application running on the sql server is causing to stop the IIS server very frequently. The cause it shows in the Error log is:
"A significant part of sql server process memory has been paged out.This may result in a performance degradation."
Is there any tool which can identify the fault in the web application?
No. You might be able to play with some settings to get your apps to not crash but in the end, if you have reached your bandwidth cap, you are stuck.
There might not actually be any fault in the web application. Both IIS and SQL Server eat a lot of memory. Source, SQL Server eats ram for lunch
There might not be anything wrong, you might just be running too much on one machine. You will have to provide an actual error or problem. Because right now, our only answer can be to leverage the admin tools, and get more memory.
I have found the cause to my problem. For each Url redirection, I used the syntax Response.Redirect("/NewPage.aspx"); and this would continue the process even after creating the child process. The fix was: Response.Redirect("/NewPage.aspx", false); This would terminate the process right after calling a child process. That saved a lot of memory used by each process!

IIS app pool crashing on Azure load-balanced VMs

We have a new ASP.NET website running on a pair of load balanced Azure VMs. The website is fairly simple and uses Kentico CMS. Twice in the 24 hours since going live the application pool on both web servers has suddenly stopped (within 5-10 minutes of each other) causing 503: Service unavailable errors.
Looking at Windows system logs I see the error which caused the problem:
Application pool '[[NAME]]' is being automatically disabled due to a
series of failures in the process(es) serving that application pool.
Leading up to this are a series of warnings:
A process serving application pool '[[NAME]]' suffered a fatal
communication error with the Windows Process Activation Service. The
process id was '[[PROCESS ID]]'. The data field contains the error
number.
Evidently this is IIS's rapid-fail protection kicking in. What's not clear is how to find the cause of this "fatal communication error".
After some web searching I've installed the Debug Diagnostics Tool which has helped me identify that in every case the relevant process was the IIS worker process (w3wp.exe). This tool is new to me and unfortunately the only time the problem occurred since I installed it, no dumps were generated. However, its logs contain a lot of messages like this:
First chance exception - 0xe0434352 caused by thread with System ID:
[[ID]]
The frustrating thing is that I don't know what steps to take to replicate the error conditions. It never occurred in UAT in a very similar environment, even under load test. Here are some facts about my setup:
ASP.NET version = 4.5.2
Application pool running with identity set to a domain account with modify permission on the website directory
Application set with max one worker process
Any advice much appreciated.
* UPDATE 1 *
I now have DebugDiag dump generated by the "fatal communication error" warning event. Dump summary reads:
Dump Summary
------------
Process Name: w3wp.exe : C:\Windows\SysWOW64\inetsrv\w3wp.exe
Process Architecture: x86
Exception Code: 0xC00000FD
Exception Information: The thread used up its stack.
Heap Information: Present
In the end I tracked this down to a bug in my code. Under very edge-case circumstances the CMS was returning an empty Guid instead of an actual ID which was causing a stack overflow in a recursive method.
The 0xC00000FD exception code I posted above is actually a stack overflow exception, so once I knew that and downloaded the Debug Diagnostcs dump file I was able to replicate the crash scenario locally. That tool, by the way, is incredibly powerful and was able to demonstrate the exact conditions of the crash.
All I can say to people who arrive here with similar issue is - firstly, don't assume the issue is not with your code! And secondly, use Debug Diagnostcs.
First of all, what is your app pool regular recycle time interval setting & overlapping setting in IIS? - If these incidents occur when the recycling is scheduled and overlapping is disabled, this behavior is to be expected. Even when overlapping is enabled, I'd guess that it is somewhat connected to automatic recycling of app pool since both instances are impacted in cca the same time & it occurs twice a day and it can cause logging the warning you mentioned (Here you might find how to disable logging this warning in case it is caused by automatic recycling)
If that leads nowhere, you can find more details about the warning event here:
IIS Application Pool Availability
And about the Debug Diagnostcs tools here:
How to use the Debug Diagnostics tool to troubleshoot an IIS process that stops unexpectedly

How to find the deadlock in ASP.Net Website

I am using .Net 2.0 and my site seems to reach the deadlock state at certain period. It stops working until I recycle the application pool or change something in web.config file. I think deadlock is causing this issue.
I am wondering if there is any tool to debug/check the site to find the code that could be causing the deadlock.
Right now I had to set recycling interval to 10 minutes which is really bad but it is the only way to solve the problem and there is a lot of codes on the site and I need to find the problem. If I use DOS attack tool, can I find the page/code block that is causing this issue? If I can, what is the best tool to test it?
Cheers!
EDIT
I tried to check the Event Logs and found the following warning. I don't know if it is issue will keep digging now.
Exception information:
Exception type: HttpException
Exception message: Request timed out.
Check the event log
Turn on Health Monitoring
If you use the 'Failed Request Tracing' and it'll produce a nice output which will then tell you what is causing the error, down to the module level. This will then give you the first step into where it's breaking down.
Have a read of this article on iis.net → Troubleshooting Failed Requests Using Tracing in IIS 7
I would attach visual studio to IIS and break the debugger when a deadlock occurs. You can then inspect the call stack of the running threads.
Code Project has a nice article on how to do IIS remote debugging.
Of course, you can very well set up up a test machine with a local IIS and local Visual Studio .NET and do this without the need to remotely debug.

Categories

Resources