What happens when an Azure Scheduled WebJob throws an unhandled exception?

What happens when an Azure Scheduled WebJob throws an unhandled exception? - c#

My case is that the job will run for about twelve hours every 60 seconds and then it will just stop and I have to manually start it again. My web job is just a plain CLI that is not using the SDK. It seems as if my web job is being disabled if I have an unhandled exception and then I have to manually start it again. Is this a correct behavior?

As we found in the comments, the root issue was that the Web App did not have the Always On option enabled. After enabling it, the issue went away.

Related

IIS app pool crashing on Azure load-balanced VMs

We have a new ASP.NET website running on a pair of load balanced Azure VMs. The website is fairly simple and uses Kentico CMS. Twice in the 24 hours since going live the application pool on both web servers has suddenly stopped (within 5-10 minutes of each other) causing 503: Service unavailable errors.
Looking at Windows system logs I see the error which caused the problem:
Application pool '[[NAME]]' is being automatically disabled due to a
series of failures in the process(es) serving that application pool.
Leading up to this are a series of warnings:
A process serving application pool '[[NAME]]' suffered a fatal
communication error with the Windows Process Activation Service. The
process id was '[[PROCESS ID]]'. The data field contains the error
number.
Evidently this is IIS's rapid-fail protection kicking in. What's not clear is how to find the cause of this "fatal communication error".
After some web searching I've installed the Debug Diagnostics Tool which has helped me identify that in every case the relevant process was the IIS worker process (w3wp.exe). This tool is new to me and unfortunately the only time the problem occurred since I installed it, no dumps were generated. However, its logs contain a lot of messages like this:
First chance exception - 0xe0434352 caused by thread with System ID:
[[ID]]
The frustrating thing is that I don't know what steps to take to replicate the error conditions. It never occurred in UAT in a very similar environment, even under load test. Here are some facts about my setup:
ASP.NET version = 4.5.2
Application pool running with identity set to a domain account with modify permission on the website directory
Application set with max one worker process
Any advice much appreciated.
* UPDATE 1 *
I now have DebugDiag dump generated by the "fatal communication error" warning event. Dump summary reads:
Dump Summary
------------
Process Name: w3wp.exe : C:\Windows\SysWOW64\inetsrv\w3wp.exe
Process Architecture: x86
Exception Code: 0xC00000FD
Exception Information: The thread used up its stack.
Heap Information: Present

In the end I tracked this down to a bug in my code. Under very edge-case circumstances the CMS was returning an empty Guid instead of an actual ID which was causing a stack overflow in a recursive method.
The 0xC00000FD exception code I posted above is actually a stack overflow exception, so once I knew that and downloaded the Debug Diagnostcs dump file I was able to replicate the crash scenario locally. That tool, by the way, is incredibly powerful and was able to demonstrate the exact conditions of the crash.
All I can say to people who arrive here with similar issue is - firstly, don't assume the issue is not with your code! And secondly, use Debug Diagnostcs.

First of all, what is your app pool regular recycle time interval setting & overlapping setting in IIS? - If these incidents occur when the recycling is scheduled and overlapping is disabled, this behavior is to be expected. Even when overlapping is enabled, I'd guess that it is somewhat connected to automatic recycling of app pool since both instances are impacted in cca the same time & it occurs twice a day and it can cause logging the warning you mentioned (Here you might find how to disable logging this warning in case it is caused by automatic recycling)
If that leads nowhere, you can find more details about the warning event here:
IIS Application Pool Availability
And about the Debug Diagnostcs tools here:
How to use the Debug Diagnostics tool to troubleshoot an IIS process that stops unexpectedly

C# MVC app on Azure: Site works fine when freshly deployed but fails after being left idle

I have spent a good bit of time researching this but have not found anybody else reporting the issue that I am having.
I deploy my site and everything is good. I share the link with my QA and UI peeps and everything works for them. But if the site goes for a period of time, such as overnight, without being accessed then the web application is unable to start and reports "Method 'get_CurrentUser' in type ... does not have an implementation." Stopping and restarting the site does not resolve the issue but if I simply re-upload the main .dll file (the exact same version of the .dll) to the site then everything works fine again. For a while, at any rate.
Why would it be okay when newly published but fail after sitting for some time? Any suggestions on troubleshooting this would be greatly appreciated.

IIS has introduced Suspend action for idle timeout, when site is idle for specified duration, IIS will suspend process, god knows what happens exactly but looks like it just hibernates the process. And waking up may not work, so you can change this action to terminate for idle timeout.
Second option is to keep site awake by using pingdom or site24x7 services.

Could it help to try Azure Web App and set Always On?
http://azure.microsoft.com/updates/azure-web-sites-adds-always-on/

It looks like I found a solution.
The app start event in global.asax included "var assemblies = System.Web.Compilation.BuildManager.GetReferencedAssemblies();" in order to make sure that everything loads, but that line was -after- the call that was triggering all of the dependency registration. I cut/pasted that line so that it happens first and the site came up fine when I tried it this morning.

Window Service status it is appearing as running, But not working

I am not quite sure, if I can have an answer to this question but it will be nice to know some suggestions about it.
I have a windows service with two threads. It was working perfectly fine for a time but it seems that it stopped working on last week. When I checked the service status it is appearing as running and startup type is automatic. But service didn't pick the data from service queue. There is no error log and I think thread got stopped, but don't why?

Yes, We found while deployment of Application , there was some dependecy of service on windows, which cause these issue.
We fixed the issue by each time reset the IIS and stop all services and then reinstall them each time.

In our case , we found issue due to incorrect exception handling.
When Installer install the service but in case of any request , which cause any crash while execution, then service is reflected as running state but stop working.

Windows 2008 R2 Standard scheduled task stopped working - Last run result 0x2

I have been scratching my head over this for a few days and can't get to the bottom of it.
I have a scheduled task that runs a batch file every morning. The batch file starts a windows service which calls a web service on another server which then performs various tasks after which the service is stopped.
This has been working without issue for the last few months, but starting last week, every morning the scheduled task is triggered at the appointed time, but it's not starting the service and the Last Run Result is 0x2.
I've tried just about everything I could think of, checked 'Run whether user is logged in or not', 'Run with highest privileges'. I've turned on History for scheduled tasks and everything seems to run fine. There are no errors in Event Viewer nor is the service throwing any exceptions. Running the batch file manually executes just fine.
In the end I deleted and recreated the scheduled task which resolved the issue. Today this error started occurring again. I've been unable to get solid info on what exactly 0x2 means. Does any one have any more information on what could be the cause of this issue?
I've come to believe that the issue lies with the server and not the service. I exported the service from Task Scheduler and imported again with no changes and it's been running fine over the weekend. Apparently some server maintenance was carried out around the time that the issue started occurring so will investigate further with our server management department

0x2 result means that the file could not be found.
Make sure that when the Task starts (which is apparently the case), the batch file exists and is accessible.
See System errors code here on MSDN.

As explained here 0x2 could also mean "Access denied". So maybe you are facing a permission issue.
You can try to configure the Scheduled Task to Run with highest privileges in the Properties dialog of the task (General tab).

How to find the deadlock in ASP.Net Website

I am using .Net 2.0 and my site seems to reach the deadlock state at certain period. It stops working until I recycle the application pool or change something in web.config file. I think deadlock is causing this issue.
I am wondering if there is any tool to debug/check the site to find the code that could be causing the deadlock.
Right now I had to set recycling interval to 10 minutes which is really bad but it is the only way to solve the problem and there is a lot of codes on the site and I need to find the problem. If I use DOS attack tool, can I find the page/code block that is causing this issue? If I can, what is the best tool to test it?
Cheers!
EDIT
I tried to check the Event Logs and found the following warning. I don't know if it is issue will keep digging now.
Exception information:
Exception type: HttpException
Exception message: Request timed out.

Check the event log
Turn on Health Monitoring

If you use the 'Failed Request Tracing' and it'll produce a nice output which will then tell you what is causing the error, down to the module level. This will then give you the first step into where it's breaking down.
Have a read of this article on iis.net → Troubleshooting Failed Requests Using Tracing in IIS 7

I would attach visual studio to IIS and break the debugger when a deadlock occurs. You can then inspect the call stack of the running threads.
Code Project has a nice article on how to do IIS remote debugging.
Of course, you can very well set up up a test machine with a local IIS and local Visual Studio .NET and do this without the need to remotely debug.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.