WCF Service hard crashing Windows 2016

WCF Service hard crashing Windows 2016 - c#

I have a web service running in IIS-10 on a Windows Server 2016 instance within a VM Hypervisor. A separate scheduled task calls functions of that web service during off peak times in order to retrieve status updates from a third party system. The scheduled task breaks the items that need to have statuses pulled into small batches and calls a function that retrieves / updates the records in parallel via Tasks and gives a return once all Tasks have completed.
Sometimes (every third time?), during this scheduled task, the app pool that the service is running on hangs. Log4Net stops logging, requests to the service do not get a response, IIS logging for the service is not updated with requests. There are no errors recorded in either my logs or in the windows event logs. When this happens, the app pool will stay in this state indefinitely. If I recycle the App Pool that the service is running on, the service will respond normally for ~30 seconds, and then the server will do a hard restart.
After the restart the event logs show the below error:
The computer has rebooted from a bugcheck. The bugcheck was: 0x00000139 (0x0000000000000003, 0xffffd60019506680, 0xffffd600195065d8, 0x0000000000000000).
The dmp file that is generated shows the same error code and identifies the file as ntoskrnl.exe.
All drivers are fully up to date. I have made sure all tasks and requests have timeouts. I have increased server resources past the point where that could be the cause. I have adjusted the batch size of items being processed.
I am out of troubleshooting ideas and would appreciate any help I can get.

I figured I would close this out in case anyone else has this very specific issue.
Digging through the dump, BHDRVX64.SYS (Symantec Antivirus) was on the stack immediately before the crash.
A 4 days later Symantec pushed an update https://support.symantec.com/en_US/article.INFO4367.html with a fix for the issue.
** If you hit a similar issue to this, start by uninstalling antivirus and seeing if the issue persists. After that, work through the list of kernel level processes returned by the 'fltmc' command in admin command prompt.

Related

100 % CPU by IIS process by Service Model-4 module after the wcf request has passed

I have a problem I'm fighting for a week now. I have a WCF service running in IIS 8.5 on Windows Server 2012 R2 and a windows service client who is making one or two requests at each 30 seconds. At some point (usually withing two hours of the service running) one of the requests is causing the service app pool (separated from other app pools) process to gain CPU usage. In IIS worker process section can be seen that this request never ends and is hanging in ServiceModel-4 module in AuthenticateRequest state (i.e most likely it is in infinite loop somewhere). At some point another such request is added to the first one, until they become four, staying forever and causing 100 % CPU usage (there are 4 logical processors on the machine). What I did to investigate , fix this problem:
used wcf tracing and custom logging to determine where the problem is. Wcf tracing actually shows all the requests made to the server passed succesfully in milliseconds (!) (at the same time wcf tracing on the client side shows of course time out on the same requests). Custom logging also is showing that the service code is calling returtn of the requested operation. The result of the method are two simple dto objects, so no possible serialization issue and also there are no enpoint behaviors or wathever custom code which is execting before sending reply from the service (except the method code, which, as I mentioned returns successfully).
used iis failed request tracing which shows the request reaching the ServiceModel-4 without continuing with the following information:
ModuleName : ServiceModel-4.0
Notification: AUTHENTICATE_REQUEST
HttpStatus: 500
HttpReason: Internal Server Error
HttpSubStatus: 0
ErrorCode: The operation completed succesfully (0x0)
used Debug Diag for tracing requests continuing more than 10 minutes and saw the threads which are running long time. The stack trace is as follows:
or as follows:
I've seen these are called from iis process. Since thiese are .Net function I suspected first corrupted .Net installation, moreover there were both .Net4.5 and .Net4 installed on the server (which I don't know how exactly could happen). So:
I deinstalled .Net4 and From windows features on/off i turned off .Net4.5 features, restarted and after that i turned them on, restarted, without success
after that I by same way reinstalled the IIS (from Windows features). Again no success.
Does not have any more ideas.

it seems I have found the answer (but havent used Dot Trace or other tools). There was an access to a Generic Dictionary from multiple threads. This seems to be a known problem:
https://blogs.msdn.microsoft.com/tess/2009/12/21/high-cpu-in-net-app-using-a-static-generic-dictionary/
https://blogs.msdn.microsoft.com/asiatech/2009/05/11/asp-net-application-100-cpu-caused-by-system-collections-generic-dictionary/
Actually I noticed this problem in the beggining of the research but ruled it out, because i couldn't reproduce it (probably because I havent't testing the dictionary in iis app, of course I received various exceptions, but not a 100 % Cpu) and mainly because all logs showed that the code, accessing the dictionary has passed, also the stack trace above has nothing to do with the dictionary.
However I think that the problem happened during the serialization of this dictionary (which is data contract) which explains the logged information.
Still cannot explain how this exactly is happening. If anyone can explain it I think it will be a good knowledge for everyone.

First webservice response is slow even with Application Initialization module installed

A test WCF webservice that I have hosted using IIS 7.5 is consistently slow to respond to calls made after a period of inactivity (i.e. the first call of each day).
From researching this topic I gather that the problem of "application warmup" is commonly encountered when using IIS (e.g. see here).
I have taken the usual steps that are recommended to try and mitigate this problem:
Installed the Application Initialization Module.
Disabled the application pool Idle Time-out, and the Regular Recycling Time Interval (i.e. set them to '0').
Edited the applicationhost.config file so that autoStart=True and startMode="alwaysRunning" for the necessary app pool, and preloadEnabled="true" for my application.
With these settings, I expect the application pool to immediately spin up a worker process when IIS is started, and spin up a new worker process when the existing one exits. Also, I expect the application to be loaded within the worker process.
However, for the first call of each day, the logs show the difference in time between the client making a call, and the webservice receiving the call, can be as much as 10 seconds. Subsequent calls are typically handled in well under 2 seconds.
Curiously, the long response time is not reproduced by making a call following an iisreset command. I would expect that such a heavy-handed operation would put the webservice in a similarly "cold" situation, however this does not seem to be the case.
I would like to know:
What could still be causing the delay in the application "warming up"?
What is the difference in the state of the webservice following iisreset and a long period of inactivity?
Should I resort to a "heart beat" solution to regularly ping the service to keep it alive?
Thanks in advance for any tips or insight.

I'll try to help with you questions:
What could still be causing the delay in the application "warming up"?
Warm up an application does not mean warm up its resources. For instance, if you configure Auto-start with Application Fabric in your WCF application (https://msdn.microsoft.com/en-us/library/ee677260(v=azure.10).aspx), and this application access database using EF, it will not initiate your DBContext.
If you want these resources initialized after your application warmed up, you need to implement a method to initialize your resources, like cache, DBContext, etc.
What is the difference in the state of the webservice following iisreset and a long period of inactivity?
When the application spend long time of inactivity, probably the application pool goes down and it is restarted when it receives any request, like a recycle does.
This link has interest information about the difference between iisreset and application pool recycle, and it can help to answer your question: https://fullsocrates.wordpress.com/2012/07/25/iisreset-vs-recycling-application-pools/
Should I resort to a "heart beat" solution to regularly ping the service to keep it alive?
If you keep on accessing your service, it will probably keep its resources initialized in memory, so can be a good approach.
Anyway, if your Application Pool is configured to recycle in some interval time, it will be recycled and your resources in memory lost.
If it looks problem to you, just turn off this feature going to IIS -> Application Pool -> Advanced settings and set Regular Time Interval=0
For this issue, it's just some suggestions, you need to make some tests and find out the better solution.

Windows 2008 R2 Standard scheduled task stopped working - Last run result 0x2

I have been scratching my head over this for a few days and can't get to the bottom of it.
I have a scheduled task that runs a batch file every morning. The batch file starts a windows service which calls a web service on another server which then performs various tasks after which the service is stopped.
This has been working without issue for the last few months, but starting last week, every morning the scheduled task is triggered at the appointed time, but it's not starting the service and the Last Run Result is 0x2.
I've tried just about everything I could think of, checked 'Run whether user is logged in or not', 'Run with highest privileges'. I've turned on History for scheduled tasks and everything seems to run fine. There are no errors in Event Viewer nor is the service throwing any exceptions. Running the batch file manually executes just fine.
In the end I deleted and recreated the scheduled task which resolved the issue. Today this error started occurring again. I've been unable to get solid info on what exactly 0x2 means. Does any one have any more information on what could be the cause of this issue?
I've come to believe that the issue lies with the server and not the service. I exported the service from Task Scheduler and imported again with no changes and it's been running fine over the weekend. Apparently some server maintenance was carried out around the time that the issue started occurring so will investigate further with our server management department

0x2 result means that the file could not be found.
Make sure that when the Task starts (which is apparently the case), the batch file exists and is accessible.
See System errors code here on MSDN.

As explained here 0x2 could also mean "Access denied". So maybe you are facing a permission issue.
You can try to configure the Scheduled Task to Run with highest privileges in the Properties dialog of the task (General tab).

Quartz on a lightly loaded server

I have several important Quartz events that MUST go off at specific times of the night. Lately I have been noticing that not all the events are run. I have a feeling that overnight our server load is very light (ie. zero users) and that the web server kind of goes to sleep, and hence so does Quartz. Does this seem plausible? I am using Quartz.net within the web server, and not as a separate service.

Yes, that is plausible. In general, it is considered a bad practice to have IIS run scheduled tasks, that is a job better left for a Windows Service, or the built-in Windows Scheduled tasks (which has been much improved for Windows Server 2008).
Your worker process might have been shut down because there are no load. By default, IIS shuts down worker processes after 20 minutes without ingoing requests (you can alter this in the Application Pool settings). Also, worker processes are likely to be restarted after a certain amount of time or requests, or if they consume too much memory.
A quick-fix for your specific problem might be to use the Windows Task Scheduler to request the site periodically to keep it alive - or have it request a URL that triggers your task at the predefined time.

ASP.Net, by default, will shut down AppDomain's after a period of inactivity.
The recommended course of action is to implement timed events either in a Windows Service or as an executable launched from Window's Scheduled Tasks.
It is also possible to change the IIS configuration so that it will not shut down your AppDomain. How exactly this is accomplished varies between versions of IIS, but instructions can easily be found by searching.

Service In Perpetual "Starting" Status

I've written a windows service in C# that converts wav files into mp3 and then stores them on a remote server. On my development rig (OS: WinXP SP3) the service starts up fine and runs the way it's supposed to.
When I installed it the production machine (OS: WinServer 2000), upon starting the service it fails to start in a timely fashion, and remains in a constant "Starting" status. The program is clearly working though since the files are being converted and transferred.
My hunch is that the problem is in the timer component, I think that on the Windows 2000 Server machine, the timer may be causing the system to register the program as "Starting".
Is there something I'm missing about Windows Server 2000?

I'm not familiar with writing services in .NET, but in general, the thread that is used to start the service and report its initial status should not be the same thread that performs the actual work. The service should spawn a worker thread so the entry point can return status to the SCM quickly.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.