We have an application which read messages from IBM MQ Topic and interact with users via SignalR WebSockets.
Case:
Open iis asp.net application web.config
Change and save it (this causing appdomain restart)
Repeate step 2 10 times
After that we can see many Application_Start/Dispose events in logs but at ONE of appdomain restart iterations haven't Dispose call. Cause that out IBM MQ listener handling message from old AppDomain therefore we have duplicate handling and business errors.
It seems like something constraint AppDomain from unload.
I know what it's very hard to say what's there happening, but maybe anybody knows how can we trace that problem.
Disable Overlapped Recycle is true
Shutdown Time Limit is 3s
what I have do in similar situation - on global.asax use this call
void Application_End(object sender, EventArgs e)
{
// here signaling the listener to close - and wait until they do
// also raise the shutdown time limit to more than 3 seconds, give them time to close
}
Related
I have the problem, that since some time, we can't stop a NServiceBus Windows-Service. If we try to, we get this exception:
Error 1061: The service cannot accept control messages at this time.
Unfortunately, I really didn't find anything about this matter but this Github-Issue: https://github.com/Particular/NServiceBus/issues/1898
Sadly, this doesn't help, since we need in fact the IConfigureThisEndpoint Interface, to configure the BusConfiguration, which also isn't that long running. We also use almost the exact same template for other NServiceBus-Endpoints, which don't have any problems.
Interesting enough, it worked also for this Endpoint for quite some time and it also seem also to be a problem only for one specific Server.
Is there a possibility to find more about the exception, be it from Microsoft or NServiceBus?
Graceful shutdown
This error can be caused for a lot of reasons. NServiceBus will only try to perform a graceful shutdown.
NServiceBus will not abort messages that are currently being processed but will stop processing new messages.
Log file
The log file should indicate that a shutdown is triggered so that is the first thing you can verify.
I would advise to set the log level to DEBUG to help diagnose the shutdown sequence and to also add this:
var appDomainLogger = LogManager.GetLogger("AppDomain");
var appDomain = AppDomain.CurrentDomain
appDomain.FirstChanceException += (o, ea) => {
appDomainLogger.Debug("FirstChanceException", ea.Exception);
};
appDomain.UnhandledException += (o, ea) => {
appDomainLogger.Debug("UnhandledException", ea.ExceptionObject as Exception);
};
It could be that exceptions occur that prevent shutdown and this adds additional diagnostics.
Long running messages
If for example a message is being processed that is waiting for a lock on a database to be released then this message could take more time then the windows service interface allows.
Other tasks like converting media files
Eventually, the windows service should shutdown if all resources are freed and messages are done processing unless they contain a bug that prevents shutdown.
Disposing of resources
Also, during shutdown the container is disposed too. It might be that your container has resources that have lots of cleanup/teardown to do. For example, resources that flush in-memory caches to disk or remote storage so the next time the services is started it can startup faster then normal.
I have a Windows service that spawns a set of child activities on separate threads and that should only terminate when all those activities have successfully completed. I do not know in advance how long it might take to terminate an activity after a stop signal is received. During OnStop(), I wait in intervals for that stop signal and keep requesting additional time for as long as the system is willing to grant it.
Here is the basic structure:
class MyService : ServiceBase
{
private CancellationTokenSource stopAllActivities;
private CountdownEvent runningActivities;
protected override void OnStart(string[] args)
{
// ... start a set of activities that signal runningActivities
// when they stop
// ... initialize runningActivities to the number of activities
}
protected override void OnStop()
{
stopAllActivities.Cancel();
while (!runningActivities.Wait(10000))
{
RequestAdditionalTime(15000); // NOTE: 5000 added for overhead
}
}
}
Just how much "overhead" should I be adding in the RequestAdditionalTime call? I'm concerned that the requests are cumulative, instead of based on the point in time when each RequestAdditionalTime call is made. If that's the case, adding overhead could result in the system eventually denying the request because it's too far out in the future. But if I don't add any overhead then my service could be terminated before it has a chance to request the next block of additional time.
This post wasn't exactly encouraging:
The MSDN documentation doesn’t mention this but it appears that the value specified in RequestAdditionalTime is not actually ‘additional’ time. Instead, it replaces the value in ServicesPipeTimeout. Worse still, any value greater than two minutes (120000 milliseconds) is ignored, i.e. capped at two minutes.
I hope that's not the case, but I'm posting this as a worst-case answer.
UPDATE: The author of that post was kind enough to post a very detailed reply to my comment, which I've copied below.
Lars, the short answer is no.
What I would say is that I now realise that Windows Services ought to be designed to start and terminate processing quickly when requested to do so.
As developers, we tend to focus on the implementation of the processing and then package it up and deliver it as a Windows Service.
However, this really isn’t the correct approach to designing Windows Services. Services must be able to respond quickly to requests to start and stop not only when an administrator making the request from the services console but also when the operating system is requesting a start as part of its start up processing or a stop because it is shutting down,
Consider what happens when Windows is configured to shut down when a UPS signals that the power has failed. It’s not appropriate for the service to respond with “I need a few more minutes…”.
It’s possible to write services that react quickly to stop requests even when they implement long running processing tasks. Usually a long running process will consist of batch processing of data and the processing should check if a stop has been requested at the level of the smallest unit of work that ensures data consistency.
As an example, the first service where I found the stop timeout was a problem involved the processing of a notifications queue on a remote server. The processing retrieved a notification from the queue, calling a web service to retrieve data related to the subject of the notification, and then writing a data file for processing by another application.
I implemented the processing as a timer driven call to a single method. Once the method is called it doesn’t return until all the notifications in the queue have been processed. I realised this was a mistake for a Windows Service because occasionally there might be tens of thousands of notifications in the queue and processing might take several minutes.
The method is capable of processing 50 notifications per second. So, what I should have done was implement a check to see if a stop had been requested before processing each notification. This would have allowed the method to return when it has completed the processing of a notification but before it has started to process the next notification. This would have ensured that the service responds quickly to a stop request and any pending notifications remained queued for processing when the service is restarted.
I am having MSMQ on windows 2008. Messages are available in private queue. I have one WCF subscriber (written in C#) which is installed as windows service. Now problem is that sometimes the WCF subscriber stops picking messages from Queue. If I restart service again it works fine. Now I attached IError Handler to log the reason and exception.
Now to Handle this issue what I wanted to do is, I will set the recovery property to restart service on first failure and now problem is how to throw the error from HandleError() method of IErrorHandler class?
Please tell me best way to throw an exception in a window service so it can be restarted.
While it is probably better to address the underlying cause of your exceptions, it is certainly valid in certain scenarios to implement a fail fast methodology. Indeed, this ability to kill processes which have become "flawed" in some manner is critical to the concept of fault tolerance.
So, to make a windows service commit suicide:
void KillSelf()
{
try
{
// Code to close open connections/dispose
// of unmanaged resources etc
...
}
finally
{
Environment.Exit(1);
}
}
Service recovery options should be set to restart automatically. This will ensure your service comes straight back up again.
As far as I know one cannot throw an exception to restart a windows service.
I usually encapsulate a try catch (with logging) to prevent any exceptions crashing the service, which is the opposite to what you are suggesting.
It may be that you can catch an error and stop the service (not sure) and configure the service to restart if it stops?
I am creating a Windows Service app that I would like to have programmatically pause when either a system error, odbc connection, or missing file error occur while . I was wondering if anyone knows how to do this? The Windows service app uses an odbc connection and datareader to connect to an MS Access database and an Oracle table, so there are the probable errors that I would be handling with those, I just want to allow a pause for the user handle the errors if/when they occur.
ServiceController service = new ServiceController(serviceName);
TimeSpan timeout = TimeSpan.FromMilliseconds(timeoutValue);
service.Pause(); //or whatever you want here.
sevice.WaitForStatus(ServiceControllerStatus.Paused, timeout);
...
Then to restart, do the same thing except for
service.Continue();
sevice.WaitForStatus(ServiceControllerStatus.Running, timeout);
You can do this for any state you want. Check out the msdn documentation by googling SeviceController. It will be the first result returned.
Also, you will need to handle the OnPause and OnContinue events in your service.
Have you tried?
System.Threading.Thread.Sleep(1000); // sleep for 1 second
Adjust the 1000 to 1000 times however long you want it to sleep in seconds.
Assuming that your service has a continual loop that checks for data, add a check to an external source for pause/continue commands. This source can be a message queue like MSMQ or a database table.
I implemented something along like this by having my service continually check a table for commands, and reporting its status in another table. When it gets a start command it launches a processing loop on another thread. A stop command causes it to signal the thread to gracefully exit. The service core never stops running.
The user interacts via a separate app with a UI that lets them view the service's status and submit commands. Since the app does its control via a database it doesn't have to run on the same machine that the service is running on.
Quick summary with what I now know
I've got an EventWaitHandle that I created and then closed. When I try to re-create it with this ctor, an "Access to the path ... is denied" exception is thrown. This exception is rare, most of the times it just re-creates the EventWaitHandle just fine. With the answer posted below (by me), I'm able to successfully call EventWaitHandle.OpenExisting and continue on in the case that an exception was thrown, however, the ctor for EventWaitHandle should have done this for me, right? Isn't that what the out parameter, createdNew is for?
Initial question
I've got the following architecture, a windows service and a web service on the same server. The web service tells the windows service that it has to do work by opening and setting the wait handle that the windows service is waiting on.
Normally everything is flawless and I'm able to start / stop the windows service without any issue popping up. However, some times when I stop the web service and then start it up again, it will be completely unable to create the wait handle, breaking the whole architecture.
I specifically need to find out what is breaking the event wait handle and stop it. When the wait handle "breaks", I have to reboot windows before it will function properly again and thats obviously not ideal.
UPDATE: Exception thrown & Log of Issue
I rebooted the windows service while the web service was doing work in hopes of causing the issue and it did! Some of the class names have been censored for corporate anonymity
12:00:41,250 [7] - Stopping execution due to a ThreadAbortException
System.Threading.ThreadAbortException: Thread was being aborted.
at System.Threading.Thread.SleepInternal(Int32 millisecondsTimeout)
at OurCompany.OurProduct.MyClass.MyClassCore.MonitorRequests()
12:00:41,328 [7] - Closing Event Wait Handle
12:00:41,328 [7] - Finally block reached
12:00:42,781 [6] - Application Start
12:00:43,031 [6] - Creating EventWaitHandle: Global\OurCompany.OurProduct.MyClass.EventWaitHandle
12:00:43,031 [6] - Creating EventWaitHandle with the security entity name of : Everyone
12:00:43,078 [6] - Unhandled Exception
System.UnauthorizedAccessException: Access to the path 'Global\OurCompany.OurProduct.MyClass.EventWaitHandle' is denied.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.Threading.EventWaitHandle..ctor(Boolean initialState, EventResetMode mode, String name, Boolean& createdNew, EventWaitHandleSecurity eventSecurity)
at OurCompany.OurProduct.MyClassLibrary.EventWaitHandleFactory.GetNewWaitHandle(String handleName, String securityEntityName, Boolean& created)
at OurCompany.OurProduct.MyClassLibrary.EventWaitHandleFactory.GetNewEventWaitHandle()
at OurCompany.OurProduct.MyClass.MyClassCore..ctor()
Rough timeline:
11:53:09,937: The last thread on the web service to open that existing wait handle, COMPLETED its work (as in terminated connection with the client)
12:00:30,234: The web service gets a new connection, not yet using the wait handle. The thread ID for this connection is the same as the thread ID for the last connection at 11:53
12:00:41,250: The windows service stops
12:00:42,781: The windows service starts up
12:00:43,078: The windows service finished crashing
12:00:50,234: The web service was actually able to open the wait handle call Set() on it without any exception thrown etc.
12:02:00,000: I tried rebooting the windows service, same exception
12:36:57,328: After arbitrarily waiting 36 minutes, I was able to start the windows service up without a full system reboot.
Windows Service Code
Initialization:
// I ran into security issues so I open the global EWH
// and grant access to Everyone
var ewhSecurity = new EventWaitHandleSecurity();
ewhSecurity.AddAccessRule(
new EventWaitHandleAccessRule(
"Everyone",
EventWaitHandleRights.Synchronize | EventWaitHandleRights.Modify,
AccessControlType.Allow));
this.ewh = new EventWaitHandle(
false,
EventResetMode.AutoReset,
#"Global\OurCompany.OurProduct.MyClass.EventWaitHandle",
out created,
ewhSecurity);
// the variable "created" is logged
Utilization:
// wait until the web service tells us to loop again
this.ewh.WaitOne();
Disposal / closing:
try
{
while (true)
{
// entire service logic here
}
}
catch (Exception e)
{
// should this be in a finally, instead?
if (this.ewh != null)
{
this.ewh.Close();
}
}
Web Service Code
Initialization:
// NOTE: the wait handle is a member variable on the web service
this.existing_ewh = EventWaitHandle.OpenExisting(
#"Global\OurCompany.OurProduct.MyClass.EventWaitHandle");
Utilization:
// wake up the windows service
this.existing_ewh.Set();
Since the EventWaitHandle is a member variable on the web service, I don't have any code that specifically closes it. Actually, the only code that interacts with the EventWaitHandle on the web service is posted above.
Looking back, I should probably have put the Close() that is in the catch block, in a finally block instead. I probably should have done the same for the web service but I didn't think that it was needed.
At any rate, can anyone see if I'm doing anything specifically wrong? Is it crucially important to put the close statements within a finally block? Do I need to manually control the Close() of the existing_ewh on the web service?
Also, I know this is a slightly complex issue so let me know if you need any additional info, I'll be monitoring it closely and add any needed information or explanations.
Reference material
EventWaitHandleSecurity Class
EventWaitHandleAccessRule Class
EventWaitHandle Class
In the code that creates the wait handle on the windows service, if it fails (as in access denied), you could try to "open an existing wait handle" via
EventWaitHandle.OpenExisting(
#"Global\OurCompany.OurProduct.MyClass.EventWaitHandle",
EventWaitHandleRights.Synchronize | EventWaitHandleRights.Modify);
Though, I'm not entirely sure if the behaviour would stay the same at that point.
Note: I'd appreciate feedback. Its a potential answer so I'm answering my own question, again, plenty of comments are quite welcome!
Note 2: Amazingly, applying EventWaitHandleRights.FullControl instead of the above flags (Synchronize + Modify) doesn't work well. You must use the sample above.
MSDN says:
UnauthorizedAccessException - The named event exists and has access control security, but the user does not have EventWaitHandleRights.FullControl.
and
The caller has full control over the newly created EventWaitHandle object even if eventSecurity denies or fails to grant some access rights to the current user.
Your service has no rights to get the existing event via EventWaitHandle constructor. (EventWaitHandleRights.FullControl is not specified. And your named event exists while it has opened handles on it.) You can open the existing event using EventWaitHandle.OpenExisting.