How to make window service to restart on error

How to make window service to restart on error - c#

I am having MSMQ on windows 2008. Messages are available in private queue. I have one WCF subscriber (written in C#) which is installed as windows service. Now problem is that sometimes the WCF subscriber stops picking messages from Queue. If I restart service again it works fine. Now I attached IError Handler to log the reason and exception.
Now to Handle this issue what I wanted to do is, I will set the recovery property to restart service on first failure and now problem is how to throw the error from HandleError() method of IErrorHandler class?
Please tell me best way to throw an exception in a window service so it can be restarted.

While it is probably better to address the underlying cause of your exceptions, it is certainly valid in certain scenarios to implement a fail fast methodology. Indeed, this ability to kill processes which have become "flawed" in some manner is critical to the concept of fault tolerance.
So, to make a windows service commit suicide:
void KillSelf()
{
try
{
// Code to close open connections/dispose
// of unmanaged resources etc
...
}
finally
{
Environment.Exit(1);
}
}
Service recovery options should be set to restart automatically. This will ensure your service comes straight back up again.

As far as I know one cannot throw an exception to restart a windows service.
I usually encapsulate a try catch (with logging) to prevent any exceptions crashing the service, which is the opposite to what you are suggesting.
It may be that you can catch an error and stop the service (not sure) and configure the service to restart if it stops?

Related

How to trigger failure in a windows service?

For every windows service you can define "recover policy" indicating what to do in case of a failure.
This works for when the service fail to start, but how can I trigger "failure" if the service started successfully but did something wrong while running? I tried to throw exception but the service controller just hides it, I tried to set exit code to 1 and call Stop() but it just stops gracefully..
Is there a way to tell windows "this service has crashed and want to use the recovery policy" from code and after it had successfully started? If its not possible to trigger the recover policy after started that's OK, but whats the best way to stop while indicating windows there was an error?
Thanks,

Do I need to close the connection to a WCF service from a console that exits straight away?

I have a console app that I want to do a "fire-and-forget" call to a WCF service, and then close down without waiting for a response. It is just supposed to initiate a cleanup job. The job can take several hours to finish, so I don't want the console app to stay open and wait for it.
I have added "IsOneWay=true" to the methods in the contract, but the console app still waits for the task to finish before doing client.close() and exiting.
If I remove the client.Close() then the console app works the way I want, but I am not sure if the channel will remain open even though the console app is not running anymore?
Here is my console app code:
static void Main(string[] args)
{
Console.WriteLine("Starting Cleanup");
var client = new IntegrationWcfServiceClient(EndPointConfigurationName);
try
{
client.ExecuteCleanup();
//client.Close();
}
catch (Exception ex)
{
client.Abort();
WriteLineRed($"Couldn't start cleanup: {ex.Message}");
return;
}
WriteLineGreen("Cleanup started successfully");
}
And here is the operation contract code:
[OperationContract(IsOneWay = true)]
void ExecuteCleanup();

There are few things you need to consider while making oneway call.
From the book programming WCF services.
Ideally, when the client calls a one-way method, it should be blocked only for the
briefest moment required to dispatch the call. However, in reality, one-way calls do not equate to asynchronous calls. When one-way calls reach the service, they may not be
dispatched all at once but may instead be buffered on the service side to be dispatched
one at a time, according to the service’s configured concurrency mode behavior
Although one-way operations do not return values or exceptions from the service itself,
it’s wrong to perceive them as a one-way street or a “black hole” from which nothing
can come out. The client should still expect exceptions from a one-way call, and can
even deduce that the call failed on the service. When dispatching a one-way operation,
any error because of communication problems (such as a wrong address or the host
being unavailable) will throw an exception on the side of the client trying to invoke the
operation.
If I remove the client.Close() then the console app works the way I want, but I am not sure if the channel will remain open even though the console app is not running anymore?
A one-way call is not fire-and-forget in nature, since the client can discover
that something went wrong on the service during a one-way invocation.
Here you are tring to invoke invoke a one-way operation asynchronously and hence you are not able to close the connection or proxy.
[OperationContract(IsOneWay = true,AsyncPattern = true)]
IAsyncResult ExecuteCleanup(AsyncCallback callback,object asyncState);
client.ExecuteCleanup(,null,null);
Note:If you dont want to complicate things ,then make sure ExecuteCleanup is the last call in your service and later you can close which will not affect later operations.
Possible implementation How to properly close a client proxy (An existing connection was forcibly closed by the remote host)?

NServiceBus Windows-Service: Service can't be stopped

I have the problem, that since some time, we can't stop a NServiceBus Windows-Service. If we try to, we get this exception:
Error 1061: The service cannot accept control messages at this time.
Unfortunately, I really didn't find anything about this matter but this Github-Issue: https://github.com/Particular/NServiceBus/issues/1898
Sadly, this doesn't help, since we need in fact the IConfigureThisEndpoint Interface, to configure the BusConfiguration, which also isn't that long running. We also use almost the exact same template for other NServiceBus-Endpoints, which don't have any problems.
Interesting enough, it worked also for this Endpoint for quite some time and it also seem also to be a problem only for one specific Server.
Is there a possibility to find more about the exception, be it from Microsoft or NServiceBus?

Graceful shutdown
This error can be caused for a lot of reasons. NServiceBus will only try to perform a graceful shutdown.
NServiceBus will not abort messages that are currently being processed but will stop processing new messages.
Log file
The log file should indicate that a shutdown is triggered so that is the first thing you can verify.
I would advise to set the log level to DEBUG to help diagnose the shutdown sequence and to also add this:
var appDomainLogger = LogManager.GetLogger("AppDomain");
var appDomain = AppDomain.CurrentDomain
appDomain.FirstChanceException += (o, ea) => {
appDomainLogger.Debug("FirstChanceException", ea.Exception);
};
appDomain.UnhandledException += (o, ea) => {
appDomainLogger.Debug("UnhandledException", ea.ExceptionObject as Exception);
};
It could be that exceptions occur that prevent shutdown and this adds additional diagnostics.
Long running messages
If for example a message is being processed that is waiting for a lock on a database to be released then this message could take more time then the windows service interface allows.
Other tasks like converting media files
Eventually, the windows service should shutdown if all resources are freed and messages are done processing unless they contain a bug that prevents shutdown.
Disposing of resources
Also, during shutdown the container is disposed too. It might be that your container has resources that have lots of cleanup/teardown to do. For example, resources that flush in-memory caches to disk or remote storage so the next time the services is started it can startup faster then normal.

Lync UserEndpoint appears online to users but can't be called

I have a Lync 2013-based application which:
connects to a UserEndpoint (hereinafter CallCenter)
redirects calls made to CallCenter according to bla bla bla business logic.
At times, a user will see CallCenter in their standard Lync 2013 Client as Online, but if that user attempts to start an IM call with CallCenter, the user receives the message "We couldn't send this message because CallCenter is unavailable or offline."
I haven't been able to identify the process that leads up to this, but if it's happened to one user, then all of the other users experience the same problem when attempting to call CallCenter. The only way I have been able to recover CallCenter has been to restart my application. Regular interaction with CallCenter then resumes without a problem.
If CallCenter is indeed "unavailable or offline", then why does it's Presence appear as "Online"? Is there a need to renew / keep CallCenter's connection alive every so often?
For reference, I connect CallCenter like so:
UserEndpointSettings settings = new UserEndpointSettings(userURI, _ProxyHost, _ProxyPort);
settings.AutomaticPresencePublicationEnabled = true;
settings.Presence.UserPresenceState = PresenceState.UserAvailable;
_userEndpoint = new UserEndpoint(_Platform.CollabPlatform, settings);
_userEndpoint.BeginEstablish(res =>
{
try
{
_userEndpoint.EndEstablish(res);
_userEndpoint.StateChanged += new EventHandler<LocalEndpointStateChangedEventArgs>(_userEndpoint_StateChanged);
}
catch (Exception ex)
{
LogError(ex, ErrorReference.EndpointEstablishFailed);
}
}, null);

In the client, when you go offline or experience an error, your presence reflects that (most of the time, that is). This can lead you to believe that the status portion of presence [1] is somehow tied to actual availability.
When you're working with UCMA, you are given ultimate control over everything related to your endpoint. As you've seen, you can make your UCMA application do things that would otherwise be impossible in the regular client. You don't have to publish any presence status (leaving you "offline" to your users), yet the service can still send/receive IMs. And, as you've seen, your service can be "Available" and yet ... have no capability to do anything but publish its status [2].
If you fail to wire up the appropriate modality (in your case IM), or your application encounters an exception which results in a particular modality no longer working (I suspect this may be your actual problem), the status of your service will still be available.
Begin/EndTerminate on the UserEndpoint should publish Offline for you automatically and publishing a presence other than Available is the only way to guarantee the presence won't be "Available" for the lifetime of your application (and even after the application ends/dies prematurely, though this is sometimes rectified by the server -- sometimes).
Here's how I'd attack resolving this issue. Ignore the presence problem and ignore the error. They're red herrings. Many problems result in the "unavailable or offline" message that have nothing to do with the service actually being stopped.
Instead, figure out why your calls aren't connecting.
If the call takes a while before you receive the error, check for deadlocks or circumstances where the Thread Pool has no room for another thread. Troubleshooting involves reviewing your code for race conditions and the myriad of other things that multi-threaded applications throw your way. If the IMCall fails instantly, check around the parts that handle incoming calls. In the latter case, your subscription may be gone (too many causes to list here, most of which are .Net related, not UCMA related), or your service may be dead.
If the importance of presence to your application is only to show it as "available" or "offline" when it is actually able to send/receive an IM, you're going to want to ensure your application terminates the endpoint properly during tear-down (including in the case of a critical failure: catch-terminate-rethrow or whatever is appropriate in your case).
[1] Be careful when thinking about the term "presence" as it relates to Lync. Presence contains availability status, modality specific states, capabilities (IM/Voice, etc), the "note" and contact information.
[2] This seems like a bizarre thing to do, however, it gave me the ability to use an ApplicationEndpoint to report on the availability of a web service (unrelated to Lync) that I wanted to be able to view in the Mobile client without connecting via VPN. When doing something like this, it's really important to publish the capabilities of your endpoint -- this will explicitly signal to your connected clients what your service can and cannot do.
[Final Footnote] There are a few ways to publish presence. The mechanism you're using to publish is the simplest and most logical to use if you're just interested in telling your users that the "service is here"/"service is not here" which is documented rather well here: Simplified Presence Publication for Endpoints

My EventWaitHandle says "Access to the path is denied", but its not

Quick summary with what I now know
I've got an EventWaitHandle that I created and then closed. When I try to re-create it with this ctor, an "Access to the path ... is denied" exception is thrown. This exception is rare, most of the times it just re-creates the EventWaitHandle just fine. With the answer posted below (by me), I'm able to successfully call EventWaitHandle.OpenExisting and continue on in the case that an exception was thrown, however, the ctor for EventWaitHandle should have done this for me, right? Isn't that what the out parameter, createdNew is for?
Initial question
I've got the following architecture, a windows service and a web service on the same server. The web service tells the windows service that it has to do work by opening and setting the wait handle that the windows service is waiting on.
Normally everything is flawless and I'm able to start / stop the windows service without any issue popping up. However, some times when I stop the web service and then start it up again, it will be completely unable to create the wait handle, breaking the whole architecture.
I specifically need to find out what is breaking the event wait handle and stop it. When the wait handle "breaks", I have to reboot windows before it will function properly again and thats obviously not ideal.
UPDATE: Exception thrown & Log of Issue
I rebooted the windows service while the web service was doing work in hopes of causing the issue and it did! Some of the class names have been censored for corporate anonymity
12:00:41,250 [7] - Stopping execution due to a ThreadAbortException
System.Threading.ThreadAbortException: Thread was being aborted.
at System.Threading.Thread.SleepInternal(Int32 millisecondsTimeout)
at OurCompany.OurProduct.MyClass.MyClassCore.MonitorRequests()
12:00:41,328 [7] - Closing Event Wait Handle
12:00:41,328 [7] - Finally block reached
12:00:42,781 [6] - Application Start
12:00:43,031 [6] - Creating EventWaitHandle: Global\OurCompany.OurProduct.MyClass.EventWaitHandle
12:00:43,031 [6] - Creating EventWaitHandle with the security entity name of : Everyone
12:00:43,078 [6] - Unhandled Exception
System.UnauthorizedAccessException: Access to the path 'Global\OurCompany.OurProduct.MyClass.EventWaitHandle' is denied.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.Threading.EventWaitHandle..ctor(Boolean initialState, EventResetMode mode, String name, Boolean& createdNew, EventWaitHandleSecurity eventSecurity)
at OurCompany.OurProduct.MyClassLibrary.EventWaitHandleFactory.GetNewWaitHandle(String handleName, String securityEntityName, Boolean& created)
at OurCompany.OurProduct.MyClassLibrary.EventWaitHandleFactory.GetNewEventWaitHandle()
at OurCompany.OurProduct.MyClass.MyClassCore..ctor()
Rough timeline:
11:53:09,937: The last thread on the web service to open that existing wait handle, COMPLETED its work (as in terminated connection with the client)
12:00:30,234: The web service gets a new connection, not yet using the wait handle. The thread ID for this connection is the same as the thread ID for the last connection at 11:53
12:00:41,250: The windows service stops
12:00:42,781: The windows service starts up
12:00:43,078: The windows service finished crashing
12:00:50,234: The web service was actually able to open the wait handle call Set() on it without any exception thrown etc.
12:02:00,000: I tried rebooting the windows service, same exception
12:36:57,328: After arbitrarily waiting 36 minutes, I was able to start the windows service up without a full system reboot.
Windows Service Code
Initialization:
// I ran into security issues so I open the global EWH
// and grant access to Everyone
var ewhSecurity = new EventWaitHandleSecurity();
ewhSecurity.AddAccessRule(
new EventWaitHandleAccessRule(
"Everyone",
EventWaitHandleRights.Synchronize | EventWaitHandleRights.Modify,
AccessControlType.Allow));
this.ewh = new EventWaitHandle(
false,
EventResetMode.AutoReset,
#"Global\OurCompany.OurProduct.MyClass.EventWaitHandle",
out created,
ewhSecurity);
// the variable "created" is logged
Utilization:
// wait until the web service tells us to loop again
this.ewh.WaitOne();
Disposal / closing:
try
{
while (true)
{
// entire service logic here
}
}
catch (Exception e)
{
// should this be in a finally, instead?
if (this.ewh != null)
{
this.ewh.Close();
}
}
Web Service Code
Initialization:
// NOTE: the wait handle is a member variable on the web service
this.existing_ewh = EventWaitHandle.OpenExisting(
#"Global\OurCompany.OurProduct.MyClass.EventWaitHandle");
Utilization:
// wake up the windows service
this.existing_ewh.Set();
Since the EventWaitHandle is a member variable on the web service, I don't have any code that specifically closes it. Actually, the only code that interacts with the EventWaitHandle on the web service is posted above.
Looking back, I should probably have put the Close() that is in the catch block, in a finally block instead. I probably should have done the same for the web service but I didn't think that it was needed.
At any rate, can anyone see if I'm doing anything specifically wrong? Is it crucially important to put the close statements within a finally block? Do I need to manually control the Close() of the existing_ewh on the web service?
Also, I know this is a slightly complex issue so let me know if you need any additional info, I'll be monitoring it closely and add any needed information or explanations.
Reference material
EventWaitHandleSecurity Class
EventWaitHandleAccessRule Class
EventWaitHandle Class

In the code that creates the wait handle on the windows service, if it fails (as in access denied), you could try to "open an existing wait handle" via
EventWaitHandle.OpenExisting(
#"Global\OurCompany.OurProduct.MyClass.EventWaitHandle",
EventWaitHandleRights.Synchronize | EventWaitHandleRights.Modify);
Though, I'm not entirely sure if the behaviour would stay the same at that point.
Note: I'd appreciate feedback. Its a potential answer so I'm answering my own question, again, plenty of comments are quite welcome!
Note 2: Amazingly, applying EventWaitHandleRights.FullControl instead of the above flags (Synchronize + Modify) doesn't work well. You must use the sample above.

MSDN says:
UnauthorizedAccessException - The named event exists and has access control security, but the user does not have EventWaitHandleRights.FullControl.
and
The caller has full control over the newly created EventWaitHandle object even if eventSecurity denies or fails to grant some access rights to the current user.
Your service has no rights to get the existing event via EventWaitHandle constructor. (EventWaitHandleRights.FullControl is not specified. And your named event exists while it has opened handles on it.) You can open the existing event using EventWaitHandle.OpenExisting.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.