Assume that you have a multi-threaded Windows service which performs lots of different operations which takes a fair share of time, e.g. extracting data from different data stores, parsing said data, posting it to an external server etc. Operations may be performed in different layers, e.g. application layer, repository layer or service layer.
At some point in the lifespan of this Windows service you may wish to shut it down or restart it by way of services.msc, however if you can't stop all operations and terminate all threads in the Windows service within the timespan that services.msc expects to be done with the stop procedure, it will hang and you will have to kill it from Task Manager.
Because of the issue mentioned above, my question is as follows: How would you implement a fail-safe way of handling shutdown of your Windows service? I have a volatile boolean that acts as a shutdown signal, enabled by OnStop() in my service base class, and should gracefully stop my main loop, but that isn't worth anything if there is an operation in some other layer which is taking it's time doing whatever that operation is up to.
How should this be handled? I'm currently at a loss and need some creative input.
I would use a CancellationTokenSource and propagate the cancellation token from the OnStop method down to all layers and all threads and tasks started there. It's in the framework, so it will not break your loose coupling if you care about that (I mean, wherever you use a thread/Task you also have `CancellationToken' available.
This means you need to adjust your async methods to take the cancellation token into consideration.
You should also be aware of ServiceBase.RequestAdditionalTime. In case it is not possible to cancel all tasks in due time, you can request an extension period.
Alternatively, maybe you can explore the IsBackground alternative. All threads in your windows service that have this enabled are stopped by the CLR when the process is about to exit:
A thread is either a background thread or a foreground thread.
Background threads are identical to foreground threads, except that
background threads do not prevent a process from terminating. Once all
foreground threads belonging to a process have terminated, the common
language runtime ends the process. Any remaining background threads
are stopped and do not complete.
After more research and some brainstorming I came to realise that the problems I've been experiencing were being caused by a very common design flaw regarding threads in Windows services.
The design flaw
Imagine you have a thread which does all your work. Your work consists of tasks that should be run again and again indefinitely. This is quite often implemented as follows:
volatile bool keepRunning = true;
Thread workerThread;
protected override void OnStart(string[] args)
{
workerThread = new Thread(() =>
{
while(keepRunning)
{
DoWork();
Thread.Sleep(10 * 60 * 1000); // Sleep for ten minutes
}
});
workerThread.Start();
}
protected override void OnStop()
{
keepRunning = false;
workerThread.Join();
// Ended gracefully
}
This is the very common design flaw I mentioned. The problem is that while this will compile and run as expected, you will eventually experience that your Windows service won't respond to commands from the service console in Windows. This is because your call to Thread.Sleep() blocks the thread, causing your service to become unresponsive. You will only experience this error if the thread blocks for longer than the timeout configured by Windows in HKLM\SYSTEM\CurrentControlSet\Control\WaitToKillServiceTimeout, because of this registry value this implementation may work for you if your thread is configured to sleep for a very short period of time and does it's work in an acceptable period of time.
The alternative
Instead of using Thread.Sleep() I decided to go for ManualResetEvent and System.Threading.Timer instead. The implementation looks something like this:
OnStart:
this._workerTimer = new Timer(new TimerCallback(this._worker.DoWork));
this._workerTimer.Change(0, Timeout.Infinite); // This tells the timer to perform the callback right now
Callback:
if (MyServiceBase.ShutdownEvent.WaitOne(0)) // My static ManualResetEvent
return; // Exit callback
// Perform lots of work here
ThisMethodDoesAnEnormousAmountOfWork();
(stateInfo as Timer).Change(_waitForSeconds * 1000, Timeout.Infinite); // This tells the timer to execute the callback after a specified period of time. This is the amount of time that was previously passed to Thread.Sleep()
OnStop:
MyServiceBase.ShutdownEvent.Set(); // This signals the callback to never ever perform any work again
this._workerTimer.Dispose(); // Dispose of the timer so that the callback is never ever called again
The conclusion
By implementing System.Threading.Timer and ManualResetEvent you will avoid your service becoming unresponsive to service console commands as a result of Thread.Sleep() blocking.
PS! You may not be out of the woods just yet!
However, I believe there are cases in which a callback is assigned so much work by the programmer that the service may become unresponsive to service console commands during workload execution. If that happens you may wish to look at alternative solutions, like checking your ManualResetEvent deeper in your code, or perhaps implementing CancellationTokenSource.
Related
I am developing a Windows Service application, in .NET, which executes many functions (it is a WCF service host), and one of the targets is running scheduled tasks.
I chose to create a System.Threading.Timer for every operation, with a dueTime set to the next execution and no period to avoid reentrancy.
Every time the operation ends, it changes the dueTime to match the next scheduled execution.
Most of the operations are scheduled to run every minute, not all toghether but delayed by some seconds each other.
Now, after adding a number of operations, about 30, it seems that the timers start to be inaccurate, starting the operations many seconds late, or even minutes late.
I am running the operation logic directly in the callback method of the timer, so the running thread should be the same as the timer.
Should I create a Task to run the operation instead of running it in the callback method to improve accuracy?
Or should I use a single timer with a fixed (1 second) dueTime to check which operations need to be started?
I don't like this last option because it would be more difficult to handle reentrancy..
Timers fire on a thread pool thread, so you are probably finding that as you add lots of timers that you are exhausting the thread pool.
You could increase the size of the thread pool, or alternatively ensure you have fewer timers than the thread pool size.
Firing off Tasks from the callback likely won't help - since you are going to be fighting for threads from the same thread pool. Unless you use long-running tasks.
We usually setup multiple timers to handle different actions within a single service. We set the intervals and start, stop the timer on the Service Start/Stop/Shutdown events (and have a variable indicating the status for each one, i.e. bool Stopped)
When the timer ticks over, we stop the timer, run the processing (which may take a while depending on the process, i.e. may take longer than the interval if its short.. (this code needs to be in a try--catch so it keeps going on errors)
After the code has processed, we check the Stopped variable and if its not stopped we start the timer again (this handles the reentrancy that you've mentioned and allows the code to stick to the interval as much as possible)
Timers are generally more accurate after about 100ms as far as I know, but should be close enough for what you want to do.
We have run this concept for years, and it hasn't let us down.
If you running these tasks as a sub-system of an ASP.NET app, you should also look at HangFire, which can handle background processing, eliminating the need for the windows service.
How accurate do the timers need to be? you could always use a single timer and run multiple processing threads at the same time? or queue the calls to some operations if less critical.
Ok, I came to a decision: since I am not able to easily reproduce the behavior, I chose to solve the root problem and use the Service process to only:
serve WCF requests done by clients
schedule operations (which was problematic)
Every operation that could eat CPU is executed by another process, which is controlled directly by the main process (with System.Diagnostics.Process and its events) and communicates with it through WCF.
When I start the secondary process, I pass to it the PID of the main process through command line. If the latter gets killed, the Process.Exited event fires, and I can close the child process too.
This way the main service usually doesn't use much CPU time, and is free to schedule happily without delays.
Thanks to all who gave me some advices!
I want to design an application that provides some bi-directional communcation between two otherwise completely separate systems. (bridge)
One of them can call my application via web services - the other one is a piece of 3rd party hardware. It speaks RS232. Using a RS232<->ETH transceiver we manage to talk to the piece of hardware using TCP.
The program has the following requirements.
There is a main thread running the "management service". This might be a WCF endpoint or a self-hosted webapi REST service for example. It provides methods to start new worker instances or get a list of all worker instances and their respective states.
There are numerous "worker" threads. Each of them has a state model with 5 states.
In one state for example a TCP listener must be spawned to accept incoming connections from a connected hardware device (socket based programming is mandatory). Once it gets the desired information it sends back a response and transitions into the next state.
It should be possible from the main (manager) thread to (gracefully) end single worker threads (for example if the worker thread is stuck in a state where it cannot recover from)
This is where I am coming from:
I considered WCF workflow services (state model activity) however I wasn't sure how to spawn a TcpListener there - and keep it alive. I do not need any workflow "suspend/serialize and resume/deserialize" like behavior.
The main thread is probably not that much of a concern - it just has to be there and running. It's the child (background) threads and their internal state machine that worry me.
I tried to wrap my mind around how Tasks might help here but I ended up thinking threads are actually a better fit for the task
Since there has been a lot of development in .NET (4+) I am not sure which approach to follow... the internet is full of 2005 to 2010 examples which are probably more than just outdated. It is very difficult to separate the DOs from the DONTs.
I'm glad for any hints.
UPDATE: Okay I'll try to clarify what my question is...
I think the easiest way is to provide some pseudo code.
public static void Main()
{
// Start self-hosted WCF service (due to backwards compatibility, otherwise I'd go with katana/owin) on a worker thread
StartManagementHeadAsBackgroundThread();
// Stay alive forever
while(running)
{
// not sure what to put here. Maybe Thread.Sleep(500)?
}
// Ok, application is shutting down => somehow "running" is not true anymore.
// One possible reason might be: The management service's "Shutdown()" method is being called
// Or the windows service is being stopped...
WaitForAllChildrenToReachFinalState();
}
private static void StartManagementHeadAsBackgroundThread()
{
ThreadStarter ts = new ThreadStarter(...);
Thread t = new Thread(ts);
t.Start();
}
The management head (= wcf service) offers a few methods
StartCommunicator() to start new worker threads doing the actual work with 5 states
Shutdown() to shut down the whole application, letting all worker threads finish gracefully (usually a question of minutes)
GetAllCommunicatorInstances() to show a summary of all worker threads and the current state they are in.
DestroyCommunicatorInstance(port) to forcefully end a worker thread - for example if communicator is stuck in a state where it cannot recover from.
Anyway I need to spawn new background threads from the "management" service (StartCommunicator method).
public class Communicator
{
private MyStateEnum _state;
public Communicator(int port)
{
_state = MyStateEnum.Initializing;
// do something
_state = MyStateEnum.Ready;
}
public void Run()
{
while(true)
{
// again a while(true) loop?!
switch(_state):
{
case MyStateEnum.Ready:
{
// start TcpListener - wait for TCP packets to arrive.
// parse packets. If "OK" set next good state. Otherwise set error state.
}
}
if(_state == MyStateEnum.Error) Stop();
break;
}
}
public void Stop()
{
// some cleanup.. disposes maybe. Not sure yet.
}
}
public enum MyStateEnum
{
Initializing, Ready, WaitForDataFromDevice, SendingDataElsewhere, Done, Error
}
So the question is whether my approach will take me anywhere or if I'm completely on the wrong track.
How do I implement this best? Threads? Tasks? Is while(true) a valid thing to do? How do I interact with the communicator instances from within my "management service"? What I am looking for is an annotated boiler plate kind of solution :)
My suggestion would be to use a ASP.NET Web API service and mark the Controller actions as async. From that point on, use Tasks as much as possible and when you end up with blocking IO the HTTP server will not be blocked. Personally, I would avoid Threads until you are absolutely sure that you can't achieve the same thing with Tasks.
I would recommend looking at using a thread pool. This will help with managing resources and make for more efficient use of the resources.
As far as terminating threads, thread pool threads are background workers and will be terminated when your service stops, however, from your description above that is not sufficient. Your threads should always have the ability to receive a message asking them to terminate.
How can a thread blocking method like WaitOne method exposed by AutoResetEvent not take up resources (CPU etc.)?
I would imagine that such a method would simply have a while loop like:
public void WaitOne()
{
while(IsSet == false)
{
// some code to make the thread sleep
}
// finally call delegate
}
But that's clearly wrong, since it will make the CPU spin. So what's the secret behind all this black magic?
The method is implemented in the kernel. For each thread that isn't ready to run, Windows keeps a list of all the waitable objects (events, etc.) that the thread is waiting on. When a waitable object is signalled, Windows checks if it can wake up any of the waiting threads. No polling required.
This channel9 talk has a lot of information about how it works:
http://channel9.msdn.com/shows/Going+Deep/Arun-Kishan-Farewell-to-the-Windows-Kernel-Dispatcher-Lock/
Typically, these concepts rely on underlying operating system event constructs to wake up the suspended thread once the event is triggered (or a timeout occurs if applicable). Thus, the thread is in a suspended state and not consuming CPU cycles.
That said, there are other variations of wait in other event types, some of which attempt to spin for a few cycles before suspending the thread in case the event is triggered either before or quickly after the call. There are also some lightweight locking primitives that DO perform spins waiting for a trigger (like SpinWait) but they must be used with care as long waits can drive up the CPU.
The AutoResetEvent and ManualResetEvent take advantage of OS functions. See CreateEvent for more information on this topic.
I am working on a project in C#.NET using the .NET framework version 3.5.
My project has a class called Focuser.cs which represents a physical device, a telescope focuser, that can communicate with a PC via a serial (RS-232) port. My class (Focuser) has properties such as CurrentPosition, CurrentTemperature, ect which represents the current conditions of the focuser which can change at any time. So, my Focuser class needs to continually poll the device for these values and update its internal fields. My question is, what is the best way to perform this continual polling sequence? Occasionally, the user will need to switch the device into a different mode which will require the ability to stop the polling, perform some action, and then resume polling.
My first attempt was to use a time that ticks every 500ms and then calls up a background worker which polls for one position and one temperature then returns. When the timer ticks if the background worker isBusy then it just returns and tries again 500ms later. Someone suggested that I get rid of the background worker all together and just do the poll in the timer tick event. So I set the AutoReset property of the timer to false and then just restart the timer every time a poll finishes. These two techniques seemed to behave the exact same way in my application so I am not sure if one is better than the other. I also tried creating a new thread every time I want to do a poll operation using a new ThreadStart and all that. This also seemed to work fine.
I should mention one other thing. This class is part of a COM object server which basically means that the class library that is produced will be called upon via COM. I am not sure if this has any influence on the answer but I just thought I should throw it out there.
The reason I am asking all of this is that all of my test harness runs and debug builds work just fine but when I do a release build and try to make calls to my class from another application, that application freezes up and I am having a hard time determining the cause.
Any advice, suggestions, comments would be appreciated.
Thanks, Jordan
Remember that the timer hides its own background worker thread, which basically sleeps for the interval, then fires its Elapsed event. Knowing that, it makes sense just to put the polling in Elapsed. This would be the best practice IMO, rather than starting a thread from a thread. You can start and stop Timers as well, so the code that switches modes can Stop() the Timer, perform the task, then Start() it again, and the Timer doesn't even have to know the telescope IsBusy.
However, what I WOULD keep track of is whether another instance of the Elapsed event handler is still running. You could lock the Elapsed handler's code, or you could set a flag, visible from any thread, that indicates another Elapsed() event handler is still working; Elapsed event handlers that see this flag set can exit immediately, avoiding concurrency problems working with the serial port.
So it looks like you have looked at 2 options:
Timer. The Timer is non-blocking while waiting (uses another thread), so the rest of the program can continue running and be responsive. When the timer event kicks off, you simply get/update the current values.
Timer + BackgroundWorker. The background worker is also simply a separate thread. It may take longer to actually start the thread than to simply get the current values. Unless it takes a long time to get the current values and causes your program to become unresponsive, this is unnecessary complexity.
If getting values is fast enough, stick to #1 for simplicity.
If getting values is slow, #2 will work but unnecessarily has a thread start a thread. Instead, do it with only a BackgroundWorker (no Timer). Create the BackgroundWorker once and store in a variable. No need to recreate it every time. Make sure to set WorkerSupportsCancellation to true. Whenever you want to start checking values, on your main program thread do bgWorker.RunWorkerAsync(). When you want to stop, do bgWorker.CancelAsync(). Inside your DoWork method, have a loop that checks the values and does a Thread.Sleep(500). Since it's a separate thread, it won't make your program unresponsive. In the loop conditions, also check to see if the polling was cancelled and break out. You'll probably need a way to get the values back to the main thread. You can use ReportProgress() if an integer is good enough. Otherwise you can create an object to hold the content, but make sure to lock (object) { } before reading and modifying it. This is a quick summary, but if you go this route I would recommend you read: http://www.albahari.com/threading/part3.aspx#_BackgroundWorker
Is the process of contacting the telescope and getting the current values actually take long enough to warrant polling? Have you tried dropping the multithreading and just blocking while you get the current value?
To answer your question, however, I would suggest not using a background worker but an actual Thread that updates the properties continuously.
If all these properties are read only (can you set the temp of the telescope?) and there are no dependencies between them (e.g., no transactions are required to update multiple properties at once) you can drop all the blocking code and let your thread update willy-nilly while other threads access the properties.
I suggest a real, dedicated Thread rather than the thread pool just because of a lack of knowledge of what might happen when mixing background threads and COM servers. Also, apartment state might play into this; with a Thread you can try STA but you can't do that with a threadpool thread.
You say the app freezes up in a release build?
To eliminate extra variables, I'd take all the timer/multi-threaded code out of the application(just comment it out), and try it with a straightforward blocking method.
i.e. You click a button, it calls a function, that function hits the COM object for data, and then updates the UI. All in a blocking, synchronous fashion. This will tell you for sure whether it's the multi-threading code that's freezing you up, or if it's the COM interaction itself.
How about starting a background thread with ThreadPool? Then enter a loop based on a bool (While (bContinue)) that loops and does your work and then a Thread.Sleep at the end of the loop - exiting the program would include setting bContinue to false so the thread stops - perhaps hook it up to the OnStop event in a windows service
bool bRet = ThreadPool.QueueUserWorkItem(new WaitCallback(ThreadFunc));
private void ThreadFunc(object objState)
{
// enter loop
bContinue = true;
while (bContinue) {
// do stuff
// sleep
Thread.Sleep(m_iWaitTime_ms);
}
}
In my application I have to send periodic heartbeats to a "brother" application.
Is this better accomplished with System.Timers.Timer/Threading.Timer or Using a Thread with a while loop and a Thread.Sleep?
The heartbeat interval is 1 second.
while(!exit)
{
//do work
Thread.Sleep(1000);
}
or
myTimer.Start( () => {
//do work
}, 1000); //pseudo code (not actual syntax)...
System.Threading.Timer has my vote.
System.Timers.Timer is meant for use in server-based (your code is running as a server/service on a host machine rather than being run by a user) timer functionality.
A Thread with a While loop and Thread.Sleep command is truly a bad idea given the existance of more robust Timer mecahnisms in .NET.
Server Timers are a different creature than sleeping threads.
For one thing, based on the priority of your thread, and what else is running, your sleeping thread may or may not be awoken and scheduled to run at the interval you ask. If the interval is long enough, and the precision of scheduling doesn't really matter, Thread.Sleep() is a reasonable choice.
Timers, on the other hand, can raise their events on any thread, allowing for better scheduling capabilities. The cost of using timers, however, is a little bit more complexity in your code - and the fact that you may not be able to control which thread runs the logic that the timer event fires on. From the docs:
The server-based Timer is designed for
use with worker threads in a
multithreaded environment. Server
timers can move among threads to
handle the raised Elapsed event,
resulting in more accuracy than
Windows timers in raising the event on
time.
Another consideration is that timers invoke their Elapsed delegate on a ThreadPool thread. Depending on how time-consuming and/or complicated your logic is, you may not want to run it on the thread pool - you may want a dedicated thread. Another factor with timers, is that if the processing takes long enough, the timer event may be raised again (concurrently) on another thread - which can be a problem if the code being run is not intended or structured for concurrency.
Don't confuse Server Timers with "Windows Timers". The later usually refers to a WM_TIMER messages tha can be delivered to a window, allowing an app to schedule and respond to timed-processing on its main thread without sleeping. However, Windows Timers can also refer to the Win API for low-level timing (which is not the same as WM_TIMER).
Neither :)
Sleeping is typically frowned upon (unfortunately I cannot remember the particulars, but for one, it is an uninteruptible "block"), and Timers come with a lot of baggage. If possible, I would recommend System.Threading.AutoResetEvent as such
// initially set to a "non-signaled" state, ie will block
// if inspected
private readonly AutoResetEvent _isStopping = new AutoResetEvent (false);
public void Process()
{
TimeSpan waitInterval = TimeSpan.FromMilliseconds (1000);
// will block for 'waitInterval', unless another thread,
// say a thread requesting termination, wakes you up. if
// no one signals you, WaitOne returns false, otherwise
// if someone signals WaitOne returns true
for (; !_isStopping.WaitOne (waitInterval); )
{
// do your thang!
}
}
Using an AutoResetEvent (or its cousin ManualResetEvent) guarantees a true block with thread safe signalling (for such things as graceful termination above). At worst, it is a better alternative to Sleep
Hope this helps :)
I've found that the only timer implementation that actually scales is System.Threading.Timer. All the other implementations seem pretty bogus if you're dealing with a non trivial number of scheduled items.