SpinWait vs Sleep waiting. Which one to use?

SpinWait vs Sleep waiting. Which one to use? - c#

Is it efficient to
SpinWait.SpinUntil(() => myPredicate(), 10000)
for a timeout of 10000ms
or
Is it more efficient to use Thread.Sleep polling for the same condition
For example something along the lines of the following SleepWait function:
public bool SleepWait(int timeOut)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
while (!myPredicate() && stopwatch.ElapsedMilliseconds < timeOut)
{
Thread.Sleep(50)
}
return myPredicate()
}
I'm concerned that all the yielding of SpinWait may not be a good usage pattern if we are talking about timeouts over 1sec? Is this a valid assumption?
Which approach do you prefer and why? Is there another even better approach?
Update - Becoming more specific:
Is there a way to Make BlockingCollection Pulse a sleeping thread when it reaches bounded capacity? I rather avoid a busy waits alltogether as Marc Gravel suggests.

In .NET 4 SpinWait performs CPU-intensive spinning for 10 iterations before yielding. But it does not return to the caller immediately after each of those cycles; instead, it calls Thread.SpinWait to spin via the CLR (essentially the OS) for a set time period. This time period is initially a few tens of nano-seconds but doubles with each iteration until the 10 iterations are complete. This enables clarity/predictability in the total time spent spinning (CPU-intensive) phase, which the system can tune according to conditions (number of cores etc.). If SpinWait remains in the spin-yielding phase for too long it will periodically sleep to allow other threads to proceed (see J. Albahari's blog for more information). This process is guaranteed to keep a core busy...
So, SpinWait limits the CPU-intensive spinning to a set number of iterations, after which it yields its time slice on every spin (by actually calling Thread.Yield and Thread.Sleep), lowering its resource consumption. It will also detect if the user is running a single core machine and yield on every cycle if that is the case.
With Thread.Sleep the thread is blocked. But this process will not be as expensive as the above in terms of CPU.

The best approach is to have some mechanism to actively detect the thing becoming true (rather than passively polling for it having become true); this could be any kind of wait-handle, or maybe a Task with Wait, or maybe an event that you can subscribe to to unstick yourself. Of course, if you do that kind of "wait until something happens", that is still not as efficient as simply having the next bit of work done as a callback, meaning: you don't need to use a thread to wait. Task has ContinueWith for this, or you can just do the work in an event when it gets fired. The event is probably the simplest approach, depending on the context. Task, however, already provides most-everything you are talking about here, including both "wait with timeout" and "callback" mechanisms.
And yes, spinning for 10 seconds is not great. If you want to use something like your current code, and if you have reason to expect a short delay, but need to allow for a longer one - maybe SpinWait for (say) 20ms, and use Sleep for the rest?
Re the comment; here's how I'd hook an "is it full" mechanism:
private readonly object syncLock = new object();
public bool WaitUntilFull(int timeout) {
if(CollectionIsFull) return true; // I'm assuming we can call this safely
lock(syncLock) {
if(CollectionIsFull) return true;
return Monitor.Wait(syncLock, timeout);
}
}
with, in the "put back into the collection" code:
if(CollectionIsFull) {
lock(syncLock) {
if(CollectionIsFull) { // double-check with the lock
Monitor.PulseAll(syncLock);
}
}
}

Related

Is it Really Busy Waiting If I Thread.Sleep()?

My question is a bit nit-picky on definitions:
Can the code below be described as "busy waiting"? Despite the fact that it uses Thread.Sleep() to allow for context switching?
while (true) {
if (work_is_ready){
doWork();
}
Thread.Sleep(A_FEW_MILLISECONDS);
}
PS - The current definition for busy waiting in Wikipedia suggests that it is a "less wasteful" form of busy waiting.

Any polling loop, regardless of the time between polling operations, is a busy wait. Granted, sleeping a few milliseconds is a lot less "busy" than no sleep at all, but it still involves processing: thread context switches and some minimal condition checking.
A non-busy wait is a blocking call. The non-busy version of your example would involve waiting on a synchronization primitive such as an event or a condition variable. For example, this pseudocode:
// initialize an event to be set when work is ready
Event word_is_ready;
work_is_ready.Reset();
// in code that processes work items
while (true)
{
work_is_ready.Wait(); // non-busy wait for work item
do_work();
}
The difference here is that there is no periodic polling. The Wait call blocks and the thread is never scheduled until the event is set.

That's not busy waiting. Busy waiting, or spinning, involves the opposite: avoiding context switching.
If you want to allow other threads to run, if and only if other threads are ready to run, to avoid deadlock scenarios in single threaded CPUs (e.g., the current thread needs work_is_ready to be set to true, but if this thread doesn't give up the processor and lets others run, it will never be set to true), you can use Thread.Sleep(0).
A much better option would be to use SpinWait.SpinUntil
SpinWait.SpinUntil(() => work_is_ready);
doWork();
SpinWait emits a special rep; nop (repeat no-op) or pause instruction that lets the processor know you're busy waiting, and is optimized for HyperThreading CPUs.
Also, in single core CPUs, this will yield the processor immediately (because busy waiting is completely useless if there's only one core).
But spinning is only useful if you're absolutely sure you won't be waiting on a condition for longer than it would take the processor to switch the context out and back in again. I.e., no more than a few microseconds.
If you want to poll for a condition every few milliseconds, then you should use a blocking synchronization primitive, as the wiki page suggests. For your scenario, I'd recommend an AutoResetEvent, which blocks the thread upon calling WaitOne until the event has been signaled (i.e, the condition has become true).
Read also: Overview of Synchronization Primitives

It depends on the operating system and the exact number of milliseconds you are sleeping. If the sleep is sufficiently long that the operating system can switch to another task, populate its caches, and usefully run that task until your task is ready-to-run again, then it's not busy waiting. If not, then it is.
To criticize this code, I would say something like this: "This code may busy wait if the sleep is too small to allow the core to do useful work between checks. It should be changed so that the code that makes this code need to do work triggers that response."
This poor design creates a needless design problem -- how long should the sleep be? If it's too short, you busy wait. If it's too long, the work sits undone. Even if it's long enough that you don't busy wait, you force needless context switches.

When your code is sleeping for a moment, technically it will be in sleep state freeing up a CPU. While in busy waiting your code is holding the CPU until condition is met.
Can the code below be described as "busy waiting"? Despite the fact that it uses Thread.Sleep() to allow for context switching?
It is not busy waiting, rather polling which is more performant that busy waiting. There is a difference between both
Simply put, Busy-waiting is blocking where as Polling is non-blocking.

Busy waiting is something like this:
for(;;) {
if (condition) {
break;
}
}
The condition could be "checking the current time" (for example performance counter polling). With this you can get a very accurate pause in your thread. This is useful for example for low level I/O (toggling GPIOs etc.). Because of this your thread is running all the time, and if you are on cooperative multi threading, the you are fully in control, how long the thread will stay in that wait for you. Usually this kind of threads have a high priority set and are uninterruptible.
Now a non-busy waiting means, the thread is non-busy. It allows another threads to execute, so there is a context switch. To allow a context switch, in most languages and OS you can simply use a sleep(). There are another similar functions, like yield(), wait(), select(), etc. It depends on OS and language, if they are non-busy or busy implemented. But in my experience in all cases a sleep > 0 was always non-busy.
Advantage of non-busy waiting is allowing another threads to run, which includes idle threads. With this your CPU can go into power saving mode, clock down, etc. It can also run another tasks. After the specified time the scheduler tries to go back to your thread. But is is just a try. It is not exact and it may be a little bit longer, than your sleep defines.
I think. This is clear now.
And now the big question: Is this busy, or non-busy waiting:
for(;;) {
if (condition) {
break;
}
sleep(1);
}
The answer is: is is a non-busy waiting. sleep(1) allows the thread to perform a context-switch.
Now the next question: Is the second for() busy, or non-busy waiting:
function wait() {
for(;;) {
if (condition) {
break;
}
}
}
for(;;) {
wait();
if (condition) {
break;
}
sleep(1);
}
It is hard to say. It depends on the real execution time of the wait() function. If it does nothing, then the CPU is almost the entire time in sleep(1). And this would be a non-blocking for-loop. But if wait() is a heavy calculation function without allowing a thread context switch, then this whole for-loop may become a blocking function, even if there is a sleep(1). Think of the worst-case: the wait() function is never returning back to caller, because the condition isn't hit for a long time.
This here is hard to answer, because we don't know the conditions. You can imagine the problem, where you cannot answer the question, because you don't know the conditions, in the following way:
if (unkonwnCondition) {
for(;;) {
if (condition) {
break;
}
}
} else {
for(;;) {
if (condition) {
break;
}
sleep(1);
}
}
As you see, its the same: because you don't know the conditions, you cannot say if the wait is busy or non-busy.

How safe are Interlocked.Exchange?

Beeing a threading noob, I'm trying to find a way w/o locking objects that allows me to enqueue a threadpool task, in such way that it has a max degree of parallelism = 1.
Will this code do what I think it does?
private int status;
private const int Idle = 0;
private const int Busy = 1;
private void Schedule()
{
// only schedule if we idle
// we become busy and get the old value to compare with
// in an atomic way (?)
if (Interlocked.Exchange(ref status, Busy) == Idle)
{
ThreadPool.QueueUserWorkItem(Run);
}
}
That is, in a threadsafe way enqueue the Run method if the status is Idle.
It seems to work fine in my tests, but since this is not my area, I'm not sure.

Yes, this will do what you want. It will never allow you to get a return value of Idle when in fact status is Busy, and it will set status to Busy in the same operation, with no chance of a conflict. So far so good.
However, if you're using a ConcurrentQueue<T> later on, why do you even do this? Why do you use a ThreadPool to enqueue Run over and over again, instead of just having a single thread continually take data from the concurrent queue using TryDequeue?
In fact, there is a producer-consumer collection that is specifically designed for this, the BlockingCollection<T>. Your consumer thread would just call Take (with a cancellation token if necessary - probably a good idea) and that either returns a value as in ConcurrentQueue<T>; or if no value is available, blocks the thread until there is something to take. When some other thread adds an item to the collection, it will notify as many consumers as it has data for (in your case, no need for anything complex, since you only have one consumer).
That means you only have to handle starting and stopping a single thread, which will run an "infinite" cycle, which will call col.Take, while the producers call col.Add.
Of course, this assumes you have .NET 4.0+ available, but then again, you probably do, since you're using ConcurrentQueue<T>.

Large Number of Timers

I need to write a component that receives an event (the event has a unique ID). Each event requires me to send out a request. The event specifies a timeout period, which to wait for a response from the request.
If the response comes before the timer fires, great, I cancel the timer.
If the timer fires first, then the request timed out, and I want to move on.
This timeout period is specified in the event, so it's not constant.
The expected timeout period is in the range of 30 seconds to 5 minutes.
I can see two ways of implementing this.
Create a timer for each event and put it into a dictionary linking the event to the timer.
Create an ordered list containing the DateTime of the timeout, and a new thread looping every 100ms to check if something timed out.
Option 1 would seem like the easiest solution, but I'm afraid that creating so many timers might not be a good idea because timers might be too expensive. Are there any pitfalls when creating a large number of timers? I suspect that in the background, the timer implementation might actually be an efficient implementation of Option 2. If this option is a good idea, which timer should I use? System.Timers.Timer or System.Threading.Timer.
Option 2 seems like more work, and may not be an efficient solution compared to Option 1.
Update
The maximum number of timers I expect is in the range of 10000, but more likely in the range of 100. Also, the normal case would be the timer being canceled before firing.
Update 2
I ran a test using 10K instances of System.Threading.Timer and System.Timers.Timer, keeping an eye on thread count and memory. System.Threading.Timer seems to be "lighter" compared to System.Timers.Timer judging by memory usage, and there was no creation of excessive number of threads for both timers (ie - thread pooling working properly). So I decided to go ahead and use System.Threading.Timer.

I do this a lot in embedded systems (pure c), where I can't burn a lot of resources (e.g. 4k of RAM is the system memory). This is one approach that has been used (successfully):
Create a single system timer (interrupt) that goes off on a periodic basis (e.g. every 10 ms).
A "timer" is an entry in a dynamic list that indicates how many "ticks" are left till the timer goes off.
Each time the system timer goes off, iterate the list and decrement each of the "timers". Each one that is zero is "fired". Remove it from the list and do whatever the timer was supposed to do.
What happens when the timer goes off depends on the application. It may be a state machine gets run. It may be a function gets called. It may be an enumeration telling the execution code what to do with the parameter sent it the "Create Timer" call. The information in the timer structure is whatever is necessary in the context of the design. The "tick count" is the secret sauce.
We also have created this returning an "ID" for the timer (usually the address of the timer structure, which is drawn from a pool) so it can be cancelled or status on it can be obtained.
Convenience functions convert "seconds" to "ticks" so the API of creating the timers is always in terms of "seconds" or "milliseconds".
You set the "tick" interval to a reasonable value for granularity tradeoff.
I have done other implementations of this in C++, C#, objective-C, with little change in the general approach. It is a very general timer subsystem design/architecture. You just need something to create the fundamental "tick".
I even did it once with a tight "main" loop and a stopwatch from the high-precision internal timer to create my own "simulated" tick when I did not have a timer. I do not recommend this approach; I was simulating hardware in a straight console app and did not have access to the system timers, so it was a bit of an extreme case.
Iterating over a list of a hundreds of timers 10 times a second is not that big a deal on a modern processor. There are ways you can overcome this as well by inserting the items with "delta seconds" and putting them into the list in sorted order. This way you only have to check the ones at the front of the list. This gets you past scaling issues, at least in terms of iterating the list.
Was this helpful?

You should do it the simplest way possible. If you are concerned about performance, you should run your application through a profiler and determine the bottlenecks. You might be very surprised to find out it was some code which you least expected, and you had optimized your code for no reason. I always write the simplest code possible as this is the easiest. See PrematureOptimization
I don't see why there would be any pitfalls with a large number of timers. Are we talking about a dozen, or 100, or 10,000? If it's very high you could have issues. You could write a quick test to verify this.
As for which of those Timer classes to use: I don't want to steal anyone elses answer who probably did much more research: check out this answer to that question`

The first option simply isn't going to scale, you are going to need to do something else if you have a lot of concurrent timeouts. (If you don't know if how many you have is enough to be a problem though, feel free to try using timers to see if you actually have a problem.)
That said, your second option would need a bit of tweaking. Rather than having a tight loop in a new thread, just create a single timer and set its interval (each time it fires) to be the timespan between the current time and the "next" timeout time.

Let me propose a different architecture: for each event, just create a new Task and send the request and wait1 for the response there.
The ~1000 tasks should scale just fine, as shown in this early demo. I suspect ~10000 tasks would still scale, but I haven't tested that myself.
1 Consider implementing the wait by attaching a continuation on Task.Delay (instead of just Thread.Sleep), to avoid under-subscription.

I think Task.Delay is a really good option. Here is the test code for measuring how many concurrent tasks can be executed in different delay times. This code is also calculating error statistics for waiting time accuracy.
static async Task Wait(int delay, double[] errors, int index)
{
var sw = new Stopwatch();
sw.Start();
await Task.Delay(delay);
sw.Stop();
errors[index] = Math.Abs(sw.ElapsedMilliseconds - delay);
}
static void Main(string[] args)
{
var trial = 100000;
var minDelay = 1000;
var maxDelay = 5000;
var errors = new double[trial];
var tasks = new Task[trial];
var rand = new Random();
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < trial; i++)
{
var delay = rand.Next(minDelay, maxDelay);
tasks[i] = Wait(delay, errors, i);
}
sw.Stop();
Console.WriteLine($"{trial} tasks started in {sw.ElapsedMilliseconds} milliseconds.");
Task.WaitAll(tasks);
Console.WriteLine($"Avg Error: {errors.Average()}");
Console.WriteLine($"Min Error: {errors.Min()}");
Console.WriteLine($"Max Error: {errors.Max()}");
Console.ReadLine();
}
You may change the parameters to see different results. Here are several results in milliseconds:
100000 tasks started in 9353 milliseconds.
Avg Error: 9.10898
Min Error: 0
Max Error: 110

Purpose of Thread.Sleep(1)?

I was reading over some threading basics and on the msdn website I found this snippet of code.
// Put the main thread to sleep for 1 millisecond to
// allow the worker thread to do some work:
Thread.Sleep(1);
Here is a link to the the page: http://msdn.microsoft.com/en-us/library/7a2f3ay4(v=vs.80).aspx.
Why does the main thread have sleep for 1 millisecond? Will the secondary thread not start its tasks if the main thread is continuously running? Or is the example meant for a task that takes 1 millisecond to do? As in if the task generally takes 5 seconds to complete the main thread should sleep for 5000 milliseconds?
If this is solely regarding CPU usage, here is a similar Question about Thread.Sleep.
Any comments would be appreciated.
Thanks.

The 1 in that code is not terribly special; it will always end up sleeping longer than that, as things aren't so precise, and giving up your time slice does not equal any guarantee from the OS when you will get it back.
The purpose of the time parameter in Thread.Sleep() is that your thread will yield for at least that amount of time, roughly.
So that code is just explicitly giving up its time slot. Generally speaking, such a bit of code should not be needed, as the OS will manage your threads for you, preemptively interrupting them to work on other threads.
This kind of code is often used in "threading examples", where the writer wants to force some artificial occurrence to prove some race condition, or the like (that appears to be the case in your example)
As noted in Jon Hanna's answer to this same question, there is a subtle but important difference between Sleep(0) and Sleep(1) (or any other non-zero number), and as ChrisF alludes to, this can be important in some threading situations.
Both of those involve thread priorities; Threads can be given higher/lower priorities, such that lower priority threads will never execute as long as there are higher priority threads that have any work to do. In such a case, Sleep(1) can be required... However...
Low-priority threads are also subject to what other processes are doing on the same system; so while your process might have no higher-priority threads running, if any others do, yours still won't run.
This isn't usually something you ever need to worry about, though; the default priority is the 'normal' priority, and under most circumstances, you should not change it. Raising or lowering it has numerous implications.

Thread.Sleep(0) will give up the rest of a thread's time-slice if a thread of equal priority is ready to schedule.
Thread.Sleep(1) (or any other value, but 1 is the lowest to have this effect) will give up the rest of the thread's time-slice unconditionally. If it wants to make sure that even threads with lower priority have a chance to run (and such a thread could be doing something that is blocking this thread, it has to), then it's the one to go for.
http://www.bluebytesoftware.com/blog/PermaLink,guid,1c013d42-c983-4102-9233-ca54b8f3d1a1.aspx has more on this.

If the main thread doesn't sleep at all then the other threads will not be able to run at all.
Inserting a Sleep of any length allows the other threads some processing time. Using a small value (of 1 millisecond in this case) means that the main thread doesn't appear to lock up. You can use Sleep(0), but as Jon Hanna points out that has a different meaning to Sleep(1) (or indeed any positive value) as it only allows threads of equal priority to run.
If the task takes 5 seconds then the main thread will sleep for a total of 5,000 milliseconds, but spread out over a longer period.

It's only for the sake of the example- they want to make sure that the worker thread has the chance to print "worker thread: working..." at least once before the main thread kills it.
As Andrew implied, this is important in the example especially because if you were running on a single-processor machine, the main thread may not give up the processor, killing the background thread before it has a chance to iterate even once.

Interesting thing I noticed today. Interrupting a thread throws a ThreadInterruptedException. I was trying to catch the exception but could not for some reason. My coworker recommended that I put Thread.Sleep(1) prior to the catch statement and that allowed me to catch the ThreadInterruptedException.
// Start the listener
tcpListener_ = new TcpListener(ipAddress[0], int.Parse(portNumber_));
tcpListener_.Start();
try
{
// Wait for client connection
while (true)
{
// Wait for the new connection from the client
if (tcpListener_.Pending())
{
socket_ = tcpListener_.AcceptSocket();
changeState(InstrumentState.Connected);
readSocket();
}
Thread.Sleep(1);
}
}
catch (ThreadInterruptedException) { }
catch (Exception ex)
{
MessageBox.Show(ex.Message, "Contineo", MessageBoxButtons.OK, MessageBoxIcon.Error);
Console.WriteLine(ex.StackTrace);
}
Some other class...
if (instrumentThread_ != null)
{
instrumentThread_.Interrupt();
instrumentThread_ = null;
}

does Monitor.Wait Needs synchronization?

I have developed a generic producer-consumer queue which pulses by Monitor in the following way:
the enqueue :
public void EnqueueTask(T task)
{
_workerQueue.Enqueue(task);
Monitor.Pulse(_locker);
}
the dequeue:
private T Dequeue()
{
T dequeueItem;
if (_workerQueue.Count > 0)
{
_workerQueue.TryDequeue(out dequeueItem);
if(dequeueItem!=null)
return dequeueItem;
}
while (_workerQueue.Count == 0)
{
Monitor.Wait(_locker);
}
_workerQueue.TryDequeue(out dequeueItem);
return dequeueItem;
}
the wait section produces the following SynchronizationLockException :
"object synchronization method was called from an unsynchronized block of code"
do i need to synch it? why ? Is it better to use ManualResetEvents or the Slim version of .NET 4.0?

Yes, the current thread needs to "own" the monitor in order to call either Wait or Pulse, as documented. (So you'll need to lock for Pulse as well.) I don't know the details for why it's required, but it's the same in Java. I've usually found I'd want to do that anyway though, to make the calling code clean.
Note that Wait releases the monitor itself, then waits for the Pulse, then reacquires the monitor before returning.
As for using ManualResetEvent or AutoResetEvent instead - you could, but personally I prefer using the Monitor methods unless I need some of the other features of wait handles (such as atomically waiting for any/all of multiple handles).

From the MSDN description of Monitor.Wait():
Releases the lock on an object and blocks the current thread until it reacquires the lock.
The 'releases the lock' part is the problem, the object isn't locked. You are treating the _locker object as though it is a WaitHandle. Doing your own locking design that's provably correct is a form of black magic that's best left to our medicine man, Jeffrey Richter and Joe Duffy. But I'll give this one a shot:
public class BlockingQueue<T> {
private Queue<T> queue = new Queue<T>();
public void Enqueue(T obj) {
lock (queue) {
queue.Enqueue(obj);
Monitor.Pulse(queue);
}
}
public T Dequeue() {
T obj;
lock (queue) {
while (queue.Count == 0) {
Monitor.Wait(queue);
}
obj = queue.Dequeue();
}
return obj;
}
}
In most any practical producer/consumer scenario you will want to throttle the producer so it cannot fill the queue unbounded. Check Duffy's BoundedBuffer design for an example. If you can afford to move to .NET 4.0 then you definitely want to take advantage of its ConcurrentQueue class, it has lots more black magic with low-overhead locking and spin-waiting.

The proper way to view Monitor.Wait and Monitor.Pulse/PulseAll is not as providing a means of waiting, but rather (for Wait) as a means of letting the system know that the code is in a waiting loop which can't exit until something of interest changes, and (for Pulse/PulseAll) as a means of letting the system know that code has just changed something that might cause satisfy the exit condition some other thread's waiting loop. One should be able to replace all occurrences of Wait with Sleep(0) and still have code work correctly (even if much less efficiently, as a result of spending CPU time repeatedly testing conditions that haven't changed).
For this mechanism to work, it is necessary to avoid the possibility of the following sequence:
The code in the wait loop tests the condition when it isn't satisfied.
The code in another thread changes the condition so that it is satisfied.
The code in that other thread pulses the lock (which nobody is yet waiting on).
The code in the wait loop performs a Wait since its condition wasn't satisfied.
The Wait method requires that the waiting thread have a lock, since that's the only way it can be sure that the condition it's waiting upon won't change between the time it's tested and the time the code performs the Wait. The Pulse method requires a lock because that's the only way it can be sure that if another thread has "committed" itself to performing a Wait, the Pulse won't occur until after the other thread actually does so. Note that using Wait within a lock doesn't guarantee that it's being used correctly, but there's no way that using Wait outside a lock could possibly be correct.
The Wait/Pulse design actually works reasonably well if both sides cooperate. The biggest weaknesses of the design, IMHO, are (1) there's no mechanism for a thread to wait until any of a number of objects is pulsed; (2) even if one is "shutting down" an object such that all future wait loops should exit immediately (probably by checking an exit flag), the only way to ensure that any Wait to which a thread has committed itself will get a Pulse is to acquire the lock, possibly waiting indefinitely for it to become available.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.