Under what circumstances should each of the following synchronization objects be used?
ReaderWriter lock
Semaphore
Mutex
Since wait() will return once for each time post() is called, semaphores are a basic producer-consumer model - the simplest form of inter-thread message except maybe signals. They are used so one thread can tell another thread that something has happened that it's interested in (and how many times), and for managing access to resources which can have at most a fixed finite number of users. They offer ordering guarantees needed for multi-threaded code.
Mutexes do what they say on the tin - "mutual exclusion". They ensure that the right to access some resource is "held" by only on thread at a time. This gives guarantees of atomicity and ordering needed for multi-threaded code. On most OSes, they also offer reasonably sophisticated waiter behaviour, in particular to avoid priority inversion.
Note that a semaphore can easily be used to implement mutual exclusion, but that because a semaphore does not have an "owner thread", you don't get priority inversion avoidance with semaphores. So they are not suitable for all uses which require a "lock".
ReaderWriter locks are an optimisation over mutexes, in cases where you will have a lot of contention, most accesses are read-only, and simultaneous reads are permissible for the data structure being protected. In such cases, exclusion is required only when a writer is involved - readers don't need to be excluded from each other. To promote a reader to writer all other readers must finish (or abort and start waiting to retry if they also wish to become writers) before the writer lock is acquired. ReaderWriter locks are likely to be slower in cases where they aren't faster, due to the additional book-keeping they do over mutexes.
Condition variables are for allowing threads to wait on certain facts or combinations of facts being true, where the condition in question is more complex than just "it has been poked" as for semaphores, or "nobody else is using it" for mutexes and the writer part of reader-writer locks, or "no writers are using it" for the reader part of reader-writer locks. They are also used where the triggering condition is different for different waiting threads, but depends on some or all of the same state (memory locations or whatever).
Spin locks are for when you will be waiting a very short period of time (like a few cycles) on one processor or core, while another core (or piece of hardware such as an I/O bus) simultaneously does some work that you care about. In some cases they give a performance enhancement over other primitives such as semaphores or interrupts, but must be used with extreme care (since lock-free algorithms are difficult in modern memory models) and only when proven necessary (since bright ideas to avoid system primitives are often premature optimisation).
Btw, these answers aren't C# specific (hence for example the comment about "most OSes"). Richard makes the excellent point that in C# you should be using plain old locks where appropriate. I believe Monitors are a mutex/condition variable pair rolled into one object.
I would say each of them can be "the best" - depends on the use case ;-)
Simple answer: almost never.
The best type of locking is to not need a lock (no shared mutable state).
If you do need a lock, try and use a Monitor (via a lock statement), unless you have specific needs for something different (in which case see Onebyone's answer
Additionally, prefer ReaderWriteLockSlim to ReaderWriterLock (except in the extremely rare case of requiring the latter's fairness).
Related
Imagine you are utilizing Parallelism in a multi-core system.
Is it not completely possible that the same instructions may be executed simultaneously?
Take the following code:
int i = 0;
if( blockingCondition )
{
lock( objLock )
{
i++;
}
}
In my head, it seems that it is very possible on a system with multiple cores and parallelism that the blockingCondition could be checked at precisely the same moment, leading to the lock being attempted at the same moment, and so on...Is this true?
If so, how can you ensure synchronization across processors?
Also, does the .net TPL handle this type of synchronization? What about other languages?
EDIT
Please note that this is not about threads, but Tasks and Parallel-Processisng.
EDIT 2
OK, thanks for the information everyone. So is it true that the OS will ensure that writing to memory is serialized, ensuring multi-core synchronization via volatile reads?
To understand why this works, bear in mind:
Locking a lock (i.e. incrementing a
lock semaphore on the object) is an
operation that blocks if the object
is already locked.
The two steps of lock, a) checking the lock
semaphore is free, b) and actually
locking the object, are performed
'simultaneously' - i.e. they are a
monolithic or atomic operation as
far as relationship between CPU and
memory is concerned.
Therefore, you can see that, if 2 threads enter your if-block, one of the two threads will acquire the lock, and the other will block until the first one has finished the if.
A lock like you have described here is a "Monitor" style lock on the objLock. As you've noted, it is entirely possible, under a multi-core system, for the two "lock" calls to begin simultaneously. However, any high level application environment which uses monitors will have translated the monitor into semaphore requests (or, depending on your OS and language particulars, mutex requests) in the compiled byte code.
Semaphores are implemented at the operating system and/or hardware level, and higher level languages bind to them. At the OS level, they are "guaranteed" to be atomic. That is, any program acquiring a semaphore is guaranteed to be the only one doing so at that point in time. If two programs, or two threads within a program attempt to acquire the lock at the same time, one will go first (and succeed), and the other will go second (and fail).
At this point, the "how do you ensure synchronisation" stops being a problem for the application programmer to worry about, and starts being a problem for the operating system designer and the hardware designer.
The upshot of it is, as an application coder, you can safely assume that "lock(objLock)" will be an atomic call no matter how many CPUs you plug into your system.
Your concern is precisely why we need a special mechanism like lock and cannot simply use a boolean flag.
The solution to your 'simultaneous' problem is in the algorithm that lock (which calls Monitor.Enter()) uses. It involves memory barriers and knowledge of the very lowlevel memory mechanics to ensure that no 2 threads can acquire the lock at the same time.
Note: I'm talking about .NET only, not Java.
Please explain what are the main differences and when should I use what.
The focus on web multi-threaded applications.
lock allows only one thread to execute the code at the same time. ReaderWriterLock may allow multiple threads to read at the same time or have exclusive access for writing, so it might be more efficient. If you are using .NET 3.5 ReaderWriterLockSlim is even faster. So if your shared resource is being read more often than being written, use ReaderWriterLockSlim. A good example for using it is a file that you read very often (on each request) and you update the contents of the file rarely. So when you read from the file you enter a read lock so that many requests can open it for reading and when you decide to write you enter a write lock. Using a lock on the file will basically mean that you can serve one request at a time.
Consider using ReaderWriterLock if you have lots of threads that only need to read the data and these threads are getting blocked waiting for the lock and and you don’t often need to change the data.
However ReaderWriterLock may block a thread that is waiting to write for a long time.
Therefore only use ReaderWriterLock after you have confirmed you get high contention for the lock in “real life” and you have confirmed you can’t redesign your locking design to reduce how long the lock is held for.
Also consider if you can't rather store the shared data in a database and let it take care of all the locking, as this is a lot less likely to give you a hard time tracking down bugs, iff a database is fast enough for your application.
In some cases you may also be able to use the Aps.net cache to handle shared data, and just remove the item from the cache when the data changes. The next read can put a fresh copy in the cache.
Remember
"The best kind of locking is the
locking you don't need (i.e. don't
share data between threads)."
Monitor and the underlying "syncblock" that can be associated with any reference object—the underlying mechanism under C#'s lock—support exclusive execution. Only one thread can ever have the lock. This is simple and efficient.
ReaderWriterLock (or, in V3.5, the better ReaderWriterLockSlim) provide a more complex model. Avoid unless you know it will be more efficient (i.e. have performance measurements to support yourself).
The best kind of locking is the locking you don't need (i.e. don't share data between threads).
ReaderWriterLock allows you to have multiple threads hold the ReadLock at the same time... so that your shared data can be consumed by many threads at once. As soon as a WriteLock is requested no more ReadLocks are granted and the code waiting for the WriteLock is blocked until all the threads with ReadLocks have released them.
The WriteLock can only ever be held by one thread, allow your 'data updates' to appear atomic from the point of view of the consuming parts of your code.
The Lock on the other hand only allows one thread to enter at a time, with no allowance for threads that are simply trying to consume the shared data.
ReaderWriterLockSlim is a new more performant version of ReaderWriterLock with better support for recursion and the ability to have a thread move from a Lock that is essentially a ReadLock to the WriteLock smoothly (UpgradeableReadLock).
ReaderWriterLock/Slim is specifically designed to help you efficiently lock in a multiple consumer/ single producer scenario. Doing so with the lock statement is possible, but not efficient. RWL/S gets the upper hand by being able to aggressively spinlock to acquire the lock. That also helps you avoid lock convoys, a problem with the lock statement where a thread relinquishes its thread quantum when it cannot acquire the lock, making it fall behind because it won't be rescheduled for a while.
It is true that ReaderWriterLockSlim is FASTER than ReaderWriterLock. But the memory consumption by ReaderWriterLockSlim is outright outrageous. Try attaching a memory profiler and see for yourself. I would pick ReaderWriterLock anyday over ReaderWriterLockSlim.
I would suggest looking through http://www.albahari.com/threading/part4.aspx#_Reader_Writer_Locks. It talks about ReaderWriterLockSlim (which you want to use instead of ReaderWriterLock).
I have used generic queue in C# collection and everyone says that it is better to use the object of System.Collection.Generic.Queue because of thread safety.
Please advise on the right decision to use Queue object, and how it is thread safe?
"Thread safe" is a bit of an unfortunate term because it doesn't really have a solid definition. Basically it means that certain operations on the object are guaranteed to behave sensibly when the object is being operated on via multiple threads.
Consider the simplest example: a counter. Suppose you have two threads that are incrementing a counter. If the sequence of events goes:
Thread one reads from counter, gets
zero.
Thread two reads from counter, gets
zero.
Thread one increments zero, writes
one to counter.
Thread two increments zero, writes
one to counter.
Then notice how the counter has "lost" one of the increments. Simple increment operations on counters are not threadsafe; to make them threadsafe you can use locks, or InterlockedIncrement.
Similarly with queues. Not-threadsafe-queues can "lose" enqueues the same way that not-threadsafe counters can lose increments. Worse, not threadsafe queues can even crash or produce crazy results if you use them in a multi-threaded scenario improperly.
The difficulty with "thread safe" is that it is not clearly defined. Does it simply mean "will not crash"? Does it mean that sensible results will be produced? For example, suppose you have a "threadsafe" collection. Is this code correct?
if (!collection.IsEmpty) Console.WriteLine(collection[0]);
No. Even if the collection is "threadsafe", that doesn't mean that this code is correct; another thread could have made the collection empty after the check but before the writeline and therefore this code could crash, even if the object is allegedly "threadsafe". Actually determining that every relevant combination of operations is threadsafe is an extremely difficult problem.
Now to come to your actual situation: anyone who is telling you "you should use the Queue class, it is better because it is threadsafe" probably does not have a clear idea of what they're talking about. First off, Queue is not threadsafe. Second, whether Queue is threadsafe or not is completely irrelevant if you are only using the object on a single thread! If you have a collection that is going to be accessed on multiple threads, then, as I indicated in my example above, you have an extremely difficult problem to solve, regardless of whether the collection itself is "threadsafe". You have to determine that every combination of operations you perform on the collection is also threadsafe. This is a very difficult problem, and if it is one you face, then you should use the services of an expert on this difficult topic.
A type that is thread safe can be safely accessed from multiple threads without concern for concurrency. This usually means that the type is read-only.
Interestingly enough, Queue<T> is not thread safe - it can support concurrent reads as long as the queue isn't modified but that isn't the same thing as thread safety.
In order to think about thread safety consider what would happen if two threads were accessing a Queue<T> and a third thread came along and began either adding to or removing from this Queue<T>. Since this type does not restrict this behavior it is not thread safe.
In dealing with multithreading, you usually have to deal with concurrency issues. The term "concurrency issues" refers to issues that are specifically introduced by the possibility of interleaving instructions from two different execution contexts on a resource shared by both. Here, in terms of thread safety, the execution contexts are two threads within a process; however, in related subjects they might be processes.
Thread safety measures are put in place to achieve two goals primarily. First is to regain determinism with regard to what happens if the threads context-switch (which is otherwise controlled by the OS and thus basically nondeterministic in user-level programs), to prevent certain tasks from being left half-finished or two contexts writing to the same location in memory one after the other. Most measures simply use a little bit of hardware-supported test-and-set instructions and the like, as well as software-level synchronization constructs to force all other execution contexts to stay away from a data type while another one is doing work that should not be interrupted.
Usually, objects that are read-only are thread-safe. Many objects that are not read-only are able to have data accesses (read-only) occur with multiple threads without issue, if the object is not modified in the middle. But this is not thread safety. Thread safety is when all manner of things are done to a data type to prevent any modifications to it by one thread from causing data corruption or deadlock even when dealing with many concurrent reads and writes.
So it's my understanding that on a ReaderWriterLock (or ReaderWriterLockSlim more specifically), both the read and write need acquire a mutex to take the lock. I'd like to optimize the read access of the lock, such that if there are no writes pending, no lock need be acquired. (And I'm willing to sacrifice the performance of writes, add some constraints to the reads, make the first read slow and second fast, etc.. if necessary, as long as the vast majority of the reads are as fast as possible.)
So, how would one do this, or even better, is there a framework or "standard" implementation one could point me to? (Or if I've misunderstood and it's supported already, great!)
So for my piece:
It would seem that if one were to have a counter for the number of readers/writers (protected by Interlocked.Increment), that would be enough for the reader to check if the writer count was non-zero, and only acquire the lock then. (And increment within the lock if acquired.)
Writers would always increment, acquire the lock, spin till the reader count went to 0 (willing to assume readers always finish quickly, or even bypass the reader count entirely in an optimistic scenario), and finally decrement. (It'd be nice to throw in some form priority too when we do block, or potentially clear all pending readers/writers in one pass since I'm only protecting one value, but I'll forgo that for now..)
So.. anyone seen anything similar or have a suggestion? If there's nothing out there after a bit, I'd be happy to throw together an initial implementation and talk more concretely.
What you've described is, at a basic level, already how the reader/writer locks work. They don't need to take a mutex out as the reader/writer lock controls access by using an internal count of readers and writers (and, indeed, a mutex would imply that readers would block each other, whereas in fact multiple concurrent readers are allowed -- that's the whole point of the lock type!).
So yes, there is a framework/standard implementation for this: ReaderWriterLockSlim. I really doubt you'll be able to write a reader/writer lock with better performance than this. In any case -- are you sure that this lock is the root of your performance problems?
I am afraid you are wrong, since ReaderWriterLockSlim is based on spin locking, not on mutexes (you can see this in Reflector).
I understand the main function of the lock key word from MSDN
lock Statement (C# Reference)
The lock keyword marks a statement
block as a critical section by
obtaining the mutual-exclusion lock
for a given object, executing a
statement, and then releasing the
lock.
When should the lock be used?
For instance it makes sense with multi-threaded applications because it protects the data. But is it necessary when the application does not spin off any other threads?
Is there performance issues with using lock?
I have just inherited an application that is using lock everywhere, and it is single threaded and I want to know should I leave them in, are they even necessary?
Please note this is more of a general knowledge question, the application speed is fine, I want to know if that is a good design pattern to follow in the future or should this be avoided unless absolutely needed.
When should the lock be used?
A lock should be used to protect shared resources in multithreaded code. Not for anything else.
But is it necessary when the application does not spin off any other threads?
Absolutely not. It's just a time waster. However do be sure that you're not implicitly using system threads. For example if you use asynchronous I/O you may receive callbacks from a random thread, not your original thread.
Is there performance issues with using lock?
Yes. They're not very big in a single-threaded application, but why make calls you don't need?
...if that is a good design pattern to follow in the future[?]
Locking everything willy-nilly is a terrible design pattern. If your code is cluttered with random locking and then you do decide to use a background thread for some work, you're likely to run into deadlocks. Sharing a resource between multiple threads requires careful design, and the more you can isolate the tricky part, the better.
All the answers here seem right: locks' usefulness is to block threads from acessing locked code concurrently. However, there are many subtleties in this field, one of which is that locked blocks of code are automatically marked as critical regions by the Common Language Runtime.
The effect of code being marked as critical is that, if the entire region cannot be entirely executed, the runtime may consider that your entire Application Domain is potentially jeopardized and, therefore, unload it from memory. To quote MSDN:
For example, consider a task that attempts to allocate memory while holding a lock. If the memory allocation fails, aborting the current task is not sufficient to ensure stability of the AppDomain, because there can be other tasks in the domain waiting for the same lock. If the current task is terminated, other tasks could be deadlocked.
Therefore, even though your application is single-threaded, this may be a hazard for you. Consider that one method in a locked block throws an exception that is eventually not handled within the block. Even if the exception is dealt as it bubbles up through the call stack, your critical region of code didn't finish normally. And who knows how the CLR will react?
For more info, read this article on the perils of Thread.Abort().
Bear in mind that there might be reasons why your application is not as single-threaded as you think. Async I/O in .NET may well call-back on a pool thread, for example, as do some of the various timer classes (not the Windows Forms Timer, though).
Generally speaking if your application is single threaded, you're not going to get much use out of the lock statement. Not knowing your application exactly, I don't know if they're useful or not - but I suspect not. Further, if you're application is using lock everywhere I don't know that I would feel all that confident about it working in a multi-threaded environment anyways - did the original developer actually know how to develop multi-threaded code, or did they just add lock statements everywhere in the vague hope that that would do the trick?
lock should be used around the code that modifies shared state, state that is modified by other threads concurrently, and those other treads must take the same lock.
A lock is actually a memory access serializer, the threads (that take the lock) will wait on the lock to enter until the current thread exits the lock, so memory access is serialized.
To answer you question lock is not needed in a single threaded application, and it does have performance side effects. because locks in C# are based on kernel sync objects and every lock you take creates a transition to kernel mode from user mode.
If you're interested in multithreading performance a good place to start is MSDN threading guidelines
You can have performance issues with locking variables, but normally, you'd construct your code to minimize the lengths of time that are spent inside a 'locked' block of code.
As far as removing the locks. It'll depend on what exactly the code is doing. Even though it's single threaded, if your object is implemented as a Singleton, it's possible that you'll have multiple clients using an instance of it (in memory, on a server) at the same time..
Yes, there will be some performance penalty when using lock but it is generally neglible enough to not matter.
Using locks (or any other mutual-exclusion statement or construct) is generally only needed in multi-threaded scenarios where multiple threads (either of your own making or from your caller) have the opportunity to interact with the object and change the underlying state or data maintained. For example, if you have a collection that can be accessed by multiple threads you don't want one thread changing the contents of that collection by removing an item while another thread is trying to read it.
Lock(token) is only used to mark one or more blocks of code that should not run simultaneously in multiple threads. If your application is single-threaded, it's protecting against a condition that can't exist.
And locking does invoke a performance hit, adding instructions to check for simultaneous access before code is executed. It should only be used where necessary.
See the question about 'Mutex' in C#. And then look at these two questions regarding use of the 'lock(Object)' statement specifically.
There is no point in having locks in the app if there is only one thread and yes, it is a performance hit although it does take a fair number of calls for that hit to stack up into something significant.