I'm quite new to C# so please bear with me. I'm a bit confused with the thread safety. When is something thread safe and when something isn't?
Is reading (just reading from something that was initialized before) from a field always thread safe?
//EXAMPLE
RSACryptoServiceProvider rsa = new RSACrytoServiceProvider();
rsa.FromXmlString(xmlString);
//Is this thread safe if xml String is predifined
//and this code can be called from multiple threads?
Is accessing an object from an array or list always thread safe (in case you use a for loop for enumeration)?
//EXAMPLE (a is local to thread, array and list are global)
int a = 0;
for(int i=0; i<10; i++)
{
a += array[i];
a -= list.ElementAt(i);
}
Is enumeration always/ever thread safe?
//EXAMPLE
foreach(Object o in list)
{
//do something with o
}
Can writing and reading to a particular field ever result in a corrupted read (half of the field is changed and half is still unchanged) ?
Thank you for all your answers and time.
EDIT: I meant if all threads are only reading & using (not writing or changing) object. (except for the last question where it is obvious that I meant if threads both read and write). Because I do not know if plain access or enumeration is thread safe.
It's different for different cases, but in general, reading is safe if all threads are reading. If any are writing, neither reading or writing is safe unless it can be done atomically (inside a synchronized block or with an atomic type).
It isn't definite that reading is ok -- you never know what is happening under the hoods -- for example, a getter might need to initialize data on first usage (therefore writing to local fields).
For Strings, you are in luck -- they are immutable, so all you can do is read them. With other types, you will have to take precautions against them changing in other threads while you are reading them.
Is reading (just reading from something that was initialized before) from a field always thread safe?
The C# language guarantees that reads and writes are consistently ordered when the reads and writes are on a single thread in section 3.10:
Data dependence is preserved within a thread of execution. That is, the value of each variable is computed as if all statements in the thread were executed in original program order. Initialization ordering rules are preserved.
Events in a multithreaded, multiprocessor system do not necessarily have a well-defined consistent ordering in time with respect to each other. The C# language does not guarantee there to be a consistent ordering. A sequence of writes observed by one thread may be observed to be in a completely different order when observed from another thread, so long as no critical execution point is involved.
The question is therefore unanswerable because it contains an undefined word. Can you give a precise definition of what "before" means to you with respect to events in a multithreaded, multiprocessor system?
The language guarantees that side effects are ordered only with respect to critical execution points, and even then, does not make any strong guarantees when exceptions are involved. Again, to quote from section 3.10:
Execution of a C# program proceeds such that the side effects of each executing thread are preserved at critical execution points. A side effect is defined as a read or write of a volatile field, a write to a non-volatile variable, a write to an external resource, and the throwing of an exception. The critical execution points at which the order of these side effects must be preserved are references to volatile fields, lock statements, and thread creation and termination. [...] The ordering of side effects is preserved with respect to volatile reads and writes.
Additionally, the execution environment need not evaluate part of an expression if it can deduce that that expression’s value is not used and that no needed side effects are produced (including any caused by calling a method or accessing a volatile field). When program execution is interrupted by an asynchronous event (such as an exception thrown by another thread), it is not guaranteed that the observable side effects are visible in the original program order.
Is accessing an object from an array or list always thread safe (in case you use a for loop for enumeration)?
By "thread safe" do you mean that two threads will always observe consistent results when reading from a list? As noted above, the C# language makes very limited guarantees about observation of results when reading from variables. Can you give a precise definition of what "thread safe" means to you with respect to non-volatile reading?
Is enumeration always/ever thread safe?
Even in single threaded scenarios it is illegal to modify a collection while enumerating it. It is certainly unsafe to do so in multithreaded scenarios.
Can writing and reading to a particular field ever result in a corrupted read (half of the field is changed and half is still unchanged) ?
Yes. I refer you to section 5.5, which states:
Reads and writes of the following data types are atomic: bool, char, byte, sbyte, short, ushort, uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type in the previous list are also atomic. Reads and writes of other types, including long, ulong, double, and decimal, as well as user-defined types, are not guaranteed to be atomic. Aside from the library functions designed for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or decrement.
Well, I generally assume everything is thread unsafe. For quick and dirty access to global objects in an threaded environment I use the lock(object) keyword. .Net have an extensive set of synchronization methods like different semaphores and such.
Reading can be thread-unsafe if there are any threads that are writing (if they write in the middle of a read, for example, they'll be hit with an exception).
If you must do this, then you can do:
lock(i){
i.GetElementAt(a)
}
This will force thread-safety on i (as long as other threads similarly attempt to lock i before they use it. only one thing can lock a reference type at a time.
In terms of enumeration, I'll refer to the MSDN:
The enumerator does not have exclusive access to the collection; therefore, enumerating
through a collection is intrinsically not a thread-safe procedure. To guarantee thread
safety during enumeration, you can lock the collection during the entire enumeration. To
allow the collection to be accessed by multiple threads for reading and writing, you must
implement your own synchronization.
An example of no thread-safety: When several threads increment an integer. You can set it up in a way that you have a predeterminded number of increments. What youmay observe though, is, that the int has not been incremented as much as you thought it would. What happens is that two threads may increment the same value of the integer.This is but an example of aplethora of effects you may observe when working with several threads.
PS
A thread-safe increment is available through Interlocked.Increment(ref i)
Related
(This is a repeat of: How to correctly read an Interlocked.Increment'ed int field? but, after reading the answers and comments, I'm still not sure of the right answer.)
There's some code that I don't own and can't change to use locks that increments an int counter (numberOfUpdates) in several different threads. All calls use:
Interlocked.Increment(ref numberOfUpdates);
I want to read numberOfUpdates in my code. Now since this is an int, I know that it can't tear. But what's the best way to ensure that I get the latest value possible? It seems like my options are:
int localNumberOfUpdates = Interlocked.CompareExchange(ref numberOfUpdates, 0, 0);
Or
int localNumberOfUpdates = Thread.VolatileRead(numberOfUpdates);
Will both work (in the sense of delivering the latest value possible regardless of optimizations, re-orderings, caching, etc.)? Is one preferred over the other? Is there a third option that's better?
I'm a firm believer in that if you're using interlocked to increment shared data, then you should use interlocked everywhere you access that shared data. Likewise, if you use insert you favorite synchronization primitive here to increment shared data, then you should use insert you favorite synchronization primitive here everywhere you access that shared data.
int localNumberOfUpdates = Interlocked.CompareExchange(ref numberOfUpdates, 0, 0);
Will give you exactly what your looking for. As others have said interlocked operations are atomic. So Interlocked.CompareExchange will always return the most recent value. I use this all the time for accessing simple shared data like counters.
I'm not as familiar with Thread.VolatileRead, but I suspect it will also return the most recent value. I'd stick with interlocked methods, if only for the sake of being consistent.
Additional info:
I'd recommend taking a look at Jon Skeet's answer for why you may want to shy away from Thread.VolatileRead(): Thread.VolatileRead Implementation
Eric Lippert discusses volatility and the guarantees made by the C# memory model in his blog at http://blogs.msdn.com/b/ericlippert/archive/2011/06/16/atomicity-volatility-and-immutability-are-different-part-three.aspx. Straight from the horses mouth: "I don't attempt to write any low-lock code except for the most trivial usages of Interlocked operations. I leave the usage of "volatile" to real experts."
And I agree with Hans's point that the value will always be stale at least by a few ns, but if you have a use case where that is unacceptable, its probably not well suited for a garbage collected language like C# or a non-real-time OS. Joe Duffy has a good article on the timeliness of interlocked methods here: http://joeduffyblog.com/2008/06/13/volatile-reads-and-writes-and-timeliness/
Thread.VolatileRead(numberOfUpdates) is what you want. numberOfUpdates is an Int32, so you already have atomicity by default, and Thread.VolatileRead will ensure volatility is dealt with.
If numberOfUpdates is defined as volatile int numberOfUpdates; you don't have to do
this, as all reads of it will already be volatile reads.
There seems to be confusion about whether Interlocked.CompareExchange is more appropriate. Consider the following two excerpts from the documentation.
From the Thread.VolatileRead documentation:
Reads the value of a field. The value is the latest written by any processor in a computer, regardless of the number of processors or the state of processor cache.
From the Interlocked.CompareExchange documentation:
Compares two 32-bit signed integers for equality and, if they are equal, replaces one of the values.
In terms of the stated behavior of these methods, Thread.VolatileRead is clearly more appropriate. You do not want to compare numberOfUpdates to another value, and you do not want to replace its value. You want to read its value.
Lasse makes a good point in his comment: you might be better off using simple locking. When the other code wants to update numberOfUpdates it does something like the following.
lock (state)
{
state.numberOfUpdates++;
}
When you want to read it, you do something like the following.
int value;
lock (state)
{
value = state.numberOfUpdates;
}
This will ensure your requirements of atomicity and volatility without delving into more-obscure, relatively low-level multithreading primitives.
Will both work (in the sense of delivering the latest value possible regardless of optimizations, re-orderings, caching, etc.)?
No, the value you get is always stale. How stale the value might be is entirely unpredictable. The vast majority of the time it will be stale by a few nanoseconds, give or take, depending how quickly you act on the value. But there is no reasonable upper-bound:
your thread can lose the processor when it context-switches another thread onto the core. Typical delays are around 45 msec with no firm upper-bound. This does not mean that another thread in your process also gets switched-out, it can keep motoring and continue to mutate the value.
just like any user-mode code, your code is subjected to page-faults as well. Incurred when the processor needs RAM for another process. On a heavily loaded machine that can and will page-out active code. As sometimes happens to the mouse driver code for example, leaving a frozen mouse cursor.
managed threads are subject to near-random garbage collection pauses. Tends to be the lesser problem since it is likely that another thread that's mutating the value will be paused as well.
Whatever you do with the value needs to take this into account. Needless to say perhaps, that's very, very difficult. Practical examples are hard to come by. The .NET Framework is a very large chunk of battle-scarred code. You can see the cross-reference to usage of VolatileRead from the Reference Source. Number of hits: 0.
Well, any value you read will always be somewhat stale as Hans Passant said. You can only control a guarantee that other shared values are consistent with the one you've just read using memory fences in the middle of code reading several shared values without locks (ie: are at the same degree of "staleness")
Fences also have the effect of defeating some compiler optimizations and reordering thus preventing unexpected behavior in release mode on different platforms.
Thread.VolatileRead will cause a full memory fence to be emitted so that no reads or writes can be reordered around your read of the int (in the method that's reading it). Obviously if you're only reading a single shared value (and you're not reading something else shared and the order and consistency of them both is important), then it may not seem necessary...
But I think that you will need it anyway to defeat some optimizations by the compiler or CPU so that you don't get the read more "stale" than necessary.
A dummy Interlocked.CompareExchange will do the same thing as Thread.VolatileRead (full fence and optimization defeating behavior).
There is a pattern followed in the framework used by CancellationTokenSource
http://referencesource.microsoft.com/#mscorlib/system/threading/CancellationTokenSource.cs#64
//m_state uses the pattern "volatile int32 reads, with cmpxch writes" which is safe for updates and cannot suffer torn reads.
private volatile int m_state;
public bool IsCancellationRequested
{
get { return m_state >= NOTIFYING; }
}
// ....
if (Interlocked.CompareExchange(ref m_state, NOTIFYING, NOT_CANCELED) == NOT_CANCELED) {
}
// ....
The volatile keyword has the effect of emitting a "half" fence. (ie: it blocks reads/writes from being moved before the read to it, and blocks reads/writes from being moved after the write to it).
It seems like my options are:
int localNumberOfUpdates = Interlocked.CompareExchange(ref numberOfUpdates, 0, 0);
Or
int localNumberOfUpdates = Thread.VolatileRead(numberOfUpdates);
Starting from the .NET Framework 4.5, there is a third option:
int localNumberOfUpdates = Volatile.Read(ref numberOfUpdates);
The Interlocked methods impose full fences, while the Volatile methods impose half fences¹. So using the static methods of the Volatile class is a potentially more economic way of reading atomically the latest value of an int variable or field.
Alternatively, if the numberOfUpdates is a field of a class, you could declare it as volatile. Reading a volatile field is equivalent with reading it with the Volatile.Read method.
I should mention one more option, which is to simply read the numberOfUpdates directly, without the help of either the Interlocked or the Volatile. We are not supposed to do this, but demonstrating an actual problem caused by doing so might be impossible. The reason is that the memory models of the most commonly used CPUs are stronger than the C# memory model. So if your machine has such a CPU (for example x86 or x64), you won't be able to write a program that fails as a result of reading directly the field. Nevertheless personally I never use this option, because I am not an expert in CPU architectures and memory protocols, nor I have the desire to become one. So I prefer to use either the Volatile class or the volatile keyword, whatever is more convenient in each case.
¹ With some exceptions, like reading/writing an Int64 or double on a 32-bit machine.
Not sure why nobody mentioned Interlocked.Add(ref localNumberOfUpdates, 0), but seems the simplest way to me...
From MSDN here (http://msdn.microsoft.com/en-us/library/aa691278(v=vs.71).aspx) you can see that basic types such as int,byte……are all readable/writable atomic. So I wonder since they are all "atomic":
What's the relationship between "An atomic operation" and "lock"? In my mind, if an operation is "atomic", we don't need lock anymore because they must be "thread-safe", am I right?
And anyway, I was told by someone on the website's forum, are they right?
1) Any reference type is "Atomic Operation".
2) Types such as i++ or --i aren't atomic operations(So how can I check which operation is "atomic", which is NOT?)
I'm always feeling puzzled by these problems....
#Daniel Brückner and #keshlam:
It seems that a mixed operation such as ++i (first read/then write) isn't an atomic. However I feel interested in that if I seperate them into two steps like below:
//suppose I have "i" defined and assigned a value
void Fun()
{
if(i==1)
{
i=2;
}
}
If the codes above will be called by several threads, and "if" part only reads (atomic step), and "i=2" (another atomic step), so I don't need lock?
An atomic operation is one which is inherently uninterruptable -- there is no way for anyone else to see the system in an inconsistent state, only before or after the whole operation.
A single read or write of a primitive value is atomic FOR THAT CORE. It may not be safe if there's code running in other processor cores; values often aren't flushed out of processor cache and made visible to other cores until the operating system explicitly performs that operation. I don't know enough about C# threading to know how much of a risk exists there.
A read-modify-write, which is what we're more often concerned with, often isn't -- which includes the ++ and -- operations. In SOME architectures, there are single instructions with implement an inherently atomic test-and-set, which is a specialized case of read-modify-write that can be used to implement a very lightweight form of threadsafe locking/counting.... but that depends very much on which processor you're running on.
An operation which involves reading and/or changing several primitive values, or which is doing a nontrivial operation upon a value, won't be atomic no matt1er how you slice it. And most real-world tasks above the level of implementing primitive semaphores do involve more than one value that have to be kept synchronized with each other. So while knowing that some operations are atomic is helpful for the simplest case, you will find that this isn't enough once you get above the very lowest level of subroutines. At that point you need to use an explicit lock to keep other threads from interrupting the aggregate operation. Semaphores/locks basically use a primitive atomic as a fence around the more complicated operation, taking advantage of operating system functions to make the second (or later) thread to try to access to lock wait for it VERY efficiently (basically, going to sleep until the lock is released, rather than going into a spin loop). This lets you create larger operations that behave atomically, though they have to cooperate with each other to achieve that.
Summary: If you don't KNOW something is inherently atomic, always assume that it isn't, and that if it's going to be accessed from more than one thread it had to be protected by a lock. Failing to do so results in garbled values and extremely annoying debugging scenarios. Don't take chances; if in doubt, protect it explicitly.
(And don't forget that having a structure be atomic doesn't ensure that its contents are atomic. I had to debug a case last year where someone was using one of Java's atomic collection types but forgot that the structure they were storing into the collection also had to protect its own contents.)
Data types are not atomic, some operation with them are or can be atomic. Assigning a 32 bit integer variable
Int32 foo = 123456789;
is, for example, an atomic operation. It will never happen that another thread observes foo to be 52501, that is the least significant 16 bits are assigned but not yet the most significant 16 bits. The same is not true for Int64 - it may happen that another thread sees an assignment operation only partially executed.
Many other operations are not atomic. For example
foo++;
requires reading, modifying and writing the value and therefore another thread may read and change the value of foo after you read it but before your were able to write the updated value back. You can use Interlocked.Increment() to perform this operation atomically and there are methods for a couple of other operations, too.
So in essence what you called atomic data types only guarantees that variable assignments are performed atomically and this is true for primitive data types and references except Double, Decimal, Int64and UInt64.
void Fun()
{
if (i == 1)
{
i = 2; // I may have any value here because another thread might
// have assigned a new value after the test i == 1 but
// before this assignment.
}
}
Simplified illustration below, how does .NET deal with such a situation?
and if it would cause problems, would i have to lock/gate access to each and every field/property that might at times be written to + accessed from different threads?
A field somewhere
public class CrossRoads(){
public int _timeouts;
}
A background thread writer
public void TimeIsUp(CrossRoads crossRoads){
crossRoads._timeouts++;
}
Possibly at the same time, trying to read elsewhere
public void HowManyTimeOuts(CrossRoads crossRoads){
int timeOuts = crossRoads._timeouts;
}
The simple answer is that the above code has the ability to cause problems if accessed simultaneously from multiple threads.
The .Net framework provides two solutions: interlocking and thread synchronization.
For simple data type manipulation (i.e. ints), interlocking using the Interlocked class will work correctly and is the recommended approach.
In fact, interlocked provides specific methods (Increment and Decrement) that make this process easy:
Add an IncrementCount method to your CrossRoads class:
public void IncrementCount() {
Interlocked.Increment(ref _timeouts);
}
Then call this from your background worker:
public void TimeIsUp(CrossRoads crossRoads){
crossRoads.IncrementCount();
}
The reading of the value, unless of a 64-bit value on a 32-bit OS, are atomic. See the Interlocked.Read method documentation for more detail.
For class objects or more complex operations, you will need to use thread synchronization locking (lock in C# or SyncLock in VB.Net).
This is accomplished by creating a static synchronization object at the level the lock is to be applied (for example, inside your class), obtaining a lock on that object, and performing (only) the necessary operations inside that lock:
private static object SynchronizationObject = new Object();
public void PerformSomeCriticalWork()
{
lock (SynchronizationObject)
{
// do some critical work
}
}
The good news is that reads and writes to ints are guaranteed to be atomic, so no torn values. However, it is not guaranteed to do a safe ++, and the read could potentially be cached in registers. There's also the issue of instruction re-ordering.
I would use:
Interlocked.Increment(ref crossroads._timeouts);
For the write, which will ensure no values are lost, and;
int timeouts = Interlocked.CompareExchange(ref crossroads._timeouts, 0, 0);
For the read, since this observes the same rules as the increment. Strictly speaking "volatile" is probably enough for the read, but it is so poorly understood that the Interlocked seems (IMO) safer. Either way, we're avoiding a lock.
Well, I'm not a C# developer, but this is how it typically works at this level:
how does .NET deal with such a situation?
Unlocked. Not likely to be guaranteed to be atomic.
Would i have to lock/gate access to each and every field/property that might at times be written to + accessed from different threads?
Yes. An alternative would be to make a lock for the object available to the clients, then tell the clients they must lock the object while using the instance. This will reduce the number of locks acquisitions, and guarantee a more consistent, predictable, state for your clients.
Forget dotnet. At the machine language level, crossRoads._timeouts++ will be implemented as an INC [memory] instruction. This is known as a Read-Modify-Write instruction. These instructions are atomic with respect to multi-threading on a single processor*, (essentially implemented with time-slicing,) but are not atomic with respect to multi-threading using multiple processors or multiple cores.
So:
If you can guarantee that only TimeIsUp() will ever modify crossRoads._timeouts, and if you can guarantee that only one thread will ever execute TimeIsUp(), then it will be safe to do this. The writing in TimeIsUp() will work fine, and the reading in HowManyTimeOuts() (and any place else) will work fine. But if you also modify crossRoads._timeouts elsewhere, or if you ever spawn one more background thread writer, you will be in trouble.
In either case, my advice would be to play it safe and lock it.
(*) They are atomic with respect to multi-threading on a single processor because context switches between threads happen on a periodic interrupt, and on the x86 architectures these instructions are atomic with respect to interrupts, meaning that if an interrupt occurs while the CPU is executing such an instruction, the interrupt will wait until the instruction completes. This does not hold true with more complex instructions, for example those with the REP prefix.
Although an int may be 'native' size to a CPU (dealing in 32 or 64 bits at a time), if you are reading and writing from different threads to the same variable, you are best off locking this variable and synchronizing access.
There is never a guarantee that reads/writes maybe atomic to an int.
You can also use Interlocked.Increment for your purposes here.
While going through some database code looking for a bug unrelated to this question, I noticed that in some places List<T> was being used inappropriately. Specifically:
There were many threads concurrently accessing the List as readers, but using indexes into the list instead of enumerators.
There was a single writer to the list.
There was zero synchronization, readers and writers were accessing the list at the same time, but because of code structure the last element would never be accessed until the method that executed the Add() returned.
No elements were ever removed from the list.
By the C# documentation, this should not be thread safe.
Yet it has never failed. I am wondering, because of the specific implementation of the List (I am assuming internally it's an array that re-allocs when it runs out of space), it the 1-writer 0-enumerator n-reader add-only scenario accidentally thread safe, or is there some unlikely scenario where this could blow up in the current .NET4 implementation?
edit: Important detail I left out reading some of the replies. The readers treat the List and its contents as read-only.
This can and will blow. It just hasn't yet. Stale indices is usually the first thing that goes. It will blow just when you don't want it to. You are probably lucky at the moment.
As you are using .Net 4.0, I'd suggest changing the list to a suitable collection from System.Collections.Concurrent which is guaranteed to be thread safe. I'd also avoid using array indices and switch to ConcurrentDictionary if you need to look up something:
http://msdn.microsoft.com/en-us/library/dd287108.aspx
Because of it has never failed or your application doesn't crash that doesn't mean that this scenario is thread safe. for instance suppose the writer thread does update a field within the list, lets say that is was a long field, at the same time the reader thread reading that field. the value returned maybe a bitwise combination of the two fields the old one and the new one! that could happen because the reader thread start reading the value from memory but before it finishes reading it the writer thread just updated it.
Edit: That of course if we suppose that the reader threads will just read all the data without updating anything, I am sure that they doesn't change the values of the arrays them self but, but they could change a property or field within the value they read. for instance:
for (int index =0 ; index < list.Count; index++)
{
MyClass myClass = list[index];//ok we are just reading the value from list
myClass.SomeInteger++;//boom the same variable will be updated from another threads...
}
This example not talking about thread safe of the list itself rather than the shared variables that the list exposed.
The conclusion is that you have to use a synchronization mechanism such as lock before interaction with the list, even if it has only one writer and no item removed, that will help you prevent tinny bugs and failure scenarios you are dispensable for in the first place.
Thread safety only matters when data is modified more than once at a time. The number of readers does not matter. Even when someone is writing while someone reads, the reader either gets the old data or the new, it still works. The fact that elements can only be accessed after the Add() returns, prevents parts of the element being read seperately. If you would start using the Insert() method readers could get the wrong data.
It follows then, that if the architecture is 32 bits, writing a field bigger than 32 bits, such as long and double, is not a thread safe operation; see the documentation for System.Double:
Assigning an instance of this type is not thread safe on all hardware platforms because the
binary representation of that instance might be too large to assign in a single atomic
operation.
If the list is fixed in size, however, this situation matters only if the List is storing value types greater than 32 bits. If the list is only holding reference types, then any thread safety issues stem from the reference types themselves, not from their storage and retrieval from the List. For instance, immutable reference types are less likely to cause thread safety issues than mutable reference types.
Moreover, you can't control the implementation details of List: that class was mainly designed for performance, and it's likely to change in the future with that aspect, rather than thread safety, in mind.
In particular, adding elements to a list or otherwise changing its size is not thread safe even if the list's elements are 32 bits long, since there is more involved in inserting, adding, or removing than just placing the element in the list. If such operations are needed after other threads have access to the list, then locking access to the list or using a concurrent list implementation is a better choice.
First off, to some of the posts and comments, since when was documentation reliable?
Second, this answer is more to the general question than the specifics of the OP.
I agree with MrFox in theory because this all boils down to two questions:
Is the List class is implemented as a flat array?
If yes, then:
Can a write instruction be preempted in the middle of a write>
I believe this is not the case -- the full write will happen before anything can read that DWORD or whatever. In other words, it will never happen that I write two of the four bytes of a DWORD and then you read 1/2 of the new value and 1/2 of the old one.
So, if you're indexing an array by providing an offset to some pointer, you can read safely without thread-locking. If the List is doing more than just simple pointer math, then it is not thread safe.
If the List was not using a flat array, I think you would have seen it crash by now.
My own experience is that it is safe to read a single item from a List via index without thread-locking. This is all just IMHO though, so take it for what it's worth.
Worst case, such as if you need to iterate through the list, the best thing to do is:
lock the List
create an array the same size
use CopyTo() to copy the List to the array
unlock the List
then iterate through the array instead of the list.
in (whatever you call the .net) C++:
List<Object^>^ objects = gcnew List<Object^>^();
// in some reader thread:
Monitor::Enter(objects);
array<Object^>^ objs = gcnew array<Object^>(objects->Count);
objects->CopyTo(objs);
Monitor::Exit(objects);
// use objs array
Even with the memory allocation, this will be faster than locking the List and iterating through the entire thing before unlocking it.
Just a heads up though: if you want a fast system, thread-locking is your worst enemy. Use ZeroMQ instead. I can speak from experience, message-based synch is the right way to go.
When implementing a thread safe list or queue; does it require to lock on the List.Count property before returning the Count i.e:
//...
public int Count
{
lock (_syncObject)
{
return _list.Count;
}
}
//...
Is it necessary to do a lock because of the original _list.Count variable maybe not a volatile variable?
Yes, that is necessary, but mostly irrelevant. Without the lock you could be reading stale values or even, depending on the inner working and the type of _list.Count, introduce errors.
But note that using that Count property is problematic, any calling code cannot really rely on it:
if (myStore.Count > 0) // this Count's getter locks internally
{
var item = myStore.Dequeue(); // not safe, myStore could be empty
}
So you should aim for a design where checking the Count and acting on it are combined:
ItemType GetNullOrFirst()
{
lock (_syncObject)
{
if (_list.Count > 0)
{
....
}
}
}
Additional:
is it nessesary to do a lock because of the original _list.Count variable maybe not a volatile variable?
The _list.Count is not a variable but a property. It cannot be marked volatile. Whether it is thread-safe depends on the getter code of the property, but it will usually be safe. But unreliable.
It depends.
Now, theoretically, because the underlying value is not threadsafe, just about anything could go wrong because of something the implementation does. In practice though, it's a read from a variable of 32-bits and so guaranteed to be atomic. Hence it might be stale, but it will be a real stale value rather than garbage caused by reading half a value before a change and the other half after it.
So the question is, does staleness matter? Possibly it doesn't. The more you can put up with staleness the better for performance (because the more you can put up with it the less you need to do to ensure you don't have anything stale). E.g. if you were putting the count out to the UI and the collection was changing rapidly, then just the time it takes for the person to read the number and process it in their own brain is going to be enough to make it obsolete anyway, and hence stale.
If however, you needed it to ensure an operation was reasonable before it was attempted, then that staleness is going to cause problems.
However, that staleness is going to happen anyway, because in between your locked (and guaranteed to be fresh) read and the operation happening, there's the opportunity for the collection to change, so you've got a race condition even when locking.
Hence, if freshness is important, you are going to have to lock at a higher level. With this being the case, the lower-level lock is just a waste. Hence you can probably do without it at that point.
The important thing here is that even with a class that is threadsafe in every regard, the best that can guarantee is that each operation will be fresh (though even that may not be guaranteed; indeed when there are more than one core involved "fresh" begins to become meaningless as changes that are near to truly simultaneous can happen) and that each operation won't put the object into an invalid safe. It is still possible to write non-threadsafe code with threadsafe objects (indeed very many non-threadsafe objects are composed of ints, strings and bools or objects that are in turn composed of them, and each of those is threadsafe in and of itself).
One thing that can be useful with mutable classes intended for multithreaded use are synchronised "Try" methods. E.g. to use a List as a stack we need the following operations:
See if the list is empty, reporting failure if it is.
Obtain the "top" value.
Remove the top value.
Even if we synchronise each of those individually, the code doing so won't be threadsafe as something can happen between each step. However, we can provide a Pop() method that does each three in one synchronised method. Analogous approaches are also useful with lock-free classes (where different techniques are used to ensure the method either succeeds or fails completely, and without damage to the object).
No, you don't need to lock, but the caller should otherwise something like this might happen
count is n
thread asks for count
Count returns n
another thread dequeues
count is n - 1
thread asking for count sees count is n when count is actually n - 1