thread-safety of primitive concurrent read and write - c#

Simplified illustration below, how does .NET deal with such a situation?
and if it would cause problems, would i have to lock/gate access to each and every field/property that might at times be written to + accessed from different threads?
A field somewhere
public class CrossRoads(){
public int _timeouts;
}
A background thread writer
public void TimeIsUp(CrossRoads crossRoads){
crossRoads._timeouts++;
}
Possibly at the same time, trying to read elsewhere
public void HowManyTimeOuts(CrossRoads crossRoads){
int timeOuts = crossRoads._timeouts;
}

The simple answer is that the above code has the ability to cause problems if accessed simultaneously from multiple threads.
The .Net framework provides two solutions: interlocking and thread synchronization.
For simple data type manipulation (i.e. ints), interlocking using the Interlocked class will work correctly and is the recommended approach.
In fact, interlocked provides specific methods (Increment and Decrement) that make this process easy:
Add an IncrementCount method to your CrossRoads class:
public void IncrementCount() {
Interlocked.Increment(ref _timeouts);
}
Then call this from your background worker:
public void TimeIsUp(CrossRoads crossRoads){
crossRoads.IncrementCount();
}
The reading of the value, unless of a 64-bit value on a 32-bit OS, are atomic. See the Interlocked.Read method documentation for more detail.
For class objects or more complex operations, you will need to use thread synchronization locking (lock in C# or SyncLock in VB.Net).
This is accomplished by creating a static synchronization object at the level the lock is to be applied (for example, inside your class), obtaining a lock on that object, and performing (only) the necessary operations inside that lock:
private static object SynchronizationObject = new Object();
public void PerformSomeCriticalWork()
{
lock (SynchronizationObject)
{
// do some critical work
}
}

The good news is that reads and writes to ints are guaranteed to be atomic, so no torn values. However, it is not guaranteed to do a safe ++, and the read could potentially be cached in registers. There's also the issue of instruction re-ordering.
I would use:
Interlocked.Increment(ref crossroads._timeouts);
For the write, which will ensure no values are lost, and;
int timeouts = Interlocked.CompareExchange(ref crossroads._timeouts, 0, 0);
For the read, since this observes the same rules as the increment. Strictly speaking "volatile" is probably enough for the read, but it is so poorly understood that the Interlocked seems (IMO) safer. Either way, we're avoiding a lock.

Well, I'm not a C# developer, but this is how it typically works at this level:
how does .NET deal with such a situation?
Unlocked. Not likely to be guaranteed to be atomic.
Would i have to lock/gate access to each and every field/property that might at times be written to + accessed from different threads?
Yes. An alternative would be to make a lock for the object available to the clients, then tell the clients they must lock the object while using the instance. This will reduce the number of locks acquisitions, and guarantee a more consistent, predictable, state for your clients.

Forget dotnet. At the machine language level, crossRoads._timeouts++ will be implemented as an INC [memory] instruction. This is known as a Read-Modify-Write instruction. These instructions are atomic with respect to multi-threading on a single processor*, (essentially implemented with time-slicing,) but are not atomic with respect to multi-threading using multiple processors or multiple cores.
So:
If you can guarantee that only TimeIsUp() will ever modify crossRoads._timeouts, and if you can guarantee that only one thread will ever execute TimeIsUp(), then it will be safe to do this. The writing in TimeIsUp() will work fine, and the reading in HowManyTimeOuts() (and any place else) will work fine. But if you also modify crossRoads._timeouts elsewhere, or if you ever spawn one more background thread writer, you will be in trouble.
In either case, my advice would be to play it safe and lock it.
(*) They are atomic with respect to multi-threading on a single processor because context switches between threads happen on a periodic interrupt, and on the x86 architectures these instructions are atomic with respect to interrupts, meaning that if an interrupt occurs while the CPU is executing such an instruction, the interrupt will wait until the instruction completes. This does not hold true with more complex instructions, for example those with the REP prefix.

Although an int may be 'native' size to a CPU (dealing in 32 or 64 bits at a time), if you are reading and writing from different threads to the same variable, you are best off locking this variable and synchronizing access.
There is never a guarantee that reads/writes maybe atomic to an int.
You can also use Interlocked.Increment for your purposes here.

Related

Is it safe to use Volatile.Read combined with Interlocked.Exchange for concurrently accessing a shared memory location from multiple threads in .NET?

Experts on threading/concurrency/memory model in .NET, could you verify that the following code is correct under all circumstances (that is, regardless of OS, .NET runtime, CPU architecture, etc.)?
class SomeClassWhoseInstancesAreAccessedConcurrently
{
private Strategy _strategy;
public SomeClassWhoseInstancesAreAccessedConcurrently()
{
_strategy = new SomeStrategy();
}
public void DoSomething()
{
Volatile.Read(ref _strategy).DoSomething();
}
public void ChangeStrategy()
{
Interlocked.Exchange(ref _strategy, new AnotherStrategy());
}
}
This pattern comes up pretty frequently. We have an object which is used concurrently by multiple threads and at some point the value of one of its fields needs to be changed. We want to guarantee that from that point on every access to that field coming from any thread observe the new value.
Considering the example above, we want to make sure that after the point in time when ChangeStrategy is executed, it can't happen that SomeStrategy.DoSomething is called instead of AnotherStrategy.DoSomething because some of the threads don't observe the change and use the old value cached in a register/CPU cache/whatever.
To my knowledge of the topic, we need at least volatile read to prevent such caching. The main question is that is it enough or we need Interlocked.CompareExchange(ref _strategy, null, null) instead to achieve the correct behavior?
If volatile read is enough, a further question arises: do we need Interlocked.Exchange at all or even volatile write would be ok in this case?
As I understand, volatile reads/writes use half-fences which allows a write followed by a read reordered, whose implications I still can't fully understand, to be honest. However, as per ECMA 335 specification, section I.12.6.5, "The class library provides a variety of atomic operations in the
System.Threading.Interlocked class. These operations (e.g., Increment, Decrement, Exchange,
and CompareExchange) perform implicit acquire/release operations." So, if I understand this correctly, Interlocked.Exchange should create a full-fence, which looks enough.
But, to complicate things further, it seems that not all Interlocked operations were implemented according to the specification on every platform.
I'd be very grateful if someone could clear this up.
Yes, your code is safe. It is functionally equivalent with using a lock like this:
public void DoSomething()
{
Strategy strategy;
lock (_locker) strategy = _strategy;
strategy.DoSomething();
}
public void ChangeStrategy()
{
Strategy strategy = new AnotherStrategy();
lock (_locker) _strategy = strategy;
}
Your code is more performant though, because the lock imposes a full fence, while the Volatile.Read imposes a potentially cheaper half fence.
You could improve the performance even more by replacing the Interlocked.Exchange (full fence) with a Volatile.Write (half fence). The only reason to prefer the Interlocked.Exchange over the Volatile.Write is when you want to retrieve the previous strategy as an atomic operation. Apparently this is not needed in your case.
For simplicity you could even get rid of the Volatile.Write/Volatile.Read calls, and just declare the _strategy field as volatile.

Read int value in thread safe way [duplicate]

(This is a repeat of: How to correctly read an Interlocked.Increment'ed int field? but, after reading the answers and comments, I'm still not sure of the right answer.)
There's some code that I don't own and can't change to use locks that increments an int counter (numberOfUpdates) in several different threads. All calls use:
Interlocked.Increment(ref numberOfUpdates);
I want to read numberOfUpdates in my code. Now since this is an int, I know that it can't tear. But what's the best way to ensure that I get the latest value possible? It seems like my options are:
int localNumberOfUpdates = Interlocked.CompareExchange(ref numberOfUpdates, 0, 0);
Or
int localNumberOfUpdates = Thread.VolatileRead(numberOfUpdates);
Will both work (in the sense of delivering the latest value possible regardless of optimizations, re-orderings, caching, etc.)? Is one preferred over the other? Is there a third option that's better?
I'm a firm believer in that if you're using interlocked to increment shared data, then you should use interlocked everywhere you access that shared data. Likewise, if you use insert you favorite synchronization primitive here to increment shared data, then you should use insert you favorite synchronization primitive here everywhere you access that shared data.
int localNumberOfUpdates = Interlocked.CompareExchange(ref numberOfUpdates, 0, 0);
Will give you exactly what your looking for. As others have said interlocked operations are atomic. So Interlocked.CompareExchange will always return the most recent value. I use this all the time for accessing simple shared data like counters.
I'm not as familiar with Thread.VolatileRead, but I suspect it will also return the most recent value. I'd stick with interlocked methods, if only for the sake of being consistent.
Additional info:
I'd recommend taking a look at Jon Skeet's answer for why you may want to shy away from Thread.VolatileRead(): Thread.VolatileRead Implementation
Eric Lippert discusses volatility and the guarantees made by the C# memory model in his blog at http://blogs.msdn.com/b/ericlippert/archive/2011/06/16/atomicity-volatility-and-immutability-are-different-part-three.aspx. Straight from the horses mouth: "I don't attempt to write any low-lock code except for the most trivial usages of Interlocked operations. I leave the usage of "volatile" to real experts."
And I agree with Hans's point that the value will always be stale at least by a few ns, but if you have a use case where that is unacceptable, its probably not well suited for a garbage collected language like C# or a non-real-time OS. Joe Duffy has a good article on the timeliness of interlocked methods here: http://joeduffyblog.com/2008/06/13/volatile-reads-and-writes-and-timeliness/
Thread.VolatileRead(numberOfUpdates) is what you want. numberOfUpdates is an Int32, so you already have atomicity by default, and Thread.VolatileRead will ensure volatility is dealt with.
If numberOfUpdates is defined as volatile int numberOfUpdates; you don't have to do
this, as all reads of it will already be volatile reads.
There seems to be confusion about whether Interlocked.CompareExchange is more appropriate. Consider the following two excerpts from the documentation.
From the Thread.VolatileRead documentation:
Reads the value of a field. The value is the latest written by any processor in a computer, regardless of the number of processors or the state of processor cache.
From the Interlocked.CompareExchange documentation:
Compares two 32-bit signed integers for equality and, if they are equal, replaces one of the values.
In terms of the stated behavior of these methods, Thread.VolatileRead is clearly more appropriate. You do not want to compare numberOfUpdates to another value, and you do not want to replace its value. You want to read its value.
Lasse makes a good point in his comment: you might be better off using simple locking. When the other code wants to update numberOfUpdates it does something like the following.
lock (state)
{
state.numberOfUpdates++;
}
When you want to read it, you do something like the following.
int value;
lock (state)
{
value = state.numberOfUpdates;
}
This will ensure your requirements of atomicity and volatility without delving into more-obscure, relatively low-level multithreading primitives.
Will both work (in the sense of delivering the latest value possible regardless of optimizations, re-orderings, caching, etc.)?
No, the value you get is always stale. How stale the value might be is entirely unpredictable. The vast majority of the time it will be stale by a few nanoseconds, give or take, depending how quickly you act on the value. But there is no reasonable upper-bound:
your thread can lose the processor when it context-switches another thread onto the core. Typical delays are around 45 msec with no firm upper-bound. This does not mean that another thread in your process also gets switched-out, it can keep motoring and continue to mutate the value.
just like any user-mode code, your code is subjected to page-faults as well. Incurred when the processor needs RAM for another process. On a heavily loaded machine that can and will page-out active code. As sometimes happens to the mouse driver code for example, leaving a frozen mouse cursor.
managed threads are subject to near-random garbage collection pauses. Tends to be the lesser problem since it is likely that another thread that's mutating the value will be paused as well.
Whatever you do with the value needs to take this into account. Needless to say perhaps, that's very, very difficult. Practical examples are hard to come by. The .NET Framework is a very large chunk of battle-scarred code. You can see the cross-reference to usage of VolatileRead from the Reference Source. Number of hits: 0.
Well, any value you read will always be somewhat stale as Hans Passant said. You can only control a guarantee that other shared values are consistent with the one you've just read using memory fences in the middle of code reading several shared values without locks (ie: are at the same degree of "staleness")
Fences also have the effect of defeating some compiler optimizations and reordering thus preventing unexpected behavior in release mode on different platforms.
Thread.VolatileRead will cause a full memory fence to be emitted so that no reads or writes can be reordered around your read of the int (in the method that's reading it). Obviously if you're only reading a single shared value (and you're not reading something else shared and the order and consistency of them both is important), then it may not seem necessary...
But I think that you will need it anyway to defeat some optimizations by the compiler or CPU so that you don't get the read more "stale" than necessary.
A dummy Interlocked.CompareExchange will do the same thing as Thread.VolatileRead (full fence and optimization defeating behavior).
There is a pattern followed in the framework used by CancellationTokenSource
http://referencesource.microsoft.com/#mscorlib/system/threading/CancellationTokenSource.cs#64
//m_state uses the pattern "volatile int32 reads, with cmpxch writes" which is safe for updates and cannot suffer torn reads.
private volatile int m_state;
public bool IsCancellationRequested
{
get { return m_state >= NOTIFYING; }
}
// ....
if (Interlocked.CompareExchange(ref m_state, NOTIFYING, NOT_CANCELED) == NOT_CANCELED) {
}
// ....
The volatile keyword has the effect of emitting a "half" fence. (ie: it blocks reads/writes from being moved before the read to it, and blocks reads/writes from being moved after the write to it).
It seems like my options are:
int localNumberOfUpdates = Interlocked.CompareExchange(ref numberOfUpdates, 0, 0);
Or
int localNumberOfUpdates = Thread.VolatileRead(numberOfUpdates);
Starting from the .NET Framework 4.5, there is a third option:
int localNumberOfUpdates = Volatile.Read(ref numberOfUpdates);
The Interlocked methods impose full fences, while the Volatile methods impose half fences¹. So using the static methods of the Volatile class is a potentially more economic way of reading atomically the latest value of an int variable or field.
Alternatively, if the numberOfUpdates is a field of a class, you could declare it as volatile. Reading a volatile field is equivalent with reading it with the Volatile.Read method.
I should mention one more option, which is to simply read the numberOfUpdates directly, without the help of either the Interlocked or the Volatile. We are not supposed to do this, but demonstrating an actual problem caused by doing so might be impossible. The reason is that the memory models of the most commonly used CPUs are stronger than the C# memory model. So if your machine has such a CPU (for example x86 or x64), you won't be able to write a program that fails as a result of reading directly the field. Nevertheless personally I never use this option, because I am not an expert in CPU architectures and memory protocols, nor I have the desire to become one. So I prefer to use either the Volatile class or the volatile keyword, whatever is more convenient in each case.
¹ With some exceptions, like reading/writing an Int64 or double on a 32-bit machine.
Not sure why nobody mentioned Interlocked.Add(ref localNumberOfUpdates, 0), but seems the simplest way to me...

Possibly incorrect implementation of double double-lock checking

I've found in our project's code the following implementation of double-lock checking:
public class SomeComponent
{
private readonly object mutex = new object();
public SomeComponent()
{
}
public bool IsInitialized { get; private set; }
public void Initialize()
{
this.InitializeIfRequired();
}
protected virtual void InitializeIfRequired()
{
if (!this.OnRequiresInitialization())
{
return;
}
lock (this.mutex)
{
if (!this.OnRequiresInitialization())
{
return;
}
try
{
this.OnInitialize();
}
catch (Exception)
{
throw;
}
this.IsInitialized = true;
}
}
protected virtual void OnInitialize()
{
//some code here
}
protected virtual bool OnRequiresInitialization()
{
return !this.IsInitialized;
}
}
From my point of view, this is the wrong implementation due to the absence of guarantees that different threads will see the freshest value of the IsInitialized property.
And the question is "Am I right?".
Update:
The scenario that I'm afraid to happen, is the following:
Step 1. Thread1 is executed on Processor1 and writes true into IsInitialized inside the lock section. This time old value of IsInitialized ( it's false) is in the cache of Processor1 As we know, processors have store buffers, so Processor1 can put new value (true) into its store buffer, not into its cache.
Step 2. Thread2 is inside InitializeIfRequired, executed on Processor2 and reads IsInitialized. There is no value of IsInitialized inside the cache of Processor2, so Processor2 ask the value of IsInitialized from other processors' caches or from memory. Processor1 has the value of IsInitialized inside its cache ( but remember it's old value, the updated value is still in the store buffer of Processor1 ), so it sends the old value to Processor2. As a result, Thread2 can read false instead of true.
Update 2:
If the lock (this.mutex) flushes processors' store buffers, then everything is ok, but is that guaranteed?
this is the wrong implementation due to the absence of guarantees that different threads will see the freshest value of the IsInitialized property. The question is "Am I right?".
You are correct that this is a broken implementation of double-checked locking. You are wrong in multiple subtle ways about why it is wrong.
First, let's disabuse you of your wrongness.
The belief that there is a "freshest" value of any variable in a multithreaded program is a bad belief, for two reasons. The first reason is that yes, C# makes guarantees about certain constraints on how reads and writes may be re-ordered. However, those guarantees do not include any promise that a globally consistent ordering exists and can be deduced by all threads. It is legal in the C# memory model for there to be reads and writes on variables, and for there to be ordering constraints on those reads and writes. But in cases where those constraints are not strong enough to enforce exactly one ordering of reads and writes, it is permissible for there to be no "canonical" order observed by all threads. It is permitted for two threads to agree that the constraints were all met, but still disagree upon what order was chosen. This logically implies that the notion that there is a single, canonical "freshest" value for each variable is simply wrong. Different threads can disagree as to which writes are "fresher" than others.
The second reason is that even without this weird property that the model admits two threads to disagree on the sequence of reads and writes, it would still be wrong to say that in any low-lock program you have a way to read the "freshest" value. All the primitive operations you have guarantee you is that certain writes and reads will not be moved forwards or backwards in time past certain points in the code. Nothing in there says anything whatsoever about "freshest", whatever that means. The best you can say is that some reads will read a fresher value. The notion of "freshest" is not defined by the memory model.
Another way you are wrong is very subtle indeed. You are doing a great job of reasoning about what might happen based on processors flushing caches. But nowhere in the C# documentation does it say one word about processors flushing caches! That's a chip implementation detail that is subject to change any time your C# program runs on a different architecture. Do not reason about processors flushing caches unless you know your program will run on exactly one architecture, and that you thoroughly understand that architecture. Rather, reason about the constraints imposed by the memory model. I am aware that the documentation on the model is sorely lacking, but that's the thing you should be reasoning about, because that's what you can actually depend on.
The other way that you are wrong is that though yes, the implementation is broken, it is not broken because you are not reading an up-to-date value of the initialized flag. The problem is that the initialized state that is controlled by the flag is not subject to restrictions on being moved around in time!
Let's make your example a bit more concrete:
private C c = null;
protected virtual void OnInitialize()
{
c = new C();
}
And a usage site:
this.InitializeIfRequired();
this.c.Frob();
Now we come to the real problem. Nothing is stopping the reads of IsInitialized and c from being moved around in time.
Suppose threads Alpha and Bravo are both running this code. Thread Bravo wins the race and the first thing it does is reads c as null. Remember, it is allowed to do so because there is no ordering constraint on the reads and writes because Bravo is never going to enter the lock.
Realistically, how might this happen? The C# compiler or the jitter are permitted to move the read instruction earlier, but they don't. Briefly returning to the real world of cached architectures, the read of c might be logically moved up in front of the read of the flag because c is already in the cache. Maybe it was close to a different variable that was read recently. Or maybe branch prediction is predicting that the flag is going to cause you to skip the lock, and the processor pre-fetches the value. But again, it doesn't matter what the real-world scenario is; that's all chip implementation details. The C# spec permits this read to be done early, so assume that at some point it will be done early!
Back to our scenario. We immediately switch to thread Alpha.
Thread Alpha runs as you expect it to. It sees that the flag says that initialization is required, takes the lock, initializes c, sets the flag, and leaves.
Now thread Bravo runs again, the flag now says that initialization is not required, and so we use the version of c that we read earlier, and dereference null.
Double-checked locking is correct in C# as long as you strictly follow the exact double-checked locking pattern. The moment you diverge from it even slightly you are off in the weeds of horrible, unreproducible, race condition bugs like the one I just described. Just don't go there:
Don't share memory across threads. The takeaway that I take from knowing everything I just told you is I am not smart enough to write multithreaded code that shares memory and works by design. I am only smart enough to write multithreaded code that works by accident, and that's not acceptable to me.
If you must share memory across threads, lock every access, without exception. It's not that expensive! And you know what is more expensive? Dealing with a series of unreproducible fatal crashes that all lose user data.
If you must share memory across threads and you must have low lock lazy initialization good heavens do not write it yourself. Use Lazy<T>; it contains a correct implementation of low-locked lazy initialization that you can rely on being correct on all processor architectures.
Follow-up question:
If the lock (this.mutex) flushes processors' store buffers, then everything is ok, but is that guaranteed?
To clarify, this question is about whether the initialized flag is read correctly in the double-checked locking scenario. Let's again address your misconceptions here.
The initialized flag is guaranteed to be read correctly inside the lock because it is written inside the lock.
However, the correct way to think about this, as I mentioned before, is not to reason anything about flushing caches. The correct way to reason about this is that the C# specification puts restrictions on how reads and writes can be moved around in time with respect to locks.
In particular, a read inside a lock may not be moved to before the lock, and a write inside a lock may not be moved to after the lock. Those facts, combined with the fact that locks provide mutual exclusion, is sufficient to conclude that the read of the initialized flag is correct inside the lock.
Again, if you are not comfortable making these kinds of deductions -- and I am not! -- then do not write low-lock code.

Benefits of locking a variable? [duplicate]

Lets just say you have a simple operation that runs on a background thread. You want to provide a way to cancel this operation so you create a boolean flag that you set to true from the click event handler of a cancel button.
private bool _cancelled;
private void CancelButton_Click(Object sender ClickEventArgs e)
{
_cancelled = true;
}
Now you're setting the cancel flag from the GUI thread, but you're reading it from the background thread. Do you need to lock before accessing the bool?
Would you need to do this (and obviously lock in the button click event handler too):
while(operationNotComplete)
{
// Do complex operation
lock(_lockObject)
{
if(_cancelled)
{
break;
}
}
}
Or is it acceptable to do this (with no lock):
while(!_cancelled & operationNotComplete)
{
// Do complex operation
}
Or what about marking the _cancelled variable as volatile. Is that necessary?
[I know there is the BackgroundWorker class with it's inbuilt CancelAsync() method, but I'm interested in the semantics and use of locking and threaded variable access here, not the specific implementation, the code is just an example.]
There seems to be two theories.
1) Because it is a simple inbuilt type (and access to inbuilt types is atomic in .net) and because we are only writing to it in one place and only reading on the background thread there is no need to lock or mark as volatile.
2) You should mark it as volatile because if you don't the compiler may optimise out the read in the while loop because it thinks nothing it capable of modifying the value.
Which is the correct technique? (And why?)
[Edit: There seem to be two clearly defined and opposing schools of thought on this. I am looking for a definitive answer on this so please if possible post your reasons and cite your sources along with your answer.]
Firstly, threading is tricky ;-p
Yes, despite all the rumours to the contrary, it is required to either use lock or volatile (but not both) when accessing a bool from multiple threads.
For simple types and access such as an exit flag (bool), then volatile is sufficient - this ensures that threads don't cache the value in their registers (meaning: one of the threads never sees updates).
For larger values (where atomicity is an issue), or where you want to synchronize a sequence of operations (a typical example being "if not exists and add" dictionary access), a lock is more versatile. This acts as a memory-barrier, so still gives you the thread safety, but provides other features such as pulse/wait. Note that you shouldn't use a lock on a value-type or a string; nor Type or this; the best option is to have your own locking object as a field (readonly object syncLock = new object();) and lock on this.
For an example of how badly it breaks (i.e. looping forever) if you don't synchronize - see here.
To span multiple programs, an OS primitive like a Mutex or *ResetEvent may also be useful, but this is overkill for a single exe.
_cancelled must be volatile. (if you don't choose to lock)
If one thread changes the value of _cancelled, other threads might not see the updated result.
Also, I think the read/write operations of _cancelled are atomic:
Section 12.6.6 of the CLI spec states:
"A conforming CLI shall guarantee that
read and write access to properly
aligned memory locations no larger
than the native word size is atomic
when all the write accesses to a
location are the same size."
Locking is not required because you have a single writer scenario and a boolean field is a simple structure with no risk of corrupting the state (while it was possible to get a boolean value that is neither false nor true). But you have to mark the field as volatile to prevent the compiler from doing some optimizations. Without the volatile modifier the compiler could cache the value in a register during the execution of your loop on your worker thread and in turn the loop would never recognize the changed value. This MSDN article (How to: Create and Terminate Threads (C# Programming Guide)) addresses this issue.
While there is need for locking, a lock will have the same effect as marking the field volatile.
For thread synchronization, it's recommended that you use one of the EventWaitHandle classes, such as ManualResetEvent. While it's marginally simpler to employ a simple boolean flag as you do here (and yes, you'd want to mark it as volatile), IMO it's better to get into the practice of using the threading tools. For your purposes, you'd do something like this...
private System.Threading.ManualResetEvent threadStop;
void StartThread()
{
// do your setup
// instantiate it unset
threadStop = new System.Threading.ManualResetEvent(false);
// start the thread
}
In your thread..
while(!threadStop.WaitOne(0) && !operationComplete)
{
// work
}
Then in the GUI to cancel...
threadStop.Set();
Look up Interlocked.Exchange(). It does a very fast copy into a local variable which can be used for comparison. It is faster than lock().

What is thread safe (C#) ? (Strings, arrays, ... ?)

I'm quite new to C# so please bear with me. I'm a bit confused with the thread safety. When is something thread safe and when something isn't?
Is reading (just reading from something that was initialized before) from a field always thread safe?
//EXAMPLE
RSACryptoServiceProvider rsa = new RSACrytoServiceProvider();
rsa.FromXmlString(xmlString);
//Is this thread safe if xml String is predifined
//and this code can be called from multiple threads?
Is accessing an object from an array or list always thread safe (in case you use a for loop for enumeration)?
//EXAMPLE (a is local to thread, array and list are global)
int a = 0;
for(int i=0; i<10; i++)
{
a += array[i];
a -= list.ElementAt(i);
}
Is enumeration always/ever thread safe?
//EXAMPLE
foreach(Object o in list)
{
//do something with o
}
Can writing and reading to a particular field ever result in a corrupted read (half of the field is changed and half is still unchanged) ?
Thank you for all your answers and time.
EDIT: I meant if all threads are only reading & using (not writing or changing) object. (except for the last question where it is obvious that I meant if threads both read and write). Because I do not know if plain access or enumeration is thread safe.
It's different for different cases, but in general, reading is safe if all threads are reading. If any are writing, neither reading or writing is safe unless it can be done atomically (inside a synchronized block or with an atomic type).
It isn't definite that reading is ok -- you never know what is happening under the hoods -- for example, a getter might need to initialize data on first usage (therefore writing to local fields).
For Strings, you are in luck -- they are immutable, so all you can do is read them. With other types, you will have to take precautions against them changing in other threads while you are reading them.
Is reading (just reading from something that was initialized before) from a field always thread safe?
The C# language guarantees that reads and writes are consistently ordered when the reads and writes are on a single thread in section 3.10:
Data dependence is preserved within a thread of execution. That is, the value of each variable is computed as if all statements in the thread were executed in original program order. Initialization ordering rules are preserved.
Events in a multithreaded, multiprocessor system do not necessarily have a well-defined consistent ordering in time with respect to each other. The C# language does not guarantee there to be a consistent ordering. A sequence of writes observed by one thread may be observed to be in a completely different order when observed from another thread, so long as no critical execution point is involved.
The question is therefore unanswerable because it contains an undefined word. Can you give a precise definition of what "before" means to you with respect to events in a multithreaded, multiprocessor system?
The language guarantees that side effects are ordered only with respect to critical execution points, and even then, does not make any strong guarantees when exceptions are involved. Again, to quote from section 3.10:
Execution of a C# program proceeds such that the side effects of each executing thread are preserved at critical execution points. A side effect is defined as a read or write of a volatile field, a write to a non-volatile variable, a write to an external resource, and the throwing of an exception. The critical execution points at which the order of these side effects must be preserved are references to volatile fields, lock statements, and thread creation and termination. [...] The ordering of side effects is preserved with respect to volatile reads and writes.
Additionally, the execution environment need not evaluate part of an expression if it can deduce that that expression’s value is not used and that no needed side effects are produced (including any caused by calling a method or accessing a volatile field). When program execution is interrupted by an asynchronous event (such as an exception thrown by another thread), it is not guaranteed that the observable side effects are visible in the original program order.
Is accessing an object from an array or list always thread safe (in case you use a for loop for enumeration)?
By "thread safe" do you mean that two threads will always observe consistent results when reading from a list? As noted above, the C# language makes very limited guarantees about observation of results when reading from variables. Can you give a precise definition of what "thread safe" means to you with respect to non-volatile reading?
Is enumeration always/ever thread safe?
Even in single threaded scenarios it is illegal to modify a collection while enumerating it. It is certainly unsafe to do so in multithreaded scenarios.
Can writing and reading to a particular field ever result in a corrupted read (half of the field is changed and half is still unchanged) ?
Yes. I refer you to section 5.5, which states:
Reads and writes of the following data types are atomic: bool, char, byte, sbyte, short, ushort, uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type in the previous list are also atomic. Reads and writes of other types, including long, ulong, double, and decimal, as well as user-defined types, are not guaranteed to be atomic. Aside from the library functions designed for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or decrement.
Well, I generally assume everything is thread unsafe. For quick and dirty access to global objects in an threaded environment I use the lock(object) keyword. .Net have an extensive set of synchronization methods like different semaphores and such.
Reading can be thread-unsafe if there are any threads that are writing (if they write in the middle of a read, for example, they'll be hit with an exception).
If you must do this, then you can do:
lock(i){
i.GetElementAt(a)
}
This will force thread-safety on i (as long as other threads similarly attempt to lock i before they use it. only one thing can lock a reference type at a time.
In terms of enumeration, I'll refer to the MSDN:
The enumerator does not have exclusive access to the collection; therefore, enumerating
through a collection is intrinsically not a thread-safe procedure. To guarantee thread
safety during enumeration, you can lock the collection during the entire enumeration. To
allow the collection to be accessed by multiple threads for reading and writing, you must
implement your own synchronization.
An example of no thread-safety: When several threads increment an integer. You can set it up in a way that you have a predeterminded number of increments. What youmay observe though, is, that the int has not been incremented as much as you thought it would. What happens is that two threads may increment the same value of the integer.This is but an example of aplethora of effects you may observe when working with several threads.
PS
A thread-safe increment is available through Interlocked.Increment(ref i)

Categories

Resources