volatile with release/acquire semantics

volatile with release/acquire semantics - c#

Since Java 5, the volatile keyword has release/acquire semantics to make side-effects visible to other threads (including assignments to non-volatile variables!). Take these two variables, for example:
int i;
volatile int v;
Note that i is a regular, non-volatile variable. Imagine thread 1 executing the following statements:
i = 42;
v = 0;
At some later point in time, thread 2 executes the following statements:
int some_local_variable = v;
print(i);
According to the Java memory model, the write of v in thread 1 followed by the read of v in thread 2 ensures that thread 2 sees the write to i executed in thread 1, so the value 42 is printed.
My question is: does volatile have the same release/acquire semantics in C#?

The semantics of "volatile" in C# are defined in sections 7.10 and 14.5.4 of the C# 6.0 specification. Rather than reproduce them here, I encourage you to look it up in the spec, and then decide that it is too complicated and dangerous to use "volatile", and go back to using locks. That's what I always do.
See "7.10 Execution order" and "14.5.4 Volatile fields" in the C# 6.0 specification.
In the C# 5.0 specification, see "3.10 Execution Order" and "10.5.3 Volatile Fields"

Well, I believe it ensures that if some_local_variable is read as 0 (due to the write to v), i will be read as 42.
The tricky part is "at some later point in time". While commonly volatility is talked about in terms of "flushing" writes, that's not how it's actually defined in the spec (either Java or C#).
From the C# 4 language spec, section 10.5.3:
For non-volatile fields, optimization techniques that reorder instructions can lead to unexpected and unpredictable results in multi-threaded programs that access fields without synchronization such as that provided by the lock-statement (§8.12). These optimizations can be performed by the compiler, by the run-time system, or by hardware. For volatile fields, such reordering optimizations are restricted:
A read of a volatile field is called a volatile read. A volatile read has “acquire semantics”; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.
A write of a volatile field is called a volatile write. A volatile write has “release semantics”; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence.
There's then an example which is quite similar to yours, but conditional on the value read from the volatile variable.
And like Eric, I'd strongly avoid relying on volatile myself. It's hard to reason about, and best left to the Joe Duffy/Stephen Toubs of the world.

Related

Does volatile prevent introduced reads or writes?

In C#, volatile keyword ensures that reads and writes have acquire and release semantics, respectively. However, does it say anything about introduced reads or writes?
For instance:
volatile Thing something;
volatile int aNumber;
void Method()
{
// Are these lines...
var local = something;
if (local != null)
local.DoThings();
// ...guaranteed not to be transformed into these by compiler, jitter or processor?
if (something != null)
something.DoThings(); // <-- Second read!
// Are these lines...
if (aNumber == 0)
aNumber = 1;
// ...guaranteed not to be transformed into these by compiler, jitter or processor?
var temp = aNumber;
if (temp == 0)
temp = 1;
aNumber = temp; // <-- An out-of-thin-air write!
}

Here's what the C# spec1 has to say about Execution Order:
Execution of a C# program proceeds such that the side effects of each executing thread are preserved at critical execution points. A side effect is defined as a read or write of a volatile field ...
The execution environment is free to change the order of execution of a C# program, subject to the following constraints:
...
The ordering of side effects is preserved with respect to volatile reads and writes ...
I would certainly consider introducing new side effects to be changing the order of side effects, but it's not explicitly stated like that here.
Link in answer is to the C# 6 spec which is listed as Draft. C# 5 spec isn't a draft but is not available on-line, only as a download. Identical wording, so far as I can see in this section.

This wording from the C# spec:
The ordering of side effects is preserved with respect to volatile
reads and writes...
may be interpreted as implying that read and write introductions on volatile variables are not allowed, but it is really ambiguous and it depends on the meaning of "ordering." If it is referring to relative ordering of existing accesses, then introducing new reads or writes does not change that and so it would not violate this part of the spec. If it is referring to the exact position of all memory accesses in program order, then introducing new accesses would violate the spec.
This article says that reads on non-volatile variables might be introduced but does not say explicitly whether this is not allowed on volatile variables.
This Q/A discusses how to prevent read introduction (but no discussion on write introduction).
In the comments under this article, two Microsoft employees (at the least at the time the comments were written) explicitly state that read and write introductions on volatile variables are not allowed.
Stephen Toub
"read introduction" is one mechanism by which a memory reordering
might be introduced.
Igor Ostrovsky
Elsewhere in the C# specification, a volatile read is defined to be a
"side effect". As a result, repeating the read of m_paused would be
equivalent to adding another side effect, which is not allowed.
I think we can conclude from these comments that introducing a side effect out-of-thin-air in C#, any kind of side effect, anywhere in the code is not allowed.
A related quote from the CLI standard states the following in Section I.12.6.7:
An optimizing compiler that converts CIL to native code shall not
remove any volatile operation, nor shall it coalesce multiple volatile
operations into a single operation.
As far as I know, the CLI does not explicitly talk about introducing new side effects.

I wonder whether you have misunderstood what volatile means. Volatile can be used with types that can be read or written as an atomic action.
There is no acquire/release of a lock, only a barrier to compile-time and run-time reordering to provide lockless aquire/release semantics (https://preshing.com/20120913/acquire-and-release-semantics/). On non-x86 this may require barrier instructions in the asm, but not taking a lock.
volatile indicates that a field may be modified by other threads, which is why read/writes need to be treated as atomic and not optimised.
Your question is a little ambiguous.
1/ If you mean, will the compiler transform:
var local = something;
if (local != null) local.DoThings();
into:
if (something != null) something.DoThings();
then the answer is no.
2/ If you mean, will "DoThings()" be called twice on the same object:
var local = something;
if (local != null) local.DoThings();
if (something != null) something.DoThings();
then the answer is mostly yes, unless another thread has changed the value of "something" before the second "DoThings()" is invoked. If this is the case then it could give you a run time error - if after the "if" condition is evaluated and before "DoThings" is called, another thread sets "something" to null then you will get a runtime error. I assume this is why you have your "var local = something;".
3/ If you mean will the following cause two reads:
if (something != null) something.DoThings();
then yes, one read for the condition and a second read when it invokes DoThings() (assuming that something is not null). Were it not marked volatile then the compiler might manage that with a single read.
In any event the implementation of the function "DoThings()" needs to be aware that it could be called by multiple threads, so would need to consider incorporating a combination of locks and its own volatile members.

Read elimination and concurrency

Given the following simple code:
class Program
{
static bool finish = false;
static void Main(string[] args)
{
new Thread(ThreadProc).Start();
int x = 0;
while (!finish)
{
x++;
}
}
static void ThreadProc()
{
Thread.Sleep(1000);
finish = true;
}
}
and running it in the Release mode with MSVS2015(.NET 4.6) we will get a never ending application. That happens because the JIT-compiler generates code which reads finish only once, hence ignoring any future updates.
The question is: why is the JIT-compiler allowed to do such an optimization? What part of the specification allows it?

This is covered in section 10.5.3 - Volatile Fields in the C# Specification:
(I've emphasized the part which covers your observations below)
10.5.3 Volatile Fields
When a field-declaration includes a volatile modifier, the fields introduced by that declaration are volatile fields.
For non-volatile fields, optimization techniques that reorder instructions can lead to unexpected and unpredictable results in multithreaded programs that access fields without synchronization, such as that provided by the lock-statement (§8.12). These optimizations can be performed by the compiler, by the runtime system, or by hardware. For volatile fields, such reordering optimizations are restricted:
* A read of a volatile field is called a volatile read. A volatile read has “acquire semantics”; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.
* A write of a volatile field is called a volatile write. A volatile write has “release semantics”; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence.
These restrictions ensure that all threads will observe volatile writes performed by any other thread in the order in which they were performed. A conforming implementation is not required to provide a single total ordering of volatile writes as seen from all threads of execution.

The compiler thinks (makes the following promises) as follows:
I will execute your code in the order that you've
asked me to, that is any instructions that run in your single-threaded
code will be performed in the order they were written and based on
this I will do any optimizations I please, that conform with
single-threaded sequentiality.
Well, the compiler sees the finish variable not marked as volatile and thus he considers that it will not be changed by other threads, so he optimizes that away into considering the condition as being always true.
On Debug mode, it has a more lax thinking and does not perform this optimization away.
More on this here.

Why make a bool volatile when multithreading in C#?

I understand the volatile keyword in C++ fairly well. But in C#, it appears to take a different meaning, more so related to multi-threading. I thought bool operations were atomic, and I thought that if the operation were atomic, you wouldn't have threading concerns. What am I missing?
https://msdn.microsoft.com/en-us/library/x13ttww7.aspx

I thought bool operations were atomic
They are are indeed atomic.
and I thought that if the operation were atomic, you wouldn't have threading concerns.
That is where you have an incomplete picture. Imagine you have two threads running on separate cores, each with their own cache layers. Thread #1 has foo in its cache, and Thread #2 updates the value of foo. Thread #1 won't see the change unless foo is marked as volatile, acquired a lock, used the Interlocked class, or explicitly called Thread.MemoryBarrier() which would cause the value to be invalidated in the cache. Thus, guaranteeing that you read the most up to date value:
Using the volatile modifier ensures that one thread retrieves the most up-to-date value written by another thread.
Eric Lippert has a great post about volatility where he explains:
The true semantics of volatile reads
and writes are considerably more complex than I've outlined here; in
fact they do not actually guarantee that every processor stops what it
is doing and updates caches to/from main memory. Rather, they provide
weaker guarantees about how memory accesses before and after reads and
writes may be observed to be ordered with respect to each other.
Edit:
As per #xanatos comment, This doesn't mean that volatile guarantees an immediate read, it guarantees a read on the most updated value.

The volatile keyword in C# is all about reading/writing reordering, so it is something quite esoteric.
http://www.albahari.com/threading/part4.aspx#_Memory_Barriers_and_Volatility
(that I consider to be one of the "bibles" about threading) writes:
The volatile keyword instructs the compiler to generate an acquire-fence on every read from that field, and a release-fence on every write to that field. An acquire-fence prevents other reads/writes from being moved before the fence; a release-fence prevents other reads/writes from being moved after the fence. These “half-fences” are faster than full fences because they give the runtime and hardware more scope for optimization.
It is something quite unreadable :-)
Now... What it doesn't mean:
It doesn't mean that a value will be read now/will be written now
it simply means that if you read something from a volatile variable, everything else that has been read/written before this "special" read won't be moved after this "special" read. So it creates a barrier. So paradoxically, by reading from a volatile variable, you guarantee that all the writes you have done to any other variable (volatile or not) at the point of the reading will be done.
A volatile write is something probably more important, and is something that is in part guaranteed by the Intel CPU, and something that wasn't guaranteed by the first version of Java: no write reordering. The problem was:
object myrefthatissharedwithotherthreads = new MyClass(5);
where
class MyClass
{
public int Value;
MyClass(int value)
{
Value = value;
}
}
Now... that expression can be imagined to be:
var temp = new MyClass();
temp.Value = 5;
myrefthatissharedwithotherthreads = temp;
where temp is something generated by the compiler that you can't see.
If the writes can be reordered, you could have:
var temp = new MyClass();
myrefthatissharedwithotherthreads = temp;
temp.Value = 5;
and another thread could see a partially initialized MyClass, because the value of myrefthatissharedwithotherthreads is readable before the class MyClass has finished initializing.

Where to places fences/memory barriers to guarantee a fresh read/committed writes?

Like many other people, I've always been confused by volatile reads/writes and fences. So now I'm trying to fully understand what these do.
So, a volatile read is supposed to (1) exhibit acquire-semantics and (2) guarantee that the value read is fresh, i.e., it is not a cached value. Let's focus on (2).
Now, I've read that, if you want to perform a volatile read, you should introduce an acquire fence (or a full fence) after the read, like this:
int local = shared;
Thread.MemoryBarrier();
How exactly does this prevent the read operation from using a previously cached value?
According to the definition of a fence (no read/stores are allowed to be moved above/below the fence), I would insert the fence before the read, preventing the read from crossing the fence and being moved backwards in time (aka, being cached).
How does preventing the read from being moved forwards in time (or subsequent instructions from being moved backwards in time) guarantee a volatile (fresh) read? How does it help?
Similarly, I believe that a volatile write should introduce a fence after the write operation, preventing the processor from moving the write forward in time (aka, delaying the write). I believe this would make the processor flush the write to the main memory.
But to my surprise, the C# implementation introduces the fence before the write!
[MethodImplAttribute(MethodImplOptions.NoInlining)] // disable optimizations
public static void VolatileWrite(ref int address, int value)
{
MemoryBarrier(); // Call MemoryBarrier to ensure the proper semantic in a portable way.
address = value;
}
Update
According to this example, apparently taken from "C# 4 in a Nutshell", fence 2 , placed after a write is supposed to force the write to be flushed to main memory immediately, and fence 3, placed before a read, is supposed to guarantee a fresh read:
class Foo{
int _answer;
bool complete;
void A(){
_answer = 123;
Thread.MemoryBarrier(); // Barrier 1
_complete = true;
Thread.MemoryBarrier(); // Barrier 2
}
void B(){
Thread.MemoryBarrier(); // Barrier 3;
if(_complete){
Thread.MemoryBarrier(); // Barrier 4;
Console.WriteLine(_answer);
}
}
}
The ideas in this book (and my own personal beliefs) seem to contradict the ideas behind C#'s VolatileRead and VolatileWrite implementations.

How exactly does this prevent the read operation from using a
previously cached value?
It does no such thing. A volatile read does not guarantee that the latest value will be returned. In plain English all it really means is that the next read will return a newer value and nothing more.
How does preventing the read from being moved forwards in time (or
subsequent instructions from being moved backwards in time) guarantee
a volatile (fresh) read? How does it help?
Be careful with the terminology here. Volatile is not synonymous with fresh. As I already mentioned above its real usefulness lies in how two or more volatile reads are chained together. The next read in a sequence of volatile reads will absolutely return a newer value than the previous read of the same address. Lock-free code should be written with this premise in mind. That is, the code should be structured to work on the principal of dealing with a newer value and not the latest value. This is why most lock-free code spins in a loop until it can verify that the operation completely successfully.
The ideas in this book (and my own personal beliefs) seem to
contradict the ideas behind C#'s VolatileRead and VolatileWrite
implementations.
Not really. Remember volatile != fresh. Yes, if you want a "fresh" read then you need to place an acquire-fence before the read. But, that is not the same as doing a volatile read. What I am saying is that if the implementation of VolatileRead had the call to Thread.MemoryBarrier before the read instruction then it would not actually produce a volatile read. If would produce fresh a read though.

The important thing to understand is that volatile does not only mean "cannot cache value", but also gives important visibility guarantees (to be exact, it's entirely possible to have a volatile write that only goes to cache; depends solely on the hardware and its used cache coherency protocols)
A volatile read gives acquire semantics, while a volatile write has release semantics. An acquire fence means that you cannot reorder reads or writes before the fence, while a release fence means you cannot move them after the fence. The linked answer in the comments explains that actually quite nicely.
Now the question is, if we don't have any memory barrier before the load how is it guaranteed that we'll see the newest value? The answer to that is: Because we also put memory barriers after each volatile write to guarantee that.
Doug Lea wrote a great summary on which barriers exist, what they do and where to put them for volatile reads/writes for the JMM as a help for compiler writers, but the text is also quite useful for other people. Volatile reads and writes give the same guarantees in both Java and the CLR so that's generally applicable.
Source - scroll down to the "Memory Barriers" section (I'd copy the interesting parts, but the formatting doesn't survive it..)

What is thread safe (C#) ? (Strings, arrays, ... ?)

I'm quite new to C# so please bear with me. I'm a bit confused with the thread safety. When is something thread safe and when something isn't?
Is reading (just reading from something that was initialized before) from a field always thread safe?
//EXAMPLE
RSACryptoServiceProvider rsa = new RSACrytoServiceProvider();
rsa.FromXmlString(xmlString);
//Is this thread safe if xml String is predifined
//and this code can be called from multiple threads?
Is accessing an object from an array or list always thread safe (in case you use a for loop for enumeration)?
//EXAMPLE (a is local to thread, array and list are global)
int a = 0;
for(int i=0; i<10; i++)
{
a += array[i];
a -= list.ElementAt(i);
}
Is enumeration always/ever thread safe?
//EXAMPLE
foreach(Object o in list)
{
//do something with o
}
Can writing and reading to a particular field ever result in a corrupted read (half of the field is changed and half is still unchanged) ?
Thank you for all your answers and time.
EDIT: I meant if all threads are only reading & using (not writing or changing) object. (except for the last question where it is obvious that I meant if threads both read and write). Because I do not know if plain access or enumeration is thread safe.

It's different for different cases, but in general, reading is safe if all threads are reading. If any are writing, neither reading or writing is safe unless it can be done atomically (inside a synchronized block or with an atomic type).
It isn't definite that reading is ok -- you never know what is happening under the hoods -- for example, a getter might need to initialize data on first usage (therefore writing to local fields).
For Strings, you are in luck -- they are immutable, so all you can do is read them. With other types, you will have to take precautions against them changing in other threads while you are reading them.

Is reading (just reading from something that was initialized before) from a field always thread safe?
The C# language guarantees that reads and writes are consistently ordered when the reads and writes are on a single thread in section 3.10:
Data dependence is preserved within a thread of execution. That is, the value of each variable is computed as if all statements in the thread were executed in original program order. Initialization ordering rules are preserved.
Events in a multithreaded, multiprocessor system do not necessarily have a well-defined consistent ordering in time with respect to each other. The C# language does not guarantee there to be a consistent ordering. A sequence of writes observed by one thread may be observed to be in a completely different order when observed from another thread, so long as no critical execution point is involved.
The question is therefore unanswerable because it contains an undefined word. Can you give a precise definition of what "before" means to you with respect to events in a multithreaded, multiprocessor system?
The language guarantees that side effects are ordered only with respect to critical execution points, and even then, does not make any strong guarantees when exceptions are involved. Again, to quote from section 3.10:
Execution of a C# program proceeds such that the side effects of each executing thread are preserved at critical execution points. A side effect is defined as a read or write of a volatile field, a write to a non-volatile variable, a write to an external resource, and the throwing of an exception. The critical execution points at which the order of these side effects must be preserved are references to volatile fields, lock statements, and thread creation and termination. [...] The ordering of side effects is preserved with respect to volatile reads and writes.
Additionally, the execution environment need not evaluate part of an expression if it can deduce that that expression’s value is not used and that no needed side effects are produced (including any caused by calling a method or accessing a volatile field). When program execution is interrupted by an asynchronous event (such as an exception thrown by another thread), it is not guaranteed that the observable side effects are visible in the original program order.
Is accessing an object from an array or list always thread safe (in case you use a for loop for enumeration)?
By "thread safe" do you mean that two threads will always observe consistent results when reading from a list? As noted above, the C# language makes very limited guarantees about observation of results when reading from variables. Can you give a precise definition of what "thread safe" means to you with respect to non-volatile reading?
Is enumeration always/ever thread safe?
Even in single threaded scenarios it is illegal to modify a collection while enumerating it. It is certainly unsafe to do so in multithreaded scenarios.
Can writing and reading to a particular field ever result in a corrupted read (half of the field is changed and half is still unchanged) ?
Yes. I refer you to section 5.5, which states:
Reads and writes of the following data types are atomic: bool, char, byte, sbyte, short, ushort, uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type in the previous list are also atomic. Reads and writes of other types, including long, ulong, double, and decimal, as well as user-defined types, are not guaranteed to be atomic. Aside from the library functions designed for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or decrement.

Well, I generally assume everything is thread unsafe. For quick and dirty access to global objects in an threaded environment I use the lock(object) keyword. .Net have an extensive set of synchronization methods like different semaphores and such.

Reading can be thread-unsafe if there are any threads that are writing (if they write in the middle of a read, for example, they'll be hit with an exception).
If you must do this, then you can do:
lock(i){
i.GetElementAt(a)
}
This will force thread-safety on i (as long as other threads similarly attempt to lock i before they use it. only one thing can lock a reference type at a time.
In terms of enumeration, I'll refer to the MSDN:
The enumerator does not have exclusive access to the collection; therefore, enumerating
through a collection is intrinsically not a thread-safe procedure. To guarantee thread
safety during enumeration, you can lock the collection during the entire enumeration. To
allow the collection to be accessed by multiple threads for reading and writing, you must
implement your own synchronization.

An example of no thread-safety: When several threads increment an integer. You can set it up in a way that you have a predeterminded number of increments. What youmay observe though, is, that the int has not been incremented as much as you thought it would. What happens is that two threads may increment the same value of the integer.This is but an example of aplethora of effects you may observe when working with several threads.
PS
A thread-safe increment is available through Interlocked.Increment(ref i)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.