I think i'm missing something big here.
What i'm trying to do:
I have an object, which is known to multiple threads, which may read or manipulate it. Now i want the object accesses to block, when one thread calls obj.setProperty(T type) i want every other thread to have to wait until the property is set. How to i do this? I know that there is volatile for primitive types, but how does this translate to non primitive types.
Use the lock statement in the property getter and setter.
Also, you don't understand what volatile does. Volatile is to prevent blocking, not to cause blocking.
Related
I need to check for a variable in a TPL program whether it has been changed. For example, if a thread changes a volatile string variable, other threads dont need to change it. As the type of variable is volatile other threads can use it. How can I do it?
volatile probably does not do what you think it does. Don't use it because it does not give you what you need.
You cannot find out if a variable has been change just like that. Maybe you can add a bool wasChanged = false and set it to true when the variable was written to. Remember to use proper synchronization for this (probably lock).
A sure-fire way to check a shared variable versus its expected value, given restrictions on the variable type, is using Interlocked operations.
I can't find any example of VolatileRead/write (try...) but still:
When should I use volatile vs VolatileRead?
AFAIK the whole purpose of volatile is to create half fences so:
For a READ operation, reads/writes (on other threads) which comes AFTER the current operation , won't pass before the fence. hence - we read the latest value.
Question #1
So why do I need the volatileRead? it seems that volatile already do the work.
Plus - in C# all writes are volatile (unlike say in Java), regardless of whether you write to a volatile or a non-volatile field - and so I ask: Why do I need the volatileWrite?
Question #2
This is the implementation for VolatileRead :
[MethodImpl(MethodImplOptions.NoInlining)]
public static int VolatileRead(ref int address)
{
int num = address;
MemoryBarrier();
return num;
}
Why the line int num = address; is there? they already have the address argument which is clearly holding the value.
You should never use Thread.VolatileRead/Write(). It was a design mistake in .NET 1.1, it uses a full memory barrier. This was corrected in .NET 2.0, but they couldn't fix these methods anymore and had to add a new way to do it, provided by the System.Threading.Volatile class. Which is a class that the jitter is aware of, it replaces the methods at jit time with a version that's suitable for the specific processor type.
The comments in the source code for the Volatile class as available through the Reference Source tells the tale (edited to fit):
// Methods for accessing memory with volatile semantics. These are preferred over
// Thread.VolatileRead and Thread.VolatileWrite, as these are implemented more
// efficiently.
//
// (We cannot change the implementations of Thread.VolatileRead/VolatileWrite
// without breaking code that relies on their overly-strong ordering guarantees.)
//
// The actual implementations of these methods are typically supplied by the VM at
// JIT-time, because C# does not allow us to express a volatile read/write from/to
// a byref arg. See getILIntrinsicImplementationForVolatile() in jitinterface.cpp.
And yes, you'll have trouble finding examples of its usage. The Reference Source is an excellent guide with megabytes of carefully written, tested and battle-scarred C# code that deals with threading. The number of times it uses VolatileRead/Write: zero.
Frankly, the .NET memory models are a mess with conflicting assumptions made by the CLR mm and C# mm with new rules added for ARM cores just recently. The weirdo semantics of the volatile keyword that means different things for different architectures is some evidence for that. Albeit that for a processor with a weak memory model you can typically assume that what the C# language spec says is accurate.
Do note that Joe Duffy has given up all hope and just flat out discourages all use of it. It is in general very unwise to assume that you can do better than the primitives provided by the language and framework. The Remarks section of the Volatile class bring the point home:
Under normal circumstances, the C# lock statement, the Visual Basic SyncLock statement, and the Monitor class provide the easiest and least error-prone way of synchronizing access to data, and the Lazy class provides a simple way to write lazy initialization code without directly using double-checked locking.
When you need more fine grained control over the way fences are applied to the code can you can use the static Thread.VolatileRead or Thread.VolatileWrite .
Declaring a variable volatile means that the compiler doesn't cache it's value and always reads the field value, and when a write is performed the compiler writes assigned values immediately.
The two methods Thread.VolatileRead and Thread.VolatileWrite gives you the ability to have a finer grained control without declaring the variable as volatile, since you can decide when perform a volatile read operation and when a volatile write, without having the bound to read no cached and write immediately that you have when you declare the variale volatile, so in poor words you have more control and more freedom ...
VolatileRead() reads the latest version of a memory address, and VolatileWrite() writes to the address, making the address available to all threads. Using both VolatileRead() and VolatileWrite() consistently on a variable has the same effect as marking it as volatile.
Take a look at this blog post that explains by example the difference ...
Why the line int num = address; is there ? they already have the
address argument which is clearly holding the value.
It is a defensive copy to avoid that something outside change the value while we are inside the method, the integer value is copied to the local variable to avoid an accidental change from outside.
Notes
Since in Visual Basic the volatile keyword doesn't exist you have the only choice of using consistently VolatileRead() and VolatileWrite() static methods to achieve the same effect of the volatile keyword in c#.
Why the line int num = address; is there ? they already have the
address argument which is clearly holding the value.
address is not an int. It is an int* (so it really is an address). The code is dereferencing the pointer and copying it to a local so that the barrier occurs after the dereference.
To Elaborate more on aleroot's answer.
Volatile.Read and Volatile.Write are same as volatile modifier as Royi Namir's argument. But you can use them wisely.
For example if you declare a field with volatile modifier then every access to this field whether it is read or write operation it will be read from CPU Register which is not free, this is not required in most of cases and will be unnecessary performance hit if field is even have many read operation.
Think of scenario where you have private singleton variable is declared as volatile and is returned in property getter, once it is initialized you don't need to read it's root from CPU Register hence you can use Volatile.Read / Write until it's instance created, once created all read operation can be done as normal field otherwise it would be big performance hit.
Whereas you can use Volatile.Read or Volatile.Write on demand bases.
Best uses is declare field without volatile modifier and use Volatile.Read or Volatile.Write when needed.
Should I use static fields and interlocked together, in cases when i need to provide thread safety and atomic operations with static fields, Is static fields are atomic by default? For example:
Interlocked.Increment(ref Factory.DefectivePartsCount);
Thanks.
Yes.
The field (assuming Int32) is atomic, not because it's static but because it's 32 bits.
How ever, Factory.DefectivePartsCount += 1 requires a read and a write action on the variable so the whole operation is not thread-safe.
static doesn't guarantee anything in terms of thread-safety. Hence, an increment will still not be atomic even if the variable is static. As such, you will still need to use classic synchronization mechanisms depending on the situation. In your case Interlocked.Increment is fine.
From the specification 10.5.3 Volatile fields:
The type of a volatile field must be one of the following:
A reference-type.
The type byte, sbyte, short, ushort,
int, uint, char, float, bool,
System.IntPtr, or System.UIntPtr.
An enum-type having an enum base type
of byte, sbyte, short, ushort, int,
or uint.
First I want to confirm my understanding is correct: I guess the above types can be volatile because they are stored as a 4-bytes unit in memory(for reference types because of its address), which guarantees the read/write operation is atomic. A double/long/etc type can't be volatile because they are not atomic reading/writing since they are more than 4 bytes in memory. Is my understanding correct?
And the second, if the first guess is correct, why a user defined struct with only one int field in it(or something similar, 4 bytes is ok) can't be volatile? Theoretically it's atomic right? Or it's not allowed simply because that all user defined structs(which is possibly more than 4 bytes) are not allowed to volatile by design?
So, I suppose you propose the following point to be added:
A value type consisting only of one field which can be legally marked volatile.
First, fields are usually private, so in external code, nothing should depend on a presence of a certain field. Even though the compiler has no issue accessing private fields, it is not such a good idea to restrict a certain feature based on something the programmer has no proper means to affect or inspect.
Since a field is usually a part of the internal implementation of a type, it can be changed at any time in a referenced assembly, but this could make a piece of C# code that used the type illegal.
This theoretical and practical reason means that the only feasible way would be to introduce a volatile modifier for value types that would ensure that point specified above holds. However, since the only group of types that would benefit from this modifier are value types with a single field, this feature probably wasn't very high on the list.
Basically, usage of the volatile keyword can sometimes be misleading. Its purpose is to allow that the latest value (or actually, an eventually fresh enough value)1 of the respective member is returned when accessed by any thread.
In fact, this is true to value types only2. Reference type members are represented in memory as the pointers to a location in the heap where the object is actually stored. So, when used on a reference type, volatile ensures you only get the fresh value of the reference (the pointer) to the object, not the object itself.
If you have a volatile List<String> myVolatileList which is modified by multiple threads by having elements added or removed, and if you expect it to be safely accessing the latest modification of the list, you are actually wrong. In fact, you are prone to the same issues as if the volatile keyword was not there -- race conditions and/or having the object instance corrupted -- it does not assist you in this case, neither it provides you with any thread safety.
If, however, the list itself is not modified by the different threads, but rather, each thread would only assign a different instance to the field (meaning the list is behaving like an immutable object), then you are fine. Here is an example:
public class HasVolatileReferenceType
{
public volatile List<int> MyVolatileMember;
}
The following usage is correct with respect to multi-threading, as each thread would replace the MyVolatileMember pointer. Here, volatile ensures that the other threads will see the latest list instance stored in the MyVolatileMember field.
HasVolatileReferenceTypeexample = new HasVolatileReferenceType();
// instead of modifying `example.MyVolatileMember`
// we are replacing it with a new list. This is OK with volatile.
example.MyVolatileMember = example.MyVolatileMember
.Where(x => x > 42).ToList();
In contrast, the below code is error prone, because it directly modifies the list. If this code is executed simultaneously with multiple threads, the list may become corrupted, or behave in an inconsistent manner.
example.MyVolatileMember.RemoveAll(x => x <= 42);
Let us return to value types for a while. In .NET all value types are actually reassigned when they are modified, they are safe to be used with the volatile keyword - see the code:
public class HasVolatileValueType
{
public volatile int MyVolatileMember;
}
// usage
HasVolatileValueType example = new HasVolatileValueType();
example.MyVolatileMember = 42;
1The notion of lates value here is a little misleading, as noted by Eric Lippert in the comments section. In fact latest here means that the .NET runtime will attempt (no guarantees here) to prevent writes to volatile members to happen in-between read operations whenever it deems it is possible. This would contribute to different threads reading a fresh value of the volatile member, as their read operations would probably be ordered after a write operation to the member. But there is more to count on probability here.
2In general, volatile is OK to be used on any immutable object, since modifications always imply reassignment of the field with a different value. The following code is also a correct example of the use of the volatile keyword:
public class HasVolatileImmutableType
{
public volatile string MyVolatileMember;
}
// usage
HasVolatileImmutableType example = new HasVolatileImmutableType();
example.MyVolatileMember = "immutable";
// string is reference type, but is *immutable*,
// so we need to reasign the modification result it in order
// to work with the new value later
example.MyVolatileMember = example.MyVolatileMember.SubString(2);
I'd recommend you to take a look at this article. It thoroughly explains the usage of the volatile keyword, the way it actually works and the possible consequences to using it.
I think it is because a struct is a value type, which is not one of the types listed in the specs. It is interesting to note that reference types can be a volatile field. So it can be accomplished with a user-defined class. This may disprove your theory that the above types are volatile because they can be stored in 4 bytes (or maybe not).
This is an educated guess at the answer... please don't shoot me down too much if I am wrong!
The documentation for volatile states:
The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access.
This implies that part of the design intent for volatile fields is to implement lock-free multithreaded access.
A member of a struct can be updated independently of the other members. So in order to write the new struct value where only part of it has been changed, the old value must be read. Writing is therefore not guaranteed to require a single memory operation. This means that in order to update the struct reliably in a multithreaded environment, some kind of locking or other thread synchronization is required. Updating multiple members from several threads without synchronization could soon lead to counter-intuitive, if not technically corrupt, results: to make a struct volatile would be to mark a non-atomic object as atomically updateable.
Additionally, only some structs could be volatile - those of size 4 bytes. The code that determines the size - the struct definition - could be in a completely separate part of the program to that which defines a field as volatile. This could be confusing as there would be unintended consequences of updating the definition of a struct.
So, whereas it would be technically possible to allow some structs to be volatile, the caveats for correct usage would be sufficiently complex that the disadvantages would outweigh the benefits.
My recommendation for a workaround would be to store your 4-byte struct as a 4-byte base type and implement static conversion methods to use each time you want to use the field.
To address the second part of your question, I would support the language designers decision based on two points:
KISS - Keep It Simple Simon - It would make the spec more complex and implementations hard to have this feature. All language features start at minus 100 points, is adding the ability to have a small minority of struts volatile really worth 101 points?
Compatibility - questions of serialization aside - Usually adding a new field to a type [class, struct] is a safe backwards source compatible move. If you adding a field should not break anyones compile. If the behavior of structs changed when adding a field this would break this.
What happens when we use the lock object? I am aware that it the runtime makes use of the monitor.Enter and Exit methods. But what really happens under the hood?
Why only reference types to be used for locking?
Even though the object used for accomplishing the locking is changed, how come it still provides thread safety?
In the current sample, we are modifying the object which is used for the locking purpose. Ideally, this is not a preferred way of doing it and best practise is to used a dedicated privately scoped variable.
static List<string> stringList = new List<string>();
static void AddItems(object o)
{
for (int i = 0; i < 100; i++)
{
lock (stringList)
{
Thread.Sleep(20);
stringList.Add(string.Format("Thread-{0},No-{1}", Thread.CurrentThread.ManagedThreadId, i));
}
}
string[] listArray = null;
lock(stringList)
listArray = stringList.ToArray();
foreach (string s in listArray)
{
Console.WriteLine(s);
}
}
What happens under the hood is approximately this:
Imagine the object type has a hidden field in it.
Monitor.Enter() and Monitor.Exit() use that field to communicate with each other.
Every reference type inherits that field from object.
Of course, the type of that field is something special: It’s a synchronization lock that works in a thread-safe manner. In reality, of course, it is not really a field in the CLR sense, but a special feature of the CLR that uses a chunk of memory within each object’s memory to implement that synchronization lock. (The exact implementation is described in “Safe Thread Synchronization” in the MSDN Magazine.)
How come it still provides thread safety? I think what you mean is: why doesn’t it break thread safety for objects that are thread safe? The answer is easy: because you can have objects that are partly thread safe and partly not. You could have an object with two methods, and using one of them is thread safe while the other isn’t. Monitor.Enter() is thread safe irrespective of what the rest of the object does.
Why only reference types to be used for locking? Because only reference types actually have this special magic in their memory chunk. Value types are really literally just the value itself: a 32-bit integer in the case of int; a concatenation of all the fields in the case of a custom struct. You can pass a value type into Monitor.Enter(), and it won’t complain, but it won’t work because the value type will be boxed — i.e., wrapped into an object of a reference type. When you call Monitor.Exit(), it will be boxed again, so it will try to release the lock on a different object reference.
Regarding your code sample: I see nothing wrong with it. All your access to the stringList variable is wrapped in a lock, and you never assign to the stringList field itself except during initialisation. There is nothing that can go wrong with this; it is thread safe. (Of course something could go wrong if some other code accesses the field without locking it. If you were to make the field public, there is a very great chance of that happening accidentally. There is no need to use only locally-scoped variables for such a lock unless you really can’t ensure otherwise that the variable won’t be accessed by code you don’t control.)
But what really happens under the
hood?
Refer to this MSDN article for an in-depth description.
Essentially, each CLR object that gets allocated has an associated field that holds a sync block index. This index points into a pool of sync blocks that the CLR maintains. A sync block holds the same information as a critical section which gets used during synchronization. Initially, an object's sync block index is meaningless (uninitialized). When you lock on the object, however, it gets a valid index into the pool.
Why only reference types to be used
for locking?
You need a reference type since value types don't have the associated sync block index field (less overhead).
Even though the object used for
accomplishing the locking is changed,
how come it still provides thread
safety?
Locking on a CLR object and then modifying it is akin to having a C++ object with a CRITICAL_SECTION member that's used for locking while that same object is modified. There are no thread safety issues there.
In the current sample, we are
modifying the object which is used for
the locking purpose. Ideally, this is
not a preferred way of doing it and
best practise is to used a dedicated
privately scoped variable.
Correct, this situation is also described in the article. If you're not using a privately scoped variable that is in complete control of the owning class, then you can run into deadlock issues when two separate classes decide to lock on the same referenced object (e.g. if for some reason your stringList gets passed to another class that then decides to lock on it as well). This is unlikely, but if you use a strictly-controlled, privately scoped variable that never gets passed around, you will avoid such deadlocks altogether.