I understand the functionality of Interlocked.Increment and lock(). But I'm confused on when to use one or the other. As far as I can tell Interlocked.Increment increments shared int/long value, whereas as lock() is meant to lock region of code.
For example, if i want to update string value it is possible with lock():
lock(_object)
{
sharedString = "Hi";
}
However, this is not possible with Interlocked class.
Why can't this be done via Interlocked?
What's the difference between these synchronization mechanisms?
Interlocked.Increment and related methods rely on hardware instructions to perform synchronized modification of a single 32bit or 64bit memory value, ensuring that multiple threads accessing the same value do not read/write stale data. this is necessary because at a hardware level a processor has a local/bus copy of memory values (for performance, often referred to as bus memory or cpu cache).
lock(){} performs synchronization for a section of code, rather than a single integral value. and instead of relying on hardware instructions to synchronize access to a variable the resulting code instead relies on operating system synchronization primitives (software, not hardware) to protect memory and code execution.
Further, the use of lock() emits a memory barrier, ensuring that accessing the same variables from multiple CPUs yields synchronized (non-stale) data. This is not true in other languages/platforms, where a memory barriers and fencing must be explicitly performed.
It's more efficient to use Interlocked methods on integral values because the hardware has native support for performing the necessary synchronization. but this hardware support only exists for native integrals such as __int32 and __int64, since the hardware does not have a notion of higher level complex types no such high level method is exposed from the Interlocked type. Thus you can't use Interlocked to synchronize the assignment of System.String or any System.Object derived types.
(Even though the assignment of a pointer to a string value can be done with the same hardware instruction if you were using a lower level language, the fact is that in .NET a string object is not represented as a pointer and thus it's just not possible in any "pure" .NET language. I am avoiding the fact that you can use unsafe methods to resolve the pointer and do an interlocked assignment of string values if you -really- wanted to, but I don't feel this is really what you are asking about, and further this is not supported by Interlocked because under the hood GC pinning would need to occur, which would likely become more expensive and invasive than using lock().)
Thus, for synchronized modification/assignment of "reference types" you will need to use a synchronization primitive (i.e. lock(){}, Monitor, etc). If all you need to synchronize is a single integral value (Int32, Int64) it would be more efficient to use the Interlocked methods. It may still make sense to use lock() statement if there are multiple integral values to synchronize, for example incrementing one integer while decrementing a second integer, where both need to be synchronized as a single logical operation.
Interlocked.Increment can and should be used to increment shared int variable.
Functionally using Interlocked.Increment is same as:
lock(_object)
{
counter++;
}
but Interlocked.Increment is much cheaper performance-wise.
If you want to exchange a reference value, and return the original value in an atomic operation, you can use Interlocked.Exchange. Interlocked.Increment does exactly what it says it does: it increments a number.
But simply assigning a reference value to a variable, or any 32-bit value type is atomic in .NET anyway. The only other case I can think of, in which the latter doesn't hold, is if you create a packed structure and set attributes which will force the compiler not to align members at 4-byte boundaries (but this is not something you do really often).
Related
What is the difference, if any, of the Read(Int64) method of the .NET system classes System.Threading.Volatile and System.Threading.Interlocked?
Specifically, what are their respective guarantees / behaviour with regard to (a) atomicity and (b) memory ordering.
Note that this is about the Volatile class, not the volatile (lower case) keyword.
The MS docs state:
Volatile.Read Method
Reads the value of a field. On systems that require it, inserts a
memory barrier that prevents the processor from reordering memory
operations as follows: If a read or write appears after this method in
the code, the processor cannot move it before this method.
...
Returns Int64
The value that was read. This value is the latest written by any processor
in the computer, regardless of the number of processors or the state of
processor cache.
vs.
Interlocked.Read(Int64) Method
Returns a 64-bit value, loaded as an atomic operation.
Particularly confusing seems that the Volatile docs do not talk about atomicity and the Interlocked docs do not talk about ordering / memory barriers.
Side Note: Just as a reference: I'm more familiar with the C++ atomic API where atomic operations always also specify a memory ordering semantic.
The question link (and transitive links) helpfully provided by Pavel do a good job of explaining the difference / ortogonality of volatile-as-in-memory-barrier and atomic-as-in-no-torn-reads, but they do not explain how the two concepts apply to these two classes.
Does Volatile.Read make any guarantees about atomicity?
Does Interlocked.Read (or, really, any of the Interlocked functions) make any guarantees about memory order?
Interlocked.Read translates into a CompareExchange operation:
public static long Read(ref long location)
{
return Interlocked.CompareExchange(ref location, 0, 0);
}
Therefore it has all the benefits of CompareExchange:
Full memory barrier
Atomicity
Volatile.Read on the other hand has only acquire semantics. It helps you ensuring the execution order of your read operations, without any atomicity or freshness guarantee.
The documentation of the Volatile.Read(long) method doesn't mention anything about atomicity, but the source code is quite revealing:
private struct VolatileIntPtr { public volatile IntPtr Value; }
[Intrinsic]
[NonVersionable]
public static long Read(ref long location) =>
#if TARGET_64BIT
(long)Unsafe.As<long, VolatileIntPtr>(ref location).Value;
#else
// On 32-bit machines, we use Interlocked, since an ordinary volatile read would not be atomic.
Interlocked.CompareExchange(ref location, 0, 0);
#endif
On 32-bit machines, the Volatile.Read method invokes indirectly the Interlocked.CompareExchange, just like the Interlocked.Read does (source code), so there is no difference between the two. A full fence is emitted by both methods.
On 64-bit machines the atomicity of the reading is guaranteed by the CPU architecture, so a cheaper half fence is emitted instead.
So the Volatile.Read seems to be the preferable option overall. Although its atomicity is not guaranteed by the documentation, if it wasn't atomic its usefulness would be severely limited, if any. What use would you have for a value that can be potentially torn?
Note: the Intrinsic attribute means that the code of the decorated method can be potentially replaced/optimized by the Jitter. This can be slightly concerning, so please make your own judgement about whether it's safe to use the Volatile.Read for reading long values in a multithreaded environment.
There is a Volatile.Read method for all primitives and reference types, why is there no Volatile.Read for structures? Same applies to Volatile.Write. Likewise the old Thread.VolatileRead methods didn't have one for structs either.
What is the reason behind this? I can declare volatile structs in a class, why can't I do volatile reads with these methods?
There is only a guarantee in volatile operations if they're also atomic, which is not the case for all but the simplest structs (e.g. one field of a primitive or reference type, or any struct that fits 64 bits/8 bytes).
For instance, what would you expect from such Volatile methods on a 768 bits/96 bytes struct? Anything bigger than the greatest supported atomic operation would actually result in multiple volatile writes, each of which would be immediately visible without any guarantee.
In Microsoft implementations of .NET, long and double Volatile methods are atomic. Even on 32-bit architectures, at the cost of using interlocked operations in such architectures.
I can't find any example of VolatileRead/write (try...) but still:
When should I use volatile vs VolatileRead?
AFAIK the whole purpose of volatile is to create half fences so:
For a READ operation, reads/writes (on other threads) which comes AFTER the current operation , won't pass before the fence. hence - we read the latest value.
Question #1
So why do I need the volatileRead? it seems that volatile already do the work.
Plus - in C# all writes are volatile (unlike say in Java), regardless of whether you write to a volatile or a non-volatile field - and so I ask: Why do I need the volatileWrite?
Question #2
This is the implementation for VolatileRead :
[MethodImpl(MethodImplOptions.NoInlining)]
public static int VolatileRead(ref int address)
{
int num = address;
MemoryBarrier();
return num;
}
Why the line int num = address; is there? they already have the address argument which is clearly holding the value.
You should never use Thread.VolatileRead/Write(). It was a design mistake in .NET 1.1, it uses a full memory barrier. This was corrected in .NET 2.0, but they couldn't fix these methods anymore and had to add a new way to do it, provided by the System.Threading.Volatile class. Which is a class that the jitter is aware of, it replaces the methods at jit time with a version that's suitable for the specific processor type.
The comments in the source code for the Volatile class as available through the Reference Source tells the tale (edited to fit):
// Methods for accessing memory with volatile semantics. These are preferred over
// Thread.VolatileRead and Thread.VolatileWrite, as these are implemented more
// efficiently.
//
// (We cannot change the implementations of Thread.VolatileRead/VolatileWrite
// without breaking code that relies on their overly-strong ordering guarantees.)
//
// The actual implementations of these methods are typically supplied by the VM at
// JIT-time, because C# does not allow us to express a volatile read/write from/to
// a byref arg. See getILIntrinsicImplementationForVolatile() in jitinterface.cpp.
And yes, you'll have trouble finding examples of its usage. The Reference Source is an excellent guide with megabytes of carefully written, tested and battle-scarred C# code that deals with threading. The number of times it uses VolatileRead/Write: zero.
Frankly, the .NET memory models are a mess with conflicting assumptions made by the CLR mm and C# mm with new rules added for ARM cores just recently. The weirdo semantics of the volatile keyword that means different things for different architectures is some evidence for that. Albeit that for a processor with a weak memory model you can typically assume that what the C# language spec says is accurate.
Do note that Joe Duffy has given up all hope and just flat out discourages all use of it. It is in general very unwise to assume that you can do better than the primitives provided by the language and framework. The Remarks section of the Volatile class bring the point home:
Under normal circumstances, the C# lock statement, the Visual Basic SyncLock statement, and the Monitor class provide the easiest and least error-prone way of synchronizing access to data, and the Lazy class provides a simple way to write lazy initialization code without directly using double-checked locking.
When you need more fine grained control over the way fences are applied to the code can you can use the static Thread.VolatileRead or Thread.VolatileWrite .
Declaring a variable volatile means that the compiler doesn't cache it's value and always reads the field value, and when a write is performed the compiler writes assigned values immediately.
The two methods Thread.VolatileRead and Thread.VolatileWrite gives you the ability to have a finer grained control without declaring the variable as volatile, since you can decide when perform a volatile read operation and when a volatile write, without having the bound to read no cached and write immediately that you have when you declare the variale volatile, so in poor words you have more control and more freedom ...
VolatileRead() reads the latest version of a memory address, and VolatileWrite() writes to the address, making the address available to all threads. Using both VolatileRead() and VolatileWrite() consistently on a variable has the same effect as marking it as volatile.
Take a look at this blog post that explains by example the difference ...
Why the line int num = address; is there ? they already have the
address argument which is clearly holding the value.
It is a defensive copy to avoid that something outside change the value while we are inside the method, the integer value is copied to the local variable to avoid an accidental change from outside.
Notes
Since in Visual Basic the volatile keyword doesn't exist you have the only choice of using consistently VolatileRead() and VolatileWrite() static methods to achieve the same effect of the volatile keyword in c#.
Why the line int num = address; is there ? they already have the
address argument which is clearly holding the value.
address is not an int. It is an int* (so it really is an address). The code is dereferencing the pointer and copying it to a local so that the barrier occurs after the dereference.
To Elaborate more on aleroot's answer.
Volatile.Read and Volatile.Write are same as volatile modifier as Royi Namir's argument. But you can use them wisely.
For example if you declare a field with volatile modifier then every access to this field whether it is read or write operation it will be read from CPU Register which is not free, this is not required in most of cases and will be unnecessary performance hit if field is even have many read operation.
Think of scenario where you have private singleton variable is declared as volatile and is returned in property getter, once it is initialized you don't need to read it's root from CPU Register hence you can use Volatile.Read / Write until it's instance created, once created all read operation can be done as normal field otherwise it would be big performance hit.
Whereas you can use Volatile.Read or Volatile.Write on demand bases.
Best uses is declare field without volatile modifier and use Volatile.Read or Volatile.Write when needed.
The following are the only ways classes are different from structs in C# (please correct me if I'm wrong):
Class variables are references, while struct variables are values, therefore the entire value of struct is copied in assignments and parameter passes
Class variables are pointers stored on stack that point to the memory on heap, while struct variables are on stored heap as values
Suppose I have an immutable struct, that is struct with fields that cannot be modified once initialized. Each time I pass this struct as a parameter or use in assignments, the value would be copied and stored on stack.
Then suppose I make this immutable struct to be an immutable class. The single instance of this class would be created once, and only the reference to the class would be copied in assignments and parameter passes.
If the object was mutable, the behavior in these two cases would be different: when one would change the object, in the first case the copy of the struct would be modified, while in the second case the original object would be changed. However, in both cases the object is immutable, therefore there is no difference whether this is actually a class or a struct for the user of this object.
Since copying reference is cheaper than copying struct, why would one use an immutable struct?
Also, since mutable structs are evil, it looks like there is no reason to use structs at all.
Where am I wrong?
Since copying reference is cheaper than copying struct, why would one use an immutable struct?
This isn't always true. Copying a reference is going to be 8 bytes on a 64bit OS, which is potentially larger than many structs.
Also note that creation of the class is likely more expensive. Creating a struct is often done completely on the stack (though there are many exceptions), which is very fast. Creating a class requires creating the object handle (for the garbage collector), creating the reference on the stack, and tracking the object's lifetime. This can add GC pressure, which also has a real cost.
That being said, creating a large immutable struct is likely not a good idea, which is part of why the Guidelines for choosing between Classes and Structures recommend always using a class if your struct will be more than 16 bytes, if it will be boxed, and other issues that make the difference smaller.
That being said, I often base my decision more on the intended usage and meaning of the type in question. Value types should be used to refer to a single value (again, refer to guidelines), and often have a semantic meaning and expected usage different than classes. This is often just as important as the performance characteristics when making the choice between class or struct.
Reed's answer is quite good but just to add a few extra points:
please correct me if I'm wrong
You are basically on the right track here. You've made the common error of confusing variables with values. Variables are storage locations; values are stored in variables. And you are flirting with the commonly-stated myth that "value types go on the stack"; rather, variables go on either short-term or long-term storage, because variables are storage locations. Whether a variable goes on short or long term storage depends on its known lifetime, not its type.
But all of that is not particularly relevant to your question, which boils down to asking for a refutation of this syllogism:
Mutable structs are evil.
Reference copying is cheaper than struct copying, so immutable structs are always worse.
Therefore, there is never any use for structs.
We can refute the syllogism in several ways.
First, yes, mutable structs are evil. However, they are sometimes very useful because in some limited scenarios, you can get a performance advantage. I do not recommend this approach unless other reasonable avenues have been exhausted and there is a real performance problem.
Second, reference copying is not necessarily cheaper than struct copying. References are typically implemented as 4 or 8 byte managed pointers (though that is an implementation detail; they could be implemented as opaque handles). Copying a reference-sized struct is neither cheaper nor more expensive than copying a reference-sized reference.
Third, even if reference copying is cheaper than struct copying, references must be dereferenced in order to get at their fields. Dereferencing is not zero cost! Not only does it take machine cycles to dereference a reference, doing so might mess up the processor cache, and that can make future dereferences far more expensive!
Fourth, even if reference copying is cheaper than struct copying, who cares? If that is not the bottleneck that is producing an unacceptable performance cost then which one is faster is completely irrelevant.
Fifth, references are far, far more expensive in memory space than structs are.
Sixth, references add expense because the network of references must be periodically traced by the garbage collector; "blittable" structs may be ignored by the garbage collector entirely. Garbage collection is a large expense.
Seventh, immutable value types cannot be null, unlike reference types. You know that every value is a good value. And as Reed pointed out, in order to get a good value of a reference type you have to run both an allocator and a constructor. That's not cheap.
Eighth, value types represent values, and programs are often about the manipulation of values. It makes sense to "bake in" the metaphors of both "value" and "reference" in a language, regardless of which is "cheaper".
From MSDN;
Classes are reference types and structures are value types. Reference
types are allocated on the heap, and memory management is handled by
the garbage collector. Value types are allocated on the stack or
inline and are deallocated when they go out of scope. In general,
value types are cheaper to allocate and deallocate. However, if they
are used in scenarios that require a significant amount of boxing and
unboxing, they perform poorly as compared to reference types.
Do not define a structure unless the type has all of the following characteristics:
It logically represents a single value, similar to primitive types (integer, double, and so on).
It has an instance size smaller than 16 bytes.
It is immutable.
It will not have to be boxed frequently.
So, you should always use a class instead of struct, if your struct will be more than 16 bytes. Also read from http://www.dotnetperls.com/struct
There are two usage cases for structures. Opaque structures are useful for things which could be implemented using immutable classes, but are sufficiently small that even in the best of circumstances there wouldn't be much--if any--benefit to using a class, especially if the frequency with which they are created and discarded is a significant fraction of the frequency with which they will be simply copied. For example, Decimal is a 16-byte struct, so holding a million Decimal values would take 16 megabytes. If it were a class, each reference to a Decimal instance would take 4 or 8 bytes, but each distinct instance would probably take another 20-32 bytes. If one had many large arrays whose elements were copied from a small number of distinct Decimal instances, the class could win out, but in most scenarios one would be more likely to have an array with a million references to a million distinct instances of Decimal, which would mean the struct would win out.
Using structures in this way is generally only good if the guidelines quoted from MSDN apply (though the immutability guideline is mainly a consequence of the fact that there isn't yet any way via which struct methods can indicate that they modify the underlying struct). If any of the last three guidelines don't apply, one is likely better off using an immutable class than a struct. If the first guideline does not apply, however, that means one shouldn't use an opaque struct, but not that one should use a class instead.
In some situations, the purpose of a data type is simply to fasten a group of variables together with duct tape so that their values can be passed around as a unit, but they still remain semantically as distinct variables. For example, a lot of methods may need to pass around groups of three floating-point numbers representing 3d coordinates. If one wants to draw a triangle, it's a lot more convenient to pass three Point3d parameters than nine floating-point numbers. In many cases, the purpose of such types is not to impart any domain-specific behavior, but rather to simply provide a means of passing things around conveniently. In such cases, structures can offer major performance advantages over classes, if one uses them properly. A struct which is supposed to represent three varaibles of type double fastened together with duct tape should simply have three public fields of type double. Such a struct will allow two common operations to be performed efficiently:
Given an instance, take a snapshot of its state so the instance can be modified without disturbing the snapshot
Given an instance which is no longer needed, somehow come up with an instance which is slightly different
Immutable class types allow the first to be performed at fixed cost regardless of the amount of data held by the class, but they are inefficient at the second. The greater the amount of data the variable is supposed to represent, the greater the advantage of immutable class types versus structs when performing the first operation, and the greater the advantage of exposed-field structs when performing the second.
Mutable class types can be efficient in scenarios where the second operation dominates, and the first is needed seldom if ever, but it can be difficult for an object to expose the present values in a mutable class object without exposing the object itself to outside modification.
Note that depending upon usage patterns, large exposed-field structures may be much more efficient than either opaque structures or class types. Structure larger than 17 bytes are often less efficient than smaller ones, but they can still be vastly more efficient than classes. Further, the cost of passing a structure as a ref parameter does not depend upon its size. Large structs are inefficient if one accesses them via properties rather than fields, passes them by value needlessly, etc. but if one is careful to avoid redundant "copy" operations, there are usage patterns where there is no break-even point for classes versus structs--structs will simply perform better.
Some people may recoil in horror at the idea of a type having exposed fields, but I would suggest that a struct such as I describe shouldn't be thought of so much as an entity unto itself, but rather an extension of the things that read or write it. For example:
public struct SlopeAndIntercept
{
public double Slope,Intercept;
}
public SlopeAndIntercept FindLeastSquaresFit() ...
Code which is going to perform a least-squares-fit of a bunch of points will have to do a significant amount of work to find either the slope or Y intercept of the resulting line; finding both would not cost much more. Code which calls the FindLeastSquaresFit method is likely going to want to have the slope in one variable and the intercept in another. If such code does:
var resultLine = FindLeastSquaresFit();
the result will be to effectively create two variables resultLine.Slope and resultLine.Intercept which the method can manipulate as it sees fit. The fields of resultLine don't really belong to SlopeIntercept, nor to FindLeastSquaresFit; they belong to the code that declares resultLine. The situation is little different from if the method were used as:
double Slope, Intercept;
FindLeastSquaresFit(out Slope, out Intercept);
In that context, it would be clear that immediately following the function call, the two variables have the meaning assigned by the method, but that their meaning at any other time will depend upon what else the method does with them. Likewise for the fields of the aforementioned structure.
There are some situations where it may be better to return data using an immutable class rather than a transparent structure. Among other things, using a class will make it easier for future versions of a function that returns a Foo to return something which includes additional information. On the other hand, there are many situations where code is going to expect to deal with a specific set of discrete things, and changing that set of things would fundamentally change what clients have to do with it. For example, if one has a bunch of code that deals with (x,y) points, adding a "z" coordinate is going to require that code to be rewritten, and there's nothing the "point" type can do to mitigate that.
From the specification 10.5.3 Volatile fields:
The type of a volatile field must be one of the following:
A reference-type.
The type byte, sbyte, short, ushort,
int, uint, char, float, bool,
System.IntPtr, or System.UIntPtr.
An enum-type having an enum base type
of byte, sbyte, short, ushort, int,
or uint.
First I want to confirm my understanding is correct: I guess the above types can be volatile because they are stored as a 4-bytes unit in memory(for reference types because of its address), which guarantees the read/write operation is atomic. A double/long/etc type can't be volatile because they are not atomic reading/writing since they are more than 4 bytes in memory. Is my understanding correct?
And the second, if the first guess is correct, why a user defined struct with only one int field in it(or something similar, 4 bytes is ok) can't be volatile? Theoretically it's atomic right? Or it's not allowed simply because that all user defined structs(which is possibly more than 4 bytes) are not allowed to volatile by design?
So, I suppose you propose the following point to be added:
A value type consisting only of one field which can be legally marked volatile.
First, fields are usually private, so in external code, nothing should depend on a presence of a certain field. Even though the compiler has no issue accessing private fields, it is not such a good idea to restrict a certain feature based on something the programmer has no proper means to affect or inspect.
Since a field is usually a part of the internal implementation of a type, it can be changed at any time in a referenced assembly, but this could make a piece of C# code that used the type illegal.
This theoretical and practical reason means that the only feasible way would be to introduce a volatile modifier for value types that would ensure that point specified above holds. However, since the only group of types that would benefit from this modifier are value types with a single field, this feature probably wasn't very high on the list.
Basically, usage of the volatile keyword can sometimes be misleading. Its purpose is to allow that the latest value (or actually, an eventually fresh enough value)1 of the respective member is returned when accessed by any thread.
In fact, this is true to value types only2. Reference type members are represented in memory as the pointers to a location in the heap where the object is actually stored. So, when used on a reference type, volatile ensures you only get the fresh value of the reference (the pointer) to the object, not the object itself.
If you have a volatile List<String> myVolatileList which is modified by multiple threads by having elements added or removed, and if you expect it to be safely accessing the latest modification of the list, you are actually wrong. In fact, you are prone to the same issues as if the volatile keyword was not there -- race conditions and/or having the object instance corrupted -- it does not assist you in this case, neither it provides you with any thread safety.
If, however, the list itself is not modified by the different threads, but rather, each thread would only assign a different instance to the field (meaning the list is behaving like an immutable object), then you are fine. Here is an example:
public class HasVolatileReferenceType
{
public volatile List<int> MyVolatileMember;
}
The following usage is correct with respect to multi-threading, as each thread would replace the MyVolatileMember pointer. Here, volatile ensures that the other threads will see the latest list instance stored in the MyVolatileMember field.
HasVolatileReferenceTypeexample = new HasVolatileReferenceType();
// instead of modifying `example.MyVolatileMember`
// we are replacing it with a new list. This is OK with volatile.
example.MyVolatileMember = example.MyVolatileMember
.Where(x => x > 42).ToList();
In contrast, the below code is error prone, because it directly modifies the list. If this code is executed simultaneously with multiple threads, the list may become corrupted, or behave in an inconsistent manner.
example.MyVolatileMember.RemoveAll(x => x <= 42);
Let us return to value types for a while. In .NET all value types are actually reassigned when they are modified, they are safe to be used with the volatile keyword - see the code:
public class HasVolatileValueType
{
public volatile int MyVolatileMember;
}
// usage
HasVolatileValueType example = new HasVolatileValueType();
example.MyVolatileMember = 42;
1The notion of lates value here is a little misleading, as noted by Eric Lippert in the comments section. In fact latest here means that the .NET runtime will attempt (no guarantees here) to prevent writes to volatile members to happen in-between read operations whenever it deems it is possible. This would contribute to different threads reading a fresh value of the volatile member, as their read operations would probably be ordered after a write operation to the member. But there is more to count on probability here.
2In general, volatile is OK to be used on any immutable object, since modifications always imply reassignment of the field with a different value. The following code is also a correct example of the use of the volatile keyword:
public class HasVolatileImmutableType
{
public volatile string MyVolatileMember;
}
// usage
HasVolatileImmutableType example = new HasVolatileImmutableType();
example.MyVolatileMember = "immutable";
// string is reference type, but is *immutable*,
// so we need to reasign the modification result it in order
// to work with the new value later
example.MyVolatileMember = example.MyVolatileMember.SubString(2);
I'd recommend you to take a look at this article. It thoroughly explains the usage of the volatile keyword, the way it actually works and the possible consequences to using it.
I think it is because a struct is a value type, which is not one of the types listed in the specs. It is interesting to note that reference types can be a volatile field. So it can be accomplished with a user-defined class. This may disprove your theory that the above types are volatile because they can be stored in 4 bytes (or maybe not).
This is an educated guess at the answer... please don't shoot me down too much if I am wrong!
The documentation for volatile states:
The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access.
This implies that part of the design intent for volatile fields is to implement lock-free multithreaded access.
A member of a struct can be updated independently of the other members. So in order to write the new struct value where only part of it has been changed, the old value must be read. Writing is therefore not guaranteed to require a single memory operation. This means that in order to update the struct reliably in a multithreaded environment, some kind of locking or other thread synchronization is required. Updating multiple members from several threads without synchronization could soon lead to counter-intuitive, if not technically corrupt, results: to make a struct volatile would be to mark a non-atomic object as atomically updateable.
Additionally, only some structs could be volatile - those of size 4 bytes. The code that determines the size - the struct definition - could be in a completely separate part of the program to that which defines a field as volatile. This could be confusing as there would be unintended consequences of updating the definition of a struct.
So, whereas it would be technically possible to allow some structs to be volatile, the caveats for correct usage would be sufficiently complex that the disadvantages would outweigh the benefits.
My recommendation for a workaround would be to store your 4-byte struct as a 4-byte base type and implement static conversion methods to use each time you want to use the field.
To address the second part of your question, I would support the language designers decision based on two points:
KISS - Keep It Simple Simon - It would make the spec more complex and implementations hard to have this feature. All language features start at minus 100 points, is adding the ability to have a small minority of struts volatile really worth 101 points?
Compatibility - questions of serialization aside - Usually adding a new field to a type [class, struct] is a safe backwards source compatible move. If you adding a field should not break anyones compile. If the behavior of structs changed when adding a field this would break this.