I'm using Visual Studio 2010 + ReSharper and it shows a warning on the following code:
if (rect.Contains(point))
{
...
}
rect is a readonly Rectangle field, and ReSharper shows me this warning:
"Impure Method is called for readonly field of value type."
What are impure methods and why is this warning being shown to me?
First off, Jon, Michael and Jared's answers are essentially correct but I have a few more things I'd like to add to them.
What is meant by an "impure" method?
It is easier to characterize pure methods. A "pure" method has the following characteristics:
Its output is entirely determined by its input; its output does not depend on externalities like the time of day or the bits on your hard disk. Its output does not depend on its history; calling the method with a given argument twice should give the same result.
A pure method produces no observable mutations in the world around it. A pure method may choose to mutate private state for efficiency's sake, but a pure method does not, say, mutate a field of its argument.
For example, Math.Cos is a pure method. Its output depends only on its input, and the input is not changed by the call.
An impure method is a method which is not pure.
What are some of the dangers of passing readonly structs to impure methods?
There are two that come to mind. The first is the one pointed out by Jon, Michael and Jared, and this is the one that ReSharper is warning you about. When you call a method on a struct, we always pass a reference to the variable that is the receiver, in case the method wishes to mutate the variable.
So what if you call such a method on a value, rather than a variable? In that case, we make a temporary variable, copy the value into it, and pass a reference to the variable.
A readonly variable is considered a value, because it cannot be mutated outside the constructor. So we are copying the variable to another variable, and the impure method is possibly mutating the copy, when you intend it to mutate the variable.
That's the danger of passing a readonly struct as a receiver. There is also a danger of passing a struct that contains a readonly field. A struct that contains a readonly field is a common practice, but it is essentially writing a cheque that the type system does not have the funds to cash; the "read-only-ness" of a particular variable is determined by the owner of the storage. An instance of a reference type "owns" its own storage, but an instance of a value type does not!
struct S
{
private readonly int x;
public S(int x) { this.x = x; }
public void Badness(ref S s)
{
Console.WriteLine(this.x);
s = new S(this.x + 1);
// This should be the same, right?
Console.WriteLine(this.x);
}
}
One thinks that this.x is not going to change because x is a readonly field and Badness is not a constructor. But...
S s = new S(1);
s.Badness(ref s);
... clearly demonstrates the falsity of that. this and s refer to the same variable, and that variable is not readonly!
An impure method is one which isn't guaranteed to leave the value as it was.
In .NET 4, you can decorate methods and types with [Pure] to declare them to be pure, and R# will take notice of this. Unfortunately, you can't apply it to someone else's members, and you can't convince R# that a type/member is pure in a .NET 3.5 project as far as I'm aware. (This bites me in Noda Time all the time.)
The idea is that if you're calling a method which mutates a variable, but you call it on a read-only field, it's probably not doing what you want, so R# will warn you about this. For example:
public struct Nasty
{
public int value;
public void SetValue()
{
value = 10;
}
}
class Test
{
static readonly Nasty first;
static Nasty second;
static void Main()
{
first.SetValue();
second.SetValue();
Console.WriteLine(first.value); // 0
Console.WriteLine(second.value); // 10
}
}
This would be a really useful warning if every method which was actually pure was declared that way. Unfortunately they're not, so there are a lot of false positives :(
The short answer is that this is a false positive, and you can safely ignore the warning.
The longer answer is that accessing a read-only value type creates a copy of it, so that any changes to the value made by a method would only affect the copy. ReSharper doesn't realize that Contains is a pure method (meaning it has no side effects). Eric Lippert talks about it here: Mutating Readonly Structs
It sounds like ReSharper believes that the method Contains can mutate the rect value. Because rect is a readonly struct, the C# compiler makes defensive copies of the value to prevent the method from mutating a readonly field. Essentially, the final code looks like this:
Rectangle temp = rect;
if (temp.Contains(point)) {
...
}
ReSharper is warning you here that Contains may mutate rect in a way that would be immediately lost because it happened on a temporary.
An Impure method is a method that could have side-effects. In this case, ReSharper seems to think it could change rect. It probably doesn't but the chain of evidence is broken.
Related
Consider the following executable example:
namespace MyNamespace;
public record struct Record()
{
public bool DoSomething { get; set; } = false;
public void SetDoSomething(bool newValue)
{
DoSomething = newValue;
}
}
public static class Program
{
public static readonly Record MyObject = new();
public static void Main()
{
MyObject.SetDoSomething(true);
Console.WriteLine($"MyObject.DoSomething: {MyObject.DoSomething}");
/* Output:
* false - current version
* true - if MyObject is not readonly or Record is defined as record class
*/
}
}
I'm trying to understand, why DoSomething is still false, after calling the method which sets the property to true.
My guess is, that a copy gets created when calling the method. It makes sense that this does not happen if Record is a reference type (record class). But why gets MyObject not copied if I remove the readonly modifier?
It is called Defensive Copy, which is performed by the C# compilers to enforce the semantic of the value types, it is generally not recommended to mark readonly on a non-readonly struct since such things will happen and further causes performance regression, there're also some similar scenarios worth mentioning, more specifically:
x.Y causes a defensive copy of the x if:
x is a readonly field and
the type of x is a non-readonly struct and
Y is not a field.
The same rules are applied when x is an in-parameter, ref readonly local variable or a result of a method invocation that returns a value by readonly reference.
The record modifier here really doesn't matter, you mark the field with value type as readonly so the compiler thinks that it should preserve the semantic, i.e., the immutability of value types through and through. When you invoke a method or access a property of that field, the compiler won't know if the method or property is actually side-effect free, so it makes a conservative decision, that is, the defensive copy to avoid it.
you can check more information at The ‘in’-modifier and the readonly structs in C# and Avoiding struct and readonly reference performance pitfalls with ErrorProne.NET
The behaviour you see is present not only in record structs, but also non-record structs too. Try removing the keyword record and the () after the name Record, and see the same behaviour.
This is just how calling mutating methods on structs are supposed to work. When you call a mutating method on a struct variable, say x.F(), you actually pass a reference to x, then that reference can be mutated by F.
For example, if Record is a non-record struct, and MyObject is not readonly, MyObject.SetDoSomething(true); is compiled to the following IL (Try it yourself with SharpLab):
ldsflda valuetype Record Program::MyObject
ldc.i4.1
call instance void Record::SetDoSomething(bool)
ldsflda means "load static field address". I've only found a small section of the spec that talks about this when it is talking about boxing of structs (emphasis mine):
Similarly, boxing never implicitly occurs when accessing a member on a constrained type parameter when the member is implemented within the value type. For example, suppose an interface ICounter contains a method Increment, which can be used to modify a value. If ICounter is used as a constraint, the implementation of the Increment method is called with a reference to the variable that Increment was called on, never a boxed copy.
Basically, if you don't box structs (you clearly don't here!), their methods are supposed to be called by reference. No copies are supposed to be made.
On the other hand, if you call x.F() but x is readonly, you obviously can't translate it to the same code above, since that would mutate the field. What the compiler does, according to SharpLab, is:
ldsfld valuetype Record Program::MyObject
stloc.0
ldloca.s 0
ldc.i4.1
call instance void Record::SetDoSomething(bool)
Basically, it loads the value of the struct to a temporary variable first, and then pass the reference of that variable to SetDoSomething.
var temp = MyObject;
temp.SetDoSomething();
Hence the "copy" behaviour that you see.
In C#, if I have the following struct:
internal struct myStruct : IDisposable
{
public int x;
public void Dispose()
{
x = 0;
}
}
then do this in Main:
using (myStruct myStruct = new myStruct())
{
myStruct.x = 5;
}
it fails saying that myStruct is readonly. That makes sense as myStruct is a value-type.
Now if I add the folling function to the struct:
public void myfunc(int x)
{
this.x = x;
}
and change the Main code to this:
using (myStruct myStruct = new myStruct())
{
myStruct.myfunc(5);
Console.WriteLine(myStruct.x);
}
it works. Why ?
The short answer is "because the C# specification says so". Which, I admit, may be a bit unsatisfying. But that's how it is.
The motivation is, I'm sure, as commenter Blogbeard suggests: while it's practical to enforce read-only on the field access, it's not practical to do so from within a type. After all, the type itself has no way to know how a variable containing a value of that type was declared.
The key part of the C# specification (from the v5.0 spec) is here, on page 258 (in the section on the using statement):
Local variables declared in a resource-acquisition are read-only, and must include an initializer. A compile-time error occurs if the embedded statement attempts to modify these local variables (via assignment or the ++ and operators), take the address of them, or pass them as ref or out parameters.
Since in the case of a value type, the variable itself contains the value of the object rather than a reference to an object, modifying any field of the object via that variable is the same as modifying the variable, and is so a "modification via assignment", which is specifically prohibited by the specification.
This is exactly the same as if you had declared the value type variable as a field in another object, with the readonly modifier.
But note that this is a compile-time rule, enforced by the C# compiler, and that there's no way for the compiler to similarly enforce the rule for a value type that modifies itself.
I will point out that this is one of many excellent reasons that one should never ever implement a mutable value type. Mutable value types frequently wind up being able to be modified when you don't want them to be, while at the same time find themselves failing to be modified when you do want them to be (in completely different scenarios from this one).
If you treat a value type as something that is truly a value, i.e. a single value that is itself never changing, they work much better and find themselves in the middle of many fewer bugs. :)
Apparently you can change the this value from anywhere in your struct (but not in classes):
struct Point
{
public Point(int x, int y)
{
this = new Point();
X = x; Y = y;
}
int X; int Y;
}
I've neither seen this before nor ever needed it. Why would one ever want to do that? Eric Lippert reminds us that a feature must be justified to be implemented. What great use case could justify this? Are there any scenarios where this is invaluable? I couldn't find any documentation on it1.
Also, for calling constructors there is already a better known alternative syntax, so this feature is sometimes redundant:
public Point(int x, int y)
: this()
{
X = x; Y = y;
}
I found this feature in an example in Jeffrey Richter's CLR via C# 4th edition.
1) Apparently it is in the C# specification.
Good question!
Value types are, by definition, copied by value. If this was not actually an alias to a storage location then the constructor would be initializing a copy rather than initializing the variable you intend to initialize. Which would make the constructor rather less useful! And similarly for methods; yes, mutable structs are evil but if you are going to make a mutable struct then again, this has to be the variable that is being mutated, not a copy of its value.
The behaviour you are describing is a logical consequence of that design decision: since this aliases a variable, you can assign to it, same as you can assign to any other variable.
It is somewhat odd to assign directly to this like that, rather than assigning to its fields. It is even more odd to assign directly to this and then overwrite 100% of that assignment!
An alternative design which would avoid making this an alias to the receiver's storage would be to allocate this off the short-term storage pool, initialize it in the ctor, and then return it by value. The down side of that approach is that it makes copy elision optimizations pretty much impossible, and it makes ctors and methods weirdly inconsistent.
Also, I couldn't find any documentation on it.
Did you try looking in the C# spec? Because I can find documentation on it (7.6.7):
When this is used in a primary-expression within an instance constructor of a struct, it is classified as a variable. The type of the variable is the instance type (§10.3.1) of the struct within which the usage occurs, and the variable represents the struct being constructed. The this variable of an instance constructor of a struct behaves exactly the same as an out parameter of the struct type—in particular, this means that the variable must be definitely assigned in every execution path of the instance constructor.
When this is used in a primary-expression within an instance method or instance accessor of a struct, it is classified as a variable. The type of the variable is the instance type (§10.3.1) of the struct within which the usage occurs.
If the method or accessor is not an iterator (§10.14), the this variable represents the struct for which the method or accessor was invoked, and behaves exactly the same as a ref parameter of the struct type.
If the method or accessor is an iterator, the this variable represents a copy of the struct for which the method or accessor was invoked, and behaves exactly the same as a value parameter of the struct type.
As to a use case for it, I can't immediately think of many - about the only thing I've got is if the values you want to assign in the constructor are expensive to compute, and you've got a cached value you want to copy into this, it might be convenient.
A storage location of value type in an aggregation of storage locations comprising that type's public and private fields. Passing a value type an an ordinary (value) parameter will physically and semantically pass the contents of all its fields. Passing a value type as a ref parameter is semantically pass the contents of all its fields, though a single "byref" is used to pass all of them.
Calling a method on a struct is equivalent to passing the struct (and thus all its fields) as a ref parameter, except for one wrinkle: normally, neither C# nor vb.net will allow a read-only value to be passed as a ref parameter. Both, however, will allow struct methods to be invoked on read-only values or temporary values. They do this by making a copy of all the struct (and thus all of its fields), and then passing that copy as a ref parameter.
Because of this behavior, some people call mutable structs "evil", but the only thing that's evil is the fact that neither C# or vb.net defines any attribute to indicate whether a struct member or property should be invokable on things that can't be directly passed by ref.
I created a "const" for a value previously explicitly stated several times in my code:
private static readonly int QUARTER_HOUR_COUNT = 96;
When I did a search-and-replace of 96 for QUARTER_HOUR_COUNT, I inadvertently also replaced the declaration, so it became:
private static readonly int QUARTER_HOUR_COUNT = QUARTER_HOUR_COUNT;
...yet it compiled. I would think that it would disallow that. Why was it accepted by the compiler as a valid declaration?
I would think that it would disallow that. Why was it accepted by the compiler as a valid declaration?
Presumably because the language specification allows it. Do you have a specific rule in the language specification which you think prohibits it?
If your question is really "why doesn't the language specification prohibit this" - I suspect it's because it's probably quite hard to make sure you only prohibit things you really want to prohibit, while actually prohibit all such things.
You could argue that for simple cases of assignment directly back to itself, it would be good to have a special case in the language spec, but it would introduce complexity into the language for relatively little benefit.
Note that even if you didn't get an error, I'd expect you to get a warning - something like this:
Test.cs(3,33): warning CS1717: Assignment made to same variable; did you mean to assign something else?
Also note that if you make it a const instead of just a static readonly variable, then you do get a compile-time error:
Test.cs(3,23): error CS0110: The evaluation of the constant value for 'Program.QUARTER_HOUR_COUNT' involves a circular definition
Also note that by .NET naming conventions, this ought to be called QuarterHourCount, rather than having a SHOUTY_NAME.
The IL code generated by the code is this:
IL_0007: ldsfld int32 Example.Quat::QUARTER_HOUR_COUNT//Load the value of a static field on the stack
IL_000c: stsfld int32 Example.Quat::QUARTER_HOUR_COUNT// Store the value from the stack in the static field
Since the default value of QUARTER_HOUR_COUNT is 0,the 0 is assigned to QUARTER_HOUR_COUNT
Because the variable was initialized as 0 and then set to itself.
My guess would be that it would be doing a new Int() prior to setting to itself which would initialize it to zero.
Because the compiler will break this line down:
private static readonly int QUARTER_HOUR_COUNT = QUARTER_HOUR_COUNT;
Basically into the IL equivalent of:
private static readonly int QUARTER_HOUR_COUNT;
QUARTER_HOUR_COUNT = QUARTER_HOUR_COUNT;
And then obviously that'll get broken down more too, but the above should suffice to illustrate my point.
So technically it'll exist with a default value of zero at the time it gets used.
As others have implied value types like int have a default value so declaring a variable without explicitly initializing it means it still has a value.
You can find out the default value for any type like so:
int i = default(int);
Or more generally:
T t = default(T);
Note that for reference types the default will be null, only value types will have default values.
I'm continuing my study of C# and the language specification and Here goes another behavior that I don't quite understand:
The C# Language Specification clearly states the following in section 10.4:
The type specified in a constant declaration must be sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, bool, string, an enum-type, or a reference-type.
It also states in section 4.1.4 the following:
Through const declarations it is possible to declare constants of the simple types (§10.4). It is not possible to have constants of other struct types, but a similar effect is provided by static readonly fields.
Ok, so a similar effect can be gained by using static readonly. Reading this I went and tried the following code:
static void Main()
{
OffsetPoints();
Console.Write("Hit a key to exit...");
Console.ReadKey();
}
static Point staticPoint = new Point(0, 0);
static readonly Point staticReadOnlyPoint = new Point(0, 0);
public static void OffsetPoints()
{
PrintOutPoints();
staticPoint.Offset(1, 1);
staticReadOnlyPoint.Offset(1, 1);
Console.WriteLine("Offsetting...");
Console.WriteLine();
PrintOutPoints();
}
static void PrintOutPoints()
{
Console.WriteLine("Static Point: X={0};Y={1}", staticPoint.X, staticPoint.Y);
Console.WriteLine("Static readonly Point: X={0};Y={1}", staticReadOnlyPoint.X, staticReadOnlyPoint.Y);
Console.WriteLine();
}
The output of this code is:
Static Point: X=0;Y=0
Static readonly Point: X=0;Y=0
Offsetting...
Static Point: X=1;Y=1
Static readonly Point: X=0;Y=0
Hit a key to exit...
I really expected the compiler to give me some kind of warning about mutating a static readonly field or failing that, to mutate the field as it would with a reference type.
I know mutable value types are evil (why did Microsoft ever implement Point as mutable is a mystery) but shouldn't the compiler warn you in some way that you are trying to mutate a static readonly value type? Or at least warn you that your Offset() method will not have the "desired" side effects?
Eric Lippert explains what's going on here:
...if the field is readonly and the reference occurs outside an
instance constructor of the class in which the field is declared, then
the result is a value, namely the value of the field I in the object
referenced by E.
The important word here is that the result is the value of the field,
not the variable associated with the field. Readonly fields are not
variables outside of the constructor. (The initializer here is
considered to be inside the constructor; see my earlier post on that
subject.)
Oh and just to stress on the evilness of mutable structs, here is his conclusion:
This is yet another reason why mutable value types are evil. Try to
always make value types immutable.
The point of the readonly is that you cannot reassign the reference or value.
In other words if you attempted this
staticReadOnlyPoint = new Point(1, 1);
you would get a compiler error because you are attempting to reassign staticReadOnlyPoint. The compiler will prevent you from doing this.
However, readonly doesn't enforce whether the value or referenced object itself is mutable - that is a behaviour that is designed into the class or struct by the person creating it.
[EDIT: to properly address the odd behaviour being described]
The reason you see the behaviour where staticReadOnlyPoint appears to be immutable is not because it is immutable itself, but because it is a readonly struct. This means that every time you access it, you are taking a full copy of it.
So your line
staticReadOnlyPoint.Offset(1, 1);
is accessing, and mutating, a copy of the field, not the actual value in the field. When you subsequently write out the value you are then writing out yet another copy of the original (not the mutated copy).
The copy you did mutate with the call to Offset is discarded, because it is never assigned to anything.
The compiler simply doesn't have enough information available about a method to know that the method mutates the struct. A method may well have a side-effect that's useful but doesn't otherwise change any members of the struct. If would technically be possible to add such analysis to the compiler. But that won't work for any types that live in another assembly.
The missing ingredient is a metadata token that indicates that a method doesn't mutate any members. Like the const keyword in C++. Not available. It would have be drastically non-CLS compliant if it was added in the original design. There are very few languages that support the notion. I can only think of C++ but I don't get out much.
Fwiw, the compiler does generate explicit code to ensure that the statement cannot accidentally modify the readonly. This statement
staticReadOnlyPoint.Offset(1, 1);
gets translated to
Point temp = staticReadOnlyPoint; // makes a copy
temp.Offset(1, 1);
Adding code that then compares the value and generates a runtime error is also only technically possible. It costs too much.
The observed behavior is an unfortunate consequence of the fact that neither the Framework nor C# provides any means by which member function declarations can specify whether this should be passed by ref, const-ref, or value. Instead, value types always pass this by (non-const-restricted) ref, and reference types always pass this by value.
The 'proper' behavior for a compiler would be to forbid passing immutable or temporary values by non-const-restricted ref. If such restriction could be imposed, ensuring proper semantics for mutable value types would mean following a simple rule: if you make an implicit copy of a struct, you're doing something wrong. Unfortunately, the fact that member functions can only accept this by non-const-restricted ref means a language designer must make one of three choices:
Guess that a member function won't modify `this`, and simply pass immutable or temporary variables by `ref`. This would be most efficient for functions which do not, in fact, modify `this`, but could dangerously expose to modification things that should be immutable.
Don't allow member functions to be used on immutable or temporary entities. This would avoid improper semantics, but would be a really annoying restriction, especially given that most member functions do not modify `this`.
Allow the use of member functions except those deemed most likely to modify `this` (e.g. property setters), but instead of passing immutable entities directly by ref, copy them to temporary locations and pass those.
Microsoft's choice protects constants from improper modification, but has the unfortunate consequences that code will run needlessly slowly when calling functions that don't modify this, while generally working incorrectly for those which do.
Given the way this is actually handled, one's best bet is to avoid making any changes to it in structure member functions other than property setters. Having property setters or mutable fields is fine, since the compiler will correctly forbid any attempt to use property setters on immutable or temporary objects, or to modify any fields thereof.
If you look at the IL, you will see that on usage of the readonly field, a copy is made before calling Offset:
IL_0014: ldsfld valuetype [System.Drawing]System.Drawing.Point
Program::staticReadOnlyPoint
IL_0019: stloc.0
IL_001a: ldloca.s CS$0$0000
Why this is happening, is beyond me.
It could be part of the spec, or a compiler bug (but it looks a bit too intentional for the latter).
The effect is due to several well-defined features coming together.
readonly means that the field in question cannot be changed, but not that the target of the field cannot be changed. This is more easily understood (and more often useful in practice) with readonly fields of a mutable reference type, where you can do x.SomeMutatingMethod() but not x = someNewObject.
So, first item is; you can mutate the target of a readonly field.
Second item is, that when you access a non-variable value type you obtain a copy of the value. The least confusing example of this is giveMeAPoint().Offset(1, 1) because there isn't a known location for us to later observe that the value-type returned by giveMeAPoint() may or may not have been mutated.
This is why value types are not evil, but are in some ways worse. Truly evil code doesn't have a well-defined behaviour, and all of this is well-defined. It's still confusing though (confusing enough for me to get this wrong on my first answer), and confusing is worse than evil when you're trying to code. Easily understood evil is so much more easily avoided.