Why are increment operators possible on properties while ref is not

Why are increment operators possible on properties while ref is not - c#

Say you have a class like this
public class Foo
{
public int Bar { get; set; } = 42;
}
If you try to pass the property as a ref parameter the compiler issues the error
CS0206 A property or indexer may not be passed as an out or ref
parameter
This is understandable since in practice the property in the above example is compiled into get_Bar() and set_Bar() methods. But if you use an increment operator on the property, like
var foo = new Foo();
foo.Bar++;
it works as expected. To achieve this, the compiler needs to produce something like this pseudo code:
var foo = new Foo();
int tmp = foo.get_Bar();
tmp++;
foo.set_Bar(tmp);
So in theory the compiler could do a similar thing for ref like:
var foo = new Foo();
int tmp = foo.get_Bar();
DoSomething(ref tmp);
foo.set_Bar(tmp);
Is there a technical reason why the compiler doesn't do that or was this just a design decision of the C# team?

Like HansPassant said, this was a design decision that the C# team made when writing the C# specification, so you would have to ask one of them in order to get a proper answer.
If I was to hazard a guess, though, it would be that the amount of compiler magic to get passing a property with ref would cause enough unobvious operations to happen behind the scenes as to make the solution undesirable. For example, how incrementing/decrementing a property works currently is as you say: the program assigns the value of the property's backing field to a temporary variable, performs the operation, and reassigns the result to the property. It's a straightforward process that doesn't incorporate any difficult concepts.
To do the same behind the scenes magic to pass a property with ref, however, the process becomes a bit more involved. When a value type is passed by ref, the actual value getting passed through the parameter is a pointer to the value type variable. To do this for a property, though, you would have to do something akin to your second example. That would result in the address of the temporary variable, not the property itself, to be passed to the method. That kind of behavior could cause some unforeseen and difficult to understand consequences for someone trying to manipulate the ref parameter in certain ways.
So there's my guess, that the increment operator is simple to wrap because it only deals with values, whereas the ref keyword is more complicated because it has to worry about scopes and memory addresses as well.
EDIT: Another reason that occurred to me, for a field, any manipulations inside the called method would be reflected on the field itself. These manipulations could be seen by other threads and such that accessed that field during the method's execution (best practices about concurrent field accessibility aside).
For a parameter, however, any changes that happen inside the method wouldn't be visible until the method returned and the value copied back over. This would lead to inconsistent behavior between fields and properties, the cause of which wouldn't be readily apparent.
(Personally, I think this is a more probable reason to not support ref on properties.)

Related

Why can't "this" for value types be boxed?

So I wanted to be able to mimic the "with" functionality of VB in C#, and came across a rather clever solution through StackOverflow:
public static x with<x>(this x item, Func<x,x> f)
{
item = f(item);
return item;
}
to implement you would do something like:
myObject = myObject.with(myVariable => {
//do things
});
However, I ran into a snag when I tried to implement this inside a struct with one of the struct's fields. It said that "anonymous methods [...] inside structs cannot access members of 'this' [...]."
I did some research on this and found an answer to this question, which states ultimately that "this" for value types cannot be boxed. After researching what boxing means for C#, it makes sense, considering that the parameter for the function doesn't have a defined type.
My question is, why can't "this" for value types be boxed?

The main point here is that for a method on a struct, the meaning of this is not a value (i.e. an value of your SomeStruct), but rather a reference (a ref SomeStruct, or effectively an in SomeStruct in the case of a readonly struct).
You can't box a managed pointer of this form - that isn't a scenario supported by the runtime. Managed pointers are only meant to be on the stack. In fact, currently you can't even have a ref SomeStruct field in a custom ref struct that can't escape the stack.
The compiler could cheat by pretending to do that - i.e. by dereferencing the managed pointer from a ref SomeStruct into a SomeStruct and creating a capture context where this is interpreted as "the SomeStruct that we dereferenced earlier", but ... then the compiler can't guarantee the same behaviours and outcomes (actually, I suspect a case could be made for this in the readonly struct scenario, but ... it is probably easier not to introduce that subtle distinction).
Instead, the compiler suggests that you effectively do the above step manually; since the compiler is no longer dealing in terms of this, it no longer has to pretend to respect the usual outcomes for dealing with this, and instead only has to guarantee the behaviours of an explicitly dereferenced copy of the value. That's why it advises (at least on current compiler versions):
Consider copying 'this' to a local variable outside the anonymous method, lambda expression or query expression and using the local instead.
Most lambdas, local methods, etc can therefore be achieved by the pragmatic step of:
MyStruct copy = this; // dereference
then in your lambda / local method / etc: instead of touching Something aka this.Something - touch copy.Something. It is now only copy that is being included in the capture-context, and copy is not bound by the ref SomeStruct rules (because: it isn't a ref SomeStruct - it is a SomeStruct).
It does, however, mean that if your intent was to mutate this (and have that visible in-place, rather than as a return value), then it won't work. You'll only be mutating the copy (i.e. copy). This is exactly what the compiler would have had to do anyway if it had lied, but at least now the copy step (dereference) is explicit and visible in your code.

I presume you know about setting properties during construction, which is similar (and more idiomatic of c#) ?
class MySpecialClass
{
public string Property1 {get;set;}
public int Length {get;set;}
public float Width {get;set;}
}
var p = new MySpecialClass
{
Property1 = "PropertyValue",
Length = 12,
OtherThing = 1.234F
};

Why can we not set properties of properties?

I frequently find myself wanting to do something along these lines:
Form form = new Form();
form.ClientSize.Width = 500;
Of course the compiler will now complain that this code is not valid, since ClientSize is a property, and not a variable.
We can fix this by setting the ClientSize in its entirety:
form.ClientSize = new Size(500, ClientSize.Height);
Or, in general:
Size s = form.ClientSize;
s.Width = 500;
form.ClientSize = s; //only necessary if s is a value-type. (Right?)
But this is all looks unnecessary and obfuscated. Why can't the compiler do this for me? And of course, I'm asking about the general case, possibly involving even deeper levels of properties, not just the mundane example above
Basically, I'm asking why there is no syntactic sugar to translate the line form.ClientSize.Width = 500 into the above code. Is this simply a feature which hasn't yet been implemented, is it to avoid stacking of side effects from different getters and setters, to prevent confusion when one of the setters isn't defined, or is there a completely different reason why something like this doesn't exist?

Why can't the compiler do this for me?
It can. In fact, if you were programming in VB it would do exactly that. The C# compiler doesn't do this because it generally takes the philosophy of doing what you tell it to do; it is not a language that tries to guess at what you want to do and do that instead. If you tell it to do something silly, it'll just let you do something silly. Now this particular case is such a common source of bugs, and given these exact semantics is virtually certain to be a bug, so it does result in an error, because it's nice like that.
C# programmers learn to rely on the C# compiler never deciding to do something that you never told it to do, which can be a cause of confusion and problems when the compiler guess wrong about what you wanted to do.

I believe that you have a fundamental misunderstanding of .NET here. You can set properties of properties all day for class types because you're modifying the data of a reference without changing the reference. Take this code for example which compiles and runs fine:
class Program
{
public class Complex1
{
public Complex2 Complex2Property { get; set; }
}
public class Complex2
{
public int IntProperty { get; set; }
}
static void Main( string[] args )
{
// You must create instances of all properties to avoid a NullReferenceException
// prior to accessing said properties
var complex1 = new Complex1();
complex1.Complex2Property = new Complex2();
// Set property of property
complex1.Complex2Property.IntProperty = 7;
}
}
I assume your object is a struct or value type. The reason you can't do this for structs is that a struct is a value type - it gets copied around by value, not reference. So if I changed the above example to make Complex2 a struct, I could not do this line:
complex1.Complex2Property.IntProperty = 7;
Because the property is syntactic sugar for a get method which would return the struct by value which means a copy of the struct, not the same struct that the property holds. This means my change to that copy would not affect the original property's struct at all, accomplishing nothing except modifying a copy of my data that isn't the data in my property.
As for why the compiler doesn't do this for you? it definitely could, but it won't because you'd never want to actually do this. There's no value in modifying a copy of an object that you don't actually reassign to your property. This situation is a common error for developers who don't understand value vs reference types entirely (myself included!) and so the compiler chooses to warn you of your mistake.

For the compiler to allow myForm.ClientSize.Width = 500;, one of two things would be necessary: either the compiler would have to assume that the intended behavior is equivalent to:
var temp = myForm.ClientSize;
temp.Width = 500;
myForm.ClientSize = temp;
or else myForm would have to associate the name ClientSize with a method whose signature was:
void actupon_ClientSize<TParam>(ref Rectangle it, ref TParam param);
in which case the compiler could generate code similar to
myForm.actupon_ClientSize<int>((ref Rectangle r, ref int dummy)=>r.Width = 500, ref someDummyIntvar);
where someDummyIntVar would be an arbitrary value of type int [the second ref parameter would make it possible to pass parameters to the lambda without generating a closure]. If the Framework described a standard way for objects to properties to be exposed like that, it would make many types of programming safer and more convenient. Unfortunately, no such feature exists nor do I expect any future version of .NET to include it.
With regard to the first transformation, there are many cases where it would yield the desired effect, but also many where it would be unsafe. IMHO, there is no good reason why .NET shouldn't specify attributes which would indicate when various transformations are and are not safe, but they need for them has existed since Day One, and since the programmers responsible for .NET have consistently decided that they'd rather declare mutable structures to be "evil" than do anything that would make them be not evil, I doubt that will ever change either.

Passing objects by reference vs value

I just want to check my understanding of C#'s ways of handling things, before I delve too deeply into designing my classes. My current understanding is that:
Struct is a value type, meaning it actually contains the data members defined within.
Class is a reference type, meaning it contains references to the data members defined within.
A method signature passes parameters by value, which means a copy of the value is passed to the inside of the method, making it expensive for large arrays and data structures.
A method signature that defines a parameter with the ref or out keywords will instead pass a parameter by reference, which means a pointer to the object is provided instead.
What I don't understand is what happens when I invoke a method, what actually happens. Does new() get invoked? Does it just automagically copy the data? Or does it actually just point to the original object? And how does using ref and out affect this?

What I don't understand is what happens when I invoke a method, what actually happens. Does new() get invoked? Does it just automagically copy the data? Or does it actually just point to the original object? And how does using ref and out affect this?
The short answer:
The empty constructor will not be called automatically, and it actually just points to the original object.
using ref and out does not affect this.
The long answer:
I think it would be easier to understand how C# handles passing arguments to a function.
Actually everything is being passed by value
Really?! Everything by value?
Yes! Everything!
Of course there must be some kind of a difference between passing classes and simple typed objects, such as an Integer, otherwise, it would be a huge step back performance wise.
Well the thing is, that behind the scenes when you pass a class instance of an object to a function, what is really being passed to the function is the pointer to the class. the pointer, of course, can be passed by value without causing performance issues.
Actually, everything is being passed by value; it's just that when
you're "passing an object", you're actually passing a reference to that
object (and you're passing that reference by value).
once we are in the function, given the argument pointer, we can relate to the object passed by reference.
You don't actually need to do anything for this, you can relate directly to the instance passed as the argument (as said before, this whole process is being done behind the scenes).
After understanding this, you probably understand that the empty constructor will not be called automatically, and it actually just points to the original object.
EDITED:
As to the out and ref, they allow functions to change the value of an arguments and have that change persist outside of the scope of the function.
In a nutshell, using the ref keyword for value types will act as follows:
int i = 42;
foo(ref i);
will translate in c++ to:
int i = 42;
int* ptrI = &i;
foo(ptrI)
while omitting the ref will simply translate to:
int i = 42;
foo(i)
using those keywords for reference type objects, will allow you to reallocate memory to the passed argument, and make the reallocation persist outside of the scope of the function (for more details please refer to the MSDN page)
Side note:
The difference between ref and out is that out makes sure that the called function must assign a value to the out argument, while ref does not have this restriction, and then you should handle it by assigning some default value yourself, thus, ref Implies the the initial value of the argument is important to the function and might affect it's behaviour.

Passing a value-type variable to a method means passing a copy of the variable to the method. Any changes to the parameter that take place inside the method have no affect on the original data stored in the variable.
If you want the called method to change the value of the parameter, you have to pass it by reference, using the ref or out keyword.
When you pass a reference-type parameter by value, it is possible to change the data pointed to by the reference, such as the value of a class member. However, you cannot change the value of the reference itself; that is, you cannot use the same reference to allocate memory for a new class and have it persist outside the block. To do that, pass the parameter using the ref (or out) keyword.
Reference: Passing Parameters(C#)

Tragically, there is no way to pass an object by value in C# or VB.NET. I suggest instead you pass, for example, New Class1(Object1) where Object1 is an instance of Class1. You will have to write your own New method to do this but at least you then have an easy pass-by-value capability for Class1.

In C#, where do you use "ref" in front of a parameter?

There are a number of questions already on the definition of "ref" and "out" parameter but they seem like bad design. Are there any cases where you think ref is the right solution?
It seems like you could always do something else that is cleaner. Can someone give me an example of where this would be the "best" solution for a problem?

In my opinion, ref largely compensated for the difficulty of declaring new utility types and the difficulty of "tacking information on" to existing information, which are things that C# has taken huge steps toward addressing since its genesis through LINQ, generics, and anonymous types.
So no, I don't think there are a lot of clear use cases for it anymore. I think it's largely a relic of how the language was originally designed.
I do think that it still makes sense (like mentioned above) in the case where you need to return some kind of error code from a function as well as a return value, but nothing else (so a bigger type isn't really justified.) If I were doing this all over the place in a project, I would probably define some generic wrapper type for thing-plus-error-code, but in any given instance ref and out are OK.

Well, ref is generally used for specialized cases, but I wouldn't call it redundant or a legacy feature of C#. You'll see it (and out) used a lot in XNA for example. In XNA, a Matrix is a struct and a rather massive one at that (I believe 64 bytes) and it's generally best if you pass it to functions using ref to avoid copying 64 bytes, but just 4 or 8. A specialist C# feature? Certainly. Of not much use any more or indicative of bad design? I don't agree.

One area is in the use of small utility functions, like :
void Swap<T>(ref T a, ref T b) { T tmp = a; a = b; b = tmp; }
I don't see any 'cleaner' alternatives here. Granted, this isn't exactly Architecture level.

P/Invoke is the only place I can really think of a spot where you must use ref or out. Other cases, they can be convenient, but like you said, there is generally another, cleaner way.

What if you wanted to return multiple objects, that for some unknown reason are not tied together into a single object.
void GetXYZ( ref object x, ref object y, ref object z);
EDIT: divo suggested using OUT parameters would be more appropriate for this. I have to admit, he's got a point. I'll leave this answer here as a, for the record, this is an inadaquate solution. OUT trumps REF in this case.

I think the best uses are those that you usually see; you need to have both a value and a "success indicator" that is not an exception from a function.

One design pattern where ref is useful is a bidirectional visitor.
Suppose you had a Storage class that can be used to load or save values of various primitive types. It is either in Load mode or Save mode. It has a group of overloaded methods called Transfer, and here's an example for dealing with int values.
public void Transfer(ref int value)
{
if (Loading)
value = ReadInt();
else
WriteInt(value);
}
There would be similar methods for other primitive types - bool, string, etc.
Then on a class that needs to be "transferable", you would write a method like this:
public void TransferViaStorage(Storage s)
{
s.Transfer(ref _firstName);
s.Transfer(ref _lastName);
s.Transfer(ref _salary);
}
This same single method can either load the fields from the Storage, or save the fields to the Storage, depending what mode the Storage object is in.
Really you're just listing all the fields that need to be transferred, so it closely approaches declarative programming instead of imperative. This means that you don't need to write two functions (one for reading, one for writing) and given that the design I'm using here is order-dependent then it's very handy to know for sure that the fields will always be read/written in identical order.
The general point is that when a parameter is marked as ref, you don't know whether the method is going to read it or write to it, and this allows you to design visitor classes that work in one of two directions, intended to be called in a symmetrical way (i.e. with the visited method not needing to know which direction-mode the visitor class is operating in).
Comparison: Attributes + Reflection
Why do this instead of attributing the fields and using reflection to automatically implement the equivalent of TransferViaStorage? Because sometimes reflection is slow enough to be a bottleneck (but always profile to be sure of this - it's hardly ever true, and attributes are much closer to the ideal of declarative programming).

The real use for this is when you create a struct. Structs in C# are value types and therefore always are copied completely when passed by value. If you need to pass it by reference, for example for performance reasons or because the function needs to make changes to the variable, you would use the ref keyword.
I could see if someone has a struct with 100 values (obviously a problem already), you'd likely want to pass it by reference to prevent 100 values copying. That and returning that large struct and writing over the old value would likely have performance issues.

The obvious reason for using the "ref" keyword is when you want to pass a variable by reference. For example passing a value type like System.Int32 to a method and alter it's actual value. A more specific use might be when you want to swap two variables.
public void Swap(ref int a, ref int b)
{
...
}
The main reason for using the "out" keyword is to return multiple values from a method. Personally I prefer to wrap the values in a specialized struct or class since using the out parameter produces rather ugly code. Parameters passed with "out" - is just like "ref" - passed by reference.
public void DoMagic(out int a, out int b, out int c, out int d)
{
...
}

There is one clear case when you must use the 'ref' keyword. If the object is defined but not created outside the scope of the method that you intend to call AND the method you want to call is supposed to do the 'new' to create it, you must use 'ref'. e.g.{object a; Funct(a);} {Funct(object o) {o = new object; o.name = "dummy";} will NOT do a thing with object 'a' nor will it complain about it at either compile or run time. It just won't do anything. {object a; Funct(ref a);} {Funct(object ref o) {o = new object(); o.name = "dummy";} will result in 'a' being a new object with the name of "dummy". But if the 'new' was already done, then ref not needed (but works if supplied). {object a = new object(); Funct(a);} {Funct(object o) {o.name = "dummy";}

C# property and ref parameter, why no sugar?

I just ran across this error message while working in C#
A property or indexer may not be passed as an out or ref parameter
I known what caused this and did the quick solution of creating a local variable of the correct type, calling the function with it as the out/ref parameter and then assigning it back to the property:
RefFn(ref obj.prop);
turns into
{
var t = obj.prop;
RefFn(ref t);
obj.prop = t;
}
Clearly this would fail if the property doesn't support get and set in the current context.
Why doesn't C# just do that for me?
The only cases where I can think of where this might cause problems are:
threading
exceptions
For threading that transformation affects when the writes happen (after the function call vs. in the function call), but I rather suspect any code that counts on that would get little sympathy when it breaks.
For exceptions, the concern would be; what happens if the function assigns to one of several ref parameters than throws? Any trivial solution would result in all or none of the parameters being assigned to when some should be and some should not be. Again I don't think this would be supported use of the language.
Note: I understand the mechanics of why this error messages is generated. What I'm looking for is the rationale for why C# doesn't automatically implement the trivial workaround.

Because you're passing the result of the indexer, which is really the result of a method call. There's no guarantee that the indexer property also has a setter, and passing it by ref would lead to a false security on the developer's part when he thinks that his property is going to be set without the setter being called.
On a more technical level, ref and out pass the memory address of the object passed into them, and to set a property, you have to call the setter, so there's no guarantee that the property would actually be changed especially when the property type is immutable. ref and out don't just set the value upon return of the method, they pass the actual memory reference to the object itself.

Properties are nothing more than syntactic sugar over the Java style getX/setX methods. It doesn't make much sense for 'ref' on a method. In your instance it would make sense because your properties are merely stubbing out fields. Properties don't have to just be stubs, hence the framework cannot allow 'ref' on Properties.
EDIT: Well, the simple answer is that the mere fact that a Property getter or setter could include far more than just a field read/write makes it undesirable, not to mention possibly unexpected, to allow the sort of sugar you are proposing. This isn't to say I haven't been in need of this functionality before, just that I understand why they wouldn't want to provide it.

Just for info, C# 4.0 will have something like this sugar, but only when calling interop methods - partly due to the sheer propensity of ref in this scenario. I haven't tested it much (in the CTP); we'll have to see how it pans out...

You can use fields with ref/out, but not properties. The reason is that properties are really just a syntax short cut for special methods. The compiler actually translates get / set properties to corresponding get_X and set_X methods as the CLR has no immediate support for properties.

It wouldn't be thread-safe; if two threads simultaneously create their own copies of the property value and pass them to functions as ref parameters, only one of them ends up back in the property.
class Program
{
static int PropertyX { get; set; }
static void Main()
{
PropertyX = 0;
// Sugared from:
// WaitCallback w = (o) => WaitAndIncrement(500, ref PropertyX);
WaitCallback w = (o) => {
int x1 = PropertyX;
WaitAndIncrement(500, ref x1);
PropertyX = x1;
};
// end sugar
ThreadPool.QueueUserWorkItem(w);
// Sugared from:
// WaitAndIncrement(1000, ref PropertyX);
int x2 = PropertyX;
WaitAndIncrement(1000, ref x2);
PropertyX = x2;
// end sugar
Console.WriteLine(PropertyX);
}
static void WaitAndIncrement(int wait, ref int i)
{
Thread.Sleep(wait);
i++;
}
}
PropertyX ends up as 1, whereas a field or local variable would be 2.
That code sample also highlights the difficulties introduced by things like anonymous methods when asking the compiler to do sugary stuff.

The reason for this is that C# does not support "parameterful" properties that accept parameters passed by reference. It is interesting to note that the CLR does support this functionalty but C# does not.

When you pass ref/out prepended it means that you are passing a reference type which is stored in the heap.
Properties are wrapper methods, not variables.

If you're asking why the compiler doesn't substitute the field returned by the property's getter, it's because the getter can return a const or readonly or literal or something else that shouldn't be re-initialized or overwritten.

This site appears to have a work around for you. I have not tested it though, so I can't guarantee it will work. The example appears to use reflection in order to gain access to the get and set functions of the property. This is probably not a recommended approach, but it might accomplish what you're asking for.
http://www.codeproject.com/KB/cs/Passing_Properties_byref.aspx

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.