C# property and ref parameter, why no sugar?

C# property and ref parameter, why no sugar? - c#

I just ran across this error message while working in C#
A property or indexer may not be passed as an out or ref parameter
I known what caused this and did the quick solution of creating a local variable of the correct type, calling the function with it as the out/ref parameter and then assigning it back to the property:
RefFn(ref obj.prop);
turns into
{
var t = obj.prop;
RefFn(ref t);
obj.prop = t;
}
Clearly this would fail if the property doesn't support get and set in the current context.
Why doesn't C# just do that for me?
The only cases where I can think of where this might cause problems are:
threading
exceptions
For threading that transformation affects when the writes happen (after the function call vs. in the function call), but I rather suspect any code that counts on that would get little sympathy when it breaks.
For exceptions, the concern would be; what happens if the function assigns to one of several ref parameters than throws? Any trivial solution would result in all or none of the parameters being assigned to when some should be and some should not be. Again I don't think this would be supported use of the language.
Note: I understand the mechanics of why this error messages is generated. What I'm looking for is the rationale for why C# doesn't automatically implement the trivial workaround.

Because you're passing the result of the indexer, which is really the result of a method call. There's no guarantee that the indexer property also has a setter, and passing it by ref would lead to a false security on the developer's part when he thinks that his property is going to be set without the setter being called.
On a more technical level, ref and out pass the memory address of the object passed into them, and to set a property, you have to call the setter, so there's no guarantee that the property would actually be changed especially when the property type is immutable. ref and out don't just set the value upon return of the method, they pass the actual memory reference to the object itself.

Properties are nothing more than syntactic sugar over the Java style getX/setX methods. It doesn't make much sense for 'ref' on a method. In your instance it would make sense because your properties are merely stubbing out fields. Properties don't have to just be stubs, hence the framework cannot allow 'ref' on Properties.
EDIT: Well, the simple answer is that the mere fact that a Property getter or setter could include far more than just a field read/write makes it undesirable, not to mention possibly unexpected, to allow the sort of sugar you are proposing. This isn't to say I haven't been in need of this functionality before, just that I understand why they wouldn't want to provide it.

Just for info, C# 4.0 will have something like this sugar, but only when calling interop methods - partly due to the sheer propensity of ref in this scenario. I haven't tested it much (in the CTP); we'll have to see how it pans out...

You can use fields with ref/out, but not properties. The reason is that properties are really just a syntax short cut for special methods. The compiler actually translates get / set properties to corresponding get_X and set_X methods as the CLR has no immediate support for properties.

It wouldn't be thread-safe; if two threads simultaneously create their own copies of the property value and pass them to functions as ref parameters, only one of them ends up back in the property.
class Program
{
static int PropertyX { get; set; }
static void Main()
{
PropertyX = 0;
// Sugared from:
// WaitCallback w = (o) => WaitAndIncrement(500, ref PropertyX);
WaitCallback w = (o) => {
int x1 = PropertyX;
WaitAndIncrement(500, ref x1);
PropertyX = x1;
};
// end sugar
ThreadPool.QueueUserWorkItem(w);
// Sugared from:
// WaitAndIncrement(1000, ref PropertyX);
int x2 = PropertyX;
WaitAndIncrement(1000, ref x2);
PropertyX = x2;
// end sugar
Console.WriteLine(PropertyX);
}
static void WaitAndIncrement(int wait, ref int i)
{
Thread.Sleep(wait);
i++;
}
}
PropertyX ends up as 1, whereas a field or local variable would be 2.
That code sample also highlights the difficulties introduced by things like anonymous methods when asking the compiler to do sugary stuff.

The reason for this is that C# does not support "parameterful" properties that accept parameters passed by reference. It is interesting to note that the CLR does support this functionalty but C# does not.

When you pass ref/out prepended it means that you are passing a reference type which is stored in the heap.
Properties are wrapper methods, not variables.

If you're asking why the compiler doesn't substitute the field returned by the property's getter, it's because the getter can return a const or readonly or literal or something else that shouldn't be re-initialized or overwritten.

This site appears to have a work around for you. I have not tested it though, so I can't guarantee it will work. The example appears to use reflection in order to gain access to the get and set functions of the property. This is probably not a recommended approach, but it might accomplish what you're asking for.
http://www.codeproject.com/KB/cs/Passing_Properties_byref.aspx

Related

Why can't "this" for value types be boxed?

So I wanted to be able to mimic the "with" functionality of VB in C#, and came across a rather clever solution through StackOverflow:
public static x with<x>(this x item, Func<x,x> f)
{
item = f(item);
return item;
}
to implement you would do something like:
myObject = myObject.with(myVariable => {
//do things
});
However, I ran into a snag when I tried to implement this inside a struct with one of the struct's fields. It said that "anonymous methods [...] inside structs cannot access members of 'this' [...]."
I did some research on this and found an answer to this question, which states ultimately that "this" for value types cannot be boxed. After researching what boxing means for C#, it makes sense, considering that the parameter for the function doesn't have a defined type.
My question is, why can't "this" for value types be boxed?

The main point here is that for a method on a struct, the meaning of this is not a value (i.e. an value of your SomeStruct), but rather a reference (a ref SomeStruct, or effectively an in SomeStruct in the case of a readonly struct).
You can't box a managed pointer of this form - that isn't a scenario supported by the runtime. Managed pointers are only meant to be on the stack. In fact, currently you can't even have a ref SomeStruct field in a custom ref struct that can't escape the stack.
The compiler could cheat by pretending to do that - i.e. by dereferencing the managed pointer from a ref SomeStruct into a SomeStruct and creating a capture context where this is interpreted as "the SomeStruct that we dereferenced earlier", but ... then the compiler can't guarantee the same behaviours and outcomes (actually, I suspect a case could be made for this in the readonly struct scenario, but ... it is probably easier not to introduce that subtle distinction).
Instead, the compiler suggests that you effectively do the above step manually; since the compiler is no longer dealing in terms of this, it no longer has to pretend to respect the usual outcomes for dealing with this, and instead only has to guarantee the behaviours of an explicitly dereferenced copy of the value. That's why it advises (at least on current compiler versions):
Consider copying 'this' to a local variable outside the anonymous method, lambda expression or query expression and using the local instead.
Most lambdas, local methods, etc can therefore be achieved by the pragmatic step of:
MyStruct copy = this; // dereference
then in your lambda / local method / etc: instead of touching Something aka this.Something - touch copy.Something. It is now only copy that is being included in the capture-context, and copy is not bound by the ref SomeStruct rules (because: it isn't a ref SomeStruct - it is a SomeStruct).
It does, however, mean that if your intent was to mutate this (and have that visible in-place, rather than as a return value), then it won't work. You'll only be mutating the copy (i.e. copy). This is exactly what the compiler would have had to do anyway if it had lied, but at least now the copy step (dereference) is explicit and visible in your code.

I presume you know about setting properties during construction, which is similar (and more idiomatic of c#) ?
class MySpecialClass
{
public string Property1 {get;set;}
public int Length {get;set;}
public float Width {get;set;}
}
var p = new MySpecialClass
{
Property1 = "PropertyValue",
Length = 12,
OtherThing = 1.234F
};

Why are increment operators possible on properties while ref is not

Say you have a class like this
public class Foo
{
public int Bar { get; set; } = 42;
}
If you try to pass the property as a ref parameter the compiler issues the error
CS0206 A property or indexer may not be passed as an out or ref
parameter
This is understandable since in practice the property in the above example is compiled into get_Bar() and set_Bar() methods. But if you use an increment operator on the property, like
var foo = new Foo();
foo.Bar++;
it works as expected. To achieve this, the compiler needs to produce something like this pseudo code:
var foo = new Foo();
int tmp = foo.get_Bar();
tmp++;
foo.set_Bar(tmp);
So in theory the compiler could do a similar thing for ref like:
var foo = new Foo();
int tmp = foo.get_Bar();
DoSomething(ref tmp);
foo.set_Bar(tmp);
Is there a technical reason why the compiler doesn't do that or was this just a design decision of the C# team?

Like HansPassant said, this was a design decision that the C# team made when writing the C# specification, so you would have to ask one of them in order to get a proper answer.
If I was to hazard a guess, though, it would be that the amount of compiler magic to get passing a property with ref would cause enough unobvious operations to happen behind the scenes as to make the solution undesirable. For example, how incrementing/decrementing a property works currently is as you say: the program assigns the value of the property's backing field to a temporary variable, performs the operation, and reassigns the result to the property. It's a straightforward process that doesn't incorporate any difficult concepts.
To do the same behind the scenes magic to pass a property with ref, however, the process becomes a bit more involved. When a value type is passed by ref, the actual value getting passed through the parameter is a pointer to the value type variable. To do this for a property, though, you would have to do something akin to your second example. That would result in the address of the temporary variable, not the property itself, to be passed to the method. That kind of behavior could cause some unforeseen and difficult to understand consequences for someone trying to manipulate the ref parameter in certain ways.
So there's my guess, that the increment operator is simple to wrap because it only deals with values, whereas the ref keyword is more complicated because it has to worry about scopes and memory addresses as well.
EDIT: Another reason that occurred to me, for a field, any manipulations inside the called method would be reflected on the field itself. These manipulations could be seen by other threads and such that accessed that field during the method's execution (best practices about concurrent field accessibility aside).
For a parameter, however, any changes that happen inside the method wouldn't be visible until the method returned and the value copied back over. This would lead to inconsistent behavior between fields and properties, the cause of which wouldn't be readily apparent.
(Personally, I think this is a more probable reason to not support ref on properties.)

Why can we not set properties of properties?

I frequently find myself wanting to do something along these lines:
Form form = new Form();
form.ClientSize.Width = 500;
Of course the compiler will now complain that this code is not valid, since ClientSize is a property, and not a variable.
We can fix this by setting the ClientSize in its entirety:
form.ClientSize = new Size(500, ClientSize.Height);
Or, in general:
Size s = form.ClientSize;
s.Width = 500;
form.ClientSize = s; //only necessary if s is a value-type. (Right?)
But this is all looks unnecessary and obfuscated. Why can't the compiler do this for me? And of course, I'm asking about the general case, possibly involving even deeper levels of properties, not just the mundane example above
Basically, I'm asking why there is no syntactic sugar to translate the line form.ClientSize.Width = 500 into the above code. Is this simply a feature which hasn't yet been implemented, is it to avoid stacking of side effects from different getters and setters, to prevent confusion when one of the setters isn't defined, or is there a completely different reason why something like this doesn't exist?

Why can't the compiler do this for me?
It can. In fact, if you were programming in VB it would do exactly that. The C# compiler doesn't do this because it generally takes the philosophy of doing what you tell it to do; it is not a language that tries to guess at what you want to do and do that instead. If you tell it to do something silly, it'll just let you do something silly. Now this particular case is such a common source of bugs, and given these exact semantics is virtually certain to be a bug, so it does result in an error, because it's nice like that.
C# programmers learn to rely on the C# compiler never deciding to do something that you never told it to do, which can be a cause of confusion and problems when the compiler guess wrong about what you wanted to do.

I believe that you have a fundamental misunderstanding of .NET here. You can set properties of properties all day for class types because you're modifying the data of a reference without changing the reference. Take this code for example which compiles and runs fine:
class Program
{
public class Complex1
{
public Complex2 Complex2Property { get; set; }
}
public class Complex2
{
public int IntProperty { get; set; }
}
static void Main( string[] args )
{
// You must create instances of all properties to avoid a NullReferenceException
// prior to accessing said properties
var complex1 = new Complex1();
complex1.Complex2Property = new Complex2();
// Set property of property
complex1.Complex2Property.IntProperty = 7;
}
}
I assume your object is a struct or value type. The reason you can't do this for structs is that a struct is a value type - it gets copied around by value, not reference. So if I changed the above example to make Complex2 a struct, I could not do this line:
complex1.Complex2Property.IntProperty = 7;
Because the property is syntactic sugar for a get method which would return the struct by value which means a copy of the struct, not the same struct that the property holds. This means my change to that copy would not affect the original property's struct at all, accomplishing nothing except modifying a copy of my data that isn't the data in my property.
As for why the compiler doesn't do this for you? it definitely could, but it won't because you'd never want to actually do this. There's no value in modifying a copy of an object that you don't actually reassign to your property. This situation is a common error for developers who don't understand value vs reference types entirely (myself included!) and so the compiler chooses to warn you of your mistake.

For the compiler to allow myForm.ClientSize.Width = 500;, one of two things would be necessary: either the compiler would have to assume that the intended behavior is equivalent to:
var temp = myForm.ClientSize;
temp.Width = 500;
myForm.ClientSize = temp;
or else myForm would have to associate the name ClientSize with a method whose signature was:
void actupon_ClientSize<TParam>(ref Rectangle it, ref TParam param);
in which case the compiler could generate code similar to
myForm.actupon_ClientSize<int>((ref Rectangle r, ref int dummy)=>r.Width = 500, ref someDummyIntvar);
where someDummyIntVar would be an arbitrary value of type int [the second ref parameter would make it possible to pass parameters to the lambda without generating a closure]. If the Framework described a standard way for objects to properties to be exposed like that, it would make many types of programming safer and more convenient. Unfortunately, no such feature exists nor do I expect any future version of .NET to include it.
With regard to the first transformation, there are many cases where it would yield the desired effect, but also many where it would be unsafe. IMHO, there is no good reason why .NET shouldn't specify attributes which would indicate when various transformations are and are not safe, but they need for them has existed since Day One, and since the programmers responsible for .NET have consistently decided that they'd rather declare mutable structures to be "evil" than do anything that would make them be not evil, I doubt that will ever change either.

In C#, where do you use "ref" in front of a parameter?

There are a number of questions already on the definition of "ref" and "out" parameter but they seem like bad design. Are there any cases where you think ref is the right solution?
It seems like you could always do something else that is cleaner. Can someone give me an example of where this would be the "best" solution for a problem?

In my opinion, ref largely compensated for the difficulty of declaring new utility types and the difficulty of "tacking information on" to existing information, which are things that C# has taken huge steps toward addressing since its genesis through LINQ, generics, and anonymous types.
So no, I don't think there are a lot of clear use cases for it anymore. I think it's largely a relic of how the language was originally designed.
I do think that it still makes sense (like mentioned above) in the case where you need to return some kind of error code from a function as well as a return value, but nothing else (so a bigger type isn't really justified.) If I were doing this all over the place in a project, I would probably define some generic wrapper type for thing-plus-error-code, but in any given instance ref and out are OK.

Well, ref is generally used for specialized cases, but I wouldn't call it redundant or a legacy feature of C#. You'll see it (and out) used a lot in XNA for example. In XNA, a Matrix is a struct and a rather massive one at that (I believe 64 bytes) and it's generally best if you pass it to functions using ref to avoid copying 64 bytes, but just 4 or 8. A specialist C# feature? Certainly. Of not much use any more or indicative of bad design? I don't agree.

One area is in the use of small utility functions, like :
void Swap<T>(ref T a, ref T b) { T tmp = a; a = b; b = tmp; }
I don't see any 'cleaner' alternatives here. Granted, this isn't exactly Architecture level.

P/Invoke is the only place I can really think of a spot where you must use ref or out. Other cases, they can be convenient, but like you said, there is generally another, cleaner way.

What if you wanted to return multiple objects, that for some unknown reason are not tied together into a single object.
void GetXYZ( ref object x, ref object y, ref object z);
EDIT: divo suggested using OUT parameters would be more appropriate for this. I have to admit, he's got a point. I'll leave this answer here as a, for the record, this is an inadaquate solution. OUT trumps REF in this case.

I think the best uses are those that you usually see; you need to have both a value and a "success indicator" that is not an exception from a function.

One design pattern where ref is useful is a bidirectional visitor.
Suppose you had a Storage class that can be used to load or save values of various primitive types. It is either in Load mode or Save mode. It has a group of overloaded methods called Transfer, and here's an example for dealing with int values.
public void Transfer(ref int value)
{
if (Loading)
value = ReadInt();
else
WriteInt(value);
}
There would be similar methods for other primitive types - bool, string, etc.
Then on a class that needs to be "transferable", you would write a method like this:
public void TransferViaStorage(Storage s)
{
s.Transfer(ref _firstName);
s.Transfer(ref _lastName);
s.Transfer(ref _salary);
}
This same single method can either load the fields from the Storage, or save the fields to the Storage, depending what mode the Storage object is in.
Really you're just listing all the fields that need to be transferred, so it closely approaches declarative programming instead of imperative. This means that you don't need to write two functions (one for reading, one for writing) and given that the design I'm using here is order-dependent then it's very handy to know for sure that the fields will always be read/written in identical order.
The general point is that when a parameter is marked as ref, you don't know whether the method is going to read it or write to it, and this allows you to design visitor classes that work in one of two directions, intended to be called in a symmetrical way (i.e. with the visited method not needing to know which direction-mode the visitor class is operating in).
Comparison: Attributes + Reflection
Why do this instead of attributing the fields and using reflection to automatically implement the equivalent of TransferViaStorage? Because sometimes reflection is slow enough to be a bottleneck (but always profile to be sure of this - it's hardly ever true, and attributes are much closer to the ideal of declarative programming).

The real use for this is when you create a struct. Structs in C# are value types and therefore always are copied completely when passed by value. If you need to pass it by reference, for example for performance reasons or because the function needs to make changes to the variable, you would use the ref keyword.
I could see if someone has a struct with 100 values (obviously a problem already), you'd likely want to pass it by reference to prevent 100 values copying. That and returning that large struct and writing over the old value would likely have performance issues.

The obvious reason for using the "ref" keyword is when you want to pass a variable by reference. For example passing a value type like System.Int32 to a method and alter it's actual value. A more specific use might be when you want to swap two variables.
public void Swap(ref int a, ref int b)
{
...
}
The main reason for using the "out" keyword is to return multiple values from a method. Personally I prefer to wrap the values in a specialized struct or class since using the out parameter produces rather ugly code. Parameters passed with "out" - is just like "ref" - passed by reference.
public void DoMagic(out int a, out int b, out int c, out int d)
{
...
}

There is one clear case when you must use the 'ref' keyword. If the object is defined but not created outside the scope of the method that you intend to call AND the method you want to call is supposed to do the 'new' to create it, you must use 'ref'. e.g.{object a; Funct(a);} {Funct(object o) {o = new object; o.name = "dummy";} will NOT do a thing with object 'a' nor will it complain about it at either compile or run time. It just won't do anything. {object a; Funct(ref a);} {Funct(object ref o) {o = new object(); o.name = "dummy";} will result in 'a' being a new object with the name of "dummy". But if the 'new' was already done, then ref not needed (but works if supplied). {object a = new object(); Funct(a);} {Funct(object o) {o.name = "dummy";}

What's so bad about ref parameters?

I'm faced with a situation that I think can only be solved by using a ref parameter. However, this will mean changing a method to always accept a ref parameter when I only need the functionality provided by a ref parameter 5% of the time.
This makes me think "whoa, crazy, must find another way". Am I being stupid? What sort of problems can be caused by a ref parameter?
Edit
Further details were requested, I don't think they are entirely relevant to what I was asking but here we go.
I'm wanting to either save a new instance (which will update with the ID which may later be used) or retrieve an existing instance that matches some logic and update that, save it then change the reference of the new instance to point to the existing one.
Code may make it clearer:
protected override void BeforeSave(Log entity)
{
var newLog = entity;
var existingLog = (from log in repository.All()
where log.Stuff == newLog.Stuff
&& log.Id != newLog.Id
select log).SingleOrDefault();
if (existingLog != null)
{
// update the time
existingLog.SomeValue = entity.SomeValue;
// remove the reference to the new entity
entity = existingLog;
}
}
// called from base class which usually does nothing before save
public void Save(TEntity entity)
{
var report = validator.Validate(entity);
if (report.ValidationPassed)
{
BeforeSave(entity);
repository.Save(entity);
}
else
{
throw new ValidationException { Report = report };
}
}
It's the fact that I would be adding it in only for one child (so far) of the base class that prevents me using an overload (due to the fact I would have to duplicate the Save method). I also have the problem whereby I need to force them to use the ref version in this instance otherwise things won't work as expected.

Can you add an overload? Have one signature without the ref parameter, and one with it.
Ref parameters can be useful, and I'm glad they exist in C#, but they shouldn't be used without thought. Often if a method is effectively returning two values, it would be better either to split the method into two parts, or encapsulate both values in a single type. Neither of these covers every case though - there are definitely times when ref is the best option.

Perhaps use an overloaded function for this 5% case and leave the other function as is.
Unnecessary ref parameters can lead to bad design patterns, but if you have a specific need, there's no problem with doing this.

If you take the .NET Framework as a barometer of people's expectations of an API, consider that almost all of the String methods return the modified value, but leave the passed argument unchanged. String.Trim(), for instance, returns the trimmed String - it doesn't trim the String that was passed in as an argument.
Now, obviously, this is only feasible if you're willing to put return-values into your API. Also, if your function already returns a value, you run into the nasty possibility of creating a custom structure that contains your original return value as well as the newly changed object.
Ultimately, it's up to you and how you document your API. I've found in my experience though that my fellow programmers tend to expect my functions to act "like the .NET Framework functions". :)

A ref parameter won't cause problems per se. It's a documented feature of the language.
However it could cause social problems. Specifically, clients of your API might not expect a ref parameter simply because it's rare. You might modify a reference that the client doesn't expect.
Of course you can argue that this is the client's fault for not reading your API spec, and that would be true. But sometimes it's best to reduce surprise. Writing good code isn't just about following the rules and documenting stuff, it's also about making something naturally obvious to a human user.

An overload won't kill your application or its design. As long as the intent is clearly documented, it should be okay.
One thing that might be considered is mitigating your fears about the ref parameter through a different type of parameter. For example, consider this:
public class SaveArgs
{
public SaveArgs(TEntity value) { this.Value = value; }
public TEntity Value { get; private set;}
public int NewId { get; internal set; }
public bool NewIdGenerated { get; internal set; }
}
In your code, you simply pass a SaveArgs rather than the TEntity, so that you can modify its properties with more meaningful information. (Naturally, it'd be a better-designed class than what I have above.) But then you wouldn't have to worry about vague method interfaces, and you could return as much data as you needed to in a verbose class.
Just a thought.
EDIT: Fixed the code. My bad.

There is nothing wrong with using ref parameters that I can think of, in fact they can be very handy sometimes. I think they sometimes get a bad rap due to debugging since the value of the variable can change in the code logic and can sometimes be hard to track. It also makes things difficult when converting to things like WebServices where a "value" only pass will suffice.

The biggest issue I have with ref parameters is they make it difficult to use type inference as you must sometimes explicitly declare the type of the variable being used as the ref parameter.
Most of the time I use a ref parameter it's in a TryGet scenario. In general I've stopped using ref's in that scenario and instead opted for using a more functional style method by way of an option.
For instance. TryGetValue in dictionary switches from
bool TryGetValue(TKey key, out TValue value)
To
Option<Value> TryGetValue(TKey key)
Option available here: http://blogs.msdn.com/jaredpar/archive/2008/10/08/functional-c-providing-an-option-part-2.aspx

The most common use I've seen for ref parameters is as a way of returning multiple values. If that's the case, you should consider creating a class or struct that returns all the values as one object.
If you still want to use a ref but want to make it optional, add a function overload.

ref is just a tool. You should think: What is the best design pattern for what I am building?
Sometimes will be better to use an overloaded method.
Others will be better to return a custom type or a tuple.
Others will be better to use a global variable.
And others ref will be the right decision.

If your method only needs this ref parameter 5% of the time perhaps you need to break this method down. Of course without more details its hard to say but this to me smells like a case of violating single responsability principal. Perhaps overloading it will help.
As for your question there is no issue in my opinion passing a parameter as a reference although it is not a common thing to run into.

This is one of those things that F# or other functional programming languages solve a lot better with returning Tuple values. Its a lot more cleaner and terse syntax. In the book I am reading on F#, it actually points out the C# equivelant of using ref as doing the same thing in C# to return multiple parameters.
I have no idea if it is a bad practice or there is some underlying "booga booga" about ref parameters, to me they just feel as not clean syntax.

A void-returning function with a single reference parameter certainly looks funny to me. If I were reviewing this code, I'd suggest refactoring the BeforeSave() to include the call to Repository.Save() - renaming it, obviously. Why not just have one method that takes your possibly-new entity and guarantees that everything is saved properly? The caller doesn't do anything with the returned entity anyway.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.