Related
Consider the following code (I have purposefully written MyPoint to be a reference type for this example)
public class MyPoint
{
public int x;
public int y;
}
It is universally acknowledged (in C# at least) that when you pass by reference, the method contains a reference to the object being manipulated, whereas when you pass by value, the method copies the value being manipulated, thus the value in global scope is not affected.
Example:
void Replace<T>(T a, T b)
{
a = b;
}
int a = 1;
int b = 2;
Replace<int>(a, b);
// a and b remain unaffected in global scope since a and b are value types.
Here is my problem; MyPoint is a reference type, thus I would expect the same operation on Point to replace a with b in global scope.
Example:
MyPoint a = new MyPoint { x = 1, y = 2 };
MyPoint b = new MyPoint { x = 3, y = 4 };
Replace<MyPoint>(a, b);
// a and b remain unaffected in global scope since a and b...ummm!?
I expected a and b to point to the same reference in memory...can someone please clarify where I have gone wrong?
Re: OP's Assertion
It is universally acknowledged (in C# at least) that when you pass by reference, the method contains a reference to the object being manipulated, whereas when you pass by value, the method copies the value being manipulated ...
TL;DR
There's more to it than that. Unless you pass variables with the ref or out keywords, C# passes variables to methods by value, irrespective of whether the variable is a value type or a reference type.
If passed by reference, then the called function may change the variable's address at the call-site (i.e. change the original calling function's variable's assignment).
If a variable is passed by value:
if the called function re-assigns the variable, this change is local to the called function only, and will not affect the original variable in the calling function
however, if changes are made to the variable's fields or properties by the called function, it will depend on whether the variable is a value type or a reference type in order to determine whether the calling function will observe the changes made to this variable.
Since this is all rather complicated, I would recommend avoiding passing by reference if possible (instead, if you need to return multiple values from a function, use a composite class, struct, or Tuples as a return type instead of using the ref or out keywords on parameters)
Also, when passing reference types around, a lot of bugs can be avoided by not changing (mutating) fields and properties of an object passed into a method (for example, use C#'s immutable properties to prevent changes to properties, and strive to assign properties only once, during construction).
In Detail
The problem is that there are two distinct concepts:
Value Types (e.g. int) vs Reference Types (e.g. string, or custom classes)
Passing by Value (default behaviour) vs Passing by Reference(ref, out)
Unless you explicitly pass (any) variable by reference, by using the out or ref keywords, parameters are passed by value in C#, irrespective of whether the variable is a value type or reference type.
When passing value types (such as int, float or structs like DateTime) by value (i.e. without out or ref), the called function gets a copy of the entire value type (via the stack).
Any change to the value type, and any changes to any properties / fields of the copy will be lost when the called function is exited.
However, when passing reference types (e.g. custom classes like your MyPoint class) by value, it is the reference to the same, shared object instance which is copied and passed on the stack.
This means that:
If the passed object has mutable (settable) fields and properties, any changes to those fields or properties of the shared object are permanent (i.e. any changes to x or y are seen by anyone observing the object)
However, during method calls, the reference itself is still copied (passed by value), so if the parameter variable is reassigned, this change is made only to the local copy of the reference, so the change will not be seen by the caller. This is why your code doesn't work as expected
What happens here:
void Replace<T>(T a, T b) // Both a and b are passed by value
{
a = b; // reassignment is localized to method `Replace`
}
for reference types T, means that the local variable (stack) reference to the object a is reassigned to the local stack reference b. This reassign is local to this function only - as soon as scope leaves this function, the re-assignment is lost.
If you really want to replace the caller's references, you'll need to change the signature like so:
void Replace<T>(ref T a, T b) // a is passed by reference
{
a = b; // a is reassigned, and is also visible to the calling function
}
This changes the call to call by reference - in effect we are passing the address of the caller's variable to the function, which then allows the called method to alter the calling method's variable.
However, nowadays:
Passing by reference is generally regarded as a bad idea - instead, we should either pass return data in the return value, and if there is more than one variable to be returned, then use a Tuple or a custom class or struct which contains all such return variables.
Changing ('mutating') a shared value (and even reference) variable in a called method is frowned upon, especially by the Functional Programming community, as this can lead to tricky bugs, especially when using multiple threads. Instead, give preference to immutable variables, or if mutation is required, then consider changing a (potentially deep) copy of the variable. You might find topics around 'pure functions' and 'const correctness' interesting further reading.
Edit
These two diagrams may help with the explanation.
Pass by value (reference types):
In your first instance (Replace<T>(T a,T b)), a and b are passed by value. For reference types, this means the references are copied onto the stack and passed to the called function.
Your initial code (I've called this main) allocates two MyPoint objects on the managed heap (I've called these point1 and point2), and then assigns two local variable references a and b, to reference the points, respectively (the light blue arrows):
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2
The call to Replace<Point>(a, b) then pushes a copy of the two references onto the stack (the red arrows). Method Replace sees these as the two parameters also named a and b, which still point to point1 and point2, respectively (the orange arrows).
The assignment, a = b; then changes the Replace methods' a local variable such that a now points to the same object as referenced by b (i.e. point2). However, note that this change is only to Replace's local (stack) variables, and this change will only affect subsequent code in Replace (the dark blue line). It does NOT affect the calling function's variable references in any way, NOR does this change the point1 and point2 objects on the heap at all.
Pass by reference:
If however we we change the call to Replace<T>(ref T a, T b) and then change main to pass a by reference, i.e. Replace(ref a, b):
As before, two point objects allocated on the heap.
Now, when Replace(ref a, b) is called, while mains reference b (pointing to point2) is still copied during the call, a is now passed by reference, meaning that the "address" to main's a variable is passed to Replace.
Now when the assignment a = b is made ...
It is the the calling function, main's a variable reference which is now updated to reference point2. The change made by the re-assignment to a is now seen by both main and Replace. There are now no references to point1
Changes to (heap allocated) object instances are seen by all code referencing the object
In both scenarios above, no changes were actually made to the heap objects, point1 and point2, it was only local variable references which were passed and re-assigned.
However, if any changes were actually made to the heap objects point1 and point2, then all variable references to these objects would see these changes.
So, for example:
void main()
{
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2
// Passed by value, but the properties x and y are being changed
DoSomething(a, b);
// a and b have been changed!
Assert.AreEqual(53, a.x);
Assert.AreEqual(21, b.y);
}
public void DoSomething(MyPoint a, MyPoint b)
{
a.x = 53;
b.y = 21;
}
Now, when execution returns to main, all references to point1 and point2, including main's variables a and b, which will now 'see' the changes when they next read the values for x and y of the points. You will also note that the variables a and b were still passed by value to DoSomething.
Changes to value types affect the local copy only
Value types (primitives like System.Int32, System.Double) and structs (like System.DateTime, or your own structs) are allocated on the stack, not the heap, and are copied verbatim onto the stack when passed into a call. This leads to a major difference in behaviour, since changes made by the called function to a value type field or property will only be observed locally by the called function, because it only will be mutating the local copy of the value type.
e.g. Consider the following code with an instance of the mutable struct, System.Drawing.Rectangle
public void SomeFunc(System.Drawing.Rectangle aRectangle)
{
// Only the local SomeFunc copy of aRectangle is changed:
aRectangle.X = 99;
// Passes - the changes last for the scope of the copied variable
Assert.AreEqual(99, aRectangle.X);
} // The copy aRectangle will be lost when the stack is popped.
// Which when called:
var myRectangle = new System.Drawing.Rectangle(10, 10, 20, 20);
// A copy of `myRectangle` is passed on the stack
SomeFunc(myRectangle);
// Test passes - the caller's struct has NOT been modified
Assert.AreEqual(10, myRectangle.X);
The above can be quite confusing and highlights why it is good practice to create your own custom structs as immutable.
The ref keyword works similarly to allow value type variables to be passed by reference, viz that the 'address' of the caller's value type variable is passed onto the stack, and assignment of the caller's assigned variable is now directly possible.
C# is actually pass by value. You get the illusion it's pass by reference, because when you pass a reference type you get a copy of the reference (the reference was passed by value). However, since your replace method is replacing that reference copy with another reference, it's effectively doing nothing (The copied reference goes out of scope immediately). You can actually pass by reference by adding the ref keyword:
void Replace<T>(ref T a, T b)
{
a = b;
}
This will get you your desired result, but in practice is a little strange.
In C# all the params that you pass to a method are passed by value.
Now before you shout keep on reading:
A value-type's value is the data that is copied while a reference type's value is actually a reference.
So when you pass an objects reference to a method and change that object then the changes will reflect outside the method as well since you are manipulating the same memory the object was allocated.
public void Func(Point p){p.x = 4;}
Point p = new Point {x=3,y=4};
Func(p);
// p.x = 4, p.y = 4
Now Lets look at this method:
public void Func2(Point p){
p = new Point{x=5,y=5};
}
Func2(p);
// p.x = 4, p.y = 4
So no changed occurred here and why? Your method simply created a new Point and changed p's reference(Which was passed by value) and therefore the change was local. You didn't manipulate the point, you changed the reference and you did locally.
And there comes the ref keyword that saves the day:
public void Func3(ref Point p){
p = new Point{x=5,y=5};
}
Func3(ref p);
// p.x = 5, p.y = 5
The same occurred in your example. You assigned a point with a new reference, but you did it locally.
C# is passing reference types objects not by reference, but rather it's passing the reference by value. Meaning you can mess around with their insides, but you can't change the assignment itself.
Read this great piece by Jon Skeet for deeper understanding.
Have a look on behavior by a simple program in C#:
class Program
{
static int intData = 0;
static string stringData = string.Empty;
public static void CallByValueForValueType(int data)
{
data = data + 5;
}
public static void CallByValueForRefrenceType(string data)
{
data = data + "Changes";
}
public static void CallByRefrenceForValueType(ref int data)
{
data = data + 5;
}
public static void CallByRefrenceForRefrenceType(ref string data)
{
data = data +"Changes";
}
static void Main(string[] args)
{
intData = 0;
CallByValueForValueType(intData);
Console.WriteLine($"CallByValueForValueType : {intData}");
stringData = string.Empty;
CallByValueForRefrenceType(stringData);
Console.WriteLine($"CallByValueForRefrenceType : {stringData}");
intData = 0;
CallByRefrenceForValueType(ref intData);
Console.WriteLine($"CallByRefrenceForValueType : {intData}");
stringData = string.Empty;
CallByRefrenceForRefrenceType(ref stringData);
Console.WriteLine($"CallByRefrenceForRefrenceType : {stringData}");
Console.ReadLine();
}
}
Output:
You're not understanding what passing by reference means. Your Replace method is creating a copy of the Point object--passing by value (which is actually the better way of doing it).
To pass by reference, so that a and b both reference the same point in memory, you need add "ref" to the signature.
You dont get it right.
It is similar like Java - everything is passed by value! But you do have to know, what the value is.
In primitive data types, the value is the number itself. In other cases it is reference.
BUT, if you copy reference to another variable, it holds same reference, but does not reference the variable (thus it is not pass by reference known in C++).
By default c# passes ALL arguements by value... that is why a and b remain unaffected in global scope in your examples. Here's a reference for those down voters.
To add more detail...in .NET, C# methods, using the default "pass by value" assigned to all parameters, reference types act differently in two scenarios. In the case of all reference types using classes (System.Object types), a copy of the "pointer" (to a memory block) to the original class or object is passed in and assigned to the method's parameter or variable name. This pointer is a value, too, and copied on the stack in memory where all value types are store. The value of the object isn't stored just a copy of its pointer, that points back to the original cl;ass object. I believe this is a 4-byte value. That's what is physically passed and stored in methods for all reference types. So, you now have a new method parameter or variable with a pointer assigned to it still pointing back to the original class object outside the method. You can now do two things to the new variable with the copied pointer value:
You can change the ORIGINAL object outside the method by changing its properties iniside your method. If "MyObject" is your variable with the copied pointer, you would do MyObject.myproperty = 6;, which changed the "myproperty" inside the original object outside the method. You did this as you passed in a pointer to the original object and assigned it to a new variable in your method. Note that this DOES change the referenced object outside the method.
Or, setting your variable with copied pointer to a new object and new pointer like so: MyObject = new SomeObject(); Here, we destroyed the old copied pointer assigned the variable above and assigned it to a new pointer to a new object! Now we have lost connection to the outside object and only changing a new object.
Consider the following code (I have purposefully written MyPoint to be a reference type for this example)
public class MyPoint
{
public int x;
public int y;
}
It is universally acknowledged (in C# at least) that when you pass by reference, the method contains a reference to the object being manipulated, whereas when you pass by value, the method copies the value being manipulated, thus the value in global scope is not affected.
Example:
void Replace<T>(T a, T b)
{
a = b;
}
int a = 1;
int b = 2;
Replace<int>(a, b);
// a and b remain unaffected in global scope since a and b are value types.
Here is my problem; MyPoint is a reference type, thus I would expect the same operation on Point to replace a with b in global scope.
Example:
MyPoint a = new MyPoint { x = 1, y = 2 };
MyPoint b = new MyPoint { x = 3, y = 4 };
Replace<MyPoint>(a, b);
// a and b remain unaffected in global scope since a and b...ummm!?
I expected a and b to point to the same reference in memory...can someone please clarify where I have gone wrong?
Re: OP's Assertion
It is universally acknowledged (in C# at least) that when you pass by reference, the method contains a reference to the object being manipulated, whereas when you pass by value, the method copies the value being manipulated ...
TL;DR
There's more to it than that. Unless you pass variables with the ref or out keywords, C# passes variables to methods by value, irrespective of whether the variable is a value type or a reference type.
If passed by reference, then the called function may change the variable's address at the call-site (i.e. change the original calling function's variable's assignment).
If a variable is passed by value:
if the called function re-assigns the variable, this change is local to the called function only, and will not affect the original variable in the calling function
however, if changes are made to the variable's fields or properties by the called function, it will depend on whether the variable is a value type or a reference type in order to determine whether the calling function will observe the changes made to this variable.
Since this is all rather complicated, I would recommend avoiding passing by reference if possible (instead, if you need to return multiple values from a function, use a composite class, struct, or Tuples as a return type instead of using the ref or out keywords on parameters)
Also, when passing reference types around, a lot of bugs can be avoided by not changing (mutating) fields and properties of an object passed into a method (for example, use C#'s immutable properties to prevent changes to properties, and strive to assign properties only once, during construction).
In Detail
The problem is that there are two distinct concepts:
Value Types (e.g. int) vs Reference Types (e.g. string, or custom classes)
Passing by Value (default behaviour) vs Passing by Reference(ref, out)
Unless you explicitly pass (any) variable by reference, by using the out or ref keywords, parameters are passed by value in C#, irrespective of whether the variable is a value type or reference type.
When passing value types (such as int, float or structs like DateTime) by value (i.e. without out or ref), the called function gets a copy of the entire value type (via the stack).
Any change to the value type, and any changes to any properties / fields of the copy will be lost when the called function is exited.
However, when passing reference types (e.g. custom classes like your MyPoint class) by value, it is the reference to the same, shared object instance which is copied and passed on the stack.
This means that:
If the passed object has mutable (settable) fields and properties, any changes to those fields or properties of the shared object are permanent (i.e. any changes to x or y are seen by anyone observing the object)
However, during method calls, the reference itself is still copied (passed by value), so if the parameter variable is reassigned, this change is made only to the local copy of the reference, so the change will not be seen by the caller. This is why your code doesn't work as expected
What happens here:
void Replace<T>(T a, T b) // Both a and b are passed by value
{
a = b; // reassignment is localized to method `Replace`
}
for reference types T, means that the local variable (stack) reference to the object a is reassigned to the local stack reference b. This reassign is local to this function only - as soon as scope leaves this function, the re-assignment is lost.
If you really want to replace the caller's references, you'll need to change the signature like so:
void Replace<T>(ref T a, T b) // a is passed by reference
{
a = b; // a is reassigned, and is also visible to the calling function
}
This changes the call to call by reference - in effect we are passing the address of the caller's variable to the function, which then allows the called method to alter the calling method's variable.
However, nowadays:
Passing by reference is generally regarded as a bad idea - instead, we should either pass return data in the return value, and if there is more than one variable to be returned, then use a Tuple or a custom class or struct which contains all such return variables.
Changing ('mutating') a shared value (and even reference) variable in a called method is frowned upon, especially by the Functional Programming community, as this can lead to tricky bugs, especially when using multiple threads. Instead, give preference to immutable variables, or if mutation is required, then consider changing a (potentially deep) copy of the variable. You might find topics around 'pure functions' and 'const correctness' interesting further reading.
Edit
These two diagrams may help with the explanation.
Pass by value (reference types):
In your first instance (Replace<T>(T a,T b)), a and b are passed by value. For reference types, this means the references are copied onto the stack and passed to the called function.
Your initial code (I've called this main) allocates two MyPoint objects on the managed heap (I've called these point1 and point2), and then assigns two local variable references a and b, to reference the points, respectively (the light blue arrows):
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2
The call to Replace<Point>(a, b) then pushes a copy of the two references onto the stack (the red arrows). Method Replace sees these as the two parameters also named a and b, which still point to point1 and point2, respectively (the orange arrows).
The assignment, a = b; then changes the Replace methods' a local variable such that a now points to the same object as referenced by b (i.e. point2). However, note that this change is only to Replace's local (stack) variables, and this change will only affect subsequent code in Replace (the dark blue line). It does NOT affect the calling function's variable references in any way, NOR does this change the point1 and point2 objects on the heap at all.
Pass by reference:
If however we we change the call to Replace<T>(ref T a, T b) and then change main to pass a by reference, i.e. Replace(ref a, b):
As before, two point objects allocated on the heap.
Now, when Replace(ref a, b) is called, while mains reference b (pointing to point2) is still copied during the call, a is now passed by reference, meaning that the "address" to main's a variable is passed to Replace.
Now when the assignment a = b is made ...
It is the the calling function, main's a variable reference which is now updated to reference point2. The change made by the re-assignment to a is now seen by both main and Replace. There are now no references to point1
Changes to (heap allocated) object instances are seen by all code referencing the object
In both scenarios above, no changes were actually made to the heap objects, point1 and point2, it was only local variable references which were passed and re-assigned.
However, if any changes were actually made to the heap objects point1 and point2, then all variable references to these objects would see these changes.
So, for example:
void main()
{
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2
// Passed by value, but the properties x and y are being changed
DoSomething(a, b);
// a and b have been changed!
Assert.AreEqual(53, a.x);
Assert.AreEqual(21, b.y);
}
public void DoSomething(MyPoint a, MyPoint b)
{
a.x = 53;
b.y = 21;
}
Now, when execution returns to main, all references to point1 and point2, including main's variables a and b, which will now 'see' the changes when they next read the values for x and y of the points. You will also note that the variables a and b were still passed by value to DoSomething.
Changes to value types affect the local copy only
Value types (primitives like System.Int32, System.Double) and structs (like System.DateTime, or your own structs) are allocated on the stack, not the heap, and are copied verbatim onto the stack when passed into a call. This leads to a major difference in behaviour, since changes made by the called function to a value type field or property will only be observed locally by the called function, because it only will be mutating the local copy of the value type.
e.g. Consider the following code with an instance of the mutable struct, System.Drawing.Rectangle
public void SomeFunc(System.Drawing.Rectangle aRectangle)
{
// Only the local SomeFunc copy of aRectangle is changed:
aRectangle.X = 99;
// Passes - the changes last for the scope of the copied variable
Assert.AreEqual(99, aRectangle.X);
} // The copy aRectangle will be lost when the stack is popped.
// Which when called:
var myRectangle = new System.Drawing.Rectangle(10, 10, 20, 20);
// A copy of `myRectangle` is passed on the stack
SomeFunc(myRectangle);
// Test passes - the caller's struct has NOT been modified
Assert.AreEqual(10, myRectangle.X);
The above can be quite confusing and highlights why it is good practice to create your own custom structs as immutable.
The ref keyword works similarly to allow value type variables to be passed by reference, viz that the 'address' of the caller's value type variable is passed onto the stack, and assignment of the caller's assigned variable is now directly possible.
C# is actually pass by value. You get the illusion it's pass by reference, because when you pass a reference type you get a copy of the reference (the reference was passed by value). However, since your replace method is replacing that reference copy with another reference, it's effectively doing nothing (The copied reference goes out of scope immediately). You can actually pass by reference by adding the ref keyword:
void Replace<T>(ref T a, T b)
{
a = b;
}
This will get you your desired result, but in practice is a little strange.
In C# all the params that you pass to a method are passed by value.
Now before you shout keep on reading:
A value-type's value is the data that is copied while a reference type's value is actually a reference.
So when you pass an objects reference to a method and change that object then the changes will reflect outside the method as well since you are manipulating the same memory the object was allocated.
public void Func(Point p){p.x = 4;}
Point p = new Point {x=3,y=4};
Func(p);
// p.x = 4, p.y = 4
Now Lets look at this method:
public void Func2(Point p){
p = new Point{x=5,y=5};
}
Func2(p);
// p.x = 4, p.y = 4
So no changed occurred here and why? Your method simply created a new Point and changed p's reference(Which was passed by value) and therefore the change was local. You didn't manipulate the point, you changed the reference and you did locally.
And there comes the ref keyword that saves the day:
public void Func3(ref Point p){
p = new Point{x=5,y=5};
}
Func3(ref p);
// p.x = 5, p.y = 5
The same occurred in your example. You assigned a point with a new reference, but you did it locally.
C# is passing reference types objects not by reference, but rather it's passing the reference by value. Meaning you can mess around with their insides, but you can't change the assignment itself.
Read this great piece by Jon Skeet for deeper understanding.
Have a look on behavior by a simple program in C#:
class Program
{
static int intData = 0;
static string stringData = string.Empty;
public static void CallByValueForValueType(int data)
{
data = data + 5;
}
public static void CallByValueForRefrenceType(string data)
{
data = data + "Changes";
}
public static void CallByRefrenceForValueType(ref int data)
{
data = data + 5;
}
public static void CallByRefrenceForRefrenceType(ref string data)
{
data = data +"Changes";
}
static void Main(string[] args)
{
intData = 0;
CallByValueForValueType(intData);
Console.WriteLine($"CallByValueForValueType : {intData}");
stringData = string.Empty;
CallByValueForRefrenceType(stringData);
Console.WriteLine($"CallByValueForRefrenceType : {stringData}");
intData = 0;
CallByRefrenceForValueType(ref intData);
Console.WriteLine($"CallByRefrenceForValueType : {intData}");
stringData = string.Empty;
CallByRefrenceForRefrenceType(ref stringData);
Console.WriteLine($"CallByRefrenceForRefrenceType : {stringData}");
Console.ReadLine();
}
}
Output:
You're not understanding what passing by reference means. Your Replace method is creating a copy of the Point object--passing by value (which is actually the better way of doing it).
To pass by reference, so that a and b both reference the same point in memory, you need add "ref" to the signature.
You dont get it right.
It is similar like Java - everything is passed by value! But you do have to know, what the value is.
In primitive data types, the value is the number itself. In other cases it is reference.
BUT, if you copy reference to another variable, it holds same reference, but does not reference the variable (thus it is not pass by reference known in C++).
By default c# passes ALL arguements by value... that is why a and b remain unaffected in global scope in your examples. Here's a reference for those down voters.
To add more detail...in .NET, C# methods, using the default "pass by value" assigned to all parameters, reference types act differently in two scenarios. In the case of all reference types using classes (System.Object types), a copy of the "pointer" (to a memory block) to the original class or object is passed in and assigned to the method's parameter or variable name. This pointer is a value, too, and copied on the stack in memory where all value types are store. The value of the object isn't stored just a copy of its pointer, that points back to the original cl;ass object. I believe this is a 4-byte value. That's what is physically passed and stored in methods for all reference types. So, you now have a new method parameter or variable with a pointer assigned to it still pointing back to the original class object outside the method. You can now do two things to the new variable with the copied pointer value:
You can change the ORIGINAL object outside the method by changing its properties iniside your method. If "MyObject" is your variable with the copied pointer, you would do MyObject.myproperty = 6;, which changed the "myproperty" inside the original object outside the method. You did this as you passed in a pointer to the original object and assigned it to a new variable in your method. Note that this DOES change the referenced object outside the method.
Or, setting your variable with copied pointer to a new object and new pointer like so: MyObject = new SomeObject(); Here, we destroyed the old copied pointer assigned the variable above and assigned it to a new pointer to a new object! Now we have lost connection to the outside object and only changing a new object.
Take a look at the following program:
class Test
{
List<int> myList = new List<int>();
public void TestMethod()
{
myList.Add(100);
myList.Add(50);
myList.Add(10);
ChangeList(myList);
foreach (int i in myList)
{
Console.WriteLine(i);
}
}
private void ChangeList(List<int> myList)
{
myList.Sort();
List<int> myList2 = new List<int>();
myList2.Add(3);
myList2.Add(4);
myList = myList2;
}
}
I assumed myList would have passed by ref, and the output would
3
4
The list is indeed "passed by ref", but only the sort function takes effect. The following statement myList = myList2; has no effect.
So the output is in fact:
10
50
100
Can you help me explain this behavior? If indeed myList is not passed-by-ref (as it appears from myList = myList2 not taking effect), how does myList.Sort() take effect?
I was assuming even that statement to not take effect and the output to be:
100
50
10
Initially, it can be represented graphically as follow:
Then, the sort is applied myList.Sort();
Finally, when you did: myList' = myList2, you lost the one of the reference but not the original and the collection stayed sorted.
If you use by reference (ref) then myList' and myList will become the same (only one reference).
Note: I use myList' to represent the parameter that you use in ChangeList (because you gave the same name as the original)
You are passing a reference to the list, but your aren't passing the list variable by reference - so when you call ChangeList the value of the variable (i.e. the reference - think "pointer") is copied - and changes to the value of the parameter inside ChangeList aren't seen by TestMethod.
try:
private void ChangeList(ref List<int> myList) {...}
...
ChangeList(ref myList);
This then passes a reference to the local-variable myRef (as declared in TestMethod); now, if you reassign the parameter inside ChangeList you are also reassigning the variable inside TestMethod.
Here is an easy way to understand it
Your List is an object created on heap. The variable myList is a
reference to that object.
In C# you never pass objects, you pass their references by value.
When you access the list object via the passed reference in
ChangeList (while sorting, for example) the original list is changed.
The assignment on the ChangeList method is made to the value of the reference, hence no changes are done to the original list (still on the heap but not referenced on the method variable anymore).
This link will help you in understanding pass by reference in C#.
Basically,when an object of reference type is passed by value to an method, only methods which are available on that object can modify the contents of object.
For example List.sort() method changes List contents but if you assign some other object to same variable, that assignment is local to that method. That is why myList remains unchanged.
If we pass object of reference type by using ref keyword then we can assign some other object to same variable and that changes entire object itself.
(Edit: this is the updated version of the documentation linked above.)
C# just does a shallow copy when it passes by value unless the object in question executes ICloneable (which apparently the List class does not).
What this means is that it copies the List itself, but the references to the objects inside the list remain the same; that is, the pointers continue to reference the same objects as the original List.
If you change the values of the things your new List references, you change the original List also (since it is referencing the same objects). However, you then change what myList references entirely, to a new List, and now only the original List is referencing those integers.
Read the Passing Reference-Type Parameters section from this MSDN article on "Passing Parameters" for more information.
"How do I Clone a Generic List in C#" from StackOverflow talks about how to make a deep copy of a List.
While I agree with what everyone has said above. I have a different take on this code.
Basically you're assigning the new list to the local variable myList not the global.
if you change the signature of ChangeList(List myList) to private void ChangeList() you'll see the output of 3, 4.
Here's my reasoning...
Even though list is passed by reference, think of it as passing a pointer variable by value
When you call ChangeList(myList) you're passing the pointer to (Global)myList. Now this is stored in the (local)myList variable. So now your (local)myList and (global)myList are pointing to the same list.
Now you do a sort => it works because (local)myList is referencing the original (global)myList
Next you create a new list and assign the pointer to that your (local)myList. But as soon as the function exits the (local)myList variable is destroyed.
HTH
class Test
{
List<int> myList = new List<int>();
public void TestMethod()
{
myList.Add(100);
myList.Add(50);
myList.Add(10);
ChangeList();
foreach (int i in myList)
{
Console.WriteLine(i);
}
}
private void ChangeList()
{
myList.Sort();
List<int> myList2 = new List<int>();
myList2.Add(3);
myList2.Add(4);
myList = myList2;
}
}
Use the ref keyword.
Look at the definitive reference here to understand passing parameters.
To be specific, look at this, to understand the behavior of the code.
EDIT: Sort works on the same reference (that is passed by value) and hence the values are ordered. However, assigning a new instance to the parameter won't work because parameter is passed by value, unless you put ref.
Putting ref lets you change the pointer to the reference to a new instance of List in your case. Without ref, you can work on the existing parameter, but can't make it point to something else.
There are two parts of memory allocated for an object of reference type. One in stack and one in heap. The part in stack (aka a pointer) contains reference to the part in heap - where the actual values are stored.
When ref keyword is not use, just a copy of part in stack is created and passed to the method - reference to same part in heap. Therefore if you change something in heap part, those change will stayed. If you change the copied pointer - by assign it to refer to other place in heap - it will not affect to origin pointer outside of the method.
Passing Value Type parameters to functions in c# is by value unless you use the ref or out keyword on the parameter. But does this also apply to Reference Types?
Specifically I have a function that takes an IList<Foo>. Will the list passed to my function be a copy of the list with copy of its contained objects? Or will modifications to the list also apply for the caller? If so - Is there a clever way I can go about passing a copy?
public void SomeFunction()
{
IList<Foo> list = new List<Foo>();
list.Add(new Foo());
DoSomethingWithCopyOfTheList(list);
..
}
public void DoSomethingWithCopyOfTheList(IList<Foo> list)
{
// Do something
}
All parameters are passed by value unless you explicitly use ref or out. However, when you pass an instance of a reference type, you pass the reference by value. I.e. the reference itself is copied, but since it is still pointing to the same instance, you can still modify the instance through this reference. I.e. the instance is not copied. The reference is.
If you want to make a copy of the list itself, List<T> has a handy constructor, that takes an IEnumerable<T>.
You're not alone; this confuses a lot of people.
Here's how I like to think of it.
A variable is a storage location.
A variable can store something of a particular type.
There are two kinds of types: value types and reference types.
The value of a variable of reference type is a reference to an object of that type.
The value of a variable of value type is an object of that type.
A formal parameter is a kind of variable.
There are three kinds of formal parameters: value parameters, ref parameters, and out parameters.
When you use a variable as an argument corresponding to a value parameter, the value of the variable is copied into the storage associated with the formal parameter. If the variable is of value type, then a copy of the value is made. If the variable is of reference type, then a copy of the reference is made, and the two variables now refer to the same object. Either way, a copy of the value of the variable is made.
When you use a variable as an argument corresponding to an out or ref parameter the parameter becomes an alias for the variable. When you say:
void M(ref int x) { ...}
...
int y = 123;
M(ref y);
what you are saying is "x and y now are the same variable". They both refer to the same storage location.
I find that much easier to comprehend than thinking about how the alias is actually implemented -- by passing the managed address of the variable to the formal parameter.
Is that clear?
The list is passed by reference, so if you modify the list in SomeFunction, you modify the list for the caller as well.
You can create a copy of a list by creating a new one:
var newList = new List<Foo>(oldList);
your list is passed by reference. If you want to pass a copy of the list you can do:
IList<Foo> clone = new List<Foo>(list);
if you add/remove elements in clone it won't modify list
but the modifications of the elements themselves will be taken into account in both lists.
When you pass reference type by value (without ref or out keywords) you may modify this reference type inside this method and all changes will reflect to callers code.
To solve your problem you may explicitly create a copy and pass this copy to your function, or you may use:
list.AsReadOnly();
When passing reference types, you pass the reference. This is an important concept.
If you pass a reference
byref, you pass the reference (pointer) directly.
byval, you pass a copy of the reference (pointer).
A reference is not the instance referenced. A reference is analagous to a pointer.
To pass a copy of the instance of a referencetype, you first must make a copy yourself and pass a reference to the copy. As such then you will not be modifying the original instance.
While using keyword ref, calling code needs to initialize passed arguments, but with keyword out we need not do so.
Why don't we use out everywhere?
What is exact difference between the two?
Please give example of a situation in which we need to use ref and can't use out?
The answer is given in this MSDN article. From that post:
The two parameter passing modes
addressed by out and ref are subtly
different, however they are both very
common. The subtle difference between
these modes leads to some very common
programming errors. These include:
not assigning a value to an out
parameter in all control flow paths
not assigning a value to variable
which is used as a ref parameter
Because the C# language assigns
different definite assignment rules to
these different parameter passing
modes, these common coding errors are
caught by the compiler as being
incorrect C# code.
The crux of the decision to include
both ref and out parameter passing
modes was that allowing the compiler
to detect these common coding errors
was worth the additional complexity of
having both ref and out parameter
passing modes in the language.
out is a special form of ref where the referenced memory should not be initialized before the call.
In this case the C# compiler enforces that the out variable is assigned before the method returns and that the variable is not used before it has been assigned.
Two examples where out doesn't work but ref does:
void NoOp(out int value) // value must be assigned before method returns
{
}
void Increment(out int value) // value cannot be used before it has been assigned
{
value = value + 1;
}
None of these answers satisfied me, so here's my take on ref versus out.
My answer is a summary of the following two pages:
ref (C# Reference)
out (C# Reference)
Compare
Both the method definition and the calling method must explicitly use the ref / out keyword
Both keywords cause parameters to be passed by reference (even value types)
However, there is no boxing when a value type is passed by reference
Properties cannot be passed via out or ref, because properties are really methods
ref / out are not considered to be part of the method signature at compile time, thus methods cannot be overloaded if the only difference between them is that one of the methods takes a ref argument and the other takes an out argument
Contrast
ref
Must be initialized before it is passed
Can use to pass a value to the method
out
Does not have to be initialized before it is passed
Calling method is required to assign a value before the method returns
Can not use to pass a value to the method
Examples
Won't compile because only difference in method signatures is ref / out:
public void Add(out int i) { }
public void Add(ref int i) { }
Using ref keyword:
public void PrintNames(List<string> names)
{
int index = 0; // initialize first (#1)
foreach(string name in names)
{
IncrementIndex(ref index);
Console.WriteLine(index.ToString() + ". " + name);
}
}
public void IncrementIndex(ref int index)
{
index++; // initial value was passed in (#2)
}
Using out keyword:
public void PrintNames(List<string> names)
{
foreach(string name in names)
{
int index; // not initialized (#1)
GetIndex(out index);
Console.WriteLine(index.ToString() + ". " + name);
}
}
public void GetIndex(out int index)
{
index = IndexHelper.GetLatestIndex(); // needs to be assigned a value (#2 & #3)
}
Author's Random Remarks
In my opinion, the concept of using the out keyword is similar to using the Output enum value of ParameterDirection for declaring output parameters in ADO.NET
Arrays in C# are passed by reference, but in order for a reassignment of the array reference to affect the reference in the calling code, the ref keyword must be used
Example:
public void ReassignArray(ref int[] array)
{
array = new int[10]; // now the array in the calling code
// will point to this new object
}
For more info on reference types versus value types, see Passing Reference-Type Parameters (C# Programming Guide)
A contrived example of when you'd need to use ref and not out is as follows:
public void SquareThisNumber(ref int number)
{
number = number * number;
}
int number = 4;
SquareThisNumber(ref number);
Here we want number to be an in-out variable, so we use ref. If we had used out, the compiler would have given an error saying we initialized an out param before using it.
The ref keyword allows you to change the value of a parameter. The method being called can be an intermediate link in the calling chain. A method using the out keyword can only be used at the beginning of a calling chain.
Another advantage is that the existing value can be used in the logic of the method and still hold the return value.
In Oracle functions have explicit IN (default and what you get if you don't set a direction) IN/OUT and OUT parameters. The equivalent is normal (just the parameter), ref [parameter], and out [parameter].
The compiler knows that out variables shouldn't set before the call. This allows them be just declared before use. However it knows that it must be set before the function it's used in returns.
When we pass the value while calling the method prefixed by the out keyword, it treats it entirely different like we are not passing it to the method. Instead we are actually collecting (outing) the value of the out variable from the definition section of the method to the method out variable parameter where we are calling that method.
So out variable is the output of processing done with in the method definition, and this is the reason why we need to create it, initialize it, and modify it within the definition only.
An out variable is used when we need return multiple values from a particular method.
While in case of a ref variable we need to initialize it first as its memory location is transfered to method definition as parameter. Think what would happen if we are not initializing it before passing?