List passed by ref - help me explain this behaviour - c#

Take a look at the following program:
class Test
{
List<int> myList = new List<int>();
public void TestMethod()
{
myList.Add(100);
myList.Add(50);
myList.Add(10);
ChangeList(myList);
foreach (int i in myList)
{
Console.WriteLine(i);
}
}
private void ChangeList(List<int> myList)
{
myList.Sort();
List<int> myList2 = new List<int>();
myList2.Add(3);
myList2.Add(4);
myList = myList2;
}
}
I assumed myList would have passed by ref, and the output would
3
4
The list is indeed "passed by ref", but only the sort function takes effect. The following statement myList = myList2; has no effect.
So the output is in fact:
10
50
100
Can you help me explain this behavior? If indeed myList is not passed-by-ref (as it appears from myList = myList2 not taking effect), how does myList.Sort() take effect?
I was assuming even that statement to not take effect and the output to be:
100
50
10

Initially, it can be represented graphically as follow:
Then, the sort is applied myList.Sort();
Finally, when you did: myList' = myList2, you lost the one of the reference but not the original and the collection stayed sorted.
If you use by reference (ref) then myList' and myList will become the same (only one reference).
Note: I use myList' to represent the parameter that you use in ChangeList (because you gave the same name as the original)

You are passing a reference to the list, but your aren't passing the list variable by reference - so when you call ChangeList the value of the variable (i.e. the reference - think "pointer") is copied - and changes to the value of the parameter inside ChangeList aren't seen by TestMethod.
try:
private void ChangeList(ref List<int> myList) {...}
...
ChangeList(ref myList);
This then passes a reference to the local-variable myRef (as declared in TestMethod); now, if you reassign the parameter inside ChangeList you are also reassigning the variable inside TestMethod.

Here is an easy way to understand it
Your List is an object created on heap. The variable myList is a
reference to that object.
In C# you never pass objects, you pass their references by value.
When you access the list object via the passed reference in
ChangeList (while sorting, for example) the original list is changed.
The assignment on the ChangeList method is made to the value of the reference, hence no changes are done to the original list (still on the heap but not referenced on the method variable anymore).

This link will help you in understanding pass by reference in C#.
Basically,when an object of reference type is passed by value to an method, only methods which are available on that object can modify the contents of object.
For example List.sort() method changes List contents but if you assign some other object to same variable, that assignment is local to that method. That is why myList remains unchanged.
If we pass object of reference type by using ref keyword then we can assign some other object to same variable and that changes entire object itself.
(Edit: this is the updated version of the documentation linked above.)

C# just does a shallow copy when it passes by value unless the object in question executes ICloneable (which apparently the List class does not).
What this means is that it copies the List itself, but the references to the objects inside the list remain the same; that is, the pointers continue to reference the same objects as the original List.
If you change the values of the things your new List references, you change the original List also (since it is referencing the same objects). However, you then change what myList references entirely, to a new List, and now only the original List is referencing those integers.
Read the Passing Reference-Type Parameters section from this MSDN article on "Passing Parameters" for more information.
"How do I Clone a Generic List in C#" from StackOverflow talks about how to make a deep copy of a List.

While I agree with what everyone has said above. I have a different take on this code.
Basically you're assigning the new list to the local variable myList not the global.
if you change the signature of ChangeList(List myList) to private void ChangeList() you'll see the output of 3, 4.
Here's my reasoning...
Even though list is passed by reference, think of it as passing a pointer variable by value
When you call ChangeList(myList) you're passing the pointer to (Global)myList. Now this is stored in the (local)myList variable. So now your (local)myList and (global)myList are pointing to the same list.
Now you do a sort => it works because (local)myList is referencing the original (global)myList
Next you create a new list and assign the pointer to that your (local)myList. But as soon as the function exits the (local)myList variable is destroyed.
HTH
class Test
{
List<int> myList = new List<int>();
public void TestMethod()
{
myList.Add(100);
myList.Add(50);
myList.Add(10);
ChangeList();
foreach (int i in myList)
{
Console.WriteLine(i);
}
}
private void ChangeList()
{
myList.Sort();
List<int> myList2 = new List<int>();
myList2.Add(3);
myList2.Add(4);
myList = myList2;
}
}

Use the ref keyword.
Look at the definitive reference here to understand passing parameters.
To be specific, look at this, to understand the behavior of the code.
EDIT: Sort works on the same reference (that is passed by value) and hence the values are ordered. However, assigning a new instance to the parameter won't work because parameter is passed by value, unless you put ref.
Putting ref lets you change the pointer to the reference to a new instance of List in your case. Without ref, you can work on the existing parameter, but can't make it point to something else.

There are two parts of memory allocated for an object of reference type. One in stack and one in heap. The part in stack (aka a pointer) contains reference to the part in heap - where the actual values are stored.
When ref keyword is not use, just a copy of part in stack is created and passed to the method - reference to same part in heap. Therefore if you change something in heap part, those change will stayed. If you change the copied pointer - by assign it to refer to other place in heap - it will not affect to origin pointer outside of the method.

Related

Why reversing a string array inside method does not persist

So I've got a simple string array. Passed it to a function that reverses its input and then displayed the content of the array. I was expecting the contents of the array to reverse since arrays are passed by reference, but the string array did not change.
string[] words = { "Metal", "Gear", "is", "Awesome!" };
mutateArray(ref words);
foreach (string word in words)
Console.Write(word + " ");
This is my mutateArray function:
public static void mutateArray(ref string[] arr)
{
arr = arr.Reverse().ToArray();
}
I know that the mutateArray method changes to the array will persist once I state that the parameter must be passed in with the keyword ref.
Aren't all arrays passed in by reference by default?
Why do the changes persist when the keyword ref is involved?
What's the difference between passing a reference type (classes, interfaces, array, delegates) by value vs passing them by reference (with the keyword ref)?
The reason this doesn't work how you expect is that Reverse() does not actually reverse the contents of the array in place but rather makes a new list with the reversed contents of the original. That's why it works once you pass the array by reference: then, you're actually replacing the entire original array in the calling method with a new one created in mutateArray.
If you had a method that did the reversing in-place, you could pass in the original array (not using ref), and after the method call, the array would be in reverse order.
All parameters are passed by value by default in C#. For reference types like array, this means the reference is passed by value.
ref causes the variable to be passed by reference into the function. This effectively means the arr parameter in mutateArray is an alias for words in the caller. This is why an assignment to arr results in a change in words after mutateArray has exited.
Passing a reference type by value into a function means a copy of the reference is made. Without the ref modifier, arr in mutateArray is a different variable containing a reference to the same object as words in the caller. Assigning to arr in this case has no effect on words in the caller. Note that you can mutate the array through the shared reference, but arr and words are separate storage locations.
You're confusing a Reference Type with passing by Reference. The ref keyword can be applied to both Value and Reference types. Even Reference types aren't passed by reference by default. Instead the reference is passed by value to the method.
Based on the documentation from MSDN, they should. That's the whole purpose of using the ref keyword with a Reference Type.
The difference is that when you pass by Reference Type by Reference, you are able to change the reference of the original variable rather than just the instance inside your method. Check the previously linked documentation for more details.
words is a reference to an array. Just consider it to contain the memory address of that array.
When you give it to MutateArray as a parameter (without the ref keyword) its VALUE will be copied into arr. So arr is a different variable as words, but they contain the same value (= memory address). This means that they refer to the same object (the string array).
You can change the contents of the object, but words (and arr) will still be referring to it.
If you assign arr to a different object, then its value changes, so it will refer to a different object than words.
However, if you use the ref keyword, then arr and words are the SAME variable. That means if you change arr's value (= assign it to a new object), you are also changing words's value, so words will refer to the same, new, object.
Maybe all of this is not technically 100% correct, but it's the way I like to think about it in order to understand how it works.
The ref keyword is why the following Swap method works in C#; without it, it would just change the inner variables of the Swap method (and do nothing basicallly)
public void Swap<T>(ref T a, ref T b) {
T temp = a;
a = b;
b = temp;
}
Your question is totally understandable and for a moment I was confused as well.
So here is the explanation.
Short explanation
When you pass a reference type to a method and make a change to its property, the object outside the method CAN SEE the property change because the object itself remains the same.
When you pass a reference type to a method and make a change to the instance itself, the object outside the method CANNOT SEE the change because inside the method you basically started pointing to another object. So the object changed inside the method and remained there as a stranger.
Long explanation
Suppose you have a reference type instance where you get the value from database.
using (var context = new MyAdventureWorksEntities2())
{
Product p = context.Products.Where(item => item.ProductID == 1000).First();
Console.WriteLine(p.Name); // p.Name = "INITIAL NAME"
UpdateName(p);
Console.WriteLine(p.Name);
}
And here is your UpdateName method:
public static void UpdateName(Product p)
{
p.Name = "UPDATED NAME";
}
This code emits the following result:
INITIAL NAME
UPDATED NAME
HOWEVER, if you change the method to the following:
public static void UpdateName(Product p)
{
using (var context = new MyAdventureWorksEntities2())
{
p = context.Products.Where(item => item.ProductID == 1003).First();
// p.Name = "ANOTHER PRODUCT NAME"
}
}
your result will be:
INITIAL NAME
INITIAL NAME
Note that I didn't touch the ref keyword at all.
And perhaps after those examples the short description will be much more comprehensible.

"Storing" value types inside an ArrayList

The ArrayList class can only contain references to objects but what happens when you store a value type such as integers?
string str = "Hello";
int i = 50;
ArrayList arraylist = new ArrayList();
arraylist.Add(str); // Makes perfectly sense:
// Reference to string-object (instance) "Hello" is added to
// index number 0
arraylist.Add(i); // What happens here? How can a reference point to a value
// type? Is the value type automatically converted to an
// object and thereafter added to the ArrayList?
It's called "boxing": automagically the int is converted to a reference type. This does cost some performance.
See also Boxing and Unboxing.
If you pull up the ArrayList class in ILSpy, you'll see that the backing store is:
private object[] _items;
and that the Add method accepts an instance of type object:
public virtual int Add(object value) { ... }
So when you call Add with an integer, .NET boxes the integer and then it gets added to the _items array in the ArrayList as an object.
Incidentally, if you need an ArrayList of just integers and you are using the .NET 2.0 Framework or later, you should use the List<T> (a.k.a. generic List) class instead, which will perform better since it avoids having to box an int when storing or retrieving it from the list (see the Performance Considerations section in that last link).
Its called boxing. A "Box" holds a copy of the struct along with details of what type it is.
MSDN : http://msdn.microsoft.com/en-us/library/yz2be5wk%28v=vs.80%29.aspx
In framework 2.0 + microsoft gave us generics which are faster and more effictive:
MSDN : http://msdn.microsoft.com/en-us/library/ms172192.aspx
The Arraylist.Add() will adds take any value and adds as an object, so the integer value will be automatically converted(boxing) and is added in to arraylist.

C# odd object behavior

I noticed something in C# when dealing with custom objects that I found to be a little odd. I am certain it is just a lack of understanding on my part so maybe someone can enlighten me.
If I create a custom object and then I assign that object to the property of another object and the second object modifies the object assigned to it, those changes are reflected in the same class that did the assigning even though nothing is returned.
You want that in English? Here is an example:
class MyProgram
{
static void Main()
{
var myList = new List<string>();
myList.Add("I was added from MyProgram.Main().");
var myObject = new SomeObject();
myObject.MyList = myList;
myObject.DoSomething();
foreach (string s in myList)
Console.WriteLine(s); // This displays both strings.
}
}
public class SomeObject
{
public List<string> MyList { get; set; }
public void DoSomething()
{
this.MyList.Add("I was added from SomeObject.DoSomething()");
}
}
In the above sample I would have thought that, because SomeObject.DoSomething() returns void, this program would only display "I was added from MyProgram.Main().". However, the List<string> in fact contains both that line and "I was added from SomeObject.DoSomething()".
Here is another example. In this example the string remains unchanged. What is the difference and what am I missing?
class MyProgram
{
static void Main()
{
var myString = "I was set in MyProgram.Main()";
var myObject = new SomeObject();
myObject.MyString = myString;
myObject.DoSomething();
Console.WriteLine(myString); // Displays original string.
}
}
public class SomeObject
{
public string MyString { get; set; }
public void DoSomething()
{
this.MyString = "I was set in SomeObject.DoSomething().";
}
}
This program sample ends up displaying "I was set in MyProgram.Main()". After seeing the results of the first sample I would have assumed that the second program would have overwritten the string with "I was set in SomeObject.DoSomething().". I think I must be misunderstanding something.
This isn't odd, or strange. When you create a class, you create reference type. When you pass references to objects around, modifications to the objects they refer to are visible to anyone that holds a reference to that object.
var myList = new List<string>();
myList.Add("I was added from MyProgram.Main().");
var myObject = new SomeObject();
myObject.MyList = myList;
myObject.DoSomething();
So in this block of code, you instantiate a new instance of List<string> and assign a reference to that instance to the variable myList. Then you add "I was added from MyProgram.Main()." to the list referred to by myList. Then you assign a refernce to that same list to myObject.MyList (to be explicit, both myList and myObject.MyList are referring to the same List<string>! Then you invoke myObject.DoSomething() which adds "I was added from SomeObject.DoSomething()" to myObject.MyList. Since both myList and myObject.MyList are referring to the same List<string>, they will both see this modification.
Let's go by way of analogy. I have a piece of paper with a telephone number on it. I photocopy that piece of paper and give it to you. We both have a piece of paper with the same telephone number on it. Now I call up that number and tell the person on the other end of the line to put a banner up on their house that says "I was added from MyProgram.Main()." You call up the person on the other end of the line to put a banner up on their house that says "I was added from SomeObject.DoSomething()". Well, the person who lives at the house that has that telephone number is now going to have two banners outside their house. One that says
I was added from MyProgram.Main().
and another that says
I was added from SomeObject.DoSomething()
Make sense?
Now, in your second example, it's a little trickier.
var myString = "I was set in MyProgram.Main()";
var myObject = new SomeObject();
myObject.MyString = myString;
myObject.DoSomething();
You start by creating a new string whose value is "I was set in MyProgram.Main()" and assign a reference to that string to myString. Then you assign a reference to that same string to myObject.MyString. Again, both myString and myObject.MyString are referring to that same string whose value is "I was set in MyProgram.Main()". But then you invoke myObject.DoSomething which has this interesting line
this.MyString = "I was set in SomeObject.DoSomething().";
Well, now you've created a new string whose value is "I was set in SomeObject.DoSomething()." and assign a reference to that string to myObject.MyString. Note that you never changed the reference that myString holds. So now, myString and myObject.MyString are referring to different strings!
Let's go by analogy again. I have a piece of paper with a web address on it. I photocopy that piece of paper and give it to you. We both have a piece of paper with the same web address on it. You cross out that web address and write down a different address. It doesn't affect what I see on my piece of paper!
Finally, a lot of people in this thread are yammering about the immutability of string. What is going on here has nothing to do with the immutability of string.
It's absolutely correct:
myObject.MyList = myList;
This line assign a reference of myList to the myObject's property.
To prove this this, call GetHashCode() on myList and on myObject.MyList.
we are talking about different pointers to same memory location, if you wish.
Whether or not a method returns something, has nothing to do with what happens inside it.
You seem to be confused regarding what assignment actually means.
Let's start from the beginning.
var myList = new List<string>();
allocates a new List<string> object in memory and puts a reference to it into myList variable.
There is currently just one instance of List<string> created by your code but you can store references to it in different places.
var theSameList = myList;
var sameOldList = myList;
someObject.MyList = myList;
Right now myList, theSameList, sameOldList and someObject.MyList (which is in turn stored in a private field of SomeObject automagically generated by compiler) all refer to the same object.
Have a look at these:
var bob = new Person();
var guyIMetInTheBar = bob;
alice.Daddy = bob;
harry.Uncle = bob;
itDepartment.Head = bob;
There is just one instance of Person, and many references to it.
It's only natural that if our Bob grew a year older, each instance's Age would have increased.
It's the same object.
If a city was renamed, you'd expect all maps to be re-printed with its new name.
You find it strange that
those changes are reflected in the same class that did the assigning
—but wait, changes are not reflected. There's no copying under the hood. They're just there, because it's the same object, and if you change it, wherever you access it from, you access its current state.
So it matters not where you add an item to the list: as long as you're referring to the same list, you'll see the item being added.
As for your second example, I see Jason has already provided you with a much better explanation than I could possibly deliver so I won't go into that.
It will suffice if I say:
Strings are immutable in .NET, you can't modify an instance of string for a variety of reasons.
Even if they were mutable (like List<T> that has its internal state modifiable via methods), in your second example, you're not changing the object, you're changing the reference.
var goodGuy = jack;
alice.Lover = jack;
alice.Lover = mike;
Would alice's change of mood make jack a bad guy? Certainly not.
Similarly, changing myObject.MyString doesn't affect local variable myString. You don't do anything to the string itself (and in fact, you can't).
You are confusing both type of objects.
A List is a List of type string .. which means it can take strings :)
When you call the Add method it adds the string literal to its collection of strings.
At the time you call your DoSomething() method, the same list reference is available to it as the one you had in Main. Hence you could see both strings when you printed in the console.
Don't forget, that your variables are objects too. In the first example, you create a List<> object and assign it to your new object. You only hold a reference to a list, in this case, you now hold two references to the same list.
In the second example you assign a specific string object to your instance.
Alex - you wrote -
In the above sample I would have thought that, because SomeObject.DoSomething() returns void, this program would only display "I was added from MyProgram.Main().". However, the List in fact contains both that line and "I was added from SomeObject.DoSomething()".
This is not the case. The VOID of the function just means the function does not return a value. This has nothing to do with the this.MyList.Add method you are invoking in the DoSomething() method. You do have to references to the same object - myList and the MyList in the SomeObject.
This is how reference types behave and is expected. myList and myObject.MyList are references to the same List object in heap memory.
In the second example strings are immutable and are passed by value, so on the line
myObject.MyString = myString;
The contents of myString are copied to myObject.MyString (i.e. passed by value not by reference)
String is a bit special because it is a reference type and a value type with the special property of immutability (once you have created a string you can't change it only make a new one, but this is somewhat hidden from you by the implementation)
In the first example ... you are working with an mutable objects, and it is always accessed by reerence. All references to MyList in different objects refer to the same thing.
In the other case, strings behave a bit differently. Declaring a string literal (i.e. text between quotes) creates a new instance of a String, completely separated from the original version. You CAN NOT modify a string, just create a new one.
UPDATE
Jason is right, it has nothing to do with String immutability ... but ....
I can't help but think that string immutabiity has its word in here. Not in THIS concrete example, but if SomeObject.DoSomething's code was this : this.MyString += "I was updated in SomeObject.DoSomething()."; , then you would have to explain that new String is created by the "concatenation", and the first string is not updated

Are ILists passed by value?

Passing Value Type parameters to functions in c# is by value unless you use the ref or out keyword on the parameter. But does this also apply to Reference Types?
Specifically I have a function that takes an IList<Foo>. Will the list passed to my function be a copy of the list with copy of its contained objects? Or will modifications to the list also apply for the caller? If so - Is there a clever way I can go about passing a copy?
public void SomeFunction()
{
IList<Foo> list = new List<Foo>();
list.Add(new Foo());
DoSomethingWithCopyOfTheList(list);
..
}
public void DoSomethingWithCopyOfTheList(IList<Foo> list)
{
// Do something
}
All parameters are passed by value unless you explicitly use ref or out. However, when you pass an instance of a reference type, you pass the reference by value. I.e. the reference itself is copied, but since it is still pointing to the same instance, you can still modify the instance through this reference. I.e. the instance is not copied. The reference is.
If you want to make a copy of the list itself, List<T> has a handy constructor, that takes an IEnumerable<T>.
You're not alone; this confuses a lot of people.
Here's how I like to think of it.
A variable is a storage location.
A variable can store something of a particular type.
There are two kinds of types: value types and reference types.
The value of a variable of reference type is a reference to an object of that type.
The value of a variable of value type is an object of that type.
A formal parameter is a kind of variable.
There are three kinds of formal parameters: value parameters, ref parameters, and out parameters.
When you use a variable as an argument corresponding to a value parameter, the value of the variable is copied into the storage associated with the formal parameter. If the variable is of value type, then a copy of the value is made. If the variable is of reference type, then a copy of the reference is made, and the two variables now refer to the same object. Either way, a copy of the value of the variable is made.
When you use a variable as an argument corresponding to an out or ref parameter the parameter becomes an alias for the variable. When you say:
void M(ref int x) { ...}
...
int y = 123;
M(ref y);
what you are saying is "x and y now are the same variable". They both refer to the same storage location.
I find that much easier to comprehend than thinking about how the alias is actually implemented -- by passing the managed address of the variable to the formal parameter.
Is that clear?
The list is passed by reference, so if you modify the list in SomeFunction, you modify the list for the caller as well.
You can create a copy of a list by creating a new one:
var newList = new List<Foo>(oldList);
your list is passed by reference. If you want to pass a copy of the list you can do:
IList<Foo> clone = new List<Foo>(list);
if you add/remove elements in clone it won't modify list
but the modifications of the elements themselves will be taken into account in both lists.
When you pass reference type by value (without ref or out keywords) you may modify this reference type inside this method and all changes will reflect to callers code.
To solve your problem you may explicitly create a copy and pass this copy to your function, or you may use:
list.AsReadOnly();
When passing reference types, you pass the reference. This is an important concept.
If you pass a reference
byref, you pass the reference (pointer) directly.
byval, you pass a copy of the reference (pointer).
A reference is not the instance referenced. A reference is analagous to a pointer.
To pass a copy of the instance of a referencetype, you first must make a copy yourself and pass a reference to the copy. As such then you will not be modifying the original instance.

Does foreach() iterate by reference?

Consider this:
List<MyClass> obj_list = get_the_list();
foreach( MyClass obj in obj_list )
{
obj.property = 42;
}
Is obj a reference to the corresponding object within the list so that when I change the property the change will persist in the object instance once constructed somewhere?
Yes, obj is a reference to the current object in the collection (assuming MyClass is in fact a class). If you change any properties via the reference, you're changing the object, just like you would expect.
Be aware however, that you cannot change the variable obj itself as it is the iteration variable. You'll get a compile error if you try. That means that you can't null it and if you're iterating value types, you can't modify any members as that would be changing the value.
The C# language specification states (8.8.4)
"The iteration variable corresponds to
a read-only local variable with a
scope that extends over the embedded
statement."
Yes, until you change the generic type from List to IEnumerable..
You've asked 2 different questions here, lets take them in order.
Does a foreach loop iterate by reference?
If you mean in the same sense as a C++ for loop by reference, then no. C# does not have local variable references in the same sense as C++ and hence doesn't support this type of iteration.
Will the change be persisted
Assuming that MyClass is a reference type, the answer is yes. A class is a reference type in .Net and hence the iteration variable is a reference to the one variable, not a copy. This would not be true for a value type.
Well, it happened to me that my changes were not updated in a foreach loop when I iterated through var collection:
var players = this.GetAllPlayers();
foreach (Player player in players)
{
player.Position = 1;
}
When I changed var to List it started working.
You can in this instance (using a List<T>) but if you were to be iterating over the generic IEnumerable<T> then it becomes dependant on its implementation.
If it was still a List<T> or T[] for instance, all would work as expected.
The big gotcha comes when you are working with an IEnumerable<T> that was constructed using yield. In this case, you can no longer modify properties of T within an iteration and expect them to be present if you iterate the same IEnumerable<T> again.
Maybe it's interesting for you to lean that by version C# 7.3 it's possible to change values by reference provided that the enumerator's Current property returns a reference Type. The following would be valid (verbatim copy from the MS docs):
Span<int> storage = stackalloc int[10];
int num = 0;
foreach (ref int item in storage)
{
item = num++;
}
Read more about this new feature at
C# foreach statement | Microsoft Docs.
this is true as long as it is not a struct.
Well, without understanding exactly what you mean by "Iterate by reference", I can't answer specifically yes or no, but I can say that what's going on under the surface is that the .net framework is constructing an "enumerator" class for each time client code calls a foreach, for the life of the foreach, that maintains a reference pointer into the collection being iterated over, and each time your foreach iterates, ir "delivers" one item and "increments" the pointer or reference in the enumerator to the next item...
This happens regardless of whether the items in the collection you are iterating over are values types or reference types.
obj is a reference to an item inside the List, hence if you change it's value it will persist. Now what you should be concerned about is whether or not get_the_list(); is making a deep copy of the List or returning the same instance.
Yes, that's also why you cannot alter the enumerable object in the context of the foreach statement.

Categories

Resources