I am reading following blog by Eric Lippert: The truth about Value types
In this, he mentions there are 3 kinds of values in the opening:
Instance of Value types
Instance of Reference types
References
It is incomplete. What about references? References are neither value types nor instances of reference types, but they are values..
So, in the following example:
int i = 10;
string s = "Hello"
First is instance of value type and second is instance of reference type. So, what is the third type, References and how do we obtain that?
So, what is the third type, References and how do we obtain that?
The variable s is a variable which holds the value of the reference. This value is a reference to a string (with a value of "Hello") in memory.
To make this more clear, say you have:
string s1 = "Hello";
string s2 = s1;
In this case, s1 and s2 are both variables that are each a reference to the same reference type instance (the string). There is only a single actual string instance (the reference type) involved here, but there are two references to that instance.
Fields and variables of reference type, such as your s, are references to an instance of a reference type that lives on the heap.
You never use an instance of a reference type directly; instead, you use it through a reference.
A reference is not really a 'third type'. It's actually a pointer that refers to a concrete instance of an object. Take a look at this example:
class MyClass
{
public string Str { get; set; }
}
class Program
{
static void Main(string[] args)
{
int a = 1;
int b = 2;
int c = 3;
var myObj = new MyClass
{
Str = "Whatever"
};
Console.WriteLine("{0};\t{1};\t{2};\t{3}", a, b, c, myObj.Str);
MyFunction(a, ref b, out c, myObj);
Console.WriteLine("{0};\t{1};\t{2};\t{3}", a, b, c, myObj.Str);
Console.ReadLine();
}
static void MyFunction(int justValue, ref int refInt, out int outInt, MyClass obj)
{
obj.Str = "Hello";
justValue = 101;
refInt = 102;
outInt = 103; // similar to refInt, but you MUST set the value of the parameter if it's uses 'out' keyword
}
}
The output of this program is:
1; 2; 3; Whatever
1; 102; 103; Hello
Focus on the MyFunction:
The first parameter we pass is a simple int which is a value type. By default value types are cloned when passed as the parameter (a new instance is being created). That's why the value of 'a' didn't change.
You can change this behavior by adding 'ref' or 'out' keyword to the parameter. In this case you actually pass a reference to that very instance of your int. In MyFunction the value of that instance is being overridden.
Here you can read move out ref and out
The last example is the object of MyClass. All classes are reference types and that's why you always pass them as references (no special keywords needed).
You can think about a reference as about an address in computer memory. Bytes at that address compose your object. If you pass it as value, you take that bytes out and pass them to a function. If you pass it as a reference you only pass the address. Than in your called function you can read bytes from that address or write to that address. Every change affects the calling function variables, because they point to exactly the same bytes in computer memory. It's not exactly what happens in .Net (it runs in a virtual machine), but I think this analogy will help you understand the concept.
Why do we use references? There are many reasons. One of them is that passing a big object by value would be very slow and would require cloning it. When you pass a reference to an object, than no matter how big that object is you only pass w few bytes that contain it's 'address' in memory.
Moreover your object may contain elements that cannot be cloned (like an open socket). Using reference you can easily pass such an object between functions.
It's also worth mentioning that sctructs, even though they look very similar to classes are actually value types and behave as value types (when you pass a struct to a function, you actually pass a clone - a new instance).
Related
classes deal with the reference types and traditional data types deal with the value type just for example :
int i=5;
int j=i;
i=3 ; //then this will output i=3 and j=5 because they are in the different memory blocks .
Similarly if we talk about the object of a class say point class
class point
{
public int x,y;
void somefucnt(point p,int x)
{
Console.writeline("value of x is "+p.x);
x=22;
Console.writeline("value of x is "+p.x);
}
}
class someotherclass
{
static void Main(string [] args )
{
p1.x=10;
p1.somefunct(p1,p1.x);
}
}
Both console.write statements are printing 10 , despite ive changed x to some other value ? why is it so ?since p is just the reference to x so it should be updated by changing values of x . this thing is really confusing me alot .
The observed behavior has nothing to do with Value types vs Reference types - it has to do with the Evaluation of Strategy (or "calling conventions") when invoking a method.
Without ref/out, C# is always Call by Value1, which means re-assignments to parameters do not affect the caller bindings. As such, the re-assignment to the x parameter is independent of the argument value (or source of such value) - it doesn't matter if it's a Value type or a Reference type.
See Reference type still needs pass by ref? (on why caller does not see parameter re-assignment):
Everything is passed by value in C#. However, when you pass a reference type, the reference itself is being passed by value, i.e., a copy of the original reference is passed. So, you can change the state of object that the reference copy points to, but if you assign a new value to the reference [parameter] you are only changing what the [local variable] copy points to, not the original reference [in the argument expression].
And Passing reference type in C# (on why ref is not needed to mutate Reference types)
I.e. the address of the object is passed by value, but the address to the object and the object is the same. So when you call your method, the VM copies the reference; you're just changing a copy.
1 For references types, the phrasing "Call By Value [of the Reference]" or "Call by [Reference] Value" may help clear up the issue. Eric Lippert has written a popular article The Truth about Value Types which encourages treating reference values as a distinct concept from References (or instances of Reference types).
void somefucnt(point p,int x){
Console.writeline("value of x is "+p.x);
x=22;
Console.writeline("value of x is "+p.x);
}
Here, the x=22 won´t change p.x but the parameter x of (point p,int x)
Normally, your assumtion about values/references is ok (if I understood it correctly).
Tip: Google for c# this instead of passing a object to it´s own method
You change the value of the parameter (x), not the value of p.x, value types are passed by value unless you use the ref keyword.
Like in your first example, there is no relationship between i and j as well as the parameter x, and p1.x.Each variable has it's own space in the memory.So changing one of them doesn't affect to the other.
You have two different variables named x in the somefucnt function. One is the member variable x which you are trying to change, the other is the function input parameter in void somefucnt(point p, int x). When you say x = 22, the input parameter x is changed instead of the member variable x.
If you change the line x = 22 to this.x = 22 then it should work as you expect.
Side note:
A good practice to avoid confusion is to always have class members private and name them as _x. Otherwise, have public auto properties in CamelCase, like this:
public int X { get; set; }
These methods avoid ambiguity between class variables and function input variables.
This below code compiles and works out as intended.
class MyClass1
{
public void test()
{
string one = "testString1";
Console.WriteLine("MyClass1: " + one);
new MyClass2().test(one);
Console.WriteLine(one); //again testString1 is printed.
}
}
class MyClass2
{
public void test(string two)
{
Console.WriteLine("Test method");
Console.WriteLine(two);
two = "pilot";
Console.WriteLine(two);
}
}
all I infer from this is:
The value assigned to the string in test method is local to that function and the changes will be reflected only if I use a ref or out.
The question is:
We all know that the string is a reference type (because it is of type, String)
So, for all the reference types : when passing around their objects, the changes should be reflected right ? (For ex, for the same example, if I pass around a object of a class, then any changes are reflected back right ?)
Why is this rule not followed here ?
Can any one point me in understanding what happens under the hood ?
Although strings are reference objects, they are also immutable. Since references are passed by value *, changes to variables representing the reference, are not reflected on the original.
To demonstrate the effect of passing reference objects, replace string with StringBuilder, and change the content inside the test method:
class MyClass1
{
public void test()
{
StringBuilder one = new StringBuilder("testString1");
Console.WriteLine("MyClass1: " + one);
new MyClass2().test(one);
Console.WriteLine(one); //testString1pilot is printed.
}
}
class MyClass2
{
public void test(StringBuilder two)
{
Console.WriteLine("Test method");
Console.WriteLine(two);
two.Append("pilot");
Console.WriteLine(two);
}
}
* Unless the method specifies a different mode of parameter passing, e.g. out or ref.
So, for all the reference types : when passing around their objects,
the changes should be reflected right ?
All reference types are passed by reference is not true.
all reference type or value types are passed by value by default.
if you want to pass any type as reference types you need to use ref or out keyword.
Note: String is a immutable type means Strings can not be changed.
That is the reason why you are not able to see the changes made in the called function.
You need to use StringBuilder to get back the changes.
JonSteek has explained about Parmeter passing well here
In your example, the fact that String is a reference type does not matter. The exact same thing would happen with any value type or even a mutable reference type (like a class).
This is because the parameter to a method normally acts like a local variable within the method. Changes made to the parameter are local to the method.
As you stated, the exception is when the parameter is ref or out.
You have to understand the difference between the string which is a reference type and the variable itself that points to that object.
two = "pilot";
When you do this, you are creating a new string object and telling variable two to now point to this new string. The variable one still points to the original string, which is a different object.
I always thought that a method parameter with a class type is passed as a reference parameter by default. Apparently that is not always the case. Consider these unit tests in C# (using MSTest).
[TestClass]
public class Sandbox
{
private class TestRefClass
{
public int TestInt { get; set; }
}
private void TestDefaultMethod(TestRefClass testClass)
{
testClass.TestInt = 1;
}
private void TestAssignmentMethod(TestRefClass testClass)
{
testClass = new TestRefClass() { TestInt = 1 };
}
private void TestAssignmentRefMethod(ref TestRefClass testClass)
{
testClass = new TestRefClass() { TestInt = 1 };
}
[TestMethod]
public void DefaultTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestDefaultMethod(testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
[TestMethod]
public void AssignmentTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestAssignmentMethod(testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
[TestMethod]
public void AssignmentRefTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestAssignmentRefMethod(ref testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
}
The results are that AssignmentTest() fails and the other two test methods pass. I assume the issue is that assigning a new instance to the testClass parameter breaks the parameter reference, but somehow explicitly adding the ref keyword fixes this.
Can anyone give a good, detailed explanation of whats going on here? I'm mainly just trying to expand my knowledge of C#; I don't have any specific scenario I'm trying to solve...
The thing that is nearly always forgotten is that a class isn't passed by reference, the reference to the class is passed by value.
This is important. Instead of copying the entire class (pass by value in the stereotypical sense), the reference to that class (I'm trying to avoid saying "pointer") is copied. This is 4 or 8 bytes; much more palatable than copying the whole class and in effect means the class is passed "by reference".
At this point, the method has it's own copy of the reference to the class. Assignment to that reference is scoped within the method (the method re-assigned only its own copy of the reference).
Dereferencing that reference (as in, talking to class members) would work as you'd expect: you'd see the underlying class unless you change it to look at a new instance (which is what you do in your failing test).
Using the ref keyword is effectively passing the reference itself by reference (pointer to a pointer sort of thing).
As always, Jon Skeet has provided a very well written overview:
http://www.yoda.arachsys.com/csharp/parameters.html
Pay attention to the "Reference parameters" part:
Reference parameters don't pass the values of the variables used in
the function member invocation - they use the variables themselves.
If the method assigns something to a ref reference, then the caller's copy is also affected (as you have observed) because they are looking at the same reference to an instance in memory (as opposed to each having their own copy).
The default convention for parameters in C# is pass by value. This is true whether the parameter is a class or struct. In the class case just the reference is passed by value while in the struct case a shallow copy of the entire object is passed.
When you enter the TestAssignmentMethod there are 2 references to a single object: testObj which lives in AssignmentTest and testClass which lives in TestAssignmentMethod. If you were to mutate the actual object via testClass or testObj it would be visible to both references since they both point to the same object. In the first line though you execute
testClass = new TestRefClass() { TestInt = 1 }
This creates a new object and points testClass to it. This doesn't alter where the testObj reference points in any way because testClass is an independent copy. There are now 2 objects and 2 references which each reference pointing to a different object instance.
If you want pass by reference semantics you need to use a ref parameter.
My 2 cents
When a class is passed to a method, a copy of its memory space address is being sent (a direction to your house is being sent). So any operation on that address will affect the house but will not change the address itself. (This is default).
Passing a class (object) by reference has an effect of passing its actual address instead of a copy of an address. That means if you assign a new object to an argument passed by reference it will change the actual address (similar to relocation). :D
This is how I see it.
The AssignmentTest uses TestAssignmentMethod which only changes the object reference passed by value.
So the object itself is passed by reference but the reference to the object is passed by value. so when you do:
testClass = new TestRefClass() { TestInt = 1 };
You are changing the local copied reference passed to the method not the reference you have in the test.
So here:
[TestMethod]
public void AssignmentTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestAssignmentMethod(testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
testObj is a reference variable. When you pass it to TestAssignmentMethod(testObj);, the refernce is passed by value. so when you change it in the method, original reference still points to the same object.
There are lot's of subtleties missed in the posted answers here that will create unexpected results and confuse new C# developers. There are actually two ways to process a reference passed by value in C# methods.
All methods in C# pass arguments in BY VALUE by default unless you use the ref, in, or out keywords. Passing a REFERENCE BY VALUE means a COPY of the MEMORY ADDRESS of the object used by the outside reference is passed in and assigned to the method parameter. The original outside variable address is not passed in nor the original object in memory, just the memory address to the object.
Both variables now point to the same object in memory.
This copy of the address to the object in memory is the VALUE for pass by value for all reference types. That means the original reference variable that points to the object address remains the same, and a new copy of that memory address is assigned to a new variable in the method parameter. They BOTH point to the same object. That means if either change properties on the object, it will affect the original object and will be seen by both variables.
This seems to act like a PASS BY REFERENCE, but it is not. That is what confuses many developers.
But this means some "weird" and unexpected things can happen passing a reference by value in methods if you are not careful. It means your method variable can connect to the same object and change the properties and fields of the original shared object ...BUT... as soon as you reassign the method variable to a new instance of the same type of object, it loses a connection to the original instance and no longer affects the original object used by the outside reference.
You might assume the method has assigned a fresh object to the outside reference variable, but you have not! Changing that new object's properties in the method no longer affect the outside reference. So BE CAREFUL!
Let's test this weirdness in C#:
// First, create my cat class. I can change its name
// to anything I want. But instead, I want it to have
// a special name assigned by the next class via a method.
class MyCat
{
public string Name { get; set; }
}
// This special class will assign a popular name to me cat.
class CatNames
{
public enum PopularNames {
Felix,
Fluffy
}
public void ChangeName(MyCat c)
{
PopularNames p = PopularNames.Felix;
c.Name = p.ToString();
}
public void ChangeNameAndCat(MyCat c)
{
PopularNames p = PopularNames.Fluffy;
MyCat d = new MyCat();
d.Name = p.ToString();
c = d;
// Note: In this case, you might want to return the new "MyCat"
// object and its name to the caller.
}
}
// Testing passing by value and how references are passed...
CatNames catnamechanger = new CatNames();
// I created two cats with the same name so you can see
// what names actually changed below.
MyCat cat1 = new MyCat();
cat1.Name = "Bubba";
MyCat cat2 = new MyCat();
cat2.Name = "Bubba";
catnamechanger.ChangeName(cat1);
catnamechanger.ChangeNameAndCat(cat2);
Console.WriteLine("My Cat1's Name is: " + cat1.Name);
Console.WriteLine("My Cat2's Name is: " + cat2.Name);
// ============== OUTPUT ==================
// My Cat1's Name is: Felix
// My Cat2's Name is: Bubba <<< OOPS! My cat name kept the original
RESULTS
Notice the first cat had its name changed on the original object, but the second cat kept its original name, "Bubba", as a new cat was assigned to the method variable. It lost connection to the original object. The reason is, passing a reference by value still allows you to affect properties of the passed in address to the original object. But as soon as you change where the method variable points, that reference is lost.
I understand (or at least I believe I do) what it means to pass an instance of a class to a method by ref versus not passing by ref. When or under what circumstances should one pass a class instance by ref? Is there a best practice when it comes to using the ref keyword for class instances?
The clearest explanation I've ever run across for output and ref parameters is ... Jon Skeet's.
Parameter Passing in C#
He doesn't go into "best practices", but if you understand the examples he's given, you'll know when you need to use them.
When you may replace the original object, you should send him as ref. If it's just for output and can be uninitialized before calling the function, you'll use out.
Put succinctly, you would pass a value as a ref parameter if you want the function you're calling to be able to alter the value of that variable.
This is not the same as passing a reference type as a parameter to a function. In those cases, you're still passing by value, but the value is a reference. In the case of passing by ref, then an actual reference to the variable is sent; essentially, you and the function you're calling "share" the same variable.
Consider the following:
public void Foo(ref int bar)
{
bar = 5;
}
...
int baz = 2;
Foo(ref baz);
In this case, the baz variable has a value of 5, since it was passed by reference. The semantics are entirely clear for value types, but not as clear for reference types.
public class MyClass
{
public int PropName { get; set; }
}
public void Foo(MyClass bar)
{
bar.PropName = 5;
}
...
MyClass baz = new MyClass();
baz.PropName = 2;
Foo(baz);
As expected, baz.PropName will be 5, since MyClass is a reference type. But let's do this:
public void Foo(MyClass bar)
{
bar = new MyClass();
bar.PropName = 5;
}
With the same calling code, baz.PropName will remain 2. This is because even though MyClass is a reference type, Foo has its own variable for bar; bar and baz just start out with the same value, but once Foo assigns a new value, they are just two different variables. If, however, we do this:
public void Foo(ref MyClass bar)
{
bar = new MyClass();
bar.PropName = 5;
}
...
MyClass baz = new MyClass();
baz.PropName = 2;
Foo(ref baz);
We'll end up with PropName being 5, since we passed baz by reference, making the two functions "share" the same variable.
The ref keyword allows you to pass an argument by reference. For reference types this means that the actual reference to an object is passed (rather than a copy of that reference). For value types this means that a reference to the variable holding the value of that type is passed.
This is used for methods that need to return more than one result but don't return a complex type to encapsulate those results. It allows you to pass a reference to a object into the method so that the method can modify that object.
The important thing to remember is that reference types are not normally passed by reference, a copy of a reference is passed. This means that you are not working with the actual reference that was passed to you. When you use ref on a class instance you are passing the actual reference itself so all modifications to it (like setting it to null for example) will be applied to the original reference.
When passing reference types (non-value-types) to a method, only the reference is passed in both cases. But when you use the ref keyword, the method being called can change the reference.
For example:
public void MyMethod(ref MyClass obj)
{
obj = new MyClass();
}
elsewhere:
MyClass x = y; // y is an instance of MyClass
// x points to y
MyMethod(ref x);
// x points to a new instance of MyClass
when calling MyMethod(ref x), x will point to the newly created object after the method call. x no longer points to the original object.
Most use cases for passing a reference variable by reference involve initialization and out is more appropriate than ref. And they compile to the same thing (the compiler enforces different constraints - that ref variables be initialized before being passed in and that out variables are initialized in the method). So the only case I can think of where this would be useful is where you need to do some checking of an instantiated ref variable and may need to reinitialize under certain circumstances.
This might also be necessary to modify an immutable class (like string) as pointed out by Asaf R.
I found that it is easy to run into trouble using the ref keyword.
The following method will modify f even without the ref keyword in the method signature because f is a reference type:
public void TrySet(Foo f,string s)
{
f.Bar = s;
}
In this second case however, the original Foo is affected only by the first line of code, the rest of the method somehow creates and affects only a new local variable.
public void TryNew(Foo f, string s)
{
f.Bar = ""; //original f is modified
f = new Foo(); //new f is created
f.Bar = s; //new f is modified, no effect on original f
}
It would be good if the compiler gave you a warning in that case. Basically what you are doing is replacing the reference you received with another one referencing a different memory area.
It you actually want to replace the object with a new instance, use the ref keyword:
public void TryNew(ref Foo f, string s)...
But are you not shooting yourself in the foot? If the caller is not aware that a new object is created, the following code will probably not work as intended:
Foo f = SomeClass.AFoo;
TryNew(ref f, "some string"); //this will clear SomeClass.AFoo.Bar and then create a new distinct object
And if you try to "fix" the problem by adding the line:
SomeClass.AFoo = f;
If the code holds a references to SomeClass.AFoo somewhere else, that reference will become invalid...
As a general rule, you probably should avoid using the new keyword to alter an object which you read from another class or received as a parameter in a method.
Regarding the use of the ref keyword with reference types, I can suggest this approach:
1) Don't use it if simply setting the values of the reference type but be explicit in your function or parameter names and in the comments:
public void SetFoo(Foo fooToSet, string s)
{
fooToSet.Bar = s;
}
2) When there is a legitimate reason to replace the input parameter with a new, different instance, use a function with a return value instead:
public Foo TryNew(string s)
{
Foo f = new Foo();
f.Bar = s;
return f;
}
But using this function may still have unwanted consequences with the SomeClass.AFoo scenario:
SomeClass.AFoo = TryNew("some string");//stores a different object in SomeClass.AFoo
3) In some cases such as the string swapping example here it is handy to use ref params, but just as in case 2 make sure that swapping the object addresses does not affect the rest of your code.
Because it manages memory allocation for you, C# makes it all too easy to forget everything about memory management but it really helps to understand how pointers and references work. Otherwise you may introduce subtle bugs that are difficult to find.
Finally, this is typically the case where one would want to use a memcpy like function but there is no such thing in C# that I know of.
I have one doubt in string .
How can we acess the memory of string?
is String is reference type or Value type?
1) if it is Reference type then do the following code
List<String> strLst = new List<string>() { "abc","def","12"};
strLst.ForEach(delegate(string s)
{
if (s == "12")
{
s = "wser";
// I doubted whether = operator is overloaded in string class
//StringBuilder sb = new StringBuilder(s);
//s = sb.Append("eiru").ToString();
s = String.Concat(s, "sdf");
}
});
See that value of string is not changed. My question is why the string value is not changed?
If it is reference type then the string value should be changed.
class emp
{
public string id;
}
List<emp> e = new List<emp>() { new emp() { id = "sdf" }, new emp() { id = "1" }, new emp() { id = "2" } };
e.ForEach(delegate(emp em)
{
if (em.id == "1")
em.id = "fghe";
});
Here value is changed because emp is reference type
2) if string is value type
public sealed class String : IComparable, ICloneable, IConvertible, IComparable<string>, IEnumerable<char>, IEnumerable, IEquatable<string>
then why do they mentione that string is a class?
This is happening because of this part:
s = "wser";
This is exactly equivalent to:
s = new String ("wser");
When you write this, a new object is instantiated and its reference is stored in s. Thus, the previous reference is completely lost in the function scope and no change is noticed outside the scope.
Thus, to notice changes in a reference type changed in another scope, you cannot create a new instance and assign it to the variable name, you must modify the original reference itself (and this is not possible for a string object in Java - strings are immutable).
The System.String type is indeed a reference type, albeit a rather strange one. It is immutable from a developer's perspective, and the CLR treats it pretty much like a value type, so you won't go far wrong doing the same.
Jon Skeet's article on parameter passing, value types, and reference types in C#, also gives a good explanation of this curiosity:
Note that many types (such as string)
appear in some ways to be value types,
but in fact are reference types. These
are known as immutable types. This
means that once an instance has been
constructed, it can't be changed. This
allows a reference type to act
similarly to a value type in some ways
- in particular, if you hold a reference to an immutable object, you
can feel comfortable in returning it
from a method or passing it to another
method, safe in the knowledge that it
won't be changed behind your back.
This is why, for instance, the
string.Replace doesn't change the
string it is called on, but returns a
new instance with the new string data
in - if the original string were
changed, any other variables holding a
reference to the string would see the
change, which is very rarely what is
desired.
If you store the reference to an object in a variable and then change the reference, the object does not change, only the variable. You're changing a propert of em, thus affecting the object references by the variable em. If you instead did em = something, it would behave like the String example and not affect anything either.
The String type is a reference type, but it is immutable. This means you can't change the contents of the string.
It actually behaves like a value type, but only references are passed around, instead of the whole string object.
From the MSDN online page - string
Strings are immutable--the contents of a string object cannot be changed. Although string is a reference type, the equality operators (== and !=) are defined to compare the values of string objects, not references.
Why can't strings be mutable in Java and .NET?
String is a reference type.
The reason why the string in the list is not changing in your example, is that the method doesn't get access to the item in the list, it only gets a copy of the reference. The argument s is just a local variable in the method, so assigning a new value to it doesn't affect the contents of the list.