Is equal sign passing a reference or object copy? - c#

When I have this code:
class A
{
public int X = 0;
...
}
public void Function()
{
// here I create a new instance of class
A a = new A();
a.X = 10;
// here I create a pointer to null
A b = null;
// here I assign a to b
b = a;
b.X = 20;
}
did I pass the reference to instance of class A now? or I cloned the instance of A to new instance and created a reference to it in b?
is changing X in b also changing X in a? Why? If not, what is a proper way to create a copy of a and insert that to b?
Why the same with strings would always create a copy? Is equal operator overridden in strings?
string a = "hello";
string b = a;
b = "world";
// "hello world"
Console.WriteLine( a + " " + b );

C# uses references not pointers. Classes are reference types.
On your example, b has the same reference with a. They referencing the same location on memory.
changing X in b also changing X in a? Why?
Yes, because they reference to the same objects and changing one reference will affect the other one.
string a = "hello";
string b = a;
b = "world";
// "hello world"
Console.WriteLine( a + " " + b );
Strings are reference types also. But they are also immutable type. Which means you can't change them. Even if you think you change them, you actually create new strings object.
line you create an object contains "hello" with a reference called a.
line you create a new reference called b referencing to the same object. ("hello")
line you assign your b reference new object called "world". Your b referance is not referencing "hello" object anymore.

did I pass the pointer to instance of class A now? or I cloned the
instance of A to new instance and created a pointer to it in b?
b is holding the same reference as a, both of them pointing to the same location.
changing X in b also changing X in a? Why?
Because both of them are pointing to the same reference.
what is a proper way to create a copy of a and insert that to b?
Implement IClonable interface
Supports cloning, which creates a new instance of a class with the
same value as an existing instance
EDIT
Since you edited the question with string, although strings are reference types but they are immutable as well
string (C# Reference)
Strings are immutable--the contents of a string object cannot be
changed after the object is created, although the syntax makes it
appear as if you can do this.

Object b is pointing to the object a, you have to do the deep clone to make a copy using IClonable Interface.

When you assign, you pass a copy of the return value of the assigned expression.
For value types, this is the value you usually get to see when you use them (like the numerical value of an integer).
For reference types, the actual value is something like an address pointing to the referenced object (but, what it really is, is an implementation detail). So, even though you pass a copy of that address, that copy points to the same object.

Related

Difference between Reference Types and References

I am reading following blog by Eric Lippert: The truth about Value types
In this, he mentions there are 3 kinds of values in the opening:
Instance of Value types
Instance of Reference types
References
It is incomplete. What about references? References are neither value types nor instances of reference types, but they are values..
So, in the following example:
int i = 10;
string s = "Hello"
First is instance of value type and second is instance of reference type. So, what is the third type, References and how do we obtain that?
So, what is the third type, References and how do we obtain that?
The variable s is a variable which holds the value of the reference. This value is a reference to a string (with a value of "Hello") in memory.
To make this more clear, say you have:
string s1 = "Hello";
string s2 = s1;
In this case, s1 and s2 are both variables that are each a reference to the same reference type instance (the string). There is only a single actual string instance (the reference type) involved here, but there are two references to that instance.
Fields and variables of reference type, such as your s, are references to an instance of a reference type that lives on the heap.
You never use an instance of a reference type directly; instead, you use it through a reference.
A reference is not really a 'third type'. It's actually a pointer that refers to a concrete instance of an object. Take a look at this example:
class MyClass
{
public string Str { get; set; }
}
class Program
{
static void Main(string[] args)
{
int a = 1;
int b = 2;
int c = 3;
var myObj = new MyClass
{
Str = "Whatever"
};
Console.WriteLine("{0};\t{1};\t{2};\t{3}", a, b, c, myObj.Str);
MyFunction(a, ref b, out c, myObj);
Console.WriteLine("{0};\t{1};\t{2};\t{3}", a, b, c, myObj.Str);
Console.ReadLine();
}
static void MyFunction(int justValue, ref int refInt, out int outInt, MyClass obj)
{
obj.Str = "Hello";
justValue = 101;
refInt = 102;
outInt = 103; // similar to refInt, but you MUST set the value of the parameter if it's uses 'out' keyword
}
}
The output of this program is:
1; 2; 3; Whatever
1; 102; 103; Hello
Focus on the MyFunction:
The first parameter we pass is a simple int which is a value type. By default value types are cloned when passed as the parameter (a new instance is being created). That's why the value of 'a' didn't change.
You can change this behavior by adding 'ref' or 'out' keyword to the parameter. In this case you actually pass a reference to that very instance of your int. In MyFunction the value of that instance is being overridden.
Here you can read move out ref and out
The last example is the object of MyClass. All classes are reference types and that's why you always pass them as references (no special keywords needed).
You can think about a reference as about an address in computer memory. Bytes at that address compose your object. If you pass it as value, you take that bytes out and pass them to a function. If you pass it as a reference you only pass the address. Than in your called function you can read bytes from that address or write to that address. Every change affects the calling function variables, because they point to exactly the same bytes in computer memory. It's not exactly what happens in .Net (it runs in a virtual machine), but I think this analogy will help you understand the concept.
Why do we use references? There are many reasons. One of them is that passing a big object by value would be very slow and would require cloning it. When you pass a reference to an object, than no matter how big that object is you only pass w few bytes that contain it's 'address' in memory.
Moreover your object may contain elements that cannot be cloned (like an open socket). Using reference you can easily pass such an object between functions.
It's also worth mentioning that sctructs, even though they look very similar to classes are actually value types and behave as value types (when you pass a struct to a function, you actually pass a clone - a new instance).

Reference to a reference in C#

Supose I create a variable B that references another one, A, both of them being reference type variables. If I set B or A to null, the other one would still be pointing to the object instance, that would remain untouched.
SomeClass A = new SomeClass();
SomeClass B = A;
B = null; //A still has a valid reference
This is also true:
SomeClass A = new SomeClass();
SomeClass B = A;
A = null; //B still has a valid reference
But I don´t want B to reference the instance referenced by A, I want B to reference A itself. That way, if B was set to null, A would be set to null as well. Is there any elegant, safe(no pointers) way of doing this? or am I trying to do something that is against C# principles?
Thanks.
You can't do this the way you would do it in C++ or C. The only time you can have a reference to an object handle is when you call a method with a ref parameter: viz:
void main_method()
{
SomeClass A = new SomeClass();
secondary_method(ref A);
}
void secondary_method(ref SomeClass B)
{
B = null; // this has the side effect of clearing the A of main_method
}
The solution here is for neither of those variables to directly refer to the object, but to instead refer to an object instance that has a field pointing to an actual SomeClass instance:
public class Pointer<T>
{
public T Value {get;set;}
}
Pointer<SomeClass> A = new Pointer<SomeClass>(){ Value = new SomeClass()};
Pointer<SomeClass> B = A;
B.Value = null;
//A.Value is null
MSDN has the following definition for a reference type: "Variables of reference types store references to the actual data" (http://msdn.microsoft.com/en-us/library/490f96s2%28v=vs.110%29.aspx). In your case, setting the second variable to null is only causing the reference of the second variable to be broken without having any effect on the actual data. This is clearly shown in the post by Olivier available at Setting a type reference type to null doesn't affect copied type?.
A possible solution to your problem is to make use of a WeakReference.
According to MSDN: "A weak reference allows the garbage collector to collect an object while still allowing an application to access the object. If you need the object, you can still obtain a strong reference to it and prevent it from being collected" (http://msdn.microsoft.com/en-us/library/system.weakreference%28v=vs.110%29.aspx).
So as long as the second (local) reference is accessing your weak reference, the object won't be garbage collected. Once you break the local reference by setting that to null, the GC would clear the weakly referenced object. More information on WeakReference is available at: http://msdn.microsoft.com/en-us/library/ms404247.aspx

Original Values changes are not reflecting in reference types variable

object a = "1411";
object b = a;
Console.WriteLine("Before Value of a " + a);
Console.WriteLine("Before Value of b " + b);
a = "5555";
Console.WriteLine("After Value of a " + a);
Console.WriteLine("After Value of b " + b);
Console.ReadKey();
output:
Before Value of a 1411
Before Value of b 1411
After Value of a 5555
After Value of b 1411
After Value of b also should changed to 5555 right? since b is reference types variable.
Let's do this with numbers:
int x = 1411; // now x is 1411
int y = x; // now y is 1411
x = 5555; // now x is 5555
Console.WriteLine(y);
Now: what is y? Simple: it is still 1411. Assigning a new value to x doesn't change y. The same is true with reference-types, but simply the "value" of a reference-type variable is the reference. Not the object.
If you assign a reference-type variable to be a different value (i.e. to point at a different object), that only affects that single variable.
Now, if you did:
var x = new SomeClass { Foo = "1411" };
var y = x;
x.Foo = "5555";
Console.WriteLine(y.Foo);
Then this would print "5555". The difference now is that we have one object, and both reference-type variables point at the same object. We have changed a value of the object (not the reference), so updating that changes it no matter how you get to the same object.
Let's take this code piece by piece to see what it does:
a = "1411";
This will store a reference to an object into the variable a. The object is a string, and allocated on the heap (since it's a reference type).
So there are two pieces involved here:
The variable a
The object (string) that it refers to
Then we have this:
b = a;
This will make the variable b reference the same object that a refers to.
References internally are implemented as memory addresses, and thus if (example) the string object lives at address 1234567890, then the values of the two variables would both be that address.
Now, then you do this:
a = "5555";
This will change the contents of the a variable, but the b variable will be left unchanged.
This means that b still refers to the old object, at address 1234567890, whereas a will refer to a different string object.
You did not change the object itself, that both a and b were referring to, you changed a.
As Marc said in a comment, you can liken this to giving you the address of a house on a piece of paper. If you give a piece of paper to your friend, writing up the same address on that second piece of paper, you are referring to the same house on both.
However, if you give your friend a paper with a different address on it, even if the two houses looks the same, they're not the same house.
So there's a big difference between reference type and variable containing a reference.

Passing a class as a ref parameter in C# does not always work as expected. Can anyone explain?

I always thought that a method parameter with a class type is passed as a reference parameter by default. Apparently that is not always the case. Consider these unit tests in C# (using MSTest).
[TestClass]
public class Sandbox
{
private class TestRefClass
{
public int TestInt { get; set; }
}
private void TestDefaultMethod(TestRefClass testClass)
{
testClass.TestInt = 1;
}
private void TestAssignmentMethod(TestRefClass testClass)
{
testClass = new TestRefClass() { TestInt = 1 };
}
private void TestAssignmentRefMethod(ref TestRefClass testClass)
{
testClass = new TestRefClass() { TestInt = 1 };
}
[TestMethod]
public void DefaultTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestDefaultMethod(testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
[TestMethod]
public void AssignmentTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestAssignmentMethod(testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
[TestMethod]
public void AssignmentRefTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestAssignmentRefMethod(ref testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
}
The results are that AssignmentTest() fails and the other two test methods pass. I assume the issue is that assigning a new instance to the testClass parameter breaks the parameter reference, but somehow explicitly adding the ref keyword fixes this.
Can anyone give a good, detailed explanation of whats going on here? I'm mainly just trying to expand my knowledge of C#; I don't have any specific scenario I'm trying to solve...
The thing that is nearly always forgotten is that a class isn't passed by reference, the reference to the class is passed by value.
This is important. Instead of copying the entire class (pass by value in the stereotypical sense), the reference to that class (I'm trying to avoid saying "pointer") is copied. This is 4 or 8 bytes; much more palatable than copying the whole class and in effect means the class is passed "by reference".
At this point, the method has it's own copy of the reference to the class. Assignment to that reference is scoped within the method (the method re-assigned only its own copy of the reference).
Dereferencing that reference (as in, talking to class members) would work as you'd expect: you'd see the underlying class unless you change it to look at a new instance (which is what you do in your failing test).
Using the ref keyword is effectively passing the reference itself by reference (pointer to a pointer sort of thing).
As always, Jon Skeet has provided a very well written overview:
http://www.yoda.arachsys.com/csharp/parameters.html
Pay attention to the "Reference parameters" part:
Reference parameters don't pass the values of the variables used in
the function member invocation - they use the variables themselves.
If the method assigns something to a ref reference, then the caller's copy is also affected (as you have observed) because they are looking at the same reference to an instance in memory (as opposed to each having their own copy).
The default convention for parameters in C# is pass by value. This is true whether the parameter is a class or struct. In the class case just the reference is passed by value while in the struct case a shallow copy of the entire object is passed.
When you enter the TestAssignmentMethod there are 2 references to a single object: testObj which lives in AssignmentTest and testClass which lives in TestAssignmentMethod. If you were to mutate the actual object via testClass or testObj it would be visible to both references since they both point to the same object. In the first line though you execute
testClass = new TestRefClass() { TestInt = 1 }
This creates a new object and points testClass to it. This doesn't alter where the testObj reference points in any way because testClass is an independent copy. There are now 2 objects and 2 references which each reference pointing to a different object instance.
If you want pass by reference semantics you need to use a ref parameter.
My 2 cents
When a class is passed to a method, a copy of its memory space address is being sent (a direction to your house is being sent). So any operation on that address will affect the house but will not change the address itself. (This is default).
Passing a class (object) by reference has an effect of passing its actual address instead of a copy of an address. That means if you assign a new object to an argument passed by reference it will change the actual address (similar to relocation). :D
This is how I see it.
The AssignmentTest uses TestAssignmentMethod which only changes the object reference passed by value.
So the object itself is passed by reference but the reference to the object is passed by value. so when you do:
testClass = new TestRefClass() { TestInt = 1 };
You are changing the local copied reference passed to the method not the reference you have in the test.
So here:
[TestMethod]
public void AssignmentTest()
{
var testObj = new TestRefClass() { TestInt = 0 };
TestAssignmentMethod(testObj);
Assert.IsTrue(testObj.TestInt == 1);
}
testObj is a reference variable. When you pass it to TestAssignmentMethod(testObj);, the refernce is passed by value. so when you change it in the method, original reference still points to the same object.
There are lot's of subtleties missed in the posted answers here that will create unexpected results and confuse new C# developers. There are actually two ways to process a reference passed by value in C# methods.
All methods in C# pass arguments in BY VALUE by default unless you use the ref, in, or out keywords. Passing a REFERENCE BY VALUE means a COPY of the MEMORY ADDRESS of the object used by the outside reference is passed in and assigned to the method parameter. The original outside variable address is not passed in nor the original object in memory, just the memory address to the object.
Both variables now point to the same object in memory.
This copy of the address to the object in memory is the VALUE for pass by value for all reference types. That means the original reference variable that points to the object address remains the same, and a new copy of that memory address is assigned to a new variable in the method parameter. They BOTH point to the same object. That means if either change properties on the object, it will affect the original object and will be seen by both variables.
This seems to act like a PASS BY REFERENCE, but it is not. That is what confuses many developers.
But this means some "weird" and unexpected things can happen passing a reference by value in methods if you are not careful. It means your method variable can connect to the same object and change the properties and fields of the original shared object ...BUT... as soon as you reassign the method variable to a new instance of the same type of object, it loses a connection to the original instance and no longer affects the original object used by the outside reference.
You might assume the method has assigned a fresh object to the outside reference variable, but you have not! Changing that new object's properties in the method no longer affect the outside reference. So BE CAREFUL!
Let's test this weirdness in C#:
// First, create my cat class. I can change its name
// to anything I want. But instead, I want it to have
// a special name assigned by the next class via a method.
class MyCat
{
public string Name { get; set; }
}
// This special class will assign a popular name to me cat.
class CatNames
{
public enum PopularNames {
Felix,
Fluffy
}
public void ChangeName(MyCat c)
{
PopularNames p = PopularNames.Felix;
c.Name = p.ToString();
}
public void ChangeNameAndCat(MyCat c)
{
PopularNames p = PopularNames.Fluffy;
MyCat d = new MyCat();
d.Name = p.ToString();
c = d;
// Note: In this case, you might want to return the new "MyCat"
// object and its name to the caller.
}
}
// Testing passing by value and how references are passed...
CatNames catnamechanger = new CatNames();
// I created two cats with the same name so you can see
// what names actually changed below.
MyCat cat1 = new MyCat();
cat1.Name = "Bubba";
MyCat cat2 = new MyCat();
cat2.Name = "Bubba";
catnamechanger.ChangeName(cat1);
catnamechanger.ChangeNameAndCat(cat2);
Console.WriteLine("My Cat1's Name is: " + cat1.Name);
Console.WriteLine("My Cat2's Name is: " + cat2.Name);
// ============== OUTPUT ==================
// My Cat1's Name is: Felix
// My Cat2's Name is: Bubba <<< OOPS! My cat name kept the original
RESULTS
Notice the first cat had its name changed on the original object, but the second cat kept its original name, "Bubba", as a new cat was assigned to the method variable. It lost connection to the original object. The reason is, passing a reference by value still allows you to affect properties of the passed in address to the original object. But as soon as you change where the method variable points, that reference is lost.

why don't string object refs behave like other object refs?

string a = "a";
string b = a;
string a = "c";
Why does string b still have the value "a" and not "c"?
As string is an object and not a stack value type, what's with this behaviour?
Thanks
You're pointing the variable to something new, it's no different than if you said
Foo a = new Foo();
Foo b = a;
a = new Foo();
// a no longer equal to b
In this example, b is pointing to what a initially referenced. By changing the value of a, a and b are no longer referencing the same object in memory. This is different than working with properties of a and b.
Foo a = new Foo();
Foo b = a;
a.Name = "Bar";
Console.WriteLine(b.Name);
In this case, "Bar" gets written to the screen because a and b still reference the same object.
Let me start by saying that your choices for variables and data are poor. It makes it very difficult for someone to say "the string a in your example..." because "a" could be the content of the string, or the variable containing the reference. (And it is easily confused with the indefinite article 'a'.)
Also, your code doesn't compile because it declares variable "a" twice. You are likely to get better answers if you ask questions in a way that makes them amenable to being answered clearly.
So let's start over.
We have two variables and two string literals.
string x = "hello";
string y = x;
x = "goodbye";
Now the question is "why does y equal 'hello' and not 'goodbye'"?
Let's go back to basics. What is a variable? A variable is a storage location.
What is a value of the string type? A value of the string type is a reference to string data..
What is a variable of type string? Put it together. A variable of type string is a storage location which holds a reference to string data.
So, what is x? a storage location. What is its first value? a reference to the string data "hello".
What is y? a storage location. What is its first value? a reference to the string data "hello", same as x.
Now we change the contents of storage location x to refer to the string data "goodbye". The contents of storage location y do not change; we didn't set y.
Make sense?
why don’t string object refs behave like other object refs?
I deny the premise of the question. String object refs do behave like other object refs. Can you give an example of where they don't?
Part of what confuses people so much about this is thinking of the following as an append operation:
str1 = str1 + str2;
If string were a mutable type, and the above were shorthand for something like this:
str1.Append(str2);
Then what you're asking would make sense.
But str1 = str1 + str2 is not just some method call on a mutable object; it is an assignment. Realizing this makes it clear that setting a = "c" in your example is no different from assigning any variable (reference type or not) to something new.
The below comparison between code that deals with two List<char> objects and code that deals with two string objects should hopefully make this clearer.
var a = new List<char>();
var b = a; // at this point, a and b refer to the same List<char>
b.Add('a'); // since a and b refer to the same List<char> ...
if (b.Contains('a')) { /* ...this is true... */ }
if (a.Contains('a')) { /* ...and so is this */ }
// HOWEVER...
a = new List<char>(); // now a and b do NOT refer to the same List<char>...
if (b.Contains('a')) { /* ...so this is still true... */ }
if (a.Contains('a')) { /* ...but this is not */ }
Compare this with a slightly modified version of the code you posted:
string a = "a";
string b = a; // at this point, a and b refer to the same string ("a")...
if (b == "a") { /* ...so this is true... */ }
if (a == "a") { /* ...and so is this */ }
// REMEMBER: the below is not simply an append operation like List<T>.Add --
// it is an ASSIGNMENT
a = a + "c"; // now they do not -- b is still "c", but a is "ac"
if (b == "a") { /* ...so this is still true... */ }
if (a == "a") { /* ...but this is not */ }
In .Net, a, b and c are reference to the objects and not the objects themselves. When you reset a, you are pointing this reference to a new memory location. The old memory location and any references to it are unchanged.
I guess the OP thinks string objects to be mutable, so something like var = "content";
would actually store the new character array inside the already existing object.
String is, however, an immutable type, which means that in this case a new string object is created and assigned to var.
See for example:
http://codebetter.com/blogs/patricksmacchia/archive/2008/01/13/immutable-types-understand-them-and-use-them.aspx
It is a misunderstanding because of the builtin string support of c#.
string a = "123"; //The way to write it in C#
string a = new string("123"); //Would be more obvious
The second way to define a is more obvious what happens, but it is verbose.Since strings have direct support from the compiler calling the string constructor is unnecessary.
Writing your example verbose:
string a = new string("a");
string b = a;
string a = new string("c");
Here the behavior is as expected a gets a reference to the new string object assigned. while the reference held by b still points to the old string.

Categories

Resources