Reference object comparison of type string

Reference object comparison of type string - c#

Consider the following code:
public static void Main()
{
string str1 = "abc";
string str2 = "abc";
if (str1 == str2)
{
Console.WriteLine("True");
}
else
{
Console.WriteLine("False");
}
Console.ReadLine();
}
The output is "True". string is a reference type in .Net & I am comparing two different objects, but still the output is "True".
Is is because it internally calls ToString() method on both objects & before comparing them?
Or is it because a string is an immutable type? Two completely distinct string objects having the same value would point to same memory location on the heap?
How does string comparison happens?
How does memory allocation works on the heap? Will two different string objects with the same value point to same memory location, or to a different one?

Strings are compared by value by default.
Objects are compared by reference by default.
Identical string literals in the same assembly are interned to be the same reference.
Identical strings that are not literals can legally be interned to the same reference, but in practice typically are not.
So now you should be able to understand why you get the given output in this program fragment:
string a1 = "a";
string a2 = "a";
string aa1 = a1 + a2;
string aa2 = a1 + a2;
object o1 = a1;
object o2 = a2;
object o3 = aa1;
object o4 = aa2;
Console.WriteLine(a1 == a2); // True
Console.WriteLine(aa1 == aa2); // True
Console.WriteLine(o1 == o2); // True
Console.WriteLine(o3 == o4); // False
Does that make sense?

For the string type, == compares the values of the strings.
See http://msdn.microsoft.com/en-us/library/53k8ybth.aspx
Regarding your question about addressing, a few lines of code says they will have the same address.
static void Main(string[] args)
{
String s1 = "hello";
String s2 = "hello";
String s3 = s2.Clone() as String;
Console.Out.WriteLine(Get(s1));
Console.Out.WriteLine(Get(s2));
Console.Out.WriteLine(Get(s3));
s1 = Console.In.ReadLine();
s1 = Console.In.ReadLine();
s3 = s2.Clone() as String;
Console.Out.WriteLine(Get(s1));
Console.Out.WriteLine(Get(s2));
Console.Out.WriteLine(Get(s3));
}
public static string Get(object a)
{
GCHandle handle = GCHandle.Alloc(a, GCHandleType.Pinned);
IntPtr pointer = GCHandle.ToIntPtr(handle);
handle.Free();
return "0x" + pointer.ToString("X");
}
Results in the same address for each set of tests.
Get() courtosey of Memory address of an object in C#

String Class has done operator overloading to write custom logic for == operator.
That's why when == is used in case of string it does not compare references but actual value.

Please see https://stackoverflow.com/a/1659107/562036 for more info
It's entirely likely that a large portion of the developer base comes
from a Java background where using == to compare strings is wrong and
doesn't work. In C# there's no (practical) difference (for strings).

string is reference type, because its does not have default allocation size, but is treated as a value type for sanity reasons, could you image a world were == would not work between to exact string values.

Related

Why Does Assert.AreSame() Consider Two Seperate Strings The Same?

Why is this passing Assert.AreSame()?
[TestMethod]
public void StringSameTest()
{
string a = "Hello";
string b = "Hello";
Assert.AreSame(a, b);
}
I understand ìt tests for reference equality, and is essentially the same as Assert.IsTrue(object.ReferenceEquals(a, b)) but it's clear that a and b are different string objects, regardless of them having the same values. If Ì set string b = a; I'd expect true, but that's not the case. Why isn't this test failing?
Thanks

The C# compiler will intern identical literal strings to the same const string reference.
So your code is equivalent to this:
private const String _hello = "Hello";
[TestMethod]
public void StringSameTest()
{
string a = _hello;
string b = _hello;
Assert.AreSame( a, b ); // true
}
To create a separate string instance that's identical to a const string use String.Copy():
string a = "Hello";
string b = a.Copy();
Assert.AreSame( a, b ); // false
However, do note that:
String.Copy() and String.Clone() are different!
Clone() does not actually clone the string value, it instead returns a reference to itself.
String.ToString() also returns a reference to itself.
String.Copy() is deprecated in .NET Framework and .NET Core and may be removed in a future version.
This is because there is no legitimate need to use String.Copy()
See Do string literals get optimised by the compiler?

Reference of two String in C# [duplicate]

The code is pretty self explanatory. I expected when I made a1 and b1 that I was creating two different string instances that contain the same text. So I figure a1 == b1 would be true but object.ReferenceEquals(a1,b1) would be false, but it isn't. Why?
//make two seemingly different string instances
string a1 = "test";
string b1 = "test";
Console.WriteLine(object.ReferenceEquals(a1, b1)); // prints True. why?
//explicitly "recreating" b2
string a2 = "test";
string b2 = "tes";
b2 += "t";
Console.WriteLine(object.ReferenceEquals(a2, b2)); // prints False
//explicitly using new string constructor
string a3 = new string("test".ToCharArray());
string b3 = new string("test".ToCharArray());
Console.WriteLine(object.ReferenceEquals(a3, b3)); // prints False

Literal string objects are coalesced into single instances by the compiler. This is actually required by the specification:
Each string literal does not necessarily result in a new string instance. When two or more string literals that are equivalent according to the string equality operator (Section 7.9.7) appear in the same assembly, these string literals refer to the same string instance.

The compiler is optimized to that the if string literals are equal with "==" operator than it does not need to create a new instance and both refer to the same instance... So, that's why your first part of question answered True.
Although string is a reference type, the equality operators (== and !=) are defined to compare the values of string objects, not references. This makes testing for string equality more intuitive. For example:
string a = "hello";
string b = "h";
// Append to contents of 'b'
b += "ello";
Console.WriteLine(a == b);
Console.WriteLine((object)a == (object)b);
This displays "True" and then "False" because the content of the strings are equivalent, but a and b do not refer to the same string instance.
The + operator concatenates strings:
string a = "good " + "morning";
This creates a string object that contains "good morning".
Strings are immutable--the contents of a string object cannot be changed after the object is created, although the syntax makes it appear as if you can do this. For example, when you write this code, the compiler actually creates a new string object to hold the new sequence of characters, and that new object is assigned to b. The string "h" is then eligible for garbage collection.
string b = "h";
b += "ello";
for more reference check this on msdn and this

Compiler optimization. Simple as that.

ReferenceEquals returning false with strings

private class global
{
public static int a = 0;
public static int val = 0;
public static int c = -1;
public static string g = "";
}
private void button8_Click(object sender, EventArgs e)
{
global.a = global.a + 1;
global.c = global.c + 1;
string a = label2.Text;
if (string.ReferenceEquals(a, global.g))
{
MessageBox.Show("a");
//dataGridView1.Rows[global.c].Cells[1].Value = global.a;
//dataGridView1.Rows[global.c].Cells[2].Value = global.val * global.a;
}
else
{
dataGridView1.Rows.Add(label2.Text, global.a, global.val);
}
global.g = label2.Text;
}
If button8 is pressed again with label2.Text it should call MessageBox.Show() but somehow global.g = label2.text does not work. I tried with :
string a = "";
string b = "";
if (string.ReferenceEquals(a, b))
{
MessageBox.Show("a");
}
It works fine but then I change b to global.g it skips if...

As qqbenq states above... you should use String.Equals instead due to string interning.
You should NOT use reference equality to compare strings... as per Microsoft
you should not use ReferenceEquals to
determine string equality.
And a bit more detail further down in the link...
Constant strings within the same assembly are always interned by the
runtime. That is, only one instance of each unique literal string is
maintained. However, the runtime does not guarantee that strings
created at runtime are interned, nor does it guarantee that two equal
constant strings in different assemblies are interned.
Specifically to answer your question... how should I change my code...
Edited as #Servy mentioned to use the static string.equals for the case where a is null.
string a = "";
string b = "";
if (string.Equals(a, b))
{
MessageBox.Show("a");
}
You should pretty much always use Equals for comparing reference types. Only use ReferenceEquals if you really want to check if they are not only equal but actually point to the same reference.

The problem here is the use of ReferenceEquals instead of Equals.
ReferenceEquals will check for equality of the reference - which is to say, the pointer in memory to the underlying variable, not the value itself. this is a static method because it has such a precise use, it should never be overridden or hidden by a derived class.
Equals on the other hand will compare the objects themselves, and thus determine if their underlying values are the same. Since it's a string, you also have overloads of Equals which allow to specify exactly how the strings are being compared.
thus, change
if (string.ReferenceEquals(a, global.g))
{
MessageBox.Show("a");
//dataGridView1.Rows[global.c].Cells[1].Value = global.a;
//dataGridView1.Rows[global.c].Cells[2].Value = global.val * global.a;
}
to
if (string.Equals(a, global.g)) // Static string.Equals prevents a NullReferenceException if 'a' is null
{
MessageBox.Show("a");
//dataGridView1.Rows[global.c].Cells[1].Value = global.a;
//dataGridView1.Rows[global.c].Cells[2].Value = global.val * global.a;
}
Generally, you want to use Equals for comparison. ReferenceEquals has some very very specific use cases, and I've only ever had to use it once.

String Equal problems in C#

here is C# code.
class Program
{
static void Main(string[] args)
{
char [] arry = {'a', 'b', 'c'};
String str1 = 'a' + "bc";
String str2 = "bcd";
String str3 = new String(arry);
if (str1 == str2)
Console.WriteLine("str1 == str2");
if (str1 == str3)
Console.WriteLine("str1 == str3");
if (String.Equals(str1, str3))
Console.WriteLine("String.Equals(str1, str3)");
String str4 = GetStr();
if (str1 == str4)
Console.WriteLine("str1 == str4");
if (String.Equals(str1, str4))
Console.WriteLine("String.Equals(str1, str4)");
if (str3 == str4)
Console.WriteLine("str3 == str4");
if (String.Equals(str3, str4))
Console.WriteLine("String.Equals(str3, str4)");
}
public static String GetStr()
{
String str = "ab" + 'c';
return str;
}
}
And result is ..
str1 == str3
String.Equals(str1, str3)
str1 == str4
String.Equals(str1, str4)
str3 == str4
String.Equals(str3, str4)
Why all results say "Equal!" ??
As I knew, reference value are different each others.
So, results should have been "different!". but not.
Why?
It seems that there is no reason to use String.equal() !!

You are confusing string.Equals with object.ReferenceEquals.
string.Equals overrides object.Equals (which has the same semantics as ReferenceEquals) and works by comparing the values of the strings. This is the reason that object.Equals is virtual in the first place.

Equality for a string has been overridden to be based on its value.
The documentation for String.Equals states it checks the value, but that also happens to be what == also does due to the string implementation.
Default equality for reference types is based on the reference itself, but that can easily be overridden... so basically your assertion is flawed, as it doesn't take into account types that override default behaviour.
As Jon has stated, reference equality can be forced via the object.ReferenceEquals static method, but as Jason has stated, this may also fail if the strings have been interned.
According to ILSpy, String.Equals ends up using == at any rate:
public static bool Equals(string a, string b)
{
return a == b || (a != null && b != null && a.Length == b.Length && string.EqualsHelper(a, b));
}

== has been overloaded for String to evaluate equality as values, not as references. From MSDN:
Although string is a reference type, the equality operators (== and !=) are defined to compare the values of string objects, not references.
However, what you need to be aware of is that some of the strings will be evaluated at compile-time, and the compiler will intern them (that is, hold a single reference to a string with a given value). Therefore, these strings might be equal as references too (but that is not guaranteed to be the case).

String.Equals Method
Determines whether two String objects have the same value.
This means that your ouput is completely normal.

Each string literal does not necessarily result in a new string instance. When two or more string literals that are equivalent according to the string equality operator appear in the same assembly, these string literals refer to the same string instance. For instance, the output produced by
class Test
{
static void Main() {
object a = "hello";
object b = "hello";
System.Console.WriteLine(a == b);
}
}
is True because the two literals refer to the same string instance.
you can see this for completely:
http://msdn.microsoft.com/en-us/library/aa691090%28v=vs.71%29.aspx

For comparing string you should use:
var1string.CompareTo(var2string) == 0
This is the correct way to compare strings
So change :
if (str1 == str2)
with this:
if (str1.CompareTo(str2) == 0)
And the rest too.

why don't string object refs behave like other object refs?

string a = "a";
string b = a;
string a = "c";
Why does string b still have the value "a" and not "c"?
As string is an object and not a stack value type, what's with this behaviour?
Thanks

You're pointing the variable to something new, it's no different than if you said
Foo a = new Foo();
Foo b = a;
a = new Foo();
// a no longer equal to b
In this example, b is pointing to what a initially referenced. By changing the value of a, a and b are no longer referencing the same object in memory. This is different than working with properties of a and b.
Foo a = new Foo();
Foo b = a;
a.Name = "Bar";
Console.WriteLine(b.Name);
In this case, "Bar" gets written to the screen because a and b still reference the same object.

Let me start by saying that your choices for variables and data are poor. It makes it very difficult for someone to say "the string a in your example..." because "a" could be the content of the string, or the variable containing the reference. (And it is easily confused with the indefinite article 'a'.)
Also, your code doesn't compile because it declares variable "a" twice. You are likely to get better answers if you ask questions in a way that makes them amenable to being answered clearly.
So let's start over.
We have two variables and two string literals.
string x = "hello";
string y = x;
x = "goodbye";
Now the question is "why does y equal 'hello' and not 'goodbye'"?
Let's go back to basics. What is a variable? A variable is a storage location.
What is a value of the string type? A value of the string type is a reference to string data..
What is a variable of type string? Put it together. A variable of type string is a storage location which holds a reference to string data.
So, what is x? a storage location. What is its first value? a reference to the string data "hello".
What is y? a storage location. What is its first value? a reference to the string data "hello", same as x.
Now we change the contents of storage location x to refer to the string data "goodbye". The contents of storage location y do not change; we didn't set y.
Make sense?
why don’t string object refs behave like other object refs?
I deny the premise of the question. String object refs do behave like other object refs. Can you give an example of where they don't?

Part of what confuses people so much about this is thinking of the following as an append operation:
str1 = str1 + str2;
If string were a mutable type, and the above were shorthand for something like this:
str1.Append(str2);
Then what you're asking would make sense.
But str1 = str1 + str2 is not just some method call on a mutable object; it is an assignment. Realizing this makes it clear that setting a = "c" in your example is no different from assigning any variable (reference type or not) to something new.
The below comparison between code that deals with two List<char> objects and code that deals with two string objects should hopefully make this clearer.
var a = new List<char>();
var b = a; // at this point, a and b refer to the same List<char>
b.Add('a'); // since a and b refer to the same List<char> ...
if (b.Contains('a')) { /* ...this is true... */ }
if (a.Contains('a')) { /* ...and so is this */ }
// HOWEVER...
a = new List<char>(); // now a and b do NOT refer to the same List<char>...
if (b.Contains('a')) { /* ...so this is still true... */ }
if (a.Contains('a')) { /* ...but this is not */ }
Compare this with a slightly modified version of the code you posted:
string a = "a";
string b = a; // at this point, a and b refer to the same string ("a")...
if (b == "a") { /* ...so this is true... */ }
if (a == "a") { /* ...and so is this */ }
// REMEMBER: the below is not simply an append operation like List<T>.Add --
// it is an ASSIGNMENT
a = a + "c"; // now they do not -- b is still "c", but a is "ac"
if (b == "a") { /* ...so this is still true... */ }
if (a == "a") { /* ...but this is not */ }

In .Net, a, b and c are reference to the objects and not the objects themselves. When you reset a, you are pointing this reference to a new memory location. The old memory location and any references to it are unchanged.

I guess the OP thinks string objects to be mutable, so something like var = "content";
would actually store the new character array inside the already existing object.
String is, however, an immutable type, which means that in this case a new string object is created and assigned to var.
See for example:
http://codebetter.com/blogs/patricksmacchia/archive/2008/01/13/immutable-types-understand-them-and-use-them.aspx

It is a misunderstanding because of the builtin string support of c#.
string a = "123"; //The way to write it in C#
string a = new string("123"); //Would be more obvious
The second way to define a is more obvious what happens, but it is verbose.Since strings have direct support from the compiler calling the string constructor is unnecessary.
Writing your example verbose:
string a = new string("a");
string b = a;
string a = new string("c");
Here the behavior is as expected a gets a reference to the new string object assigned. while the reference held by b still points to the old string.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.