I am relatively new to C#, and I noticed something interesting today that I guess I have never noticed or perhaps I am missing something. Here is an NUnit test to give an example:
object boolean1 = false;
object booloan2 = false;
Assert.That(boolean1 == booloan2);
This unit test fails, but this one passes:
object string1 = "string";
object string2 = "string";
Assert.That(string1 == string2);
I'm not that surprised in and of itself that the first one fails seeing as boolean1, and boolean2 are different references. But it is troubling to me that the first one fails, and the second one passes. I read (on MSDN somewhere) that some magic was done to the String class to facilitate this. I think my question really is why wasn't this behavior replicated in bool? As a note... if the boolean1 and 2 are declared as bool then there is no problem.
What is the reason for these differences or why it was implemented that way? Is there a situation where you would want to reference a bool object for anything except its value?
It's because the strings are in fact referring the same instance. Strings are interned, so that unique strings are reused. This means that in your code, the two string variables will refer to the same, interned string instance.
You can read some more about it here: Strings in .NET and C# (by Jon Skeet)
Update
Just for completeness; as Anthony points out string literals are interned, which can be showed with the following code:
object firstString = "string1";
object secondString = "string1";
Console.WriteLine(firstString == secondString); // prints True
int n = 1;
object firstString = "string" + n.ToString();
object secondString = "string" + n.ToString();
Console.WriteLine(firstString == secondString); // prints False
Operator Overloading.
The Boolean class does not have an overloaded == operator. The String class does.
As Fredrik said, you are doing a reference compare with the boolean comparison. The reason the string scenario works is because the == operator has been overloaded for strings to do a value compare. See the System.String page on MSDN.
Related
I recently learned about Stack and Heap and I wanted to ask a question concerning it. I've been "experimenting" with strings and I cannot explain - why is the following true if I am creating two different blocks of memory on the heap?
static void Main()
{
string test = "yes";
string secondTest = "yes";
Console.WriteLine(test == secondTest); //true
string thirdTest = new string("yes");
Console.WriteLine(secondTest == thirdTest); //true
}
The first string named test is the same as secondTest, because they have the same reference value, but when I create the third string thirdTest am I not creating a new block of memory on the heap by using "new"?
Why is it still true?
My guess:
What I wrote is exactly the same and I misunderstood the new operator, since when I watched tutorials, they were in Java language.
String name = "John"
String aThirdName = new String("John")
System.out.printIn(name == aThirdName); // false
This means that what I thought was different
(string test = "yes") = (string thirdTest = new string("yes"))
is actually the same. (By that I mean that those two lines are analogical)
If my guess is right, how do I create a new memory block on the heap with the same value?
(I want to know, just for learning purposes, I know that it is ineffective for the memory to have a lot of variables that have the same value that are on different memory blocks inside the heap)
As mentioned in comments, string is a bad example since it has the == operator overridden and the equals method overridden. For string, it is a reference type, but due to many overrides and other behavior it effectively behaves (in most cases) like a value type (especially in regards to equality).
That being said, if you were to create a simple class you'll find your test behaves exactly as you'd expect.
Snippets of the overridden equality in String to give you some context.
public static bool operator ==(string? a, string? b)
{
return string.Equals(a, b);
}
// Determines whether two Strings match.
public static bool Equals(string? a, string? b)
{
if (object.ReferenceEquals(a,b))
{
return true;
}
if (a is null || b is null || a.Length != b.Length)
{
return false;
}
return EqualsHelper(a, b);
}
It then starts down a rabbit hole of code with EqualsHelper that's not worth chasing in here (if you're interested, you can decompile it or find it online).
string firstTest=new string("test") and string secondTest="test" are the same, second version is just syntactic sugar. About why firstTest==secondTest //true, that's why class String override method Equals and operator (==) is also overriden and use Equals.
I see that String.Intern will actually add a string to the intern-pool and String.IsInterned will return the reference to that corresponding interned string. This makes me wonder:
Why does IsInterned return the referenced interned string and not a bool indicating whether a given string has been interned so far? I feel it's a funny use for an Is notation.
In what case would the code below return true?
bool InternCheck(string s)
{
string internedString = String.IsInterned(s);
return internedString != null && !String.Equals(internedString, s);
}
Why does IsInterned return the referenced interned string and not a bool indicating whether a given string has been interned so far? I feel it's a funny use for an Is notation.
For definitive "why?" you need to ask Microsoft. However, compare IsInterned() with similar (though functionally different of course) HashSet<T>.Add(). I.e. it's convenient to have a method that checks whether something is true, and if it is, provides the value you wanted as part of returning the information you want.
Why this method doesn't follow the TryXXX() pattern, again…you'd have to ask Microsoft, but we can easily guess. Obviously the method could have returned a bool and providing the string reference as an out parameter. But note that here, we know the value type is a nullable reference, and so can be null as an adequate way to indicate non-existence, which is different from the various types that implement TryXXX() methods.
In what case would the code below return true?
I don't see how that code would ever return true. If the string is not interned, it will return false, and if it is interned, then the interned string is necessarily always equal to the string that was passed in, and so !string.Equals(...) would also be false.
Is there some reason you think otherwise?
Let's imagine if the String.IsInterned method where to return a bool. Then all you'd know from calling bool whoopie = String.IsInterned(s); is that the value of your string is the same as a string that is interned. There is no indication that you have the same reference to the interned string.
Now the point of interning is to hold memory pressure down. You know you're creating a lot of similar strings and you want to ensure that you're not clogging up memory.
There's a cost to interning and that cost better be less than the cost of using up RAM.
So, back to String.IsInterned hypothetically returning a bool.
Since you don't know if you have the interned reference, which you'd want otherwise there's no point in interning, you'd end up writing this code a lot:
if (String.IsInterned(s))
{
s = String.GetInterned(s);
}
Or:
s = String.IsInterned(s) ? String.GetInterned(s) : s;
String.GetInterned is also a hypothetical method.
With the actual implementation of IsInterned this code becomes slightly simpler:
s = String.IsInterned(s) ?? s;
Let's see if we can improve this design.
If I try to implement a TryGetInterned style of operator I might implement it like this:
public static bool TryGetInterned(this string input, out string output)
{
string intermediate = String.IsInterned(input);
output = intermediate ?? input;
return intermediate != null;
}
This code works perfectly fine, but it leads to this kind of code repetition:
string s = "Hello World";
if (s.TryGetInterned(out string s2))
Console.WriteLine(s2); // `s` is interned
else
Console.WriteLine(s2); // `s` is NOT interned
This seems pretty pointless.
Compare this to the current IsInterned method:
string s = "Hello World";
s = String.IsInterned(s) ?? s;
Console.WriteLine(s);
Much simpler.
The only implementation that I could consider an improvement, in some circumstances, is this:
public static string GetIsInternedOrSelf(this string input)
=> String.IsInterned(input) ?? input;
Now I have this:
string s = "Hello World";
s = s.GetIsInternedOrSelf();
Console.WriteLine(s);
It's an improvement, but we've lost the ability to know if the string was interned.
The bottom-line is that I think String.IsInterned is probably as well designed as it could be.
There are many questions discussing this topic with ref to Javascript; but I could not get any with ref to C#.
Both the 'String........' statements below return false.
// foll querystring value from JQuery/Ajax call
var thisfieldvalue = Request.QueryString["fieldvalue"];
bool boola = String.IsNullOrWhiteSpace(thisfieldvalue );
bool boolb = String.IsNullOrEmpty(thisfieldvalue );
What is the best way to check for Undefined string variable in C#?
Note:
I get 'Undefined variable' values occasionally, via the JQuery/Ajax calls with the 'querystring'; and it ends up in the C# variable when I use the statement
var thisfieldvalue = Request.QueryString["fieldvalue"];
and the 'thisfieldvalue' variable passes both the 'String.IsNullOrWhiteSpace' as well as the 'String.IsNullOrEmpty' checks....
Note 2: I have edited the question again to make my question clearer... I am sorry that earlier it was not that clear....
you could use either
string Undefined_var = "[value to test goes here]"; //note that string must be assigned before it is used
bool boola = String.IsNullOrWhiteSpace(Undefined_var);
//or
bool boolb = String.IsNullOrEmpty(Undefined_var);
Difference being that IsNullOrWhiteSpace will check everything that IsNullOrEmpty does, plus the case when Undefined_var consists of only white space. But since a string consisting of only white space characters is not technically undefined, I would go with IsNullOrEmpty of the two.
But do note that since string is a reference type, the default value is null; so if you wanted to narrow down a step farther to eliminate the test for an empty string, you could do something like this-
string Undefined_var = null;
bool boola = Undefined_var == null;
There are no "undefined" string variables in C#.
String is a reference type, therefore if you don't define a value, it's default value is null.
There is no difference between a string not set to a value (default value null) and a string explicitely set to null.
In Visual Studio 2013 your code doesn't even compile. The first check gets flagged as use of unassigned local variable.
As C# is a strongly typed language, use it to your advantage, set the value explicitly:
string Undefined_var = null;
bool boola = String.IsNullOrWhiteSpace(Undefined_var);
bool boolb = String.IsNullOrEmpty(Undefined_var);
Then you will get two true values.
It question is not applicable to C# because C# does not allows a non-defined local variables. Members of classes are initialized by a member's default value (for reference types - initialized by null).
if (Request.QueryString["fieldvalue"] == "undefined")
It's a string, it will come across literally as a string.
If it is 5 it's a string of 5
If it is not there it's a string of undefined
I have the Following Code
CASE 1
string string1 = "pankaj";
string string2 = "pankaj";
Console.WriteLine(string1 == string2); // output TRUE
CASE 2
object obj1 = "pankaj";
object obj2 = "pankaj";
Console.WriteLine(obj1==obj2); // Output TRUE
CASE 3
object againObject1 = 2;
object againObject2 = 2;
Console.WriteLine(againObject1==againObject2); // Output FALSE
as string and object are both reference type and for reference type I learned that equality operation checks if they hold the same address, in above two case why its comparing value instead of references.
what is more confusing is the behavior of equality operator for object type in case 2 and case 3 for string type it computes true and for integers its return false.
String equality is different. Among many other things...
Example 1 and 2 will in both cases return the exact same object - the INTERNED string ("pankaj" exists only once after internalization, and all constant strings are internalized).
Example 3 has 2 boxed objects without any optimization - so 2 boxes around a value type.
Strings are objects and integers also are, but the later are type values. So the example 3 is pointing to two different places in memory and you are trying to compare their addresses by boxing them on objects.
using:
object1==object2
isn't comparing the content of the object, instead it's a comparison of the storage-address,
if the object is comparable use object1.equals(object2)
The String class has overridden operator == so as to implement comparison by value, and the Int32 class has not.
I know there are a lot of ways to compare VALUE and REFERENCES in C#, but I'm still a bit confused about what type performs what when you try to compare either VALUE or REFERENCE.
String examples:
string str = "hello";
string str2 = "hello";
if (str == str2)
{
Console.WriteLine("Something");
} // Is this a comparison of value?
if (str.Equals(str2))
{
Console.WriteLine("Something");
} // Is this a comparison of value?
string.ReferenceEquals(str, str2); // Comparison of reference (True)
Console.WriteLine((object)str1 == (object)str2); // Comparison of reference (True)
Equals and == will compare by reference by default if they're not overriden / overloaded in a subclass. ReferenceEquals will always compare by reference.
Strings are a confusing data type to use for experimenting with this, because they overload == to implement value equality; also, since they're immutable, C# will generally reuse the same instance for the same literal string. In your code, str and str2 will be the same object.
#Inerdia is right with what he says but I'd like to point out the reason why the line string.ReferenceEquals(str, str2) returns true in your code example. Because you are defining both of the strings at compile time, the compiler can optimise the code so they can both point to the same instance of the string. Since strings are immutable the compiler knows it can do this even though String is a reference type. But If you change your code to dynamically generate one of the strings (as shown below) the compiler can't perform this optimisation. So in your code example if you change your code to:
string str = "hello";
string str2 = new StringBuilder().Append("he").Append("llo").ToString();
Then the string.ReferenceEquals(str, str2) line will now return false as this time the compiler cant know to re-use the same instance (reference of the string).
Equality and Comparision of ReferenceTypes and strings:
Reference types work like this:
System.Object a = new System.Object();
System.Object b = new System.Object();
a == b; //returns true
a.Equals(b); //returns false
b = a;
a == b; //returns true
a.Equals(b); //returns true
Since strings are Reference types they should do the same, shouldn't they? But they don't!
C# Documentation defines string equality like this:
Although string is a reference type, the equality operators (== and
!=) are defined to compare the values of string objects, not
references (7.9.7 String equality operators). This makes testing for
string equality more intuitive.
https://msdn.microsoft.com/en-us/library/362314fe%28v=vs.71%29.aspx
https://msdn.microsoft.com/en-us/library/aa664728%28v=vs.71%29.aspx
This has implications for you test code.
if (str == str2)
{
Console.WriteLine("Something");
} // This is comparision of value even though string is a referenceType
if (str.Equals(str2))
{
Console.WriteLine("Something");
} // This is comparison by value too, because Equals is overrided in String class.
Keep in mind you as a programmer (Or your tricky coworker) can override .Equals(), changing it's behaviour, what you see above is what should happen. It's not necessarily in line with your codebase-reality, when in doubt check out the definition by marking .Equals() and hitting F12.
Addendum for x.Equals
The behavior of object.Equals() should these rules:
List item
x.Equals(x) returns true.
x.Equals(y) returns the same value as y.Equals(x).
if (x.Equals(y) && y.Equals(z)) returns true, then x.Equals(z) returns true.
Successive invocations of x.Equals(y) return the same value as long as the objects referenced by x and y are not modified.
x.Equals(null) returns false.
https://msdn.microsoft.com/ru-ru/library/ms173147%28v=vs.80%29.aspx
Whenever you are in doubt you can call x.ReferenceEquals, it's defined as following:
Unlike the Object.Equals(Object) method and the equality operator, the
Object.ReferenceEquals(Object) method cannot be overridden. Because of
this, if you want to test two object references for equality and you
are unsure about the implementation of the Equals method, you can call
the method.
https://msdn.microsoft.com/de-de/library/system.object.referenceequals%28v=vs.110%29.aspx
Thus:
System.Object a = new System.Object();
System.Object b = a;
System.Object.ReferenceEquals(a, b); //returns true
In your example the compiler merges your strings in optimization thus:
string str = "hello";
string str2 = "hello";
string.ReferenceEquals(str, str2); // Comparison of reference (True)
This behaviour is only duo to compiler optimization in your example, if we randomize the code it will return false:
string str = "hello";
string str2 = "hello";
if(throwCoin)
{
str2 = "bye";
}
string.ReferenceEquals(str, str2); // Comparison of reference (False)
string.ReferenceEquals(str, str2);
It obviously compares references.
str.Equals(str2)
Tries to compare references at first. Then it tries to compare by value.
str == str2
Does the same as Equals.
A good way to compare strings is to use string.Compare. If you want to ignore case, there is a parameter in place for that too.
Excerpt from .net sources:
public bool Equals(string value)
{
if (this == null)
throw new NullReferenceException();
else if (value == null)
return false;
else if (object.ReferenceEquals((object) this, (object) value))
return true;
else
return string.EqualsHelper(this, value);
}
So in general it is comparision of references first and if they don't match, it compares values.