Question about Object Identity and Object Equality and String class exception

Question about Object Identity and Object Equality and String class exception - c#

This is a Java and C# question.
We all know that, Object Identity(==) tests whether two objects refer to the same location and Obejct Equality(Equals method) tests whether two different (non identical)objects have the same value .But In case of string object Object Identity and Object Equality are same.
For e.g Below two boolean expressions in if statements return true
string a="123";
string b="123";
if(a==b)
if(a.Equals(b))
Why is it so??
What is the rational behind this design decision?

Java and C# both use a memory-saving technique called string interning. Because strings are immutable in these languages, they can pool frequently-used strings (included hard-coded string literals, like in your example) and use multiple references to that one string in memory to save space.

As far as I know, in .net the == Operator for Strings is overloaded to use Equals() instead of object identity. See this explanation for details: http://www.dotnetperls.com/string-equals
For if you need to know if it's really the same object, use this:
Object.ReferenceEquals(string1, string2)

Actually, at least in Java, there is a caching mechanism on strings. A pitfall is that two strings that are equal will sometimes, but not always return true when applying the identity operator. the following code prints false:
String a="123";
String b="12";
b=b+"3";
System.out.println(a==b);

If you really want to make sure, that a.equals(b) == true but (a==b) == false evaluates to false for two String a and b, then you can use the completely undervalued (^^) String constructor:
String a = new String("abc");
String b = new String("abc");
if (a.equals(b)) {
doTheyAreEqual();
if (a != b) {
doButNotTheSame();
}
}

Related

LINQ C# Dictionary [duplicate]

true.ToString()
false.toString();
Output:
True
False
Is there a valid reason for it being "True" and not "true"? It breaks when writing XML as XML's boolean type is lower case, and also isn't compatible with C#'s true/false (not sure about CLS though).
Update
Here is my very hacky way of getting around it in C# (for use with XML)
internal static string ToXmlString(this bool b)
{
return b.ToString().ToLower();
}
Of course that adds 1 more method to the stack, but removes ToLowers() everywhere.

Only people from Microsoft can really answer that question. However, I'd like to offer some fun facts about it ;)
First, this is what it says in MSDN about the Boolean.ToString() method:
Return Value
Type: System.String
TrueString if the value of this
instance is true, or FalseString if
the value of this instance is false.
Remarks
This method returns the
constants "True" or "False". Note that
XML is case-sensitive, and that the
XML specification recognizes "true"
and "false" as the valid set of
Boolean values. If the String object
returned by the ToString() method
is to be written to an XML file, its
String.ToLower method should be
called first to convert it to
lowercase.
Here comes the fun fact #1: it doesn't return TrueString or FalseString at all. It uses hardcoded literals "True" and "False". Wouldn't do you any good if it used the fields, because they're marked as readonly, so there's no changing them.
The alternative method, Boolean.ToString(IFormatProvider) is even funnier:
Remarks
The provider parameter is reserved. It does not participate in the execution of this method. This means that the Boolean.ToString(IFormatProvider) method, unlike most methods with a provider parameter, does not reflect culture-specific settings.
What's the solution? Depends on what exactly you're trying to do. Whatever it is, I bet it will require a hack ;)

...because the .NET environment is designed to support many languages.
System.Boolean (in mscorlib.dll) is designed to be used internally by languages to support a boolean datatype. C# uses all lowercase for its keywords, hence 'bool', 'true', and 'false'.
VB.NET however uses standard casing: hence 'Boolean', 'True', and 'False'.
Since the languages have to work together, you couldn't have true.ToString() (C#) giving a different result to True.ToString() (VB.NET). The CLR designers picked the standard CLR casing notation for the ToString() result.
The string representation of the boolean true is defined to be Boolean.TrueString.
(There's a similar case with System.String: C# presents it as the 'string' type).

For Xml you can use XmlConvert.ToString method.

It's simple code to convert that to all lower case.
Not so simple to convert "true" back to "True", however.
true.ToString().ToLower()
is what I use for xml output.

How is it not compatible with C#? Boolean.Parse and Boolean.TryParse is case insensitive and the parsing is done by comparing the value to Boolean.TrueString or Boolean.FalseString which are "True" and "False".
EDIT: When looking at the Boolean.ToString method in reflector it turns out that the strings are hard coded so the ToString method is as follows:
public override string ToString()
{
if (!this)
{
return "False";
}
return "True";
}

I know the reason why it is the way it is has already been addressed, but when it comes to "custom" boolean formatting, I've got two extension methods that I can't live without anymore :-)
public static class BoolExtensions
{
public static string ToString(this bool? v, string trueString, string falseString, string nullString="Undefined") {
return v == null ? nullString : v.Value ? trueString : falseString;
}
public static string ToString(this bool v, string trueString, string falseString) {
return ToString(v, trueString, falseString, null);
}
}
Usage is trivial. The following converts various bool values to their Portuguese representations:
string verdadeiro = true.ToString("verdadeiro", "falso");
string falso = false.ToString("verdadeiro", "falso");
bool? v = null;
string nulo = v.ToString("verdadeiro", "falso", "nulo");

This probably harks from the old VB NOT .Net days when bool.ToString produced True or False.

Why comparing two strings as object causes unexpected result

Consider the following piece of code.
object str = new string(new char[] { 't', 'e', 's', 't' });
object str1 = new string(new char[] { 't', 'e', 's', 't' });
Console.WriteLine(str==str1); // false
Console.WriteLine(str.Equals(str1)); // true
I understand the equality operator working here that as we have implicitly casted to object, the equality operator is checking the references of both if they are equal and returns false.
But i am confused on the second one, returning true looks like it is calling Equals override implementation provided by the String type and it checks for content of string if they are equal.
My question is why it doesn't check for content equality for operator as well, their actual type is string not object. right ?
while the follwing code outputs ture for both:
object str = "test";
object str1 = "test";
Console.WriteLine(str==str1); // true
Console.WriteLine(str.Equals(str1)); // true

With:
Console.WriteLine(str==str1); // false
it is determined at compile-time which C# pre-defined (formal) overload of operator == to use. Since str and str1 are declared as object, the overload operator ==(object, object) is chosen. This is fixed at compile-time. Just because the actual run-time types happen to be more specific, that does not change. If you want binding at run-time, use Console.WriteLine((dynamic)str == (dynamic)str1); /* true */ instead.
With:
Console.WriteLine(str.Equals(str1)); // true
you call a virtual method on object. Virtual means it will go to whatever override is relevant at run-time. The class System.String has an override, and since str will have run-time type System.String, the override will be used by the "virtual dispatch".
Regarding the addition to the bottom of your question: That situation is different because of string interning. String interning is an optimization where the same physical instance is used for formally distinct strings whose values are identical. When you have two strings whose values are given in the source code, string interning will "optimize" and make two references to the same instance. This is usually harmless because strings are guaranteed to be immutable. So normally you do not care if it is the same instance or another instance with identical value. But in your example, we can "reveal" the interning.
Note: String interning was not relevant to your original question. Only after you added a new example to your question, string interning became relevant.

When == is used on an expression of type object, it'll resolve to System.Object.ReferenceEquals.
Equals is just a virtual method and behaves as such, so the overridden version will be used (which, for string type compares the contents).

This happens because of string interning; when you write:
object str = "test";
object str1 = "test";
Console.WriteLine(str==str1);
This works as expected as the two strings are internally and silently copied to one location by the compiler so the two pointers will actually point to the same object.
If you create a string from an array of chars, the compiler is not clever enough to understand your intention and that it is the equivalent of above, so, being a string a reference type, they're effectively two different objects in memory.
Have a look at this article: https://blogs.msdn.microsoft.com/ericlippert/2009/09/28/string-interning-and-string-empty/
The Equals method is overridden in string, therefore it's comparing the actual content of the string rather than the address as == (ReferenceEquals) does in your case as the type is object.

I believe it is because the String == operator only takes string types as parameters, while the .Equals method takes object types as parameters.
Since the string == only take string types as parameters, the overload resolution selects the object == operator to use for the comparison.

The help to String.Equals method is giving this as a remark:
This method performs an ordinal (case-sensitive and
culture-insensitive) comparison.
So, the comparison is done by checking the string char by char, thus giving true.

Possible to create case insensitive string class?

What would be required to create a case-insensitive string type that otherwise behaves exactly like a string?
I've never heard of anyone making a case insensitive string type like this and it's obviously not part of the framework, but it seems like it could be very useful. The fact that SQL does case insensitive comparisons by default is a great case in point. So I'm thinking it's either not possible, or else there's a really good reason why no one does it that I'm not aware of.
I know it would require using an implicit operator for assignment, and you would have to override the equals operator. And for overriding GetHashCode(), I'm thinking you could just return ToLower().GetHashCode().
What am I missing?

Comparing string is rather easy. You can simply use the equals method or the compare method.
Example:
string s = "A";
s.Equals("a", StringComparison.InvariantCultureIgnoreCase); // Will return true.
string s = "A";
s.Equals("a", StringComparison.InvariantCulture); // Will return false.
You should also look at this. That will explain a little more on comparing strings.

Building on type of deathismyfriend's answer above, I would extend the string class:
public static class StringExtensions
{
public static int CaseInsensitveCompare(this string s, string stringToCompare)
{
return String.Compare(s, stringToCompare, StringComparison.InvariantCultureIgnoreCase);
}
}
And the call:
int result = firstString.CaseInsensitveCompare(secondString);

It wouldn't behave "exactly like a string". The string type is special and is baked into the language spec. C# strings exhibit special behavior, such as
being a reference type, that gets passed by value. Reference types are normally passed by...well...reference.
are interned by default. That means that there is only ever a single instance of a given string. The following code results in the creation of just a single string: a, b and c all point to exactly the same instance of the string quick. That means that Object.ReferenceEquals() is true when comparing any two:
string a = "The quick brown dog...".Substring(4,5) ;
string b = new string(new char[]{'q','u','i','c','k'});
string c = new StringBuilder().
.Append('q')
.Append('u')
.Append('i')
.Append('c')
.Append('k')
.ToString()
;
[edited to note: while one might think that this should be possible, a little fiddling around suggests that one can't actually create a custom implementation/subtype of CompareInfo as it has no public constructors and its default constructor is internal. More in the answers to this question: Globally set String.Compare/ CompareInfo.Compare to Ordinal
Grrr...]
What you could do is this:
String comparisons are done using the current culture's collation/comparison rules. Create a custom culture for your app, say, a copy of the the US culture that uses the collation/comparison rules you need. Set that as the current culture and Bob's-yer-uncle.
You'll still get compiler/ReSharper whines because you're doing string comparisons without specifying the desired comparison semantics, but your code will be clean.
For more details, see
https://msdn.microsoft.com/en-us/library/kzwcbskc(v=vs.90).aspx
https://msdn.microsoft.com/en-us/library/se513yha(v=vs.100).aspx

Why strings does not compare references?

I know it is special case but why == between strings returns if their value equals and not when their reference equals. Does it have something to do with overlloading operators?

The == operator is overloaded in String to perform value equality instead of reference equality, indeed. The idea is to make strings more friendly to the programmer and to avoid errors that arise when using reference equality to compare them (not too uncommon in Java, especially for beginners).
So far I have never needed to compare strings by reference, to be honest. If you need to do it you can use object.ReferenceEquals().

Because strings are immutable and the runtime may choose to put any two strings with the same content together into the same reference. So reference-comparing strings doesn't really make any sense.

Yes. From .NET Reflector here is the equality operator overloading of String class:
public static bool operator ==(string a, string b)
{
return Equals(a, b);
}

The equality operators (== and !=) are defined to compare the values of string objects, not references.
There was not any situation in which I had to compare the references but if you want to do so then you can use:
object.ReferenceEquals().

on a string, == compares by value
"Although string is a reference type, the equality operators (== and !=) are defined to compare the values of string objects, not references (7.9.7 String equality operators). This makes testing for string equality more intuitive."
In short, == on strings compares the strings by value, not by reference, because the C# specification says it should.

== or .Equals()

Why use one over the other?

== is the identity test. It will return true if the two objects being tested are in fact the same object. Equals() performs an equality test, and will return true if the two objects consider themselves equal.
Identity testing is faster, so you can use it when there's no need for more expensive equality tests. For example, comparing against null or the empty string.
It's possible to overload either of these to provide different behavior -- like identity testing for Equals() --, but for the sake of anybody reading your code, please don't.
Pointed out below: some types like String or DateTime provide overloads for the == operator that give it equality semantics. So the exact behavior will depend on the types of the objects you are comparing.
See also:
http://blogs.msdn.com/csharpfaq/archive/2004/03/29/102224.aspx

#John Millikin:
Pointed out below: some value types like DateTime provide overloads for the == operator >that give it equality semantics. So the exact behavior will depend on the types of the >objects you are comparing.
To elaborate:
DateTime is implemented as a struct. All structs are children of System.ValueType.
Since System.ValueType's children live on the stack, there is no reference pointer to the heap, and thus no way to do a reference check, you must compare objects by value only.
System.ValueType overrides .Equals() and == to use a reflection based equality check, it uses reflection to compare each fields value.
Because reflection is somewhat slow, if you implement your own struct, it is important to override .Equals() and add your own value checking code, as this will be much faster. Don't just call base.Equals();

Everyone else pretty much has you covered, but I have one more word of advice. Every now and again, you will get someone who swears on his life (and those of his loved ones) that .Equals is more efficient/better/best-practice or some other dogmatic line. I can't speak to efficiency (well, OK, in certain circumstances I can), but I can speak to a big issue which will crop up: .Equals requires an object to exist. (Sounds stupid, but it throws people off.)
You can't do the following:
StringBuilder sb = null;
if (sb.Equals(null))
{
// whatever
}
It seems obvious to me, and perhaps most people, that you will get a NullReferenceException. However, proponents of .Equals forget about that little factoid. Some are even "thrown" off (sorry, couldn't resist) when they see the NullRefs start to pop up.
(And years before the DailyWTF posting, I did actually work with someone who mandated that all equality checks be .Equals instead of ==. Even proving his inaccuracy didn't help. We just made damn sure to break all his other rules so that no reference returned from a method nor property was ever null, and it worked out in the end.)

== is generally the "identity" equals meaning "object a is in fact the exact same object in memory as object b".
equals() means that the objects logically equal (say, from a business point of view). So if you are comparing instances of a user-defined class, you would generally need to use and define equals() if you want things like a Hashtable to work properly.
If you had the proverbial Person class with properties "Name" and "Address" and you wanted to use this Person as a key into a Hashtable containing more information about them, you would need to implement equals() (and hash) so that you could create an instance of a Person and use it as a key into the Hashtable to get the information.
Using == alone, your new instance would not be the same.

According to MSDN:
In C#, there are two different kinds of equality: reference equality (also known as identity) and value equality. Value equality is the generally understood meaning of equality: it means that two objects contain the same values. For example, two integers with the value of 2 have value equality. Reference equality means that there are not two objects to compare. Instead, there are two object references and both of them refer to the same object.
...
By default, the operator == tests for reference equality by determining whether two references indicate the same object.

Both Equals and == can be overloaded, so the exact results of calling one or the other will vary. Note that == is determined at compile time, so while the actual implementation could change, which == is used is fixed at compile time, unlike Equals which could use a different implementation based on the run time type of the left side.
For instance string performs an equality test for ==.
Also note that the semantics of both can be complex.
Best practice is to implement equality like this example. Note that you can simplify or exclude all of this depending on how you plan on using you class, and that structs get most of this already.
class ClassName
{
public bool Equals(ClassName other)
{
if (other == null)
{
return false;
}
else
{
//Do your equality test here.
}
}
public override bool Equals(object obj)
{
ClassName other = obj as null; //Null and non-ClassName objects will both become null
if (obj == null)
{
return false;
}
else
{
return Equals(other);
}
}
public bool operator ==(ClassName left, ClassName right)
{
if (left == null)
{
return right == null;
}
else
{
return left.Equals(right);
}
}
public bool operator !=(ClassName left, ClassName right)
{
if (left == null)
{
return right != null;
}
else
{
return !left.Equals(right);
}
}
public override int GetHashCode()
{
//Return something useful here, typically all members shifted or XORed together works
}
}

Another thing to take into consideration: the == operator may not be callable or may have different meaning if you access the object from another language. Usually, it's better to have an alternative that can be called by name.

The example is because the class DateTime implements the IEquatable interface, which implements a "type-specific method for determining equality of instances." according to MSDN.

use equals if you want to express the contents of the objects compared should be equal. use == for primitive values or if you want to check that the objects being compared is one and the same object. For objects == checks whether the address pointer of the objects is the same.

I have seen Object.ReferenceEquals() used in cases where one wants to know if two references refer to the same object

In most cases, they are the same, so you should use == for clarity. According to the Microsoft Framework Design Guidelines:
"DO ensure that Object.Equals and the equality operators have exactly the same semantics and similar performance characteristics."
https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/equality-operators
But sometimes, someone will override Object.Equals without providing equality operators. In that case, you should use Equals to test for value equality, and Object.ReferenceEquals to test for reference equality.

If you do disassemble (by dotPeek for example) of Object, so
public virtual bool Equals(Object obj)
described as:
// Returns a boolean indicating if the passed in object obj is
// Equal to this. Equality is defined as object equality for reference
// types and bitwise equality for value types using a loader trick to
// replace Equals with EqualsValue for value types).
//
So, is depend on type.
For example:
Object o1 = "vvv";
Object o2 = "vvv";
bool b = o1.Equals(o2);
o1 = 555;
o2 = 555;
b = o1.Equals(o2);
o1 = new List<int> { 1, 2, 3 };
o2 = new List<int> { 1, 2, 3 };
b = o1.Equals(o2);
First time b is true (equal performed on value types), second time b is true (equal performed on value types), third time b is false (equal performed on reference types).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Question about Object Identity and Object Equality and String class exception - c#

Java and C# both use a memory-saving technique called string interning. Because strings are immutable in these languages, they can pool frequently-used strings (included hard-coded string literals, like in your example) and use multiple references to that one string in memory to save space.

As far as I know, in .net the == Operator for Strings is overloaded to use Equals() instead of object identity. See this explanation for details: http://www.dotnetperls.com/string-equals For if you need to know if it's really the same object, use this: Object.ReferenceEquals(string1, string2)

Actually, at least in Java, there is a caching mechanism on strings. A pitfall is that two strings that are equal will sometimes, but not always return true when applying the identity operator. the following code prints false: String a="123"; String b="12"; b=b+"3"; System.out.println(a==b);

Related

LINQ C# Dictionary [duplicate]

Why comparing two strings as object causes unexpected result

Possible to create case insensitive string class?

Why strings does not compare references?

== or .Equals()

Categories

Resources