Explanation:
Equals() compares the values of two objects.
ReferenceEquals() compares their references.
For reference types operator== by default compares references, while for value types it performs (AFAIK) the equivalent of Equals() using reflection.
So. I have a situation where I need to compare two reference types by their values. I can explicitly call Equals() or I can overload operator== to perform the desired comparison.
However, overloading operator== for value comparison kinda-sorta violates the principle of least astonishment. On the other hand explicitly calling two-object Equals looks like overkill.
What is standard practice here?
I know how to override Equals(). The question was whether it is commonly acceptable to override operator== to test for value equality on reference types or whether it is commonly accepted practice to explicitly call Equals/ReferenceEquals to explicitly specify which comparison you want.
What is standard practice?
The "standard practice" is, if you want to check two elements for equality which isn't reference equality, you need to implement IEquatable<T>, which introduces a Equals(T other) method, and override GetHashCode. That way, you control the way these two objects are compared. Usually, this includes overriding the == and != operators as well.
while for value types it performs (AFAIK) equivalent of Equals() using
reflection.
Only if your value type has a member which is a reference type. If it's all value types, it will do a bit comparison of these two objects:
// if there are no GC references in this object we can avoid reflection
// and do a fast memcmp
if (CanCompareBits(this))
return FastEqualsCheck(thisObj, obj);
The question was whether it is commonly acceptable to override
operator== to test for value equality on reference types
It really depends on what you're doing. It is encouraged to override the == operator once you override Equals because you want a consistent behavior value equality semantics. This depends on your definition of equality between two objects.
Although, If an object is mutable, then doing value comparison might result in odd scenarios where two objects are considered equal but later, one is mutated. This should definitely be analyzed on per case basis. Usually, overriding Equals should suffice.
Equals() performs value comparison of two objects.
This is not true. The default behavior of object.Equals on value types is to compare each of the fields using their definition of equality, the default behavior of reference types is to compare their references. It can be overridden to do whatever you want it to do. It is exactly the same as the == operator in this regard.
The only difference between the == operator and Equals is that Equals will perform a virtual dispatch on the first (but not the second) operand, finding the implementation of the method based on the runtime type of that object. The == operator is entirely statically bound; it considers only the compile time type of both operands. Other than this difference in binding both have the same default behaviors, and both can be overridden to provide whatever implementation you want.
The standard practice is to always ensure that the behavior of Equals and operator == is the same for your type. If you override the Equals method to change the equality semantics, then you should also overload the == operator to provide *identical semantics`, and vice versa.
The question was whether it is commonly acceptable to override
operator== to test for value equality
It depends on the object, if the object is immutable then you can override == operator, otherwise not. (Remember they are just guidelines).
See: Guidelines for Overriding Equals() and Operator == (C# Programming Guide)
By default, the operator == tests for reference equality by
determining whether two references indicate the same object.
Therefore, reference types do not have to implement operator == in
order to gain this functionality. When a type is immutable, that
is, the data that is contained in the instance cannot be changed,
overloading operator == to compare value equality instead of reference equality can be useful because, as immutable objects, they
can be considered the same as long as they have the same value. It
is not a good idea to override operator == in non-immutable types.
It's a good practice to give a semantic meaning to your code. So, if reference comparison is really something you should care about, use default behaviour for your classes; otherwise your application context dependent logic should be used for comparison. (with consistent behaviour of all the equality members like Equals, GetHashCode and operators)
Related
Link:
• Consider overriding Equals on a reference type if the semantics of
the type are based on the fact that the type represents some value(s).
• Most reference types must not overload the equality operator, even
if they override Equals. However, if you are implementing a reference
type that is intended to have value semantics, such as a complex
number type, you must override the equality operator.
a) To my understanding, for different instances of a reference type to be interchangeable, we should override both Equals method and the equality operator and also make the type immutable?
b) Doesn't a reference type having value semantics suggest that different instances ( that represent the same value ) of that type should be interchangeable?
c) But according to above quote, certain reference types with value semantics should only have Equals method overridden, but not also the equality operator. How can we claim such types have value semantics, since instances of that type are obviously not interchangeable?
d) So based on what criteria do we decide whether a reference type with value semantics should only have its Equals method overridden or also its equality operator? Simply based on whether or not we're willing to make the type immutable?
thanx
Regarding point A, yes, the type ought to be immutable. From MSDN:
You should not override Equals on a mutable reference type.
I think that D is the core question here, and the framework design guidelines seem to indicate that this comes down to performance:
AVOID overloading equality operators on reference types if the
implementation would be significantly slower than that of reference
equality.
Eric Lippert has some interesting things to say about this here. My favorite quote from it is:
The long answer is that the whole thing is weird and neither works the
way it ideally ought to.
Personally this lets me breathe a sigh of relief, as I have always been of the opinion that "==" is functionally a readable shorthand for Equals() (even though I know it isn't).
Failing to override GetHashCode and Equals when overloading the equality operator causes the compiler to produce warnings. Why would it be a good idea to change the implementation of either? After reading Eric Lippert's blog post on GetHashCode it's seems like there probably aren't many useful alternatives to GetHashCode's base implementation, why does the compiler I encourage you to change it?
Let's suppose you are implementing a class.
If you are overloading == then you are producing a type that has value equality as opposed to reference equality.
Given that, now the question is "how desirable is it to have a class that implements reference equality in .Equals() and value equality in ==?" and the answer is "not very desirable". That seems like a potential source of confusion. (And in fact, the company that I now work for, Coverity, produces a defect discovery tool that checks to see if you are confusing value equality with reference equality for precisely this reason. Coincidentally I was just reading the spec for it when I saw your question!)
Moreover, if you are going to have a class that implements both value and reference equality, the usual way to do it is to override Equals and leave == alone, not the other way around.
Therefore, given that you have overloaded ==, it is strongly suggested that you also override Equals.
If you are overriding Equals to produce value equality then you are required to override GetHashCode to match, as you know if you've read my article that you linked to.
If you don't override Equals() when you override == you will have some amazingly bad code.
How would you feel about this happening?
if (x == y)
{
if (!x.Equals(y))
throw new InvalidOperationException("Wut?");
}
Here's an example. Given this class:
class Test
{
public int Value;
public string Name;
public static bool operator==(Test lhs, Test rhs)
{
if (ReferenceEquals(lhs, rhs))
return true;
if (ReferenceEquals(lhs, null) || ReferenceEquals(rhs, null))
return false;
return lhs.Value == rhs.Value;
}
public static bool operator!=(Test lhs, Test rhs)
{
return !(lhs == rhs);
}
}
This code will behave oddly:
Test test1 = new Test { Value = 1, Name = "1" };
Test test2 = new Test { Value = 1, Name = "2" };
if (test1 == test2)
Console.WriteLine("test1 == test2"); // This gets printed.
else
Console.WriteLine("test1 != test2");
if (test1.Equals(test2))
Console.WriteLine("test1.Equals(test2)");
else
Console.WriteLine("NOT test1.Equals(test2)"); // This gets printed!
You do NOT want this!
My guess is that the compiler takes its clues from your actions, and decides that since you find it important to provide an alternative implementation of the equality operator, then you probably want the object equality to remain consistent with your new implementation of ==. After all, you do not want the two equality comparisons to mean drastically different things, otherwise your program would be hard to understand even on a very basic level. Therefore, the compiler thinks that you should redefine Equals as well.
Once you provide an alternative implementation Equals, however, you need to modify GetHashCode to stay consistent with the equality implementation. Hence the compiler warns you that your implementation might be incomplete, and suggests overriding both Equals and GetHashCode.
If you don't overload the Equals method too, then using it might give different results from the ones you'd get with the operator. Like, if you overload = for integers...
int i = 1;
(1 == 1) == (i.Equals(1))
Could evaluate to false.
For the same reason, you should reimplement the GetHashCode method so you don't mess up with hashtables and such other structures that rely on hash comparisons.
Notice I'm saying "might" and "could", not "will". The warnings are there just as a reminder that unexpected things might happen if you don't follow its suggestions. Otherwise you'd get errors instead of warnings.
The documentation is pretty clear about this:
The GetHashCode method can be overridden by a derived type. Value
types must override this method to provide a hash function that is
appropriate for that type and to provide a useful distribution in a
hash table. For uniqueness, the hash code must be based on the value
of an instance field or property instead of a static field or
property.
Objects used as a key in a Hashtable object must also override the
GetHashCode method because those objects must generate their own hash
code. If an object used as a key does not provide a useful
implementation of GetHashCode, you can specify a hash code provider
when the Hashtable object is constructed. Prior to the .NET Framework
version 2.0, the hash code provider was based on the
System.Collections.IHashCodeProvider interface. Starting with version
2.0, the hash code provider is based on the System.Collections.IEqualityComparer interface.
according to msdn
IStructuralEquatable
Defines methods to support the comparison of objects for structural
equality. Structural equality means that two objects are equal because
they have equal values. It differs from reference equality, which
indicates that two object references are equal because they reference
the same physical object.
isnt it what Equals should do ? ( when overriding IEquatable) ?
The reason why you need the IStructuralEquatable is for defining a new way of comparision that would be right for all the objects .
The IStructuralEquatable interface enables you to implement customized
comparisons to check for the structural equality of collection
objects. That is, you can create your own definition of structural
equality and specify that this definition be used with a collection
type that accepts the IStructuralEquatable interface.
For example if you want a list that will sort all its elements by a specific definition.
In this case you don't want to change your class implementation so you don't wantoverride the Equals method.
this will define a general way to compare objects in your application.
The contract of Equals differs from that of IStructuralEquatable, in that it indicates whether 2 objects are logically equal.
By default, Equals on a reference type indicates whether two object references reference the same object instance. However, you are able to override Equals according to the logic of your application.
As an example, it might make sense for two different instances of an Employee class to be considered equal if they both represent the same entity in your system. To achieve this, employee objects with matching SSN properties would be treated as logically equal, even if they were not structurally equal.
Do the keys of a Dictionary need to be comparable with equality?
For example
Class mytype
{
public bool equals(mytype other)
{
return ...;
}
}
In my case they won't be equal unless they are the same instance.
If I need to implement equality should I have a large numeric value that increments with every new instance of mytype created?
If your classes are only equal if they are same instance, then you don't need to do anything to use them in a Dictionary. Classes (reference types) are considered equal if and only if the refer to the same object.
From the documentation of GetHashCode
For derived classes of Object, the GetHashCode method can delegate to the Object.GetHashCode implementation, if and only if that derived class defines value equality to be reference equality and the type is not a value type.
Which seems to be true in your case. As a rule of thumb, if you override Equal you need to override GetHashCode as well but this is not necessary in your case as the default is what you are looking for.
By default, equality is based on the instance. Two separate instances are never equal. You can only change that by providing your own Equals method.
Only if they are being used as a key and you don't want to base equivalence on the instance of the object itself. If you only want references to the exact same instance to be equivalent, you are fine and need do nothing, but if you are using your type as a key, and you want "equivalent" instances to be considered equal, your class must implement Equals() and GetHashCode().
If your custom type is being stored as a value, and not used as a key, this is not necessary, of course. For example, in this case MyType does not need to override Equals() or GetHashCode() because it is only used as a value, and not as the storage key.
Dictionary<string, MyType> x;
However in this case:
Dictionary<MyType, string> x;
Your custom type is the key, and thus it would need to override Equals() and GetHashCode(). The GetHashCode() is used to determine which location it hashes to, and the Equals() is used to resolve collisions on the hash code (among other things).
You'd need to override the same two methods when dealing with many LINQ queries as well. Alternatively, you can provide a standalone IEqualityComparer apart from your class to determine if two instances are equivalent.
See the EqualityComparer.Default<T> property. This is how the dictionary obtains an equality comparer if you don't supply it with one.
This returns an equality comparer based on the type & capabilities of T.
For example, if T extends IEquatable, EqualityComparer.Default will return an equality comparer instance that uses the IEquatable interface. Otherwise it will return an equality comparer instance that uses the Object.Equals method.
The Object.Equals method, by default for reference types, uses reference equality (Object.ReferenceEquals) unless you override it with a custom comparison.
The Object.Equals method, by default for value types, uses reflection to compare the fields of the struct for equality*. Reflection being slow, this is why it's always recommended to override Equals in value types.
* unless it's a blittable value type in which case the raw bits are compared.
No, there are no type constraints on Dictionary<TKey, TValue>
I ran into this situation today. I have an object which I'm testing for equality; the Create() method returns a subclass implementation of MyObject.
MyObject a = MyObject.Create();
MyObject b = MyObject.Create();
a == b; // is false
a.Equals(b); // is true
Note I have also over-ridden Equals() in the subclass implementation, which does a very basic check to see whether or not the passed-in object is null and is of the subclass's type. If both those conditions are met, the objects are deemed to be equal.
The other slightly odd thing is that my unit test suite does some tests similar to
Assert.AreEqual(MyObject.Create(), MyObject.Create()); // Green bar
and the expected result is observed. Therefore I guess that NUnit uses a.Equals(b) under the covers, rather than a == b as I had assumed.
Side note: I program in a mixture of .NET and Java, so I might be mixing up my expectations/assumptions here. I thought, however, that a == b worked more consistently in .NET than it did in Java where you often have to use equals() to test equality.
UPDATE Here's the implementation of Equals(), as requested:
public override bool Equals(object obj) {
return obj != null && obj is MyObjectSubclass;
}
The key difference between == and Equals is that == (like all operators) is not polymorphic, while Equals (like any virtual function) is.
By default, reference types will get identical results for == and Equals, because they both compare references. It's also certainly possible to code your operator logic and Equals logic entirely differently, though that seems nonsensical to do. The biggest gotcha comes when using the == (or any) operator at a higher level than the desired logic is declared (in other words, referencing the object as a parent class that either doesn't explicitly define the operator or defines it differently than the true class). In such cases the logic for the class that it's referenced as is used for operators, but the logic for Equals comes from whatever class the object actually is.
I want to state emphatically that, based solely upon the information in your question, there is absolutely no reason to think or assume that Equals compares values versus references. It's trivially easy to create such a class, but this is not a language specification.
Post-question-edit edit
Your implementation of Equals will return true for any non-null instance of your class. Though the syntax makes me think that you aren't, you may be confusing the is C# keyword (which confirms type) with the is keyword in VB.NET (which confirms referential equality). If that is indeed the case, then you can make an explicit reference comparison in C# by using Object.ReferenceEquals(this, obj).
In any case, this is why you are seeing true for Equals, since you're passing in a non-null instance of your class.
Incidentally, your comment about NUnit using Equals is true for the same reason; because operators are not polymorphic, there would be no way for a particular class to define custom equality behavior if the Assert function used ==.
a == b checks if they reference the same object.
a.Equals(b) compares the contents.
This is a link to a Jon Skeet article from 2004 that explains it better.
You pretty much answered your question yourself:
I have also over-ridden Equals() in the subclass implementation, which does a very basic check to see whether or not the passed-in object is null and is of the subclass's type. If both those conditions are met, the objects are deemed to be equal.
The == operator hasn't been overloaded - so it's returning false since a and b are different objects. But a.Equals is calling your override, which is presumably returning true because neither a nor b are null, and they're both of the subclass' type.
So your question was "When can a == b be false and a.Equals(b) true?" Your answer in this case is: when you explicitly code it to be so!
In Java a ==b check if the references of the two objects are equals (rougly, if the two objects are the same object "aliased")
a.equals(b) compare the values represented by the two objects.
They both do the same unless they are specifically overloaded within the object to do something else.
A quote from the Jon Skeet Article mentioned elsewhere.
The Equals method is just a virtual
one defined in System.Object, and
overridden by whichever classes choose
to do so. The == operator is an
operator which can be overloaded by
classes, but which usually has
identity behaviour.
The keyword here is USUALLY. They can be written to do whatever the underlying class wishes and in no way do they have to do the same.
The "==" operate tests absolute equality (unless overloaded); that is, it tests whether two objects are the same object. That's only true if you assigned one to the other, ie.
MyObject a = MyObject.Create();
MyObject b = a;
Just setting all the properties of two objects equal doesn't mean the objects themselves are. Under the hood, what the "==" operator is comparing is the addresses of the objects in memory. A practical effect of this is that if two objects are truly equal, changing a property on one of them will also change it on the other, whereas if they're only similar ("Equals" equal), it won't. This is perfectly consistent once you understand the principle.
I believe that a == b will check if the referenced object is the same.
Usually to see if the value is the same a.Equals(b) is used (this often needs to be overridden in order to work).