Internals of "equals" in .NET - c#

I have a foolish doubt.Generally "System.Object" implements "Equals". When I implements
IEquatable interface i can give custom definition ( I believe so) to my "Equals".
so the professor class implementation is equal to
class Professor:System.Object,IEquatable
since there are different definitions of System.Equals ,and IEquatable's Equals ,Why did not C# report error?.Because I am not overriding "Equals" and even not Hiding "Equals" using new keyword.
class Professor : IEquatable<Professor>
{
public string Name { get; set; }
public bool Equals(Professor cust)
{
if (cust == null) return false;
return cust.Name == this.Name;
}
}

You are neither overriding nor hiding Object.Equals() because your version takes Professor as a parameter type - not object. Your are overloading the Equals() method.
C# allows two methods with the same name to differ on the type of the argument(s) that they accept. This is referred to as overloading - it can be viewed as compile-time polymorphism.
Overriding (which you could, and probably should also do) alters the implementation of a method from its version in a base class. It is the basis for runtime type polymorphism.
Hiding is a less common technique that allows a derived class to mask a version of a method in a base class. Based on the type of the reference through which you make the call, you may either get the base class version (if called through a base class reference) or the derived class version (if called through a derived type reference).
On your second question, you should use IEquatable<T> when there are semantics for comparing 'equality' of two instances that is separate from reference equality.
You should implement IComparable or IComparable<T> when there are semantics for ordering items. Meaning they can be less than, greater than, or equivalent.

The Object.Equals method accepts an object of type 'object' as its parameter. Your Equals method accepts an object of type 'Professor' as its parameter. Both of these methods can co-exist because it is legal to differ two identically-named methods by their parameter list; this is call method overloading.

You don't need to explicitly implement IEquatable if all you want to do is override the default Equals() implementation.
You can just do something like this:
class Professor
{
public string Name { get; set; }
public override bool Equals(object cust)
{
if (cust == null || !(cust is Professor)) return false;
return cust.Name == this.Name;
}
}
Be aware that if you override Equals() you also should override GetHashCode() to ensure proper operation of dictionaries and other collections that make use of hashing to differentiate between objects. Here's the MSDN page guidelines for overriding Equals().

Related

Why is equals method on object class virtual?

I was wondering why is Object.Equals(Object obj) virtual. I understand that we can override that method and write our own code for checking equality instead of the base Object.Equals(Object obj) method that checks only reference equality.
What I say is why override it when i can implement my own new method in my defined type? Is there any specific reason?
You would override it for the same reason you would want to override any method rather than hiding it with a new method in a derived class: because of polymorphism. You don't know how your derived class is going to be used by other code which only might know about the base class.
Clients may not even know that you've overridden the class at all, but they do know that they can call Equals on your instances because everything derives from Object. If you have some other, new method, the code using your instances will not know to call that method instead. It's the Liskov Substitution Principle at work.
What I say is why override it when i can implement my own new method in my defined type?
Because that is how the language feature has been designed, as you know every type in C# is inheriting from Object and Object defines the methods with default implementation to check equality of two objects, and we as developers creating new types might want to modify the behavior how equals method compare two objects of specific type.
Here is an example to understand why it is virtual:
int i = 1;
int j = 1;
Object o = i;
o.Equals(j); // now what will happen here if you have'nt overriden it
the int type contains the overriden implementation of Equals method which checks the two integers for equality, so when we will call Equals method using reference of type Object it will call the implementation defined in the type System.Int32,and we had a new method defined in the System.Int32 and not override the Equals method of Object, then we would see unexpected behavior as Object type implementation would have checked the memory addresses i.e reference equality.
Consider this example as well:
public class Person
{
private string _name;
public string Name
{
get
{
return _name;
}
}
public Person(string name)
{
_name = name;
}
public override string ToString()
{
return _name;
}
}
What if i want Equals method to not compare them by reference but instead we want to compare them on basis of name, in that case we would need to override the Equals method so that calling it either with Object type reference or Person type reference it would return the same result i.e not checking reference equality but checking the Name of two Person class objects.

Why does Equals(object) win over Equals(T) when using an inherited object in Hashset or other Collections?

I am aware of the fact that I always have to override Equals(object) and GetHashCode() when implementing IEquatable<T>.Equals(T).
However, I don't understand, why in some situations the Equals(object) wins over the generic Equals(T).
For example why is the following happening? If I declare IEquatable<T> for an interface and implement a concrete type X for it, the general Equals(object) is called by a Hashset<X> when comparing items of those type against each other. In all other situations where at least one of the sides is cast to the Interface, the correct Equals(T) is called.
Here's a code sample to demonstrate:
public interface IPerson : IEquatable<IPerson> { }
//Simple example implementation of Equals (returns always true)
class Person : IPerson
{
public bool Equals(IPerson other)
{
return true;
}
public override bool Equals(object obj)
{
return true;
}
public override int GetHashCode()
{
return 0;
}
}
private static void doEqualityCompares()
{
var t1 = new Person();
var hst = new HashSet<Person>();
var hsi = new HashSet<IPerson>();
hst.Add(t1);
hsi.Add(t1);
//Direct comparison
t1.Equals(t1); //IEquatable<T>.Equals(T)
hst.Contains(t1); //Equals(object) --> why? both sides inherit of IPerson...
hst.Contains((IPerson)t1); //IEquatable<T>.Equals(T)
hsi.Contains(t1); //IEquatable<T>.Equals(T)
hsi.Contains((IPerson)t1); //IEquatable<T>.Equals(T)
}
HashSet<T> calls EqualityComparer<T>.Default to get the default equality comparer when no comparer is provided.
EqualityComparer<T>.Default determines if T implementsIEquatable<T>. If it does, it uses that, if not, it uses object.Equals and object.GetHashCode.
Your Person object implements IEquatable<IPerson> not IEquatable<Person>.
When you have a HashSet<Person> it ends up checking if Person is an IEquatable<Person>, which its not, so it uses the object methods.
When you have a HashSet<IPerson> it checks if IPerson is an IEquatable<IPerson>, which it is, so it uses those methods.
As for the remaining case, why does the line:
hst.Contains((IPerson)t1);
call the IEquatable Equals method even though its called on the HashSet<Person>. Here you're calling Contains on a HashSet<Person> and passing in an IPerson. HashSet<Person>.Contains requires the parameter to be a Person; an IPerson is not a valid argument. However, a HashSet<Person> is also an IEnumerable<Person>, and since IEnumerable<T> is covariant, that means it can be treated as an IEnumerable<IPerson>, which has a Contains extension method (through LINQ) which accepts an IPerson as a parameter.
IEnumerable.Contains also uses EqualityComparer<T>.Default to get its equality comparer when none is provided. In the case of this method call we're actually calling Contains on an IEnumerable<IPerson>, which means EqualityComparer<IPerson>.Default is checking to see if IPerson is an IEquatable<IPerson>, which it is, so that Equals method is called.
Although IComparable<in T> is contravariant with respect to T, such that any type which implements IComparable<Person> would automatically be considered an implementation of IComparable<IPerson>, the type IEquatable<T> is intended for use with sealed types, especially structures. The requirement that Object.GetHashCode() be consistent with both IEquatable<T>.Equals(T) and Object.Equals(Object) generally implies that the latter two methods should behave identically, which in turn implies that one of them should chain to the other. While there is a large performance difference between passing a struct directly to an IEquatable<T> implementation of the proper type, compared with constructing a instance of the structure's boxed-heap-object type and having an Equals(Object) implementation copy the structure data out of that, no such performance different exists with reference types. If IEquatable<T>.Equals(T) and Equals(Object) are going to be equivalent and T is an inheritable reference type, there's no meaningful difference between:
bool Equals(MyType obj)
{
MyType other = obj as MyType;
if (other==null || other.GetType() != typeof(this))
return false;
... test whether other matches this
}
bool Equals(MyType other)
{
if (other==null || other.GetType() != typeof(this))
return false;
... test whether other matches this
}
The latter could save one typecast, but that's unlikely to make a sufficient performance difference to justify having two methods.

Distinct() not calling equals methods

I've implemented IEqualityComparer and IEquatable (both just to be sure), but when I call the Distinct() method on a collection it does not call the methods that come with it. Here is the code that I execute when calling Distinct().
ObservableCollection<GigViewModel> distinctGigs = new ObservableCollection<GigViewModel>(Gigs.Distinct<GigViewModel>());
return distinctGigs;
I want to return an ObservableCollection that doesn't contain any double objects that are in the 'Gigs' ObservableCollection.
I implement the interfaces like this on the GigViewModel class:
public class GigViewModel : INotifyPropertyChanged, IEqualityComparer<GigViewModel>, IEquatable<GigViewModel>
{
....
}
And override the methods that come with the interfaces like so:
public bool Equals(GigViewModel x, GigViewModel y)
{
if (x.Artiest.Naam == y.Artiest.Naam)
{
return true;
}
else
{
return false;
}
}
public int GetHashCode(GigViewModel obj)
{
return obj.Artiest.Naam.GetHashCode();
}
public bool Equals(GigViewModel other)
{
if (other.Artiest.Naam == this.Artiest.Naam)
{
return true;
}
else
{
return false;
}
}
Thanks for all the help I'm getting. So I've created a seperate class that implements IEqualityComparer and passed it's instance into the disctinct method. But the methods are still not being triggered.
EqualityComparer:
class GigViewModelComparer : IEqualityComparer<GigViewModel>
{
public bool Equals(GigViewModel x, GigViewModel y)
{
if (x.Artiest.Naam == y.Artiest.Naam)
{
return true;
}
else
{
return false;
}
}
public int GetHashCode(GigViewModel obj)
{
return obj.Artiest.Naam.GetHashCode();
}
}
The Distinct() call:
GigViewModelComparer comp = new GigViewModelComparer();
ObservableCollection<GigViewModel> distinctGigs = new ObservableCollection<GigViewModel>(Gigs.Distinct(comp));
return distinctGigs;
EDIT2:
The GetHashCode() method DOES get called! After implementing the new class. But the collection still contains duplicates. I have a list of 'Gigs' that contain an 'Artiest' (or Artist) object. This Artist has a Naam property which is a String (Name).
So you had the object that itself is being compared implement both IEquatable as well as IEqualityComparer. That generally doesn't make sense. IEquatable is a way of saying an object can compare itself to something else. IEqualityComparer is a way of saying it can compare two different things you give it to each other. You generally want to do one or the other, not both.
If you want to implement IEquatable then the object not only needs to have an Equals method of the appropriate signature, but it needs to override GetHashCode to have a sensible implementation for the given definition of equality. You didn't do that. You created GetHashCode method that takes an object as a parameter, but that's the overload used for IEqualityComparer. You need to override the parameter-less version when using IEquatable (the one defined in Object).
If you want to create a class that implements IEqualityComparer you need to pass the comparer to the Distinct method. Since you've defined the object as its own comparer you'd need to pass in some instance of this object as the second parameter. Of course, this doesn't really make a whole lot of sense this way; so it would be better, if you go this route, to pull out the two methods that go with IEqualityComparer into a new type, and create an instance of that type to the Distinct method. If you actually passed an object with those definitions in as a comparer, it'd work just fine.
Following MSDN's advice, you'd be best off creating a separate class for your equality comparisons:
We recommend that you derive from the EqualityComparer class
instead of implementing the IEqualityComparer interface, because
the EqualityComparer class tests for equality using the
IEquatable.Equals method instead of the Object.Equals method. This
is consistent with the Contains, IndexOf, LastIndexOf, and Remove
methods of the Dictionary class and other generic
collections.
So, create a class, GigViewModelComparer, that derives from EqualityComparer and put your Equals and GetHashCode methods there.
Then, pass in an instance of that new comparer class in your call to Gigs.Distinct(new GigViewModelComparer()) and it should work. Follow along in the example in the MSDN link I provided above.
I've never seen somebody implement IEqualityComparer in the same class as the type of objects the collection in question contains, that is probably at least part of your problem.

5 ways for equality check in .net .. why? and which to use?

While learning .net (by c#) i found 5 ways for checking equality between objects.
The ReferenceEquals() method.
The virtual Equals() method. (System.Object)
The static Equals() method.
The Equals method from IEquatable interface.
The comparison operator == .
My question is :
Why there are so many Equals() method along with comparison operator?
Which one of the virtual Equals() or IEquatable's Equals() sholud be used .. (say if we use our own collection classes)
1 - Reference equals checks if two reference type variables(classes, not structs) are referred to the same memory adress.
2 - The virtual Equals() method checks if two objects are equivalent. Let us say that you have this class:
class TestClass{
public int Property1{get;set}
public int Property2{get;set}
public override bool Equals(object obj)
{
if (obj.GetType() != typeof(TestClass))
return false;
var convertedObj = (TestClass)obj;
return (convertedObj.Property1 == this.Property1 && convertedObj.Property2 == this.Property2);
}
}
and you instantiate 2 objects from that class:
var o1 = new TestClass{property1 = 1, property2 = 2}
var o2 = new TestClass{property1 = 1, property2 = 2}
although the two objects are not the same instance of TestClass, the call to o1.Equals(o2) will return true.
3 - The static Equals method is used to handle problems when there is a null value in the check.
Imagine this, for instance:
TestClass o1 = null;
var o2 = new TestClass{property1 = 1, property2 = 2}
If you try this:
o1.Equals(o2);
you wil get a NullReferenceException, because o1 points to nothing.
To adress this issue, you do this:
Object.Equals(o1,o2);
This method is prepared to handle null references.
4 - The IEquatable interface is provided by .Net so you don't need to do casts inside your Equals method.
If the compiler finds out that you have implemented the interface in a class for the type you are trying to check for equality, it will give that method priority over the Object.Equals(Object) override.
For instance:
class TestClass : IEquatable<TestClass>
{
public int Property1 { get; set; }
public int Property2 { get; set; }
public override bool Equals(object obj)
{
if (obj.GetType() != typeof(TestClass))
return false;
var convertedObj = (TestClass)obj;
return (convertedObj.Property1 == this.Property1 && convertedObj.Property2 == this.Property2);
}
#region IEquatable<TestClass> Members
public bool Equals(TestClass other)
{
return (other.Property1 == this.Property1 && other.Property2 == this.Property2);
}
#endregion
}
now if we do this:
var o1 = new TestClass{property1 = 1, property2 = 2}
var o2 = new TestClass{property1 = 1, property2 = 2}
o1.Equals(o2);
The called method is Equals(TestClass), prior to Equals(Object).
5 - The == operator usually means the same as ReferenceEquals, it checks if two variables point to the same memory adress.
The gotcha is that this operator can be overrided to perform other types of checks.
In strings, for instance, it checks if two different instances are equivalent.
This is a usefull link to understand equalities in .Net better:
CodeProject
The ReferenceEquals() method.
This is used to test if two given variables point (the symbol references) to the same object. It is literally equivalent to ((object)a) == ((object)b). If you override the comparison operator (==) then ReferenceEquals maintains a way to access the default behaviour.
However, if you are dealing with a value type (e.g. a struct) then this will always return false. This is because the comparison boxes each value type to a new object so naturally the references will not be equal.
The virtual Equals() method. (System.Object)
This is the default way to semantically compare two objects (of any type). Each class overrides this as they choose. By default it is equivalent to a CLR call (InternalEquals) that basically compares memory references.
Note, if two objects return true for Equals() then GetHashCode() on each of them must be equal. However, if the hash codes for two objects are value equivalent (i.e. obj1.GetHashCode() == obj2.GetHashCode()) this does not means that Equals() is true.
Your class should typically implement Equals and GetHashCode as a means to distinguish class instances, and must implement this or the == operator (ideally both) if it is a value type.
Note, for value types the default Equals behaviour is that of ValueType.Equals() which if you look in Reflector (or read the MSDN description) uses reflection to compare the members of the two value instances.
The static Equals() method.
This is equivalent to return ((objA == objB) || (((objA != null) && (objB != null)) && objA.Equals(objB))) where each type is converted to Object for testing. My testing shows that overloaded comparison operators are ignored, but your Equals method will be used if the objects are not null and aren't the same reference. As such, a.Equals(b) does not necessarily equal object.Equals(a, b) (for the cases where ((object)a) == ((object)b) or either a or b is null).
The Equals method from IEquatable interface.
IEquatable provides a way for you to treat comparison to instances of the same class specially. Having said that your Equals method should be handling the behaviour the same way:
If you implement Equals, you should
also override the base class
implementations of
Object.Equals(Object) and GetHashCode
so that their behavior is consistent
with that of the IEquatable.Equals
method
Nevertheless you should implement IEquatable:
To handle the possibility that objects
of a class will be stored in an array
or a generic collection object, it is
a good idea to implement IEquatable
so that the object can be easily
identified and manipulated.
The comparison operator ==
The comparison operator by default returns true when both of your objects are the same reference.
It is not recommended to override the comparison operator unless you are dealing with a value type (in which case it is recommended, along with the Equals method) or an immutable reference type which you would usually compare by value (e.g. string). Always implement != at the same time (in fact I get a requires a matching operator '!=' to also be defined error if I do not).
Resources:
Link
Where is the implementation of InternalEquals(object objA, object objB)
http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx
http://msdn.microsoft.com/en-us/library/2dts52z7.aspx
http://msdn.microsoft.com/en-us/library/ms131190.aspx
http://msdn.microsoft.com/en-us/library/ms173147(VS.80).aspx
http://msdn.microsoft.com/en-us/library/ms182276(VS.80).aspx
Each version of equality is slightly different.
ReferenceEquals tests for reference equality.
virtual Equals by default checks for reference equality for class types and value equality for struct types. It can be overridden to define equality differently, if desired; and should be overridden for value types.
static Equals just calls virtual Equals, but also allows for null arguments.
IEquatable<T>.Equals is a generic/type-safe equivalent for virtual Equals.
operator== is intended to be like the default virtual Equals, meaning reference equality for class types (unless the class also overrides other operators). It should also be overridden for value types.
If you write your own collection class, use IEqualityComparer<T>, defaulting to EqualityComparer<T>.Default. Don't use any of the equality comparisons directly.
For primitives, stick with the == operator.
In most objects supplied in the .NET framework and any custom objects you create the .Equals() method and the == operator will only check to see if two objects refer to the same object on the heap.
The purpose of the IEquatable interface is to override the .Equals() method to change its behavior from checking for referential equality to check for value equality. The System.String type is an example of a built-in .NET object which implements this interface.
The .ReferenceEquals() method provides a way for developers who've overriden the standard .Equals() method to still be able to check two objects for referential equality.

Interface constraint for IComparable

When I want to constraint the type T to be comparable, should I use:
where T : IComparable
or
where T : IComparable<T>
I can't get my head around if #2 makes sense. Anyone can explain what the difference would be?
You may want both constraints, as in:
where T : IComparable, IComparable<T>
This would make your type compatible with more users of the IComparable interfaces. The generic version of IComparable, IComparable<T> will help to avoid boxing when T is a value type and allows for more strongly typed implementations of the interface methods. Supporting both ensures that no matter which interface some other object asks for, your object can comply and therefore inter-operate nicely.
For example, Array.Sort and ArrayList.Sort use IComparable, not IComparable<T>.
The main difference between IComparable and IComparable<> is that the first is pre-generics so allows you to call the compare method with any object, whereas the second enforces that it shares the same type:
IComparable - CompareTo(object other);
IComparable<T> - CompareTo(T other);
I would go with the second option provided that you don't intend to use any old .net 1.0 libraries where the types may not implement the modern, generic solution. You'll gain a performance boost since you'll avoid boxing and the comparisons won't need to check the types match and you'll also get the warm feeling that comes from doing things in the most cutting edge way...
To address Jeff's very good and pertinent point I would argue that it is good practice to place as few constraints on a generic as is required to perform the task. Since you are in complete control of the code inside the generic you know whether you are using any methods that require a basic IComparable type. So, taking his comment into consideration I personally would follow these rules:
If you are not expecting the generic to use any types that only implement IComparable (i.e. legacy 1.0 code) and you are not calling any methods from inside the generic that rely on an IComparable parameter then use the IComparable<> constraint only.
If you are using types that only implement IComparable then use that constraint only
If you are using methods that require an IComparable parameter, but not using types that only implement IComparable then using both constraints as in Jeff's answer will boost performance when you use methods that accept the generic type.
To expand on the third rule - let's assume that the class you are writing is as follows:
public class StrangeExample<T> where ... //to be decided
{
public void SortArray(T[] input)
{
Array.Sort(input);
}
public bool AreEqual(T a, T b)
{
return a.CompareTo(b) == 0;
}
}
And we need to decide what constraints to place on it. The SortArray method calls Array.Sort which requires the array that is passed in to contains objects that implement IComparable. Therefore we must have an IComparable constraint:
public class StrangeExample<T> where T : IComparable
Now the class will compile and work correctly as an array of T is valid for Array.Sort and there is a valid .CompareTo method defined in the interface. However, if you are sure that you will not want to use your class with a type that does not also implement the IComparable<> interface you can extend your constraint to:
public class StrangeExample<T> where T : IComparable, IComparable<T>
This means that when AreEqual is called it will use the faster, generic CompareTo method and you will see a performance benefit at the expense of not being able to use it with old, .NET 1.0 types.
On the other hand if you didn't have the AreEqual method then there is no advantage to the IComparable<> constraint so you may as well drop it - you are only using IComparable implementations anyway.
Those are two different interfaces. Before .NET 2.0 there were no generics, so there was just IComparable. With .NET 2.0 came generics and it became possible to make IComparable<T>. They do exactly the same. Basically IComparable is obsolete, although most libraries out there recognize both.
To make your code really compatible, implement both, but make one call the other, so you don't have to write the same code twice.
The IComparable<T> allows the comparator to be strongly typed.
You can have
public int CompareTo(MyType other)
{
// logic
}
as oppose to
public int CompareTo(object other)
{
if (other is MyType)
// logic
}
Take for instance the next example witch implements both interfaces:
public class MyType : IComparable<MyType>, IComparable
{
public MyType(string name, int id)
{ Name = name; Id = id; }
public string Name { get; set; }
public int Id { get; set; }
public int CompareTo(MyType other)
{
if (null == other)
throw new ArgumentNullException("other");
return (Id - other.Id > 0 ? 1 : 0);
}
public int CompareTo(object other)
{
if (null == other)
throw new ArgumentNullException("other");
if (other is MyType)
return (Id - (other as MyType).Id > 0 ? 1 : 0);
else
throw new InvalidOperationException("Bad type");
}
}
MyType t1 = new MyType("a", 1);
MyType t2 = new MyType("b", 2);
object someObj = new object();
// calls the strongly typed method: CompareTo(MyType other)
t1.CompareTo(t2);
// calls the *weakly* typed method: CompareTo(object other)
t1.CompareTo(someObj);
If MyType was only implemented with IComparable<MyType>, the second compareTo(someObj) is a compile time error. This is one advantage of strongly typed generics.
On the other hand, there are methods in the framework that require the non generic IComparable like Array.Sort. In these cases, you should consider implementing both interfaces like in this example.
I would use the second constraint as that will allow you to reference strongly-typed members of the interface. If you go with your first option then you will have to cast to use the interface type.

Categories

Resources