EqualityComparer<T>.Default vs. T.Equals - c#

Suppose I've got a generic MyClass<T> that needs to compare two objects of type <T>. Usually I'd do something like ...
void DoSomething(T o1, T o2)
{
if(o1.Equals(o2))
{
...
}
}
Now suppose my MyClass<T> has a constructor that supports passing a custom IEqualityComparer<T>, similar to Dictionary<T>. In that case I'd need to do ...
private IEqualityComparer<T> _comparer;
public MyClass() {}
public MyClass(IEqualityComparer<T> comparer)
{
_comparer = comparer;
}
void DoSomething(T o1, T o2)
{
if((_comparer != null && _comparer.Equals(o1, o2)) || (o1.Equals(o2)))
{
...
}
}
To remove this lengthy if statement, it'd be good if I could have _comparer default to a 'default comparer' if the regular constructor is used. I searched for something like typeof(T).GetDefaultComparer() but wasn't able to find anything like it.
I did find EqualityComparer<T>.Default, could I use that? And would then this snippet ...
public MyClass()
{
_comparer = EqualityComparer<T>.Default;
}
void DoSomething(T o1, T o2)
{
if(_comparer.Equals(o1, o2))
{
...
}
}
... provide the same results as using o1.Equals(o2) for all possible cases?
(As a side note, would this mean I'd also need to use any special generic constraint for <T>?)

It should be the same, but it is not guaranteed, because it depends on implementation details of the type T.
Explanation:
Without a constraint to T, o1.Equals(o2) will call Object.Equals, even if T implements IEquatable<T>.
EqualityComparer<T>.Default however, will use Object.Equals only, if T doesn't implement IEquatable<T>. If it does implement that interface, it uses IEquatable<T>.Equals.
As long as T's implementation of Object.Equals just calls IEquatable<T>.Equals the result is the same. But in the following example, the result is not the same:
public class MyObject : IEquatable<MyObject>
{
public int ID {get;set;}
public string Name {get;set;}
public override bool Equals(object o)
{
var other = o as MyObject;
return other == null ? false : other.ID == ID;
}
public bool Equals(MyObject o)
{
return o.Name == Name;
}
}
Now, it doesn't make any sense to implement a class like this. But you will have the same problem, if the implementer of MyObject simply forgot to override Object.Equals.
Conclusion:
Using EqualityComparer<T>.Default is a good way to go, because you don't need to support buggy objects!

By default, until overridden in a class, Object.Equals(a,b)/a.Equals(b) performs comparison by reference.
What comparer will be returned by EqualityComparer<T>.Default depends on T. For example, if T : IEquatable<> then the appropriate EqualityComparer<T> will be created.

Yes, I think it would be wise to use the EqualityComparer<T>.Default, because it uses the implementation of IEquatable<T> if the type T implements it, or the override of Object.Equals otherwise. You could do it as follows:
private IEqualityComparer<T> _comparer;
public IEqualityComparer<T> Comparer
{
get { return _comparer?? EqualityComparer<T>.Default;}
set { _comparer=value;}
}
public MyClass(IEqualityComparer<T> comparer)
{
_comparer = comparer;
}
void DoSomething(T o1, T o2)
{
if(Comparer.Equals(o1, o2))
{
...
}
}

You could use the null coaelescense operator ?? to shorten the if if it really matters
if ((_comparer ?? EqualityComparer<T>.Default).Equals(o1, o2))
{
}

That's exactly what Dictionary<> and other generic collections in the BCL do if you don't specify a comparer when constructing the object. The benefit of this is that EqualityComparer<T>.Default will return the right comparer for IEquatable<T> types, nullable types, and enums. If T is none of those, it will do a simple Equals comparison like you're old code is doing.

Related

Why does Equals(object) win over Equals(T) when using an inherited object in Hashset or other Collections?

I am aware of the fact that I always have to override Equals(object) and GetHashCode() when implementing IEquatable<T>.Equals(T).
However, I don't understand, why in some situations the Equals(object) wins over the generic Equals(T).
For example why is the following happening? If I declare IEquatable<T> for an interface and implement a concrete type X for it, the general Equals(object) is called by a Hashset<X> when comparing items of those type against each other. In all other situations where at least one of the sides is cast to the Interface, the correct Equals(T) is called.
Here's a code sample to demonstrate:
public interface IPerson : IEquatable<IPerson> { }
//Simple example implementation of Equals (returns always true)
class Person : IPerson
{
public bool Equals(IPerson other)
{
return true;
}
public override bool Equals(object obj)
{
return true;
}
public override int GetHashCode()
{
return 0;
}
}
private static void doEqualityCompares()
{
var t1 = new Person();
var hst = new HashSet<Person>();
var hsi = new HashSet<IPerson>();
hst.Add(t1);
hsi.Add(t1);
//Direct comparison
t1.Equals(t1); //IEquatable<T>.Equals(T)
hst.Contains(t1); //Equals(object) --> why? both sides inherit of IPerson...
hst.Contains((IPerson)t1); //IEquatable<T>.Equals(T)
hsi.Contains(t1); //IEquatable<T>.Equals(T)
hsi.Contains((IPerson)t1); //IEquatable<T>.Equals(T)
}
HashSet<T> calls EqualityComparer<T>.Default to get the default equality comparer when no comparer is provided.
EqualityComparer<T>.Default determines if T implementsIEquatable<T>. If it does, it uses that, if not, it uses object.Equals and object.GetHashCode.
Your Person object implements IEquatable<IPerson> not IEquatable<Person>.
When you have a HashSet<Person> it ends up checking if Person is an IEquatable<Person>, which its not, so it uses the object methods.
When you have a HashSet<IPerson> it checks if IPerson is an IEquatable<IPerson>, which it is, so it uses those methods.
As for the remaining case, why does the line:
hst.Contains((IPerson)t1);
call the IEquatable Equals method even though its called on the HashSet<Person>. Here you're calling Contains on a HashSet<Person> and passing in an IPerson. HashSet<Person>.Contains requires the parameter to be a Person; an IPerson is not a valid argument. However, a HashSet<Person> is also an IEnumerable<Person>, and since IEnumerable<T> is covariant, that means it can be treated as an IEnumerable<IPerson>, which has a Contains extension method (through LINQ) which accepts an IPerson as a parameter.
IEnumerable.Contains also uses EqualityComparer<T>.Default to get its equality comparer when none is provided. In the case of this method call we're actually calling Contains on an IEnumerable<IPerson>, which means EqualityComparer<IPerson>.Default is checking to see if IPerson is an IEquatable<IPerson>, which it is, so that Equals method is called.
Although IComparable<in T> is contravariant with respect to T, such that any type which implements IComparable<Person> would automatically be considered an implementation of IComparable<IPerson>, the type IEquatable<T> is intended for use with sealed types, especially structures. The requirement that Object.GetHashCode() be consistent with both IEquatable<T>.Equals(T) and Object.Equals(Object) generally implies that the latter two methods should behave identically, which in turn implies that one of them should chain to the other. While there is a large performance difference between passing a struct directly to an IEquatable<T> implementation of the proper type, compared with constructing a instance of the structure's boxed-heap-object type and having an Equals(Object) implementation copy the structure data out of that, no such performance different exists with reference types. If IEquatable<T>.Equals(T) and Equals(Object) are going to be equivalent and T is an inheritable reference type, there's no meaningful difference between:
bool Equals(MyType obj)
{
MyType other = obj as MyType;
if (other==null || other.GetType() != typeof(this))
return false;
... test whether other matches this
}
bool Equals(MyType other)
{
if (other==null || other.GetType() != typeof(this))
return false;
... test whether other matches this
}
The latter could save one typecast, but that's unlikely to make a sufficient performance difference to justify having two methods.

Why doesn't Contains call CompareTo nor Equals?

While comparing instances of a custom class, I noticed that a call to Contains doesn't work the way I expect it to. Assuming that the default comparison goes by the reference (pointer or whatever it's called), I implemented both CompareTo and Equals. I made sure to be implementing IComparable, of course.
It's still doesn't work and I get no hits when I put breakpoints on those methods.
What can I be missing and is the best option to use extension methods if I'm not?
public override bool Equals(Object input)
{
return Id == ((MyType) input).Id;
}
public int CompareTo(Object input)
{
return Id - ((MyType)input).Id;
}
A better implementation could be:
public bool Equals(MyType other)
{
// if 'other' is a null reference, or if 'other' is more derived or less derived
if ((object)other == (object)null || other.GetType() != GetType())
return false;
// OK, check members (assuming 'Id' has a type that makes '==' a wise choice)
return Id == other.Id;
}
public override bool Equals(object obj)
{
// call to other overload
return Equals(obj as MyType);
}
public override int GetHashCode()
{
return Id.GetHashCode();
}
You can mark the class as implementing IEquatable<MyType> in that case (but it will work even without that).
Regarding GetHashCode: Always remember to override it. You should have seen a compiler warning that it was problematic to override Equals(object) without overriding GetHashCode. Never keep the code return base.GetHashCode() in the override (assuming the base class is System.Object). Either give it a try and implement something based on the members that participate in Equals. If you do not think GetHashCode will actually be used in your case, say:
public override int GetHashCode()
{
throw new NotSupportedException("We don't have GetHashCode, sorry");
}
If you absolutely know that you will only be using List<>.Contains, and not e.g. Dictionary<,>, HashSet<> and not Linq's Distinct(), etc. etc., it could work with GetHashCode() simply throwing.
IComparable<MyType> is not needed unless you sort List<MyType> or MyType[], or you use Linq's OrderBy with MyType, or you use SortedDictionary<,>, SortedSet<>.
Overloading operator == is not needed for these uses.

Difference between string.GetHashCode and IEqualityComparer<string>.Default.GetHashCode

I would like to use Distinct() with my data, declared as IEnumerable<KeyValuePair<IdentType, string>>. In this case, i have to implement my own IEqualityComparer and there is my question:
Is there any difference between below implementations?
public int GetHashCode(KeyValuePair<IdentType, string> obj) {
return EqualityComparer<string>.Default.GetHashCode(obj.Value);
}
and
public int GetHashCode(KeyValuePair<IdentType, string> obj) {
return obj.Value.GetHashCode();
}
There is only a small difference between your two methods.
EqualityComparer<string>.Default will return a class of type GenericEqualityComparer<T> if the class implments IEquateable<T> (which string does). So that GetHashCode(obj.Value) gets called to
public override int GetHashCode(T obj) {
if (obj == null) return 0;
return obj.GetHashCode();
}
which is the same as you calling obj.Value.GetHashCode(); directly, except for the fact that if you have a null string the default comparer will return 0 and the direct call version will throw a null reference exception.
Just one: the equality comparer's GetHashCode will return 0 if the string is null, whereas the second implementation will throw an exception.
One difference is that EqualityComparer<string>.Default.GetHashCode would not crash when you pass null to it.
using System;
using System.Collections.Generic;
public class Test
{
public static void Main()
{
var n = EqualityComparer<string>.Default.GetHashCode(null);
Console.WriteLine(n);
}
}
Other than that, the results would be identical by design, because System.String implements IEquatable<System.String>
The Default property checks whether type T implements the System.IEquatable<T> generic interface and, if so, returns an EqualityComparer<T> that invokes the implementation of the IEquatable<T>.Equals method. Otherwise, it returns an EqualityComparer<T>, as provided by T.
No. It doesn't. The implementation will be the same since they both call GetHashCode() on the actual class, in this case string.
In the end, the CreateComparer method inside the EqualityComparer creates an GenericEqualityComparer, and the implementation of it's GetHashCode is:
public override int GetHashCode(T obj) {
if (obj == null) return 0;
return obj.GetHashCode();
}
In this case, obj will be the original string where you would otherwise call GetHasCode on. The only case that will make it behave differently is when your string is null.

Creating an extension method against a generic interface or as a generic constraint?

I'm not really sure if there is any real difference here in the two signatures:
public static class MyCustomExtensions
{
public static bool IsFoo(this IComparable<T> value, T other)
where T : IComparable<T>
{
// ...
}
public static bool IsFoo(this T value, T other)
where T : IComparable<T>
{
// ...
}
}
I think these will essentially operate almost identically, but I'm not quite sure... what am I overlooking here?
Yes there is.
The first signature would match any type that can be compared to T, not just T values. So any type that implements IComparable<int> can be used by the first signature, not just int.
Example:
void Main()
{
10.IsFoo(20).Dump();
new Dummy().IsFoo(20).Dump();
IComparable<int> x = 10;
x.IsFoo(20).Dump();
IComparable<int> y = new Dummy();
y.IsFoo(20).Dump();
}
public class Dummy : IComparable<int>
{
public int CompareTo(int other)
{
return 0;
}
}
public static class Extensions
{
public static bool IsFoo<T>(this IComparable<T> value, T other)
where T : IComparable<T>
{
Debug.WriteLine("1");
return false;
}
public static bool IsFoo<T>(this T value, T other)
where T : IComparable<T>
{
Debug.WriteLine("2");
return false;
}
}
Will output:
2
False
1
False
1
False
1
False
I tested this with LINQPad.
If we rewrite it slightly to use IList instead of IComparable, wouldn't that be the same question?
In that case it is clear to see that IsFoo1 is completely different to IsFoo2.
Because IsFoo1 accepts first argument of essentially IList<IList<T>>
whereas IsFoo2 accepts first argument of just IList<T>
public static class MyCustomExtensions
{
public static bool IsFoo1(IList<T> value, T other)
where T : IList<T>
{
// ...
}
public static bool IsFoo2(T value, T other)
where T : IList<T>
{
// ...
}
}
So no they are not the same at all.
They aren't identical. In the first one you are passing in IComparable<T> to the first but not the second, so your actual types would be <IComparable<IComparable<T>> and IComparable<T>.
EDITED based on Lee's feedback: these below look identical, but while both require value and other to implement IComparable, the second also requires that they are assignable to T.
public static bool IsFoo<T>(IComparable<T> value, IComparable<T> other)
{
// ...
}
public static bool IsFoo<T>(T value, T other)
where T : IComparable<T>
{
// ...
}
The difference is pretty obvious. Note that you have to define T either in the method (generic method) or in a containing class (generic class, not possible with extension methods). Below I call the two methods 1 and 2:
public static bool IsFoo1<T>(this IComparable<T> value, T other)
where T : IComparable<T>
{
return true;
}
public static bool IsFoo2<T>(this T value, T other)
where T : IComparable<T>
{
return true;
}
There are differences depending on whether T is a value type or a reference type. You can restrict to either by using constraint where T : struct, IComparable<T> or where T : class, IComparable<T>.
Generally with any type T:
Some crazy type X might be declared IComparable<Y> where Y is distinct (and unrelated) to X.
With value types:
With IFoo1 the first parameter value will be boxed, whereas value in IFoo2 will not be boxed. Value types are sealed, and contravariance does not apply to value types, so this is the most important difference in this case.
With reference types:
With reference type T, boxing is not an issue. But note that IComparable<> is contravariant ("in") in its type argument. This is important if some non-sealed class implements IComparable<>. I used these two classes:
class C : IComparable<C>
{
public int CompareTo(C other)
{
return 0;
}
}
class D : C
{
}
With them, the following calls are possible, some of them because of inheritance and/or contravariance:
// IsFoo1
new C().IsFoo1<C>(new C());
new C().IsFoo1<C>(new D());
new D().IsFoo1<C>(new C());
new D().IsFoo1<C>(new D());
new C().IsFoo1<D>(new D());
new D().IsFoo1<D>(new D());
// IsFoo2
new C().IsFoo2<C>(new C());
new C().IsFoo2<C>(new D());
new D().IsFoo2<C>(new C());
new D().IsFoo2<C>(new D());
//new C().IsFoo2<D>(new D()); // ILLEGAL
new D().IsFoo2<D>(new D());
Of course, in many cases the generic argument <C> can be left out because it will be inferred, but I included it here for clarity.

Are custom objects equal by value with List<>

I have the following classes:
public class MyDocuments
{
public DateTime registeredDate;
public string version;
public List<Document> registeredDocuments;
}
public class Document
{
public string name;
public List<File> registeredFiles;
}
public class File
{
public string name;
public string content;
}
I have an instance of MyDocuments which has several documents in List<Document> registeredDocument. I get a new List<Document> from the user.
How can I verify that the new object doesn't exist in the list? I want to compare by value not reference.
I'm thinking of using HashSet instead of List. Is this the proper approach?
How are equality comparisons performed?
Whenever the BCL classes want to perform an equality check between objects of some type T, they do so by calling one or both of the methods in some implementation of IEqualityComparer<T>. To get hold of such an implementation, the framework looks to EqualityComparer<T>.Default.
As mentioned in the documentation, this property produces an IEqualityComparer<T> like this:
The Default property checks whether type T implements the
System.IEquatable<T> interface and, if so, returns an
EqualityComparer<T> that uses that implementation. Otherwise, it
returns an EqualityComparer<T> that uses the overrides of
Object.Equals and Object.GetHashCode provided by T.
What are my options?
So, in general, to dictate how equality comparisons should be performed you can:
Explicitly provide an implementation of IEqualityComparer<T> to the class or method that performs equality checks. This option is not very visible with List<T>, but many LINQ methods (such as Contains) do support it.
Make your class implement IEquatable<T>. This will make EqualityComparer<T>.Default use this implementation, and is a good choice whenever there is an obvious "natural" way to compare objects of type T.
Override object.GetHashCode and object.Equals without implementing IEqualityComparer<T>. However, this is simply an inferior version of #2 and AFAIK should always be avoided.
Which option to pick?
A good rule of thumb is: if there is an obvious and natural way to compare objects of class T, consider having it implement IEquatable<T>; this will make sure your comparison logic is used throughout the framework without any additional involvement. If there is no obvious candidate, or if you want to compare in a manner different than the default, implement your own IEqualityComparer<T> and pass the implementation as an argument to the class or method that needs to perform equality checks.
You will need to implement the Equals() method, and probably GetHashCode() as well. See this answer for an example.
You should implement IEquatable<T>.
When you implement this interface on your custom object, any equality checks (e.g. Contains, IndexOf) are automatically done using your objects implementation.
override the object.Equals method.
here's an example straight from the documentation
public class Person
{
private string idNumber;
private string personName;
public Person(string name, string id)
{
this.personName = name;
this.idNumber = id;
}
public override bool Equals(Object obj)
{
Person personObj = obj as Person;
if (personObj == null)
return false;
else
return idNumber.Equals(personObj.idNumber);
}
public override int GetHashCode()
{
return this.idNumber.GetHashCode();
}
}
the Equals method returns a bool which is whether or not obj is equal to this
Something like this at the top level, continued down at the sub-levels:
public class MyDocuments
{
public DateTime registeredDate;
public string version;
public HashSet<Document> registeredDocuments;
public override bool Equals(Object o)
{
if( !(o is MyDocuments) ) return false;
MyDocuments that = (MyDocuments)o;
if( !String.Equals(this.version, that.version) ) return false;
if( this.registeredDocuments.Count != that.registeredDocuments.Count ) return false;
// assuming registeredDate doesn't matter for equality...
foreach( Document d in this.registeredDocuments )
if( !that.registeredDocuments.Contains(d) )
return false;
return true;
}
public override int GetHashCode()
{
int ret = version.GetHashCode();
foreach (Document d in this.registeredDocuments)
ret ^= d.GetHashCode(); // xor isn't great, but better than nothing.
return ret;
}
}
Note: Caching could be useful for the HashCode values if the properties were change-aware.

Categories

Resources