I would like to use Distinct() with my data, declared as IEnumerable<KeyValuePair<IdentType, string>>. In this case, i have to implement my own IEqualityComparer and there is my question:
Is there any difference between below implementations?
public int GetHashCode(KeyValuePair<IdentType, string> obj) {
return EqualityComparer<string>.Default.GetHashCode(obj.Value);
}
and
public int GetHashCode(KeyValuePair<IdentType, string> obj) {
return obj.Value.GetHashCode();
}
There is only a small difference between your two methods.
EqualityComparer<string>.Default will return a class of type GenericEqualityComparer<T> if the class implments IEquateable<T> (which string does). So that GetHashCode(obj.Value) gets called to
public override int GetHashCode(T obj) {
if (obj == null) return 0;
return obj.GetHashCode();
}
which is the same as you calling obj.Value.GetHashCode(); directly, except for the fact that if you have a null string the default comparer will return 0 and the direct call version will throw a null reference exception.
Just one: the equality comparer's GetHashCode will return 0 if the string is null, whereas the second implementation will throw an exception.
One difference is that EqualityComparer<string>.Default.GetHashCode would not crash when you pass null to it.
using System;
using System.Collections.Generic;
public class Test
{
public static void Main()
{
var n = EqualityComparer<string>.Default.GetHashCode(null);
Console.WriteLine(n);
}
}
Other than that, the results would be identical by design, because System.String implements IEquatable<System.String>
The Default property checks whether type T implements the System.IEquatable<T> generic interface and, if so, returns an EqualityComparer<T> that invokes the implementation of the IEquatable<T>.Equals method. Otherwise, it returns an EqualityComparer<T>, as provided by T.
No. It doesn't. The implementation will be the same since they both call GetHashCode() on the actual class, in this case string.
In the end, the CreateComparer method inside the EqualityComparer creates an GenericEqualityComparer, and the implementation of it's GetHashCode is:
public override int GetHashCode(T obj) {
if (obj == null) return 0;
return obj.GetHashCode();
}
In this case, obj will be the original string where you would otherwise call GetHasCode on. The only case that will make it behave differently is when your string is null.
Related
While comparing instances of a custom class, I noticed that a call to Contains doesn't work the way I expect it to. Assuming that the default comparison goes by the reference (pointer or whatever it's called), I implemented both CompareTo and Equals. I made sure to be implementing IComparable, of course.
It's still doesn't work and I get no hits when I put breakpoints on those methods.
What can I be missing and is the best option to use extension methods if I'm not?
public override bool Equals(Object input)
{
return Id == ((MyType) input).Id;
}
public int CompareTo(Object input)
{
return Id - ((MyType)input).Id;
}
A better implementation could be:
public bool Equals(MyType other)
{
// if 'other' is a null reference, or if 'other' is more derived or less derived
if ((object)other == (object)null || other.GetType() != GetType())
return false;
// OK, check members (assuming 'Id' has a type that makes '==' a wise choice)
return Id == other.Id;
}
public override bool Equals(object obj)
{
// call to other overload
return Equals(obj as MyType);
}
public override int GetHashCode()
{
return Id.GetHashCode();
}
You can mark the class as implementing IEquatable<MyType> in that case (but it will work even without that).
Regarding GetHashCode: Always remember to override it. You should have seen a compiler warning that it was problematic to override Equals(object) without overriding GetHashCode. Never keep the code return base.GetHashCode() in the override (assuming the base class is System.Object). Either give it a try and implement something based on the members that participate in Equals. If you do not think GetHashCode will actually be used in your case, say:
public override int GetHashCode()
{
throw new NotSupportedException("We don't have GetHashCode, sorry");
}
If you absolutely know that you will only be using List<>.Contains, and not e.g. Dictionary<,>, HashSet<> and not Linq's Distinct(), etc. etc., it could work with GetHashCode() simply throwing.
IComparable<MyType> is not needed unless you sort List<MyType> or MyType[], or you use Linq's OrderBy with MyType, or you use SortedDictionary<,>, SortedSet<>.
Overloading operator == is not needed for these uses.
Consider this struct:
public struct MyNumber
{
private readonly int _value;
public MyNumber(int myNumber)
{
_value = myNumber;
}
public int Value
{
get { return _value; }
}
public override bool Equals(object obj)
{
if (obj is MyNumber)
return this == (MyNumber) obj;
if (obj is int)
return Value == (int)obj;
return false;
}
public override string ToString()
{
return Value.ToString();
}
public static implicit operator int(MyNumber myNumber)
{
return myNumber.Value;
}
public static implicit operator MyNumber(int myNumber)
{
return new MyNumber(myNumber);
}
}
When I do this in a unit test:
Assert.AreEqual(new MyNumber(123), 123);
It's green.
But this test fails:
Assert.AreEqual(123, new MyNumber(123));
Why is this so? I'm guess it's because the int class determines the equality, whereas in the first case, my class determines it. But my class is implicitly convertible to int, so shouldn't that help?
How can I make the Assert.AreEqual work in both ways? I'm using MSTest by the way.
Update
Implementing IEquatable<int> or IComparable<int> doesn't help.
The first assertion will invoke MyNumber.Equals and you have implemented in a way where it will succeed if the argument to compare to is an Int32 and it is equal to value of MyNumber.
The second assertion will invoke Int32.Equals and it will fail because the argument to compare to is an MyNumber which Int32 does not know about or understand.
You cannot make your second unit test succeed because you assert that an Int32 should be equal to your value. It cannot be because it is different. It is the code in Int32.Equals that decides if the second value is equal. You have implemented some casting operators that you can unit test so these assertions should work:
Assert.AreEqual(123, (int) new MyNumber(123));
Assert.AreEqual((MyNumber) 123, new MyNumber(123));
Even though the casts are implemented implicit they will not automatically be invoked by Assert.AreEquals because this method expect two parameters of type Object and you will have to invoke them explicitly as I did above.
Because of the special handling in your MyNumber.Equals you now have a type that is not commutative with regards to equals, e.g. MyNumber(123) equals Int32(123) is true but Int32(123) equals MyNumber(123) is false. You should avoid that so I recommend that you remove the special case handling of ints in MyNumber.Equals and instead rely on the implicit casts which will work for you most of the time. And when they do not you will have to make an explicit cast.
To quote from Object.Equals:
The following statements must be true for all implementations of the Equals(Object) method. In the list, x, y, and z represent object references that are not null.
...
x.Equals(y) returns the same value as y.Equals(x).
Your Equals implementation breaks this hard requirement. ((object)x).Equals(123) must return the same value as ((object)123).Equals(x), if x is not null, regardless of what type it has.
There's a whole lot of code out there that, correctly, assumes that it doesn't matter which of the two objects is asked to perform the comparison. Design your code so that that assumption is not invalidated.
Effectively, this means that you must design your class in such a way that it doesn't compare equal to any integer type, no matter how much you might prefer otherwise.
Implement IComparable<int> in your MyNumber struct as follows:
public int CompareTo(int other)
{
return other.CompareTo(Value);
}
This will ensure all Asserts work as expected such as:
Assert.AreEqual(new MyNumber(123), 123);
Assert.AreEqual(123, new MyNumber(123));
Assert.Greater(124, new MyNumber(123));
Assert.Less(124, new MyNumber(125));
I have the following classes:
public class MyDocuments
{
public DateTime registeredDate;
public string version;
public List<Document> registeredDocuments;
}
public class Document
{
public string name;
public List<File> registeredFiles;
}
public class File
{
public string name;
public string content;
}
I have an instance of MyDocuments which has several documents in List<Document> registeredDocument. I get a new List<Document> from the user.
How can I verify that the new object doesn't exist in the list? I want to compare by value not reference.
I'm thinking of using HashSet instead of List. Is this the proper approach?
How are equality comparisons performed?
Whenever the BCL classes want to perform an equality check between objects of some type T, they do so by calling one or both of the methods in some implementation of IEqualityComparer<T>. To get hold of such an implementation, the framework looks to EqualityComparer<T>.Default.
As mentioned in the documentation, this property produces an IEqualityComparer<T> like this:
The Default property checks whether type T implements the
System.IEquatable<T> interface and, if so, returns an
EqualityComparer<T> that uses that implementation. Otherwise, it
returns an EqualityComparer<T> that uses the overrides of
Object.Equals and Object.GetHashCode provided by T.
What are my options?
So, in general, to dictate how equality comparisons should be performed you can:
Explicitly provide an implementation of IEqualityComparer<T> to the class or method that performs equality checks. This option is not very visible with List<T>, but many LINQ methods (such as Contains) do support it.
Make your class implement IEquatable<T>. This will make EqualityComparer<T>.Default use this implementation, and is a good choice whenever there is an obvious "natural" way to compare objects of type T.
Override object.GetHashCode and object.Equals without implementing IEqualityComparer<T>. However, this is simply an inferior version of #2 and AFAIK should always be avoided.
Which option to pick?
A good rule of thumb is: if there is an obvious and natural way to compare objects of class T, consider having it implement IEquatable<T>; this will make sure your comparison logic is used throughout the framework without any additional involvement. If there is no obvious candidate, or if you want to compare in a manner different than the default, implement your own IEqualityComparer<T> and pass the implementation as an argument to the class or method that needs to perform equality checks.
You will need to implement the Equals() method, and probably GetHashCode() as well. See this answer for an example.
You should implement IEquatable<T>.
When you implement this interface on your custom object, any equality checks (e.g. Contains, IndexOf) are automatically done using your objects implementation.
override the object.Equals method.
here's an example straight from the documentation
public class Person
{
private string idNumber;
private string personName;
public Person(string name, string id)
{
this.personName = name;
this.idNumber = id;
}
public override bool Equals(Object obj)
{
Person personObj = obj as Person;
if (personObj == null)
return false;
else
return idNumber.Equals(personObj.idNumber);
}
public override int GetHashCode()
{
return this.idNumber.GetHashCode();
}
}
the Equals method returns a bool which is whether or not obj is equal to this
Something like this at the top level, continued down at the sub-levels:
public class MyDocuments
{
public DateTime registeredDate;
public string version;
public HashSet<Document> registeredDocuments;
public override bool Equals(Object o)
{
if( !(o is MyDocuments) ) return false;
MyDocuments that = (MyDocuments)o;
if( !String.Equals(this.version, that.version) ) return false;
if( this.registeredDocuments.Count != that.registeredDocuments.Count ) return false;
// assuming registeredDate doesn't matter for equality...
foreach( Document d in this.registeredDocuments )
if( !that.registeredDocuments.Contains(d) )
return false;
return true;
}
public override int GetHashCode()
{
int ret = version.GetHashCode();
foreach (Document d in this.registeredDocuments)
ret ^= d.GetHashCode(); // xor isn't great, but better than nothing.
return ret;
}
}
Note: Caching could be useful for the HashCode values if the properties were change-aware.
Suppose I've got a generic MyClass<T> that needs to compare two objects of type <T>. Usually I'd do something like ...
void DoSomething(T o1, T o2)
{
if(o1.Equals(o2))
{
...
}
}
Now suppose my MyClass<T> has a constructor that supports passing a custom IEqualityComparer<T>, similar to Dictionary<T>. In that case I'd need to do ...
private IEqualityComparer<T> _comparer;
public MyClass() {}
public MyClass(IEqualityComparer<T> comparer)
{
_comparer = comparer;
}
void DoSomething(T o1, T o2)
{
if((_comparer != null && _comparer.Equals(o1, o2)) || (o1.Equals(o2)))
{
...
}
}
To remove this lengthy if statement, it'd be good if I could have _comparer default to a 'default comparer' if the regular constructor is used. I searched for something like typeof(T).GetDefaultComparer() but wasn't able to find anything like it.
I did find EqualityComparer<T>.Default, could I use that? And would then this snippet ...
public MyClass()
{
_comparer = EqualityComparer<T>.Default;
}
void DoSomething(T o1, T o2)
{
if(_comparer.Equals(o1, o2))
{
...
}
}
... provide the same results as using o1.Equals(o2) for all possible cases?
(As a side note, would this mean I'd also need to use any special generic constraint for <T>?)
It should be the same, but it is not guaranteed, because it depends on implementation details of the type T.
Explanation:
Without a constraint to T, o1.Equals(o2) will call Object.Equals, even if T implements IEquatable<T>.
EqualityComparer<T>.Default however, will use Object.Equals only, if T doesn't implement IEquatable<T>. If it does implement that interface, it uses IEquatable<T>.Equals.
As long as T's implementation of Object.Equals just calls IEquatable<T>.Equals the result is the same. But in the following example, the result is not the same:
public class MyObject : IEquatable<MyObject>
{
public int ID {get;set;}
public string Name {get;set;}
public override bool Equals(object o)
{
var other = o as MyObject;
return other == null ? false : other.ID == ID;
}
public bool Equals(MyObject o)
{
return o.Name == Name;
}
}
Now, it doesn't make any sense to implement a class like this. But you will have the same problem, if the implementer of MyObject simply forgot to override Object.Equals.
Conclusion:
Using EqualityComparer<T>.Default is a good way to go, because you don't need to support buggy objects!
By default, until overridden in a class, Object.Equals(a,b)/a.Equals(b) performs comparison by reference.
What comparer will be returned by EqualityComparer<T>.Default depends on T. For example, if T : IEquatable<> then the appropriate EqualityComparer<T> will be created.
Yes, I think it would be wise to use the EqualityComparer<T>.Default, because it uses the implementation of IEquatable<T> if the type T implements it, or the override of Object.Equals otherwise. You could do it as follows:
private IEqualityComparer<T> _comparer;
public IEqualityComparer<T> Comparer
{
get { return _comparer?? EqualityComparer<T>.Default;}
set { _comparer=value;}
}
public MyClass(IEqualityComparer<T> comparer)
{
_comparer = comparer;
}
void DoSomething(T o1, T o2)
{
if(Comparer.Equals(o1, o2))
{
...
}
}
You could use the null coaelescense operator ?? to shorten the if if it really matters
if ((_comparer ?? EqualityComparer<T>.Default).Equals(o1, o2))
{
}
That's exactly what Dictionary<> and other generic collections in the BCL do if you don't specify a comparer when constructing the object. The benefit of this is that EqualityComparer<T>.Default will return the right comparer for IEquatable<T> types, nullable types, and enums. If T is none of those, it will do a simple Equals comparison like you're old code is doing.
So I have a class which overrides Equals(object obj) and GetHashCode() along with implementing IEquatable. To make working with this type a little more natural when checking for equality I thought, heck, I'd overload the equality operator and inequality operator, no worries...
Uh oh, worries... consider the following - where both myType instances are NOT null:
if (myType != container.myType) //NullReferenceException
{
//never get here
}
//never get here either
Now, container is just another class to hold an instance of myType among other things which is used for caching items.
Here's the actual (relevant) code from myType:
public class MyType : IEquatable<MyType>
{
public static bool operator ==(MyType myTypeA, MyType myTypeB)
{
return myTypeA.Equals(myTypeB);
}
public static bool operator !=(MyType myTypeA, MyType myTypeB)
{
return !(myTypeA == myTypeB);
}
public override bool Equals(object obj)
{
if (obj != null && obj is MyType)
{
return Equals((MyType)obj);
}
return false;
}
public bool Equals(MyType other)
{
if (other != null)
{
return other.ToString() == ToString();
}
return false;
}
}
Any experience on this front?
Thanks.
Couple of pointers -
If you've overridden == and != on classes, make sure to use ReferenceEquals to check for null inside the overload implementations rather than ==, as that will call your overloaded operator and either go into a loop or try to call Equals on a null this reference, which is probably what is happening here.
Don't override == and != on classes. Those operators are meant for value equality, and classes aren't really designed to have value equality. Either remove the operator overloads, or make MyType a struct.
Tricky one... the problem is that you use the equality operator inside the Equal override as follows:
public bool Equals(MyType other)
{
if (other != null)
It goes to your overloaded != operator, which in turn goes to your == operator, which trying to do null.Equals...
As the others have stated you need to be carefull checking for nulls as it will call your equality function again, normally resulting in a StackOverflowException.
When I use the IEquatable interface on classes I normally use the following code:
public override bool Equals(object obj)
{
// If obj isn't MyType then 'as' will pass in null
return this.Equals(obj as MyType);
}
public bool Equals(MyType other)
{
if (object.ReferenceEquals(other, null))
{
return false;
}
// Actual comparison code here
return other.ToString() == this.ToString();
}