Distinct() not calling equals methods - c#

I've implemented IEqualityComparer and IEquatable (both just to be sure), but when I call the Distinct() method on a collection it does not call the methods that come with it. Here is the code that I execute when calling Distinct().
ObservableCollection<GigViewModel> distinctGigs = new ObservableCollection<GigViewModel>(Gigs.Distinct<GigViewModel>());
return distinctGigs;
I want to return an ObservableCollection that doesn't contain any double objects that are in the 'Gigs' ObservableCollection.
I implement the interfaces like this on the GigViewModel class:
public class GigViewModel : INotifyPropertyChanged, IEqualityComparer<GigViewModel>, IEquatable<GigViewModel>
{
....
}
And override the methods that come with the interfaces like so:
public bool Equals(GigViewModel x, GigViewModel y)
{
if (x.Artiest.Naam == y.Artiest.Naam)
{
return true;
}
else
{
return false;
}
}
public int GetHashCode(GigViewModel obj)
{
return obj.Artiest.Naam.GetHashCode();
}
public bool Equals(GigViewModel other)
{
if (other.Artiest.Naam == this.Artiest.Naam)
{
return true;
}
else
{
return false;
}
}
Thanks for all the help I'm getting. So I've created a seperate class that implements IEqualityComparer and passed it's instance into the disctinct method. But the methods are still not being triggered.
EqualityComparer:
class GigViewModelComparer : IEqualityComparer<GigViewModel>
{
public bool Equals(GigViewModel x, GigViewModel y)
{
if (x.Artiest.Naam == y.Artiest.Naam)
{
return true;
}
else
{
return false;
}
}
public int GetHashCode(GigViewModel obj)
{
return obj.Artiest.Naam.GetHashCode();
}
}
The Distinct() call:
GigViewModelComparer comp = new GigViewModelComparer();
ObservableCollection<GigViewModel> distinctGigs = new ObservableCollection<GigViewModel>(Gigs.Distinct(comp));
return distinctGigs;
EDIT2:
The GetHashCode() method DOES get called! After implementing the new class. But the collection still contains duplicates. I have a list of 'Gigs' that contain an 'Artiest' (or Artist) object. This Artist has a Naam property which is a String (Name).

So you had the object that itself is being compared implement both IEquatable as well as IEqualityComparer. That generally doesn't make sense. IEquatable is a way of saying an object can compare itself to something else. IEqualityComparer is a way of saying it can compare two different things you give it to each other. You generally want to do one or the other, not both.
If you want to implement IEquatable then the object not only needs to have an Equals method of the appropriate signature, but it needs to override GetHashCode to have a sensible implementation for the given definition of equality. You didn't do that. You created GetHashCode method that takes an object as a parameter, but that's the overload used for IEqualityComparer. You need to override the parameter-less version when using IEquatable (the one defined in Object).
If you want to create a class that implements IEqualityComparer you need to pass the comparer to the Distinct method. Since you've defined the object as its own comparer you'd need to pass in some instance of this object as the second parameter. Of course, this doesn't really make a whole lot of sense this way; so it would be better, if you go this route, to pull out the two methods that go with IEqualityComparer into a new type, and create an instance of that type to the Distinct method. If you actually passed an object with those definitions in as a comparer, it'd work just fine.

Following MSDN's advice, you'd be best off creating a separate class for your equality comparisons:
We recommend that you derive from the EqualityComparer class
instead of implementing the IEqualityComparer interface, because
the EqualityComparer class tests for equality using the
IEquatable.Equals method instead of the Object.Equals method. This
is consistent with the Contains, IndexOf, LastIndexOf, and Remove
methods of the Dictionary class and other generic
collections.
So, create a class, GigViewModelComparer, that derives from EqualityComparer and put your Equals and GetHashCode methods there.
Then, pass in an instance of that new comparer class in your call to Gigs.Distinct(new GigViewModelComparer()) and it should work. Follow along in the example in the MSDN link I provided above.
I've never seen somebody implement IEqualityComparer in the same class as the type of objects the collection in question contains, that is probably at least part of your problem.

Related

Why does Equals(object) win over Equals(T) when using an inherited object in Hashset or other Collections?

I am aware of the fact that I always have to override Equals(object) and GetHashCode() when implementing IEquatable<T>.Equals(T).
However, I don't understand, why in some situations the Equals(object) wins over the generic Equals(T).
For example why is the following happening? If I declare IEquatable<T> for an interface and implement a concrete type X for it, the general Equals(object) is called by a Hashset<X> when comparing items of those type against each other. In all other situations where at least one of the sides is cast to the Interface, the correct Equals(T) is called.
Here's a code sample to demonstrate:
public interface IPerson : IEquatable<IPerson> { }
//Simple example implementation of Equals (returns always true)
class Person : IPerson
{
public bool Equals(IPerson other)
{
return true;
}
public override bool Equals(object obj)
{
return true;
}
public override int GetHashCode()
{
return 0;
}
}
private static void doEqualityCompares()
{
var t1 = new Person();
var hst = new HashSet<Person>();
var hsi = new HashSet<IPerson>();
hst.Add(t1);
hsi.Add(t1);
//Direct comparison
t1.Equals(t1); //IEquatable<T>.Equals(T)
hst.Contains(t1); //Equals(object) --> why? both sides inherit of IPerson...
hst.Contains((IPerson)t1); //IEquatable<T>.Equals(T)
hsi.Contains(t1); //IEquatable<T>.Equals(T)
hsi.Contains((IPerson)t1); //IEquatable<T>.Equals(T)
}
HashSet<T> calls EqualityComparer<T>.Default to get the default equality comparer when no comparer is provided.
EqualityComparer<T>.Default determines if T implementsIEquatable<T>. If it does, it uses that, if not, it uses object.Equals and object.GetHashCode.
Your Person object implements IEquatable<IPerson> not IEquatable<Person>.
When you have a HashSet<Person> it ends up checking if Person is an IEquatable<Person>, which its not, so it uses the object methods.
When you have a HashSet<IPerson> it checks if IPerson is an IEquatable<IPerson>, which it is, so it uses those methods.
As for the remaining case, why does the line:
hst.Contains((IPerson)t1);
call the IEquatable Equals method even though its called on the HashSet<Person>. Here you're calling Contains on a HashSet<Person> and passing in an IPerson. HashSet<Person>.Contains requires the parameter to be a Person; an IPerson is not a valid argument. However, a HashSet<Person> is also an IEnumerable<Person>, and since IEnumerable<T> is covariant, that means it can be treated as an IEnumerable<IPerson>, which has a Contains extension method (through LINQ) which accepts an IPerson as a parameter.
IEnumerable.Contains also uses EqualityComparer<T>.Default to get its equality comparer when none is provided. In the case of this method call we're actually calling Contains on an IEnumerable<IPerson>, which means EqualityComparer<IPerson>.Default is checking to see if IPerson is an IEquatable<IPerson>, which it is, so that Equals method is called.
Although IComparable<in T> is contravariant with respect to T, such that any type which implements IComparable<Person> would automatically be considered an implementation of IComparable<IPerson>, the type IEquatable<T> is intended for use with sealed types, especially structures. The requirement that Object.GetHashCode() be consistent with both IEquatable<T>.Equals(T) and Object.Equals(Object) generally implies that the latter two methods should behave identically, which in turn implies that one of them should chain to the other. While there is a large performance difference between passing a struct directly to an IEquatable<T> implementation of the proper type, compared with constructing a instance of the structure's boxed-heap-object type and having an Equals(Object) implementation copy the structure data out of that, no such performance different exists with reference types. If IEquatable<T>.Equals(T) and Equals(Object) are going to be equivalent and T is an inheritable reference type, there's no meaningful difference between:
bool Equals(MyType obj)
{
MyType other = obj as MyType;
if (other==null || other.GetType() != typeof(this))
return false;
... test whether other matches this
}
bool Equals(MyType other)
{
if (other==null || other.GetType() != typeof(this))
return false;
... test whether other matches this
}
The latter could save one typecast, but that's unlikely to make a sufficient performance difference to justify having two methods.

How do I get Distinct() to work with a collection of custom objects

I have followed the suggestions from this post to try and get Distinct() working in my code but I am still having issues. Here are the two objects I am working with:
public class InvoiceItem : IEqualityComparer<InvoiceItem>
{
public InvoiceItem(string userName, string invoiceNumber, string invoiceAmount)
{
this.UserName = userName;
this.InvoiceNumber= invoiceNumber;
this.InvoiceAmount= invoiceAmount;
}
public string UserName { get; set; }
public string InvoiceNumber { get; set; }
public double InvoiceAmount { get; set; }
public bool Equals(InvoiceItem left, InvoiceItem right)
{
if ((object)left.InvoiceNumber == null && (object)right.InvoiceNumber == null) { return true; }
if ((object)left.InvoiceNumber == null || (object)right.InvoiceNumber == null) { return false; }
return left.InvoiceNumber == right.InvoiceNumber;
}
public int GetHashCode(InvoiceItem item)
{
return item.InvoiceNumber == null ? 0 : item.InvoiceNumber.GetHashCode();
}
}
public class InvoiceItems : List<InvoiceItem>{ }
My goal is to populate an InvoiceItems object (we will call it aBunchOfInvoiceItems) with a couple thousand InvoiceItem objects and then do:
InvoiceItems distinctItems = aBunchOfInvoiceItems.Distinct();
When I set this code up and run it, I get an error that says
Cannot implicitly convert type 'System.Collections.Generic.IEnumerable' to 'InvoiceReader.Form1.InvoiceItems'. An explicit conversion exists (are you missing a cast?)
I don't understand how to fix this. Should I be taking a different approach? Any suggestions are greatly appreciated.
Distinct returns a generic IEnumerable<T>. It does not return an InvoiceItems instance. In fact, behind the curtains it returns a proxy object that implements an iterator that is only accessed on demand (i.e. as you iterate over it).
You can explicitly coerce it into a List<> by calling .ToList(). You still need to convert it to your custom list type, though. The easiest way is probably to have an appropriate constructor, and calling that:
public class InvoiceItems : List<InvoiceItem> {
public InvoiceItems() { }
// Copy constructor
public InvoiceItems(IEnumerable<InvoiceItems> other) : base(other) { }
}
// …
InvoiceItems distinctItems = new InvoiceItems(aBunchOfInvoiceItems.Distinct());
Konrad Rudolph's answer should tackle your compilation problems. There is one another important semantic correctness issue here that has been missed: none of your equality-logic is actually going to be used.
When a comparer is not provided to Distinct, it uses EqualityComparer<T>.Default. This is going to try to use the IEquatable<T> interface, and if this is missing, falls back on the plain old Equals(object other) method declared on object. For hashing, it will use the GetHashCode() method, also declared on object. Since the interface hasn't been implemented by your type, and none of the aforementioned methods have been overriden, there's a big problem: Distinct will just fall back on reference-equality, which is not what you want.
Tthe IEqualityComparer<T> interface is typically used when one wants to write an equality-comparer that is decoupled from the type itself. On the other hand, when a type wants to be able to compare an instance of itself with another; it typically implements IEquatable<T>. I suggest one of:
Get InvoiceItem to implement IEquatable<InvoiceItem> instead.
Move the comparison logic to a separate InvoiceItemComparer : IEqualityComparer<InvoiceItem> type, and then call invoiceItems.Distinct(new InvoiceItemComparer());
If you want a quick hack with your existing code, you can do invoiceItems.Distinct(new InvoiceItem());
Quite simply, aBunchOfInvoiceItems.Distinct() returns an IEnumerable<InvoiceItem> and you are trying to assign that to something that is not an IEnumerable<InvoiceItem>.
However, the base class of InvoiceItems has a constructor that takes such an object, so you can use this:
public class InvoiceItems : List<InvoiceItem>
{
public InvoiceItems(IEnumerable<InvoiceItem> items)
base(items){}
}
Then you can use:
InvoiceItems distinctItems = new InvoiceItems(aBunchOfInvoiceItems.Distinct());
As is though, I don't see much benefit in deriving from List<InvoiceItem> so I would probably lean more toward:
List<InvoiceItem> distinctItems = aBunchOfInvoiceItems.Distinct().ToList();
The error has everything to do with your class InvoiceItems, which inherits from List<InvoiceItem>.
Distinct returns an IEnumerable<InvoiceItem>: InvoiceItems is a very specific type of IEnumerable<InvoiceItem>, but any IEnumerable<InvoiceItem> is not necessarily an InvoiceItems.
One solution could be to use an implicit conversion operator, if that's what you wanted to do: Doh, totally forgot you can't convert to/from interfaces (thanks Saed)
public class InvoiceItems : List<InvoiceItem>
{
public InvoiceItems(IEnumerable<InvoiceItem> items) : base(items) { }
}
Other things to note:
Inheriting from List<T> is usually bad. Implement IList<T> instead.
Using a list throws away one of the big benefits of LINQ, which is lazy evaluation. Be sure that prefetching the results is actually what you want to do.
Aside from the custom class vs IEnumerable issue that the other answers deal with, there is one major problem with your code. Your class implements IEqualityComparer instead of IEquatable. When you use Distinct, the items being filtered must either implement IEquatable themselves, or you must use the overload that takes an IEqualityComparer parameter. As it stands now, your call to Distinct will not filter the items according to the IEqualityComparer Equals and GetHashCode methods you provided.
IEqualityComparer should be implemented by another class than the one being compared. If a class knows how to compare itself, like your InvoiceItem class, it should implement IEquatable.

Internals of "equals" in .NET

I have a foolish doubt.Generally "System.Object" implements "Equals". When I implements
IEquatable interface i can give custom definition ( I believe so) to my "Equals".
so the professor class implementation is equal to
class Professor:System.Object,IEquatable
since there are different definitions of System.Equals ,and IEquatable's Equals ,Why did not C# report error?.Because I am not overriding "Equals" and even not Hiding "Equals" using new keyword.
class Professor : IEquatable<Professor>
{
public string Name { get; set; }
public bool Equals(Professor cust)
{
if (cust == null) return false;
return cust.Name == this.Name;
}
}
You are neither overriding nor hiding Object.Equals() because your version takes Professor as a parameter type - not object. Your are overloading the Equals() method.
C# allows two methods with the same name to differ on the type of the argument(s) that they accept. This is referred to as overloading - it can be viewed as compile-time polymorphism.
Overriding (which you could, and probably should also do) alters the implementation of a method from its version in a base class. It is the basis for runtime type polymorphism.
Hiding is a less common technique that allows a derived class to mask a version of a method in a base class. Based on the type of the reference through which you make the call, you may either get the base class version (if called through a base class reference) or the derived class version (if called through a derived type reference).
On your second question, you should use IEquatable<T> when there are semantics for comparing 'equality' of two instances that is separate from reference equality.
You should implement IComparable or IComparable<T> when there are semantics for ordering items. Meaning they can be less than, greater than, or equivalent.
The Object.Equals method accepts an object of type 'object' as its parameter. Your Equals method accepts an object of type 'Professor' as its parameter. Both of these methods can co-exist because it is legal to differ two identically-named methods by their parameter list; this is call method overloading.
You don't need to explicitly implement IEquatable if all you want to do is override the default Equals() implementation.
You can just do something like this:
class Professor
{
public string Name { get; set; }
public override bool Equals(object cust)
{
if (cust == null || !(cust is Professor)) return false;
return cust.Name == this.Name;
}
}
Be aware that if you override Equals() you also should override GetHashCode() to ensure proper operation of dictionaries and other collections that make use of hashing to differentiate between objects. Here's the MSDN page guidelines for overriding Equals().

Override .NET Generic List<MyType>.Contains(MyTypeInstance)?

Is it possible, and if so how do I override the Contains method of an otherwise normal List<T>, where T is my own, custom type?
List<T> uses EqualityComparer<T>.Default to do comparisons; this checks first to see if your object implements IEquatable<T>; otherwise is uses object.Equals.
So; the easiest thing to do is to override Equals (always update GetHashCode to match the logic in Equals). Alternatively, use LINQ instead:
bool hasValue = list.Any(x => x.Foo == someValue);
To make your own Contains implementation you could create a class that implements the IList interface. That way your class will look like a IList. You could have a real List internally to do the standard stuff.
class MyTypeList : IList<MyType>
{
private List<MyType> internalList = new ...;
public bool Contains(MyType instance)
{
}
....
}
You need to override Equals and GetHashCode in your class (MyType).
Depending on what specific needs you have in your override you might use Linq expression for doing that:
list.Any(x => x.Name.Equals("asdas", .....)) // whatever comparison you need
You can then wrap it in an extension method for convenience.
If you implement the equals of you custom type, the contains function of List will work

Abstract Class - Am I over-thinking this or doing it right?

So I am utilizing CollectionBase as an inherited class for custom collections. I am utilizing CollectionBase through an abstract class so that I don't repeated knowledge (following the DRY principle). The abstract class is defined as a generic class also. Here is how I am implementing my class:
public abstract class GenericCollectionBase<T,C> : CollectionBase
{
//Indexders, virtual methods for Add, Contains, IndexOf, etc
}
I utilize this so I don't have to implement these base methods in 10+ classes.
My question is am I taking this too far when I override the Equals method like this:
public override bool Equals(object obj)
{
if (obj is C)
{
GenericCollectionBase<T, C> collB =
obj as GenericCollectionBase<T, C>;
if (this.Count == collB.Count)
{
for (int i = 0; i < this.Count; ++i)
{
if (!this[i].Equals(collB[i]))
return false;
}
return true;
}
}
return false;
}
Am I trying to accomplish too much with my abstract, or doing this the correct way?
EDIT: This is written for .Net 2.0 and do not have access to 3.5 to utilize things like LINQ
I don't believe you are trying to accomplish too much. If an abstract class was meant to not have any implementation at all, or other methods which define functionality, then they would be interfaces.
The only thing I would change is to use EqualityComparer<T> instead of equals for the comparison of this[i] and collB[i].
Well, first, this is weird :
if (obj is C)
{
GenericCollectionBase<T, C> collB = obj as GenericCollectionBase<T, C>;
I'll assume you meant that :
GenericCollectionBase<T, C> collB = obj as GenericCollectionBase<T, C>;
if (collB != null)
{
...
I think you're over-thinking this, except if you really, really need two different collections with the same content to be considered as equal. I'd put this logic in another method to be called explicitly or in an equality comparer.
Making an extension method against IDictionary would be far more useful. There's also methods like Intersect from LINQ that may be useful.
I don't know if you're trying to accomplish too much, but I think you're trying to accomplish the wrong thing. There are cases where you might want that type of equality for collections, but it should be opt-in and obvious from the name of the type. I've created a ListValue<> with the type of equality you're using, but then it's always been immutable as well.
Also, if you're going to do this type of equality check, an initial test using object.ReferenceEquals can save you from having to iterate over a large collection when your comparing an object to itself.

Categories

Resources