How do I implement IEqualityComparer on an immutable generic Pair struct? - c#

Currently I have this (edited after reading advice):
struct Pair<T, K> : IEqualityComparer<Pair<T, K>>
{
readonly private T _first;
readonly private K _second;
public Pair(T first, K second)
{
_first = first;
_second = second;
}
public T First { get { return _first; } }
public K Second { get { return _second; } }
#region IEqualityComparer<Pair<T,K>> Members
public bool Equals(Pair<T, K> x, Pair<T, K> y)
{
return x.GetHashCode(x) == y.GetHashCode(y);
}
public int GetHashCode(Pair<T, K> obj)
{
int hashCode = obj.First == null ? 0 : obj._first.GetHashCode();
hashCode ^= obj.Second == null ? 0 : obj._second.GetHashCode();
return hashCode;
}
#endregion
public override int GetHashCode()
{
return this.GetHashCode(this);
}
public override bool Equals(object obj)
{
return (obj != null) &&
(obj is Pair<T, K>) &&
this.Equals(this, (Pair<T, K>) obj);
}
}
The problem is that First and Second may not be reference types (VS actually warns me about this), but the code still compiles. Should I cast them (First and Second) to objects before I compare them, or is there a better way to do this?
Edit:
Note that I want this struct to support value and reference types (in other words, constraining by class is not a valid solution)
Edit 2:
As to what I'm trying to achieve, I want this to work in a Dictionary. Secondly, SRP isn't important to me right now because that isn't really the essence of this problem - it can always be refactored later. Thirdly, comparing to default(T) will not work in lieu of comparing to null - try it.

Your IEqualityComparer implementation should be a different class (and definately not a struct as you want to reuse the reference).
Also, your hashcode should never be cached, as the default GetHashcode implementation for a struct (which you do not override) will take that member into account.

It looks like you need IEquatable instead:
internal struct Pair<T, K> : IEquatable<Pair<T, K>>
{
private readonly T _first;
private readonly K _second;
public Pair(T first, K second)
{
_first = first;
_second = second;
}
public T First
{
get { return _first; }
}
public K Second
{
get { return _second; }
}
public bool Equals(Pair<T, K> obj)
{
return Equals(obj._first, _first) && Equals(obj._second, _second);
}
public override bool Equals(object obj)
{
return obj is Pair<T, K> && Equals((Pair<T, K>) obj);
}
public override int GetHashCode()
{
unchecked
{
return (_first != null ? _first.GetHashCode() * 397 : 0) ^ (_second != null ? _second.GetHashCode() : 0);
}
}
}

If you use hashcodes in comparing methods, you should check for "realy value" if the hash codes are same.
bool result = ( x._hashCode == y._hashCode );
if ( result ) { result = ( x._first == y._first && x._second == y._second ); }
// OR?: if ( result ) { result = object.Equals( x._first, y._first ) && object.Equals( x._second, y._second ); }
// OR?: if ( result ) { result = object.ReferenceEquals( x._first, y._first ) && object.Equals( x._second, y._second ); }
return result;
But there is littlebit problem with comparing "_first" and "_second" fields.
By default reference types uses fore equality comparing "object.ReferenceEquals" method, bud they can override them. So the correct solution depends on the "what exactly should do" the your comparing method. Should use "Equals" method of the "_first" & "_second" fields, or object.ReferenceEquals ? Or something more complex?

Regarding the warning, you can use default(T) and default(K) instead of null.
I can't see what you're trying to achieve, but you shouldn't be using the hashcode to compare for equality - there is no guarantee that two different objects won't have the same hashcode. Also even though your struct is immutable, the members _first and _second aren't.

First of all this code violates SRP principle. Pair class used to hold pairs if items, right? It's incorrect to delegate equality comparing functionality to it.
Next let take a look at your code:
Equals method will fail if one of the arguments is null - no good. Equals uses hash code of Pair class, but take a look at the definition of GetHashCode, it just a combination of pair members hash codes - it's has nothing to do with equality of items. I would expect that Equals method will compare actual data. I'm too busy at the moment to provide correct implementation, unfortunately. But from the first look, you code seems to be wrong. It would be better if you provide us description of what you want to achieve. I'm sure SO members will be able to give you some advices.

Might I suggest the use of Lambda expressions as a parameter ?
this would allow you to specify how to compare the internal generic types.

I don't get any warning when compiling about this but I assume you are talking about the == null comparison? A cast seems like it would make this all somewhat cleaner, yes.
PS. You really should use a separate class for the comparer. This class that fills two roles (being a pair and comparing pairs) is plain ugly.

Related

How to use C# LINQ Union to get the Union of Custom list1 with list2

I am using the Enumerable.Union<TSource> method to get the union of the Custom List1 with the Custom List2. But somehow it does not work as it should in my case. I am getting all the items also the duplicate once.
I followed the MSDN Link to get the work done, but still I am not able to achieve the same.
Following is the Code of the custom class:-
public class CustomFormat : IEqualityComparer<CustomFormat>
{
private string mask;
public string Mask
{
get { return mask; }
set { mask = value; }
}
private int type;//0 for Default 1 for userdefined
public int Type
{
get { return type; }
set { type = value; }
}
public CustomFormat(string c_maskin, int c_type)
{
mask = c_maskin;
type = c_type;
}
public bool Equals(CustomFormat x, CustomFormat y)
{
if (ReferenceEquals(x, y)) return true;
//Check whether the products' properties are equal.
return x != null && y != null && x.Mask.Equals(y.Mask) && x.Type.Equals(y.Type);
}
public int GetHashCode(CustomFormat obj)
{
//Get hash code for the Name field if it is not null.
int hashProductName = obj.Mask == null ? 0 : obj.Mask.GetHashCode();
//Get hash code for the Code field.
int hashProductCode = obj.Type.GetHashCode();
//Calculate the hash code for the product.
return hashProductName ^ hashProductCode;
}
}
This I am calling as follows:-
List<CustomFormat> l1 = new List<CustomFormat>();
l1.Add(new CustomFormat("#",1));
l1.Add(new CustomFormat("##",1));
l1.Add(new CustomFormat("###",1));
l1.Add(new CustomFormat("####",1));
List<CustomFormat> l2 = new List<CustomFormat>();
l2.Add(new CustomFormat("#",1));
l2.Add(new CustomFormat("##",1));
l2.Add(new CustomFormat("###",1));
l2.Add(new CustomFormat("####",1));
l2.Add(new CustomFormat("## ###.0",1));
l1 = l1.Union(l2).ToList();
foreach(var l3 in l1)
{
Console.WriteLine(l3.Mask + " " + l3.Type);
}
Please suggest the appropriate way to achieve the same!
The oddity here is that your class implement IEqualityComparer<CustomClass> instead of IEquatable<CustomClass>. You could pass in another instance of CustomClass which would be used as the comparer, but it would be more idiomatic to just make CustomClass implement IEquatable<CustomClass>, and also override Equals(object).
The difference between IEquatable<T> and IEqualityComparer<T> is that IEquatable<T> says "I know how to compare myself with another instance of T" whereas IEqualityComparer<T> says "I know how to compare two instances of T". The latter is normally provided separately - just as it can be provided to Union via another parameter. It's very rare for a type to implement IEqualityComparer<T> for its own type - whereas IEquatable<T> should pretty much only be used to compare values of the same type.
Here's an implementation using automatically implemented properties for simplicity and more idiomatic parameter names. I'd probably change the hash code implementation myself and use expression-bodied members, but that's a different matter.
public class CustomFormat : IEquatable<CustomFormat>
{
public string Mask { get; set; }
public int Type { get; set; }
public CustomFormat(string mask, int type)
{
Mask = mask;
Type = type;
}
public bool Equals(CustomFormat other)
{
if (ReferenceEquals(this, other))
{
return true;
}
return other != null && other.Mask == Mask && other.Type == Type;
}
public override bool Equals(object obj)
{
return Equals(obj as CustomFormat);
}
public override int GetHashCode()
{
// Get hash code for the Name field if it is not null.
int hashProductName = Mask == null ? 0 : Mask.GetHashCode();
//Get hash code for the Code field.
int hashProductCode = Type.GetHashCode();
//Calculate the hash code for the product.
return hashProductName ^ hashProductCode;
}
}
Now it doesn't help that (as noted in comments) the documentation for Enumerable.Union is wrong. It currently states:
The default equality comparer, Default, is used to compare values of the types that implement the IEqualityComparer<T> generic interface.
It should say something like:
The default equality comparer, Default, is used to compare values when a specific IEqualityComparer<T> is not provided. If T implements IEquatable<T>, the default comparer will use that implementation. Otherwise, it will use the implementation of Equals(object).
You need to pass an instance of an IEqualityComparer to the Union method. The method has an overload to pass in your comparer.
The easiest and ugliest solution is
var comparer = new CustomFormat(null,0);
l1 = l1.Union(l2, comparer).ToList();
You have made some mistakes in your implementation. You should not implement the IEqualityComparer method on your type (CustomFormat), but on a separate class, like CustomFormatComparer.
On your type (CustomFormat) you should implemented IEquatable.

Linq Distinct not returning expected values

I am trying to get a list of distinct items from a custom collection, however the comparison seems to be getting ignored as I keep getting duplicates appearing in my list. I have debugged the code and I can clearly see that the values in the list that I am comparing are equal...
NOTE: The Id and Id2 values are strings
Custom Comparer:
public class UpsellSimpleComparer : IEqualityComparer<UpsellProduct>
{
public bool Equals(UpsellProduct x, UpsellProduct y)
{
return x.Id == y.Id && x.Id2 == y.Id2;
}
public int GetHashCode(UpsellProduct obj)
{
return obj.GetHashCode();
}
}
Calling code:
var upsellProducts = (Settings.SelectedSeatingPageGuids.Contains(CurrentItem.ID.ToString())) ?
GetAOSUpsellProducts(selectedProductIds) : GetGeneralUpsellProducts(selectedProductIds);
// we use a special comparer here so that same items are not included
var comparer = new UpsellSimpleComparer();
return upsellProducts.Distinct(comparer);
Most likely UpsellProduct has default implementation of GetHashCode that returns unique value for each instance of reference type.
To fix - either implement one correctly in UpsellProduct or in comparer.
public class UpsellSimpleComparer : IEqualityComparer<UpsellProduct>
{
public bool Equals(UpsellProduct x, UpsellProduct y)
{
return x.Id == y.Id && x.Id2 == y.Id2;
}
// sample, correct GetHashCode is a bit more complex
public int GetHashCode(UpsellProduct obj)
{
return obj.Id.GetHashCode() ^ obj.Id2.GetHashCode();
}
}
Note for better code to compute combined GetHashCode check Concise way to combine field hashcodes? and Is it possible to combine hash codes for private members to generate a new hash code?
Your GetHashCode() doesn't return the same values even two UpsellProduct instances are consider equals by your Equals() method.
Use something like this to reflect the same logic instead.
public int GetHashCode(UpsellProduct obj)
{
return obj.Id.GetHashCode() ^ obj.Id2.GetHashCode();
}

Mono implementation of Dictionary<T,T> using .Equals(obj o) instead of .GetHashCode()

By searching though msdn c# documentation and stack overflow, I get the clear impression that Dictionary<T,T> is supposed to use GetHashCode() for checking key-uniqueness and to do look-up.
The Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table.
...
The speed of retrieval depends on the quality of the hashing algorithm of the type specified for TKey.
I Use mono (in Unity3D), and after getting some weird results in my work, I conducted this experiment:
public class DictionaryTest
{
public static void TestKeyUniqueness()
{
//Test a dictionary of type1
Dictionary<KeyType1, string> dictionaryType1 = new Dictionary<KeyType1, string>();
dictionaryType1[new KeyType1(1)] = "Val1";
if(dictionaryType1.ContainsKey(new KeyType1(1)))
{
Debug.Log ("Key in dicType1 was already present"); //This line does NOT print
}
//Test a dictionary of type1
Dictionary<KeyType2, string> dictionaryType2 = new Dictionary<KeyType2, string>();
dictionaryType2[new KeyType2(1)] = "Val1";
if(dictionaryType2.ContainsKey(new KeyType2(1)))
{
Debug.Log ("Key in dicType2 was already present"); // Only this line prints
}
}
}
//This type implements only GetHashCode()
public class KeyType1
{
private int var1;
public KeyType1(int v1)
{
var1 = v1;
}
public override int GetHashCode ()
{
return var1;
}
}
//This type implements both GetHashCode() and Equals(obj), where Equals uses the hashcode.
public class KeyType2
{
private int var1;
public KeyType2(int v1)
{
var1 = v1;
}
public override int GetHashCode ()
{
return var1;
}
public override bool Equals (object obj)
{
return GetHashCode() == obj.GetHashCode();
}
}
Only the when using type KeyType2 are the keys considered equal. To me this demonstrates that Dictionary uses Equals(obj) - and not GetHashCode().
Can someone reproduce this, and help me interpret the meaning is? Is it an incorrect implementation in mono? Or have I misunderstood something.
i get the clear impression that Dictionary is supposed to use
.GetHashCode() for checking key-uniqueness
What made you think that? GetHashCode doesn't return unique values.
And MSDN clearly says:
Dictionary requires an equality implementation to
determine whether keys are equal. You can specify an implementation of
the IEqualityComparer generic interface by using a constructor that
accepts a comparer parameter; if you do not specify an implementation,
the default generic equality comparer EqualityComparer.Default is
used. If type TKey implements the System.IEquatable generic
interface, the default equality comparer uses that implementation.
Doing this:
public override bool Equals (object obj)
{
return GetHashCode() == obj.GetHashCode();
}
is wrong in the general case because you might end up with KeyType2 instances that are equal to StringBuilder, SomeOtherClass, AnythingYouCanImagine and what not instances.
You should totally do it like so:
public override bool Equals (object obj)
{
if (obj is KeyType2) {
return (obj as KeyType2).var1 == this.var1;
} else
return false;
}
When you are trying to override Equals and inherently GetHashCode you must ensure the following points (given the class MyObject) in this order (you were doing it the other way around):
1) When are 2 instances of MyObject equal ? Say you have:
public class MyObject {
public string Name { get; set; }
public string Address { get; set; }
public int Age { get; set; }
public DateTime TimeWhenIBroughtThisInstanceFromTheDatabase { get; set; }
}
And you have 1 record in some database that you need to be mapped to an instance of this class.
And you make the convention that the time you read the record from the database will be stored
in the TimeWhenIBroughtThisInstanceFromTheDatabase:
MyObject obj1 = DbHelper.ReadFromDatabase( ...some params...);
// you do that at 14:05 and thusly the TimeWhenIBroughtThisInstanceFromTheDatabase
// will be assigned accordingly
// later.. at 14:07 you read the same record into a different instance of MyClass
MyObject obj2 = DbHelper.ReadFromDatabase( ...some params...);
// (the same)
// At 14:09 you ask yourself if the 2 instances are the same
bool theyAre = obj1.Equals(obj2)
Do you want the result to be true ? I would say you do.
Therefore the overriding of Equals should like so:
public class MyObject {
...
public override bool Equals(object obj) {
if (obj is MyObject) {
var that = obj as MyObject;
return (this.Name == that.Name) &&
(this.Address == that.Address) &&
(this.Age == that.Age);
// without the syntactically possible but logically challenged:
// && (this.TimeWhenIBroughtThisInstanceFromTheDatabase ==
// that.TimeWhenIBroughtThisInstanceFromTheDatabase)
} else
return false;
}
...
}
2) ENSURE THAT whenever 2 instances are equal (as indicated by the Equals method you implement)
their GetHashCode results will be identitcal.
int hash1 = obj1.GetHashCode();
int hash2 = obj2.GetHashCode();
bool theseMustBeAlso = hash1 == hash2;
The easiest way to do that is (in the sample scenario):
public class MyObject {
...
public override int GetHashCode() {
int result;
result = ((this.Name != null) ? this.Name.GetHashCode() : 0) ^
((this.Address != null) ? this.Address.GetHashCode() : 0) ^
this.Age.GetHashCode();
// without the syntactically possible but logically challenged:
// ^ this.TimeWhenIBroughtThisInstanceFromTheDatabase.GetHashCode()
}
...
}
Note that:
- Strings can be null and that .GetHashCode() might fail with NullReferenceException.
- I used ^ (XOR). You can use whatever you want as long as the golden rule (number 2) is respected.
- x ^ 0 == x (for whatever x)

Comparing two instances of a class

I have a class like this
public class TestData
{
public string Name {get;set;}
public string type {get;set;}
public List<string> Members = new List<string>();
public void AddMembers(string[] members)
{
Members.AddRange(members);
}
}
I want to know if it is possible to directly compare to instances of this class to eachother and find out they are exactly the same? what is the mechanism? I am looking gor something like if(testData1 == testData2) //Do Something And if not, how to do so?
You should implement the IEquatable<T> interface on your class, which will allow you to define your equality-logic.
Actually, you should override the Equals method as well.
public class TestData : IEquatable<TestData>
{
public string Name {get;set;}
public string type {get;set;}
public List<string> Members = new List<string>();
public void AddMembers(string[] members)
{
Members.AddRange(members);
}
// Overriding Equals member method, which will call the IEquatable implementation
// if appropriate.
public override bool Equals( Object obj )
{
var other = obj as TestData;
if( other == null ) return false;
return Equals (other);
}
public override int GetHashCode()
{
// Provide own implementation
}
// This is the method that must be implemented to conform to the
// IEquatable contract
public bool Equals( TestData other )
{
if( other == null )
{
return false;
}
if( ReferenceEquals (this, other) )
{
return true;
}
// You can also use a specific StringComparer instead of EqualityComparer<string>
// Check out the specific implementations (StringComparer.CurrentCulture, e.a.).
if( EqualityComparer<string>.Default.Compare (Name, other.Name) == false )
{
return false;
}
...
// To compare the members array, you could perhaps use the
// [SequenceEquals][2] method. But, be aware that [] {"a", "b"} will not
// be considerd equal as [] {"b", "a"}
return true;
}
}
One way of doing it is to implement IEquatable<T>
public class TestData : IEquatable<TestData>
{
public string Name {get;set;}
public string type {get;set;}
public List<string> Members = new List<string>();
public void AddMembers(string[] members)
{
Members.AddRange(members);
}
public bool Equals(TestData other)
{
if (this.Name != other.Name) return false;
if (this.type != other.type) return false;
// TODO: Compare Members and return false if not the same
return true;
}
}
if (testData1.Equals(testData2))
// classes are the same
You can also just override the Equals(object) method (from System.Object), if you do this you should also override GetHashCode see here
There are three ways objects of some reference type T can be compared to each other:
With the object.Equals method
With an implementation of IEquatable<T>.Equals (only for types that implement IEquatable<T>)
With the comparison operator ==
Furthermore, there are two possibilities for each of these cases:
The static type of the objects being compared is T (or some other base of T)
The static type of the objects being compared is object
The rules you absolutely need to know are:
The default for both Equals and operator== is to test for reference equality
Implementations of Equals will work correctly no matter what the static type of the objects being compared is
IEquatable<T>.Equals should always behave the same as object.Equals, but if the static type of the objects is T it will offer slightly better performance
So what does all of this mean in practice?
As a rule of thumb you should use Equals to check for equality (overriding object.Equals as necessary) and implement IEquatable<T> as well to provide slightly better performance. In this case object.Equals should be implemented in terms of IEquatable<T>.Equals.
For some specific types (such as System.String) it's also acceptable to use operator==, although you have to be careful not to make "polymorphic comparisons". The Equals methods, on the other hand, will work correctly even if you do make such comparisons.
You can see an example of polymorphic comparison and why it can be a problem here.
Finally, never forget that if you override object.Equals you must also override object.GetHashCode accordingly.
I see many good answers here but just in case you want the comparison to work like
if(testData1 == testData2) // DoSomething
instead of using Equals function you can override == and != operators:
public static bool operator == (TestData left, TestData right)
{
bool comparison = true; //Make the desired comparison
return comparison;
}
public static bool operator != (TestData left, TestData right)
{
return !(left == right);
}
You can override the equals method and inside it manually compare the objects
Also take a look at Guidelines for Overloading Equals() and Operator ==
You will need to define the rules that make object A equal to object B and then override the Equals operator for this type.
http://msdn.microsoft.com/en-us/library/ms173147(v=vs.80).aspx
First of all equality is difficult to define and only you can define as to what equality means for you
Does it means members have same value
Or they are pointing to same location.
Here is a discussion and an answer here
What is "Best Practice" For Comparing Two Instances of a Reference Type?
Implement the IEquatable<T> interface. This defines a generalized method that a value type or class implements to create a type-specific method for determining equality of instances. More information here:
http://msdn.microsoft.com/en-us/library/ms131187.aspx

Distinct() returns duplicates with a user-defined type

I'm trying to write a Linq query which returns an array of objects, with unique values in their constructors. For integer types, Distinct returns only one copy of each value, but when I try creating my list of objects, things fall apart. I suspect it's a problem with the equality operator for my class, but when I set a breakpoint, it's never hit.
Filtering out the duplicate int in a sub-expression solves the problem, and also saves me from constructing objects that will be immediately discarded, but I'm curious why this version doesn't work.
UPDATE: 11:04 PM Several folks have pointed out that MyType doesn't override GetHashCode(). I'm afraid I oversimplified the example. The original MyType does indeed implement it. I've added it below, modified only to put the hash code in a temp variable before returning it.
Running through the debugger, I see that all five invocations of GetHashCode return a different value. And since MyType only inherits from Object, this is presumably the same behavior Object would exhibit.
Would I be correct then to conclude that the hash should instead be based on the contents of Value? This was my first attempt at overriding operators, and at the time, it didn't appear that GetHashCode needed to be particularly fancy. (This is the first time one of my equality checks didn't seem to work properly.)
class Program
{
static void Main(string[] args)
{
int[] list = { 1, 3, 4, 4, 5 };
int[] list2 =
(from value in list
select value).Distinct().ToArray(); // One copy of each value.
MyType[] distinct =
(from value in list
select new MyType(value)).Distinct().ToArray(); // Two objects created with 4.
Array.ForEach(distinct, value => Console.WriteLine(value));
}
}
class MyType
{
public int Value { get; private set; }
public MyType(int arg)
{
Value = arg;
}
public override int GetHashCode()
{
int retval = base.GetHashCode();
return retval;
}
public override bool Equals(object obj)
{
if (obj == null)
return false;
MyType rhs = obj as MyType;
if ((Object)rhs == null)
return false;
return this == rhs;
}
public static bool operator ==(MyType lhs, MyType rhs)
{
bool result;
if ((Object)lhs != null && (Object)rhs != null)
result = lhs.Value == rhs.Value;
else
result = (Object)lhs == (Object)rhs;
return result;
}
public static bool operator !=(MyType lhs, MyType rhs)
{
return !(lhs == rhs);
}
}
You need to override GetHashCode() in your class. GetHashCode must be implemented in tandem with Equals overloads. It is common for code to check for hashcode equality before calling Equals. That's why your Equals implementation is not getting called.
Your suspicion is correct,it is the equality which currently just checks the object references. Even your implementation does not do anything extra, change it to this:
public override bool Equals(object obj)
{
if (obj == null)
return false;
MyType rhs = obj as MyType;
if ((Object)rhs == null)
return false;
return this.Value == rhs.Value;
}
In you equality method you are still testing for reference equality, rather than semantic equality, eg on this line:
result = (Object)lhs == (Object)rhs
you are just comparing two object references which, even if they hold exactly the same data, are still not the same object. Instead, your test for equality needs to compare one or more properties of your object. For instance, if your object had an ID property, and objects with the same ID should be considered semantically equivalent, then you could do this:
result = lhs.ID == rhs.ID
Note that overriding Equals() means you should also override GetHashCode(), which is another kettle of fish, and can be quite difficult to do correctly.
You need to implement GetHashCode().
It seems that a simple Distinct operation can be implemented more elegantly as follows:
var distinct = items.GroupBy(x => x.ID).Select(x => x.First());
where ID is the property that determines if two objects are semantically equivalent. From the confusion here (including that of myself), the default implementation of Distinct() seems to be a little convoluted.
I think MyType needs to implement IEquatable for this to work.
The other answers have pretty much covered the fact that you need to implement Equals and GetHashCode correctly, but as a side note you may be interested to know that anonymous types have these values implemented automatically:
var distinct =
(from value in list
select new {Value = value}).Distinct().ToArray();
So without ever having to define this class, you automatically get the Equals and GetHashCode behavior you're looking for. Cool, eh?

Categories

Resources