I have a class with fields ColDescriptionOne(string), ColDescriptionTwo(string) and ColCodelist(int). I want to get the Intersect of two lists of this class where the desc are equal but the codelist is different.
I can use the Where clause and get what I need. However I can't seem to make it work using a custom Comparer like this:
internal class CodeListComparer: EqualityComparer<SheetRow>
{
public override bool Equals(SheetRow x, SheetRow y)
{
return Equals(x.ColDescriptionOne, y.ColDescriptionOne) &&
Equals(x.ColDescriptionSecond, y.ColDescriptionOne)
&& !Equals(x.ColCodelist, y.ColCodelist);
}
public override int GetHashCode(SheetRow obj)
{
return ((obj.ColDescriptionOne.GetHashCode()*397) + (obj.ColDescriptionSecond.GetHashCode()*397)
+ obj.ColCodelist.GetHashCode());
}
}
And then use it like this:
var onylByCodeList = firstSheet.Entries.Intersect(otherSheet.Entries, new CodeListComparer());
Any ideas what I'm doing wrong here ?
thanks
Sunit
You have a typo in the Equals method. The second line is comparing ColDescriptionOne to ColDescriptionSecond. They should both be ColDescriptionSecond.
return Equals(x.ColDescriptionOne, y.ColDescriptionOne)
&& Equals(x.ColDescriptionSecond, y.ColDescriptionSecond)
&& !Equals(x.ColCodelist, y.ColCodelist);
The second problem you have is you are including the ColCodeList in the GetHashCode method. The GetHashCode method must return the same value for objects that are equal. In this case though ColCodeList is supposed to be different when values are equal. This means that in cases where you want 2 objects to be considered equal they are more likely to have different hash codes which is incorrect.
Take that out of the GetHashCode method and everything should work.
public override int GetHashCode(SheetRow obj)
{
return ((obj.ColDescriptionOne.GetHashCode()*397)
+ (obj.ColDescriptionSecond.GetHashCode()*397));
}
Related
My class implements IEquatable and IComparable. It then gets added to a sortedset.
The goal is to have it be sorted by its "Date" property and be equal if both "ID1" and "ID2" are the same.
The implemented methods:
public int CompareTo(MyClass other)
{
return other.Date.CompareTo(Date);
}
public override int GetHashCode()
{
unchecked
{
var hashCode = ID1;
hashCode = (hashCode * 397) ^ ID2;
return hashCode;
}
}
public bool Equals(MyClass other)
{
if (ReferenceEquals(null, other))
return false;
if (ReferenceEquals(this, other))
return true;
return ID1 == other.ID1
&& ID2 == other.ID2;
}
The resulting sortedset is correctly sorted, but there are still elements that should be equal and therefore not in the set. Using breakpoints it seems neither GetHashCode nor Equals get called.
Any tips on how to solve this?
SortedSet uses CompareTo for both sorting and equality comparison.
There isn't an in-built ordered collection which will allow you to specify a different method to compare for equality in ordering than to compare for equality in maintaining distinctness. I'm not completely sure why this is, but it may be to do with the asssumptions behind the algorithms used for sorting.
If you want this behaviour, probably the simplest way to do it is wrap the underlying collection in its own class which will check for distinctness before adding new items to the collection.
You also need to be careful that items which are equal for sorting but not equal by your Equals method can both be added to the underlying collection. In your case, where the sorting is done by Date and the equality is done by ID1 and ID2 this might look something like:
public int CompareTo(MyClass other)
{
var result = other.Date.CompareTo(Date);
if(result != 0)
return result;
result = other.ID1.CompareTo(ID1);
if(result != 0)
return result;
return other.ID2.CompareTo(ID2);
}
This imposes extra ordering if the dates are the same, but I wouldn't expect that to be a problem.
Alternatively, you can "cheat", by forcing items which are equal to compare as equal in sort position. This would probably best be moved to a custom IComparer, because it's not the normal sorting behaviour you want, outside a SortedSet:
public int CompareTo(MyClass other)
{
if(other.Equals(this))
return 0;
var result = other.Date.CompareTo(Date);
if(result != 0)
return result;
result = other.ID1.CompareTo(ID1);
if(result != 0)
return result;
return other.ID2.CompareTo(ID2);
}
This would allow you to avoid having to create your own wrapper class around the collection, and is safer. It is rather hacky, though.
List1 contains items { A, B } and List2 contains items { A, B, C }.
What I need is to be returned { C } when I use Except Linq extension. Instead I get returned { A, B } and if I flip the lists around in my expression the result is { A, B, C }.
Am I misunderstanding the point of Except? Is there another extension I am not seeing to use?
I have looked through and tried a number of different posts on this matter with no success thus far.
var except = List1.Except(List2); //This is the line I have thus far
EDIT: Yes I was comparing simple objects. I have never used IEqualityComparer, it was interesting to learn about.
Thanks all for the help. The problem was not implementing the comparer. The linked blog post and example below where helpful.
If you are storing reference types in your list, you have to make sure there is a way to compare the objects for equality. Otherwise they will be checked by comparing if they refer to same address.
You can implement IEqualityComparer<T> and send it as a parameter to Except() function. Here's a blog post you may find helpful.
edit: the original blog post link was broken and has been replaced above
So just for completeness...
// Except gives you the items in the first set but not the second
var InList1ButNotList2 = List1.Except(List2);
var InList2ButNotList1 = List2.Except(List1);
// Intersect gives you the items that are common to both lists
var InBothLists = List1.Intersect(List2);
Edit: Since your lists contain objects you need to pass in an IEqualityComparer for your class... Here is what your except will look like with a sample IEqualityComparer based on made up objects... :)
// Except gives you the items in the first set but not the second
var equalityComparer = new MyClassEqualityComparer();
var InList1ButNotList2 = List1.Except(List2, equalityComparer);
var InList2ButNotList1 = List2.Except(List1, equalityComparer);
// Intersect gives you the items that are common to both lists
var InBothLists = List1.Intersect(List2);
public class MyClass
{
public int i;
public int j;
}
class MyClassEqualityComparer : IEqualityComparer<MyClass>
{
public bool Equals(MyClass x, MyClass y)
{
return x.i == y.i &&
x.j == y.j;
}
public int GetHashCode(MyClass obj)
{
unchecked
{
if (obj == null)
return 0;
int hashCode = obj.i.GetHashCode();
hashCode = (hashCode * 397) ^ obj.i.GetHashCode();
return hashCode;
}
}
}
You simply confused the order of arguments. I can see where this confusion arose, because the official documentation isn't as helpful as it could be:
Produces the set difference of two sequences by using the default equality comparer to compare values.
Unless you're versed in set theory, it may not be clear what a set difference actually is—it's not simply what's different between the sets. In reality, Except returns the list of elements in the first set that are not in the second set.
Try this:
var except = List2.Except(List1); // { C }
Writing a custom comparer does seem to solve the problem, but I think https://stackoverflow.com/a/12988312/10042740 is a much more simple and elegant solution.
It overwrites the GetHashCode() and Equals() methods in your object defining class, then the default comparer does its magic without extra code cluttering up the place.
Just for Ref:
I wanted to compare USB Drives connected and available to the system.
So this is the class which implements interface IEqualityComparer
public class DriveInfoEqualityComparer : IEqualityComparer<DriveInfo>
{
public bool Equals(DriveInfo x, DriveInfo y)
{
if (object.ReferenceEquals(x, y))
return true;
if (x == null || y == null)
return false;
// compare with Drive Level
return x.VolumeLabel.Equals(y.VolumeLabel);
}
public int GetHashCode(DriveInfo obj)
{
return obj.VolumeLabel.GetHashCode();
}
}
and you can use it like this
var newDeviceLst = DriveInfo.GetDrives()
.ToList()
.Except(inMemoryDrives, new DriveInfoEqualityComparer())
.ToList();
It seems that this problem has already been encountered by quite a few people:
List not working as expected
Contains always giving false
So I saw the answers and tried to implement the override of Equals and of GetHashCode but there seems that I am coding something wrong.
This is the situation: I have a list of Users(Class), each user has a List and a Name property, the list property contains licenses. I am trying to do a
if (!users.Contains(currentUser))
but it is not working as expected. And this is the code I did to override the Equals and GetHashCode:
public override bool Equals(object obj)
{
return Equals(obj as User);
}
public bool Equals(User otherUser)
{
if (ReferenceEquals(otherUser, null))
return false;
if (ReferenceEquals(this, otherUser))
return true;
return this._userName.Equals(otherUser.UserName) &&
this._licenses.SequenceEqual<string>(otherUser.Licenses);
}
public override int GetHashCode()
{
int hash = 13;
if (!_licenses.Any() && !_userName.Equals(""))
{
unchecked
{
foreach (string str in Licenses)
{
hash *= 7;
if (str != null) hash = hash + str.GetHashCode();
}
hash = (hash * 7) + _userName.GetHashCode();
}
}
return hash;
}
thank you for your suggestions and help in advance!
EDIT 1:
this is the code where I am doing the List.Contains, I am trying to see if the list already contains certain user, if not then add the user that isn't there. The Contains only works the first time, when currentUser changes then the User inside the list changes to the current user maybe this is a problem that is unrelated to the equals, any ideas?
if (isIn)
{
if (!listOfLicenses.Contains(items[3]))
listOfLicenses.Add(items[3]);
if (!users.Contains(currentUser))
{
User user2Add = new User();
user2Add.UserName = currentUser.UserName;
users.Add(user2Add);
userIndexer++;
}
if (users[userIndexer - 1].UserName.Equals(currentUser.UserName))
{
users[userIndexer - 1].Licenses.Add(items[3]);
}
result.Rows.Add();
}
Well, one problem with your hash code - if either there are no licences or the username is empty, you're ignoring the other component. I'd rewrite it as:
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 31 + _userName.GetHashCode();
foreach (string licence in Licences)
{
hash = hash * 31 + licences.GetHashCode();
}
return hash;
}
}
Shorter and simpler. It doesn't matter if you use the hash code of the empty string, or if you iterate over an empty collection.
That said, I'd have expected the previous code to work anyway. Note that it's order sensitive for the licences... oh, and List<T> won't use GetHashCode anyway. (You should absolutely override it appropriately, but it won't be the cause of the error.)
It would really help if you could show a short but complete program demonstrating the problem - I strongly suspect that you'll find it's actually a problem with your test data.
After users[userIndexer - 1].Licenses.Add(items[3]) , users[userIndexer - 1] is not the same user anymore. You have changed the Licences which is used in equality comparison(in User.Equals).
--EDIT
See below code
public class Class
{
static void Main(string[] args)
{
User u1 = new User("1");
User u2 = new User("1");
Console.WriteLine(u1.Equals(u2));
u2.Lic = "2";
Console.WriteLine(u1.Equals(u2));
}
}
public class User
{
public string Lic;
public User(string lic)
{
this.Lic = lic;
}
public override bool Equals(object obj)
{
return (obj as User).Lic == Lic;
}
}
You need you implement Equals and GetHashcode for the License class, otherwise SequenceEqual will not work.
Does your class implement IEquatable<User>? From your equality methods it appears it does but just checking.
The documentation for List.Contains states that:
This method determines equality by using the default equality
comparer, as defined by the object's implementation of the
IEquatable(T).Equals method for T (the type of values in the list)
It is very important to make sure that the value returned by GetHashCode never ever ever changes for a specific instance of an object. If the value changes then lists and dictionaries won't work correctly.
Think of GetHashCode as "GetPrimaryKey". You would not change the primary key of a user record in a database if someone added a new license to the user. Likewise you mustn't change the GetHashCode.
It appears from your code that you are changing the licenses collection and you're using that to calculate your hash code. So that is probably causing your issue.
Now, it is perfectly legitimate to use a constant value for every hash code you produce - you could just return 42 for every instance, for example. This will force calling Equals to determine if two objects are equal or not. All that having distinct hash codes does is short circuits the need to call Equals.
If the _userName field doesn't change then just return its hash code and see it that works.
Take the following:
var x = new Action(() => { Console.Write("") ; });
var y = new Action(() => { });
var a = x.GetHashCode();
var b = y.GetHashCode();
Console.WriteLine(a == b);
Console.WriteLine(x == y);
This will print:
True
False
Why is the hashcode the same?
It is kinda surprising, and will make using delegates in a Dictionary as slow as a List (aka O(n) for lookups).
Update:
The question is why. IOW who made such a (silly) decision?
A better hashcode implementation would have been:
return Method ^ Target == null ? 0 : Target.GetHashcode();
// where Method is IntPtr
Easy! Since here is the implementation of the GetHashCode (sitting on the base class Delegate):
public override int GetHashCode()
{
return base.GetType().GetHashCode();
}
(sitting on the base class MulticastDelegate which will call above):
public sealed override int GetHashCode()
{
if (this.IsUnmanagedFunctionPtr())
{
return ValueType.GetHashCodeOfPtr(base._methodPtr);
}
object[] objArray = this._invocationList as object[];
if (objArray == null)
{
return base.GetHashCode();
}
int num = 0;
for (int i = 0; i < ((int) this._invocationCount); i++)
{
num = (num * 0x21) + objArray[i].GetHashCode();
}
return num;
}
Using tools such as Reflector, we can see the code and it seems like the default implementation is as strange as we see above.
The type value here will be Action. Hence the result above is correct.
UPDATE
My first attempt of a better implementation:
public class DelegateEqualityComparer:IEqualityComparer<Delegate>
{
public bool Equals(Delegate del1,Delegate del2)
{
return (del1 != null) && del1.Equals(del2);
}
public int GetHashCode(Delegate obj)
{
if(obj==null)
return 0;
int result = obj.Method.GetHashCode() ^ obj.GetType().GetHashCode();
if(obj.Target != null)
result ^= RuntimeHelpers.GetHashCode(obj);
return result;
}
}
The quality of this should be good for single cast delegates, but not so much for multicast delegates (If I recall correctly Target/Method return the values of the last element delegate).
But I'm not really sure if it fulfills the contract in all corner cases.
Hmm it looks like quality requires referential equality of the targets.
This smells like some of the cases mentioned in this thread, maybe it will give you some pointers on this behaviour. else, you could log it there :-)
What's the strangest corner case you've seen in C# or .NET?
Rgds GJ
From MSDN :
The default implementation of
GetHashCode does not guarantee
uniqueness or consistency; therefore,
it must not be used as a unique object
identifier for hashing purposes.
Derived classes must override
GetHashCode with an implementation
that returns a unique hash code. For
best results, the hash code must be
based on the value of an instance
field or property, instead of a static
field or property.
So if you have not overwritten the GetHashCode method, it may return the same. I suspect this is because it generates it from the definition, not the instance.
Im working with a large dataset of point of interest (POI) which all have Lat/Long values.
I want to filter out POIs that are in close proximity to each other. I think that to achieve this I can round the Lat/Long down to X decimal places and group by the results (or call Distinct() or whatever)...
I have written a little LINQ statement which doesnt seem to do what I want,
var l1 = (from p in PointsOfInterest where p.IsVisibleOnMap select p).Distinct(new EqualityComparer()).ToList();
where EqualityComparer is
public class EqualityComparer : IEqualityComparer<PointOfInterest>
{
public bool Equals(PointOfInterest x, PointOfInterest y)
{
return Math.Round(x.Latitude.Value, 4) == Math.Round(y.Latitude.Value, 4) &&
Math.Round(x.Longitude.Value, 4) == Math.Round(y.Latitude.Value, 4);
}
public int GetHashCode(PointOfInterest obj)
{
return obj.GetHashCode();
}
}
but the Equals method never seems to get called?!?
Any thoughts on the best way to do this?
Equals() never gets called because GetHashCode() returns different values for any two objects because you are using GetHashCode() defined in System.Object class .
You'll need to implement GetHashCode() a little differently.
try something like
public int GetHashCode(PointOfInterest obj)
{
return obj.Longitude.Value.GetHashCode() ^ obj.Latitude.Value.GetHashCode();
}
This is the problem:
public int GetHashCode(PointOfInterest obj)
{
return obj.GetHashCode();
}
You'll have to override GetHashCode() appropriately as well, probably currently all items are considered different because the hash code doesn't match.
From MSDN for IEqualityComparer.GetHashCode():
Implement this method to provide
customized hash codes for
objects,corresponding to the
customized equality comparison
provided by the Equals method.
Any thoughts on the best way to do this?
Simply:
IEnumerable<PointOfInterest> result =
from p in PointsOfInterest
where p.IsVisibleOnMap
group p by new
{
Latitude = Math.Round(p.Latitude.Value, 4),
Longitude = Math.Round(p.Longitude.Value, 4)
} into g
let winner = g.First()
select winner;