I am storing a 2D array that represents a distance matrix between vectors as a Dictionary<DistanceCell, double>. My implementation of DistanceCell has two string fields representing the vectors being compared.
class DistanceCell
{
public string Group1 { get; private set; }
public string Group2 { get; private set; }
public DistanceCell(string group1, string group2)
{
if (group1 == null)
{
throw new ArgumentNullException("group1");
}
if (group2 == null)
{
throw new ArgumentNullException("group2");
}
this.Group1 = group1;
this.Group2 = group2;
}
}
Since I'm using this class as a key, I overrode Equals() and GetHashCode():
public override bool Equals(object obj)
{
// False if the object is null
if (obj == null)
{
return false;
}
// Try casting to a DistanceCell. If it fails, return false;
DistanceCell cell = obj as DistanceCell;
if (cell == null)
{
return false;
}
return (this.Group1 == cell.Group1 && this.Group2 == cell.Group2)
|| (this.Group1 == cell.Group2 && this.Group2 == cell.Group1);
}
public bool Equals(DistanceCell cell)
{
if (cell == null)
{
return false;
}
return (this.Group1 == cell.Group1 && this.Group2 == cell.Group2)
|| (this.Group1 == cell.Group2 && this.Group2 == cell.Group1);
}
public static bool operator ==(DistanceCell a, DistanceCell b)
{
// If both are null, or both are same instance, return true.
if (System.Object.ReferenceEquals(a, b))
{
return true;
}
// If either is null, return false.
// Cast a and b to objects to check for null to avoid calling this operator method
// and causing an infinite loop.
if ((object)a == null || (object)b == null)
{
return false;
}
return (a.Group1 == b.Group1 && a.Group2 == b.Group2)
|| (a.Group1 == b.Group2 && a.Group2 == b.Group1);
}
public static bool operator !=(DistanceCell a, DistanceCell b)
{
return !(a == b);
}
public override int GetHashCode()
{
int hash;
unchecked
{
hash = Group1.GetHashCode() * Group2.GetHashCode();
}
return hash;
}
As you can see, one of the requirements of DistanceCell is that Group1 and Group2 are interchangeable. So for two strings x and y, DistanceCell("x", "y") must equal DistanceCell("y", "x"). This is why I implemented GetHashCode() with multiplication because DistanceCell("x", "y").GetHashCode() must equal DistanceCell("y", "x").GetHashCode().
The problem that I'm having is that it works correctly roughly 90% of the time, but it throws a KeyNotFoundException or a NullReferenceException the remainder of the time. The former is thrown when getting a key from the dictionary, and the latter when I iterate over the dictionary with a foreach loop and it retrieves a key that is null that it then tries to call Equals() on. I suspect this has something to do with an error in my GetHashCode() implementation, but I'm not positive. Also note that there should never be a case where the key wouldn't exist in the dictionary when I check it because of the nature of my algorithm. The algorithm takes the same path every execution.
Update
I just wanted to update everyone that the problem is fixed. It turns out that it had nothing to do with my Equals() or GetHashCode() implementation. I did some extensive debugging and found that the reason I was getting a KeyNotFoundException was because the key didn't exist in the dictionary in the first place, which was strange because I was positive that it was being added. The problem was that I was adding the keys to the dictionary with multiple threads, and according to this, the c# Dictionary class isn't thread-safe. So the timing must have been perfect that an Add() failed and thus the key was never added to the dictionary. I believe this can also explain how the foreach loop was bringing up a null key on occasion. The Add()'s from multiple threads must have messed up some internal structures in the dictionary and introduced a null key.
Thanks to every one for your help! I'm sorry that it ended up being a fault entirely on my end.
Take a look at this blog post by Eric Lippert
It says that the result of GetHashCode should never change even if you change the content of the object. The reason is that Dictionary is using buckets for faster indexing. Once you change the GetHasCode result, the Dictionary will not be able to find the right bucket for your object and this can lead to "Key not found"
I might be wrong, but it is worth testing.
I believe what is missing for you is this answer
Using an object as a generic Dictionary key
You probably not stating you implement IEquatable interface
To me your code looks correct. (It can possibly be (micro-)optimized, but it looks as if it will be correct in all cases.) Can you provide some code where you create and use the Dictionary<DistanceCell, double>, and things go bad? The problem may lie in code you haven't shown us.
Related
My class implements IEquatable and IComparable. It then gets added to a sortedset.
The goal is to have it be sorted by its "Date" property and be equal if both "ID1" and "ID2" are the same.
The implemented methods:
public int CompareTo(MyClass other)
{
return other.Date.CompareTo(Date);
}
public override int GetHashCode()
{
unchecked
{
var hashCode = ID1;
hashCode = (hashCode * 397) ^ ID2;
return hashCode;
}
}
public bool Equals(MyClass other)
{
if (ReferenceEquals(null, other))
return false;
if (ReferenceEquals(this, other))
return true;
return ID1 == other.ID1
&& ID2 == other.ID2;
}
The resulting sortedset is correctly sorted, but there are still elements that should be equal and therefore not in the set. Using breakpoints it seems neither GetHashCode nor Equals get called.
Any tips on how to solve this?
SortedSet uses CompareTo for both sorting and equality comparison.
There isn't an in-built ordered collection which will allow you to specify a different method to compare for equality in ordering than to compare for equality in maintaining distinctness. I'm not completely sure why this is, but it may be to do with the asssumptions behind the algorithms used for sorting.
If you want this behaviour, probably the simplest way to do it is wrap the underlying collection in its own class which will check for distinctness before adding new items to the collection.
You also need to be careful that items which are equal for sorting but not equal by your Equals method can both be added to the underlying collection. In your case, where the sorting is done by Date and the equality is done by ID1 and ID2 this might look something like:
public int CompareTo(MyClass other)
{
var result = other.Date.CompareTo(Date);
if(result != 0)
return result;
result = other.ID1.CompareTo(ID1);
if(result != 0)
return result;
return other.ID2.CompareTo(ID2);
}
This imposes extra ordering if the dates are the same, but I wouldn't expect that to be a problem.
Alternatively, you can "cheat", by forcing items which are equal to compare as equal in sort position. This would probably best be moved to a custom IComparer, because it's not the normal sorting behaviour you want, outside a SortedSet:
public int CompareTo(MyClass other)
{
if(other.Equals(this))
return 0;
var result = other.Date.CompareTo(Date);
if(result != 0)
return result;
result = other.ID1.CompareTo(ID1);
if(result != 0)
return result;
return other.ID2.CompareTo(ID2);
}
This would allow you to avoid having to create your own wrapper class around the collection, and is safer. It is rather hacky, though.
It seems that this problem has already been encountered by quite a few people:
List not working as expected
Contains always giving false
So I saw the answers and tried to implement the override of Equals and of GetHashCode but there seems that I am coding something wrong.
This is the situation: I have a list of Users(Class), each user has a List and a Name property, the list property contains licenses. I am trying to do a
if (!users.Contains(currentUser))
but it is not working as expected. And this is the code I did to override the Equals and GetHashCode:
public override bool Equals(object obj)
{
return Equals(obj as User);
}
public bool Equals(User otherUser)
{
if (ReferenceEquals(otherUser, null))
return false;
if (ReferenceEquals(this, otherUser))
return true;
return this._userName.Equals(otherUser.UserName) &&
this._licenses.SequenceEqual<string>(otherUser.Licenses);
}
public override int GetHashCode()
{
int hash = 13;
if (!_licenses.Any() && !_userName.Equals(""))
{
unchecked
{
foreach (string str in Licenses)
{
hash *= 7;
if (str != null) hash = hash + str.GetHashCode();
}
hash = (hash * 7) + _userName.GetHashCode();
}
}
return hash;
}
thank you for your suggestions and help in advance!
EDIT 1:
this is the code where I am doing the List.Contains, I am trying to see if the list already contains certain user, if not then add the user that isn't there. The Contains only works the first time, when currentUser changes then the User inside the list changes to the current user maybe this is a problem that is unrelated to the equals, any ideas?
if (isIn)
{
if (!listOfLicenses.Contains(items[3]))
listOfLicenses.Add(items[3]);
if (!users.Contains(currentUser))
{
User user2Add = new User();
user2Add.UserName = currentUser.UserName;
users.Add(user2Add);
userIndexer++;
}
if (users[userIndexer - 1].UserName.Equals(currentUser.UserName))
{
users[userIndexer - 1].Licenses.Add(items[3]);
}
result.Rows.Add();
}
Well, one problem with your hash code - if either there are no licences or the username is empty, you're ignoring the other component. I'd rewrite it as:
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 31 + _userName.GetHashCode();
foreach (string licence in Licences)
{
hash = hash * 31 + licences.GetHashCode();
}
return hash;
}
}
Shorter and simpler. It doesn't matter if you use the hash code of the empty string, or if you iterate over an empty collection.
That said, I'd have expected the previous code to work anyway. Note that it's order sensitive for the licences... oh, and List<T> won't use GetHashCode anyway. (You should absolutely override it appropriately, but it won't be the cause of the error.)
It would really help if you could show a short but complete program demonstrating the problem - I strongly suspect that you'll find it's actually a problem with your test data.
After users[userIndexer - 1].Licenses.Add(items[3]) , users[userIndexer - 1] is not the same user anymore. You have changed the Licences which is used in equality comparison(in User.Equals).
--EDIT
See below code
public class Class
{
static void Main(string[] args)
{
User u1 = new User("1");
User u2 = new User("1");
Console.WriteLine(u1.Equals(u2));
u2.Lic = "2";
Console.WriteLine(u1.Equals(u2));
}
}
public class User
{
public string Lic;
public User(string lic)
{
this.Lic = lic;
}
public override bool Equals(object obj)
{
return (obj as User).Lic == Lic;
}
}
You need you implement Equals and GetHashcode for the License class, otherwise SequenceEqual will not work.
Does your class implement IEquatable<User>? From your equality methods it appears it does but just checking.
The documentation for List.Contains states that:
This method determines equality by using the default equality
comparer, as defined by the object's implementation of the
IEquatable(T).Equals method for T (the type of values in the list)
It is very important to make sure that the value returned by GetHashCode never ever ever changes for a specific instance of an object. If the value changes then lists and dictionaries won't work correctly.
Think of GetHashCode as "GetPrimaryKey". You would not change the primary key of a user record in a database if someone added a new license to the user. Likewise you mustn't change the GetHashCode.
It appears from your code that you are changing the licenses collection and you're using that to calculate your hash code. So that is probably causing your issue.
Now, it is perfectly legitimate to use a constant value for every hash code you produce - you could just return 42 for every instance, for example. This will force calling Equals to determine if two objects are equal or not. All that having distinct hash codes does is short circuits the need to call Equals.
If the _userName field doesn't change then just return its hash code and see it that works.
Take the following:
var x = new Action(() => { Console.Write("") ; });
var y = new Action(() => { });
var a = x.GetHashCode();
var b = y.GetHashCode();
Console.WriteLine(a == b);
Console.WriteLine(x == y);
This will print:
True
False
Why is the hashcode the same?
It is kinda surprising, and will make using delegates in a Dictionary as slow as a List (aka O(n) for lookups).
Update:
The question is why. IOW who made such a (silly) decision?
A better hashcode implementation would have been:
return Method ^ Target == null ? 0 : Target.GetHashcode();
// where Method is IntPtr
Easy! Since here is the implementation of the GetHashCode (sitting on the base class Delegate):
public override int GetHashCode()
{
return base.GetType().GetHashCode();
}
(sitting on the base class MulticastDelegate which will call above):
public sealed override int GetHashCode()
{
if (this.IsUnmanagedFunctionPtr())
{
return ValueType.GetHashCodeOfPtr(base._methodPtr);
}
object[] objArray = this._invocationList as object[];
if (objArray == null)
{
return base.GetHashCode();
}
int num = 0;
for (int i = 0; i < ((int) this._invocationCount); i++)
{
num = (num * 0x21) + objArray[i].GetHashCode();
}
return num;
}
Using tools such as Reflector, we can see the code and it seems like the default implementation is as strange as we see above.
The type value here will be Action. Hence the result above is correct.
UPDATE
My first attempt of a better implementation:
public class DelegateEqualityComparer:IEqualityComparer<Delegate>
{
public bool Equals(Delegate del1,Delegate del2)
{
return (del1 != null) && del1.Equals(del2);
}
public int GetHashCode(Delegate obj)
{
if(obj==null)
return 0;
int result = obj.Method.GetHashCode() ^ obj.GetType().GetHashCode();
if(obj.Target != null)
result ^= RuntimeHelpers.GetHashCode(obj);
return result;
}
}
The quality of this should be good for single cast delegates, but not so much for multicast delegates (If I recall correctly Target/Method return the values of the last element delegate).
But I'm not really sure if it fulfills the contract in all corner cases.
Hmm it looks like quality requires referential equality of the targets.
This smells like some of the cases mentioned in this thread, maybe it will give you some pointers on this behaviour. else, you could log it there :-)
What's the strangest corner case you've seen in C# or .NET?
Rgds GJ
From MSDN :
The default implementation of
GetHashCode does not guarantee
uniqueness or consistency; therefore,
it must not be used as a unique object
identifier for hashing purposes.
Derived classes must override
GetHashCode with an implementation
that returns a unique hash code. For
best results, the hash code must be
based on the value of an instance
field or property, instead of a static
field or property.
So if you have not overwritten the GetHashCode method, it may return the same. I suspect this is because it generates it from the definition, not the instance.
I need to sort in memory lists of strings or numbers in ascending or descending order. However, the list can contain null values and all null values must appear after the numbers or strings.
That is the input data might be:
1, 100, null, 5, 32.3
The ascending result would be
1, 5, 32.3, 100, null
The descending list would be
100, 32.3, 5, 1, null
Any ideas on how to make this work?
I don't have a compiler in front of me to check, but I'm thinking something like:
x.OrderBy(i => i == null).ThenBy(i => i)
You can write your own comparer which delegates to an existing one for non-nulls, but always sorts nulls at the end. Something like this:
public class NullsLastComparer<T> : IComparer<T>
{
private readonly IComparer<T> proxy;
public NullsLastComparer(IComparer<T> proxy)
{
this.proxy = proxy;
}
public override int Compare(T first, T second)
{
if (first == null && second == null)
{
return 0;
}
if (first == null)
{
return 1;
}
if (second == null)
{
return -1;
}
return proxy.Compare(first, second);
}
}
EDIT: A few issues with this approach:
Firstly, it doesn't play nicely with anonymous types; you might need a separate extension method to make that work well. Or use Ken's answer :)
More importantly, it violates the IComparer<T> contract, which specifies that nulls should be first. Now personally, I think this is a mistake in the IComparer<T> specification - it should perhaps define that the comparer should handle nulls, but it should not specify whether they come first or last... it makes requirements like this (which are perfectly reasonable) impossible to fulfil as cleanly as we might want, and has all kinds of awkward ramifications for things like a reversing comparer. You'd expect such a thing to completely reverse the ordering, but according to the spec it should still maintain nulls at the start :(
I don't think I've seen any .NET sorting implementations which actually rely on this, but it's definitely worth being aware of.
As Jon said, you need to define your custom Comparer, implementing IComparer. Here's how your Compare method in the custom comparer would like that can keep null at the end.
public int Compare(Object x, Object y)
{
int retVal = 0;
IComparable valX = x as IComparable;
IComparable valY = y as IComparable;
if (valX == null && valY == null)
{
return 0;
}
if (valX == null)
{
return 1;
}
else if (valY == null)
{
return -1;
}
return valX.CompareTo(valY);
}
I have a class with fields ColDescriptionOne(string), ColDescriptionTwo(string) and ColCodelist(int). I want to get the Intersect of two lists of this class where the desc are equal but the codelist is different.
I can use the Where clause and get what I need. However I can't seem to make it work using a custom Comparer like this:
internal class CodeListComparer: EqualityComparer<SheetRow>
{
public override bool Equals(SheetRow x, SheetRow y)
{
return Equals(x.ColDescriptionOne, y.ColDescriptionOne) &&
Equals(x.ColDescriptionSecond, y.ColDescriptionOne)
&& !Equals(x.ColCodelist, y.ColCodelist);
}
public override int GetHashCode(SheetRow obj)
{
return ((obj.ColDescriptionOne.GetHashCode()*397) + (obj.ColDescriptionSecond.GetHashCode()*397)
+ obj.ColCodelist.GetHashCode());
}
}
And then use it like this:
var onylByCodeList = firstSheet.Entries.Intersect(otherSheet.Entries, new CodeListComparer());
Any ideas what I'm doing wrong here ?
thanks
Sunit
You have a typo in the Equals method. The second line is comparing ColDescriptionOne to ColDescriptionSecond. They should both be ColDescriptionSecond.
return Equals(x.ColDescriptionOne, y.ColDescriptionOne)
&& Equals(x.ColDescriptionSecond, y.ColDescriptionSecond)
&& !Equals(x.ColCodelist, y.ColCodelist);
The second problem you have is you are including the ColCodeList in the GetHashCode method. The GetHashCode method must return the same value for objects that are equal. In this case though ColCodeList is supposed to be different when values are equal. This means that in cases where you want 2 objects to be considered equal they are more likely to have different hash codes which is incorrect.
Take that out of the GetHashCode method and everything should work.
public override int GetHashCode(SheetRow obj)
{
return ((obj.ColDescriptionOne.GetHashCode()*397)
+ (obj.ColDescriptionSecond.GetHashCode()*397));
}