My question is basically the opposite of Dictionary.ContainsKey return False, but a want True and of "the given key was not present in the dictionary" error when using a self-defined class as key:
I want to use a medium-sized class as the dictionary's key, and the dictionary must compare the keys by reference, not by value equality. The problem is, that the class already implements Equals() (which is performing value equality - which is what not what I want here).
Here's a small test class for reproduction:
class CTest
{
public int m_iValue;
public CTest (int i_iValue)
{
m_iValue = i_iValue;
}
public override bool Equals (object i_value)
{
if (ReferenceEquals (null, i_value))
return false;
if (ReferenceEquals (this, i_value))
return true;
if (i_value.GetType () != GetType ())
return false;
return m_iValue == ((CTest)i_value).m_iValue;
}
}
I have NOT yet implemented GetHashCode() (actually I have, but it only returns base.GetHashCode() so far).
Now I created a test program with a dictionary that uses instances of this class as keys. I can add multiple identical instances to the dictionary without problems, but this only works because GetHashCode() returns different values:
private static void Main ()
{
var oTest1 = new CTest (1);
var oTest2 = new CTest (1);
bool bEquals = Equals (oTest1, oTest2); // true
var dict = new Dictionary<CTest, int> ();
dict.Add (oTest1, 1);
dict.Add (oTest2, 2); // works
var iValue1 = dict[oTest1]; // correctly returns 1
var iValue2 = dict[oTest2]; // correctly returns 2
int iH1 = oTest1.GetHashCode (); // values different on each execution
int iH2 = oTest2.GetHashCode (); // values different on each execution, but never equals iH1
}
And the hash values are different every time, maybe because the calculatation in object.GetHashCode() uses some randomization or some numbers that come from the reference handle (which is different for each object).
However, this answer on Why is it important to override GetHashCode when Equals method is overridden? says that GetHashCode() must return the same values for equal objects, so I added
public override int GetHashCode ()
{
return m_iValue;
}
After that, I could not add multiple equal objects to the dictionary any more.
Now, there are two conclusions:
If I removed my own GetHashCode() again, the hash values will be different again and the dictionary can be used. But there may be situations that accidentally give the same hash code for two equal objects, which will cause an exception at runtime, whose cause will for sure never be found. Because of that (little, but not zero) risk, I cannot use a dictionary.
If I correctly implement GetHashCode() like I am supposed to do, I cannot use a dictionary anyway.
What possibilities exist to still use a dictionary?
Like many times before, I had the idea for a solution when writing this question.
You can specify an IEqualityComparer<TKey> in the constructor of the dictionary. There is one in the .net framework, but it's internal sealed, so you need to implement your own:
Is there any kind of "ReferenceComparer" in .NET?
internal class ReferenceComparer<T> : IEqualityComparer<T> where T : class
{
static ReferenceComparer ()
{
Instance = new ReferenceComparer<T> ();
}
public static ReferenceComparer<T> Instance { get; }
public bool Equals (T x, T y)
{
return ReferenceEquals (x, y);
}
public int GetHashCode (T obj)
{
return System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode (obj);
}
}
Related
One of the great advantages is supposed to be value based/structural equality, but how do I get that to work with collection properties?
Concrete simple example:
public record Something(string Id);
public record Sample(List<Something> something);
With the above records I would expect the following test to pass:
[Fact]
public void Test()
{
var x = new Sample(new List<Something>() {
new Something("x1")
});
var y = new Sample(new List<Something>() {
new Something("x1")
});
Assert.Equal(x, y);
}
I understand that it is because of List being a reference type, but does it exist a collection that implements value based comparison? Basically I would like to do a "deep" value based comparison.
Records don't do this automatically, but you can implement the Equals method yourself:
public record Sample(List<Something> something) : IEquatable<Sample>
{
public virtual bool Equals(Sample? other) =>
other != null &&
Enumerable.SequenceEqual(something, other.something);
}
But note that GetHashCode should be overridden to be consistent with Equals. See also implement GetHashCode() for objects that contain collections
I am checking if a total group of edges already contains the connection between 2 points.
I want to use HashSet's that will contain 2 vectors as Dictionary keys. Then I want to be able to call a performant Dictionary.ContainsKey(hashSet). I want the contains/equality check to be dependent on the Vectors in the Set.
Fex. If I add HashSet [V000 V001] to the Dict. I want to get Dictionary.ContainsKey(HashSet [V001 V000]) return true. (HashSet, so the order can vary, just the same Elements)
The Problem seems to be, that the Dictionary.ContainsKey() method does see separately created HashSets as different objects, even though, they contain the same elements.
Dictionary<HashSet<Vector3>, Vector3> d = new Dictionary<HashSet<Vector3>, Vector3>();
HashSet<Vector3> s = new HashSet<Vector3>();
s.Add(Vector3.one);
s.Add(Vector3.zero);
d.Add(s);
HashSet<Vector3> s2 = new HashSet<Vector3>();
s2.Add(Vector3.zero);
s2.Add(Vector3.one);
bool doesContain = d.ContainsKey(s2); // should be true
You also may suggest a better way of doing this 'Contains()' check efficiently.
The HashSet type doesn't do the equality comparison you want out of the box. It only has reference equality.
To get what you want, you'll need a new type to use as the Dictionary key. The new type will have a HashSet property, and overload Equals() and GetHashCode(), and may as well implement IEquatable at this point as well.
I'll get you started:
public class HashKey<T> : IEquatable<HashKey<T>>
{
private HashSet<T> _items;
public HashSet<T> Items
{
get {return _items;}
private set {_items = value;}
}
public HashKey()
{
_items = new HashSet<T>();
}
public HashKey(HashSet<T> initialSet)
{
_items = initialSet ?? new HashSet();
}
public override int GetHashCode()
{
// I'm leaving this for you to do
}
public override bool Equals(Object obj)
{
if (! (obj is HashKey)) return false;
return this.GetHashCode().Equals(obj.GetHashCode());
}
public bool Equals(HashSet<T> obj)
{
if (obj is null) return false;
return this.GetHashCode().Equals(obj.GetHashCode());
}
}
You want to use a hashset as key.
So the keys are references where one key is one hashset reference.
The ContainsKey compare references.
For what you want to do, you can create a class that implements IEqualityComparer to pass it to the dictionary constructor.
https://learn.microsoft.com/dotnet/api/system.collections.generic.iequalitycomparer-1
If you want a full management, you should create a new class embedding the dictionary and implement your own public operations wrapping that of the dictionary : ContainsKey and all others methods you need.
public class MyDictionary : IEnumerable<>
{
private Dictionary<HashSet<Vector3>, Vector3> d
= new Dictionary<HashSet<Vector3>, Vector3>();
public int Count { get; }
public this...
public ContainsKey()
{
// implements your own comparison algorithm
}
public Add();
public Remove();
...
}
So you will have a strongly typed dictionary for your intended usage.
By searching though msdn c# documentation and stack overflow, I get the clear impression that Dictionary<T,T> is supposed to use GetHashCode() for checking key-uniqueness and to do look-up.
The Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table.
...
The speed of retrieval depends on the quality of the hashing algorithm of the type specified for TKey.
I Use mono (in Unity3D), and after getting some weird results in my work, I conducted this experiment:
public class DictionaryTest
{
public static void TestKeyUniqueness()
{
//Test a dictionary of type1
Dictionary<KeyType1, string> dictionaryType1 = new Dictionary<KeyType1, string>();
dictionaryType1[new KeyType1(1)] = "Val1";
if(dictionaryType1.ContainsKey(new KeyType1(1)))
{
Debug.Log ("Key in dicType1 was already present"); //This line does NOT print
}
//Test a dictionary of type1
Dictionary<KeyType2, string> dictionaryType2 = new Dictionary<KeyType2, string>();
dictionaryType2[new KeyType2(1)] = "Val1";
if(dictionaryType2.ContainsKey(new KeyType2(1)))
{
Debug.Log ("Key in dicType2 was already present"); // Only this line prints
}
}
}
//This type implements only GetHashCode()
public class KeyType1
{
private int var1;
public KeyType1(int v1)
{
var1 = v1;
}
public override int GetHashCode ()
{
return var1;
}
}
//This type implements both GetHashCode() and Equals(obj), where Equals uses the hashcode.
public class KeyType2
{
private int var1;
public KeyType2(int v1)
{
var1 = v1;
}
public override int GetHashCode ()
{
return var1;
}
public override bool Equals (object obj)
{
return GetHashCode() == obj.GetHashCode();
}
}
Only the when using type KeyType2 are the keys considered equal. To me this demonstrates that Dictionary uses Equals(obj) - and not GetHashCode().
Can someone reproduce this, and help me interpret the meaning is? Is it an incorrect implementation in mono? Or have I misunderstood something.
i get the clear impression that Dictionary is supposed to use
.GetHashCode() for checking key-uniqueness
What made you think that? GetHashCode doesn't return unique values.
And MSDN clearly says:
Dictionary requires an equality implementation to
determine whether keys are equal. You can specify an implementation of
the IEqualityComparer generic interface by using a constructor that
accepts a comparer parameter; if you do not specify an implementation,
the default generic equality comparer EqualityComparer.Default is
used. If type TKey implements the System.IEquatable generic
interface, the default equality comparer uses that implementation.
Doing this:
public override bool Equals (object obj)
{
return GetHashCode() == obj.GetHashCode();
}
is wrong in the general case because you might end up with KeyType2 instances that are equal to StringBuilder, SomeOtherClass, AnythingYouCanImagine and what not instances.
You should totally do it like so:
public override bool Equals (object obj)
{
if (obj is KeyType2) {
return (obj as KeyType2).var1 == this.var1;
} else
return false;
}
When you are trying to override Equals and inherently GetHashCode you must ensure the following points (given the class MyObject) in this order (you were doing it the other way around):
1) When are 2 instances of MyObject equal ? Say you have:
public class MyObject {
public string Name { get; set; }
public string Address { get; set; }
public int Age { get; set; }
public DateTime TimeWhenIBroughtThisInstanceFromTheDatabase { get; set; }
}
And you have 1 record in some database that you need to be mapped to an instance of this class.
And you make the convention that the time you read the record from the database will be stored
in the TimeWhenIBroughtThisInstanceFromTheDatabase:
MyObject obj1 = DbHelper.ReadFromDatabase( ...some params...);
// you do that at 14:05 and thusly the TimeWhenIBroughtThisInstanceFromTheDatabase
// will be assigned accordingly
// later.. at 14:07 you read the same record into a different instance of MyClass
MyObject obj2 = DbHelper.ReadFromDatabase( ...some params...);
// (the same)
// At 14:09 you ask yourself if the 2 instances are the same
bool theyAre = obj1.Equals(obj2)
Do you want the result to be true ? I would say you do.
Therefore the overriding of Equals should like so:
public class MyObject {
...
public override bool Equals(object obj) {
if (obj is MyObject) {
var that = obj as MyObject;
return (this.Name == that.Name) &&
(this.Address == that.Address) &&
(this.Age == that.Age);
// without the syntactically possible but logically challenged:
// && (this.TimeWhenIBroughtThisInstanceFromTheDatabase ==
// that.TimeWhenIBroughtThisInstanceFromTheDatabase)
} else
return false;
}
...
}
2) ENSURE THAT whenever 2 instances are equal (as indicated by the Equals method you implement)
their GetHashCode results will be identitcal.
int hash1 = obj1.GetHashCode();
int hash2 = obj2.GetHashCode();
bool theseMustBeAlso = hash1 == hash2;
The easiest way to do that is (in the sample scenario):
public class MyObject {
...
public override int GetHashCode() {
int result;
result = ((this.Name != null) ? this.Name.GetHashCode() : 0) ^
((this.Address != null) ? this.Address.GetHashCode() : 0) ^
this.Age.GetHashCode();
// without the syntactically possible but logically challenged:
// ^ this.TimeWhenIBroughtThisInstanceFromTheDatabase.GetHashCode()
}
...
}
Note that:
- Strings can be null and that .GetHashCode() might fail with NullReferenceException.
- I used ^ (XOR). You can use whatever you want as long as the golden rule (number 2) is respected.
- x ^ 0 == x (for whatever x)
I'm trying to get a hash (md5 or sha) of an object.
I've implemented this:
http://alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx
I'm using nHibernate to retrieve my POCOs from a database.
When running GetHash on this, it's different each time it's selected and hydrated from the database. I guess this is expected, as the underlying proxies will change.
Anyway,
Is there a way to get a hash of all the properties on an object, consistently each time?
I've toyed with the idea of using a StringBuilder over this.GetType().GetProperties..... and creating a hash on that, but that seems inefficient?
As a side note, this is for change-tracking these entities from one database (RDBMS) to a NoSQL store
(comparing hash values to see if objects changed between rdbms and nosql)
If you're not overriding GetHashCode you just inherit Object.GetHashCode. Object.GetHashCode basically just returns the memory address of the instance, if it's a reference object. Of course, each time an object is loaded it will likely be loaded into a different part of memory and thus result in a different hash code.
It's debatable whether that's the correct thing to do; but that's what was implemented "back in the day" so it can't change now.
If you want something consistent then you have to override GetHashCode and create a code based on the "value" of the object (i.e. the properties and/or fields). This can be as simple as a distributed merging of the hash codes of all the properties/fields. Or, it could be as complicated as you need it to be. If all you're looking for is something to differentiate two different objects, then using a unique key on the object might work for you.If you're looking for change tracking, using the unique key for the hash probably isn't going to work
I simply use all the hash codes of the fields to create a reasonably distributed hash code for the parent object. For example:
public override int GetHashCode()
{
unchecked
{
int result = (Name != null ? Name.GetHashCode() : 0);
result = (result*397) ^ (Street != null ? Street.GetHashCode() : 0);
result = (result*397) ^ Age;
return result;
}
}
The use of the prime number 397 is to generate a unique number for a value to better distribute the hash code. See http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/ for more details on the use of primes in hash code calculations.
You could, of course, use reflection to get at all the properties to do this, but that would be slower. Alternatively you could use the CodeDOM to generate code dynamically to generate the hash based on reflecting on the properties and cache that code (i.e. generate it once and reload it next time). But, this of course, is very complex and might not be worth the effort.
An MD5 or SHA hash or CRC is generally based on a block of data. If you want that, then using the hash code of each property doesn't make sense. Possibly serializing the data to memory and calculating the hash that way would be more applicable, as Henk describes.
If this 'hash' is solely used to determine whether entities have changed then the following algorithm may help (NB it is untested and assumes that the same runtime will be used when generating hashes (otherwise the reliance on GetHashCode for 'simple' types is incorrect)):
public static byte[] Hash<T>(T entity)
{
var seen = new HashSet<object>();
var properties = GetAllSimpleProperties(entity, seen);
return properties.Select(p => BitConverter.GetBytes(p.GetHashCode()).AsEnumerable()).Aggregate((ag, next) => ag.Concat(next)).ToArray();
}
private static IEnumerable<object> GetAllSimpleProperties<T>(T entity, HashSet<object> seen)
{
foreach (var property in PropertiesOf<T>.All(entity))
{
if (property is int || property is long || property is string ...) yield return property;
else if (seen.Add(property)) // Handle cyclic references
{
foreach (var simple in GetAllSimpleProperties(property, seen)) yield return simple;
}
}
}
private static class PropertiesOf<T>
{
private static readonly List<Func<T, dynamic>> Properties = new List<Func<T, dynamic>>();
static PropertiesOf()
{
foreach (var property in typeof(T).GetProperties())
{
var getMethod = property.GetGetMethod();
var function = (Func<T, dynamic>)Delegate.CreateDelegate(typeof(Func<T, dynamic>), getMethod);
Properties.Add(function);
}
}
public static IEnumerable<dynamic> All(T entity)
{
return Properties.Select(p => p(entity)).Where(v => v != null);
}
}
This would then be useable like so:
var entity1 = LoadEntityFromRdbms();
var entity2 = LoadEntityFromNoSql();
var hash1 = Hash(entity1);
var hash2 = Hash(entity2);
Assert.IsTrue(hash1.SequenceEqual(hash2));
GetHashCode() returns an Int32 (not an MD5).
If you create two objects with all the same property values they will not have the same Hash if you use the base or system GetHashCode().
String is an object and an exception.
string s1 = "john";
string s2 = "john";
if (s1 == s2) returns true and will return the same GetHashCode()
If you want to control equality comparison of two objects then you should override the GetHash and Equality.
If two object are the same then they must also have the same GetHash(). But two objects with the same GetHash() are not necessarily the same. A comparison will first test the GetHash() and if it gets a match there it will test the Equals. OK there are some comparisons that go straight to Equals but you should still override both and make sure two identical objects produce the same GetHash.
I use this for syncing a client with the server. You could use all the Properties or you could have any Property change change the VerID. The advantage here is a simpler quicker GetHashCode(). In my case I was resetting the VerID with any Property change already.
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is FTSdocWord)) return false;
FTSdocWord item = (FTSdocWord)obj;
return (OjbID == item.ObjID && VerID == item.VerID);
}
public override int GetHashCode()
{
return ObjID ^ VerID;
}
I ended up using ObjID alone so I could do the following
if (myClientObj == myServerObj && myClientObj.VerID <> myServerObj.VerID)
{
// need to synch
}
Object.GetHashCode Method
Two objects with the same property values. Are they equal? Do they produce the same GetHashCode()?
personDefault pd1 = new personDefault("John");
personDefault pd2 = new personDefault("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// different GetHashCode
if (pd1.Equals(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("pd1 == pd2");
}
List<personDefault> personsDefault = new List<personDefault>();
personsDefault.Add(pd1);
if (personsDefault.Contains(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("Contains(pd2)");
}
personOverRide po1 = new personOverRide("John");
personOverRide po2 = new personOverRide("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// same hash
if (po1.Equals(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("po1 == po2");
}
List<personOverRide> personsOverRide = new List<personOverRide>();
personsOverRide.Add(po1);
if (personsOverRide.Contains(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("Contains(p02)");
}
}
public class personDefault
{
public string Name { get; private set; }
public personDefault(string name) { Name = name; }
}
public class personOverRide: Object
{
public string Name { get; private set; }
public personOverRide(string name) { Name = name; }
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is personOverRide)) return false;
personOverRide item = (personOverRide)obj;
return (Name == item.Name);
}
public override int GetHashCode()
{
return Name.GetHashCode();
}
}
I have a class, show below, which is used as a key in a Dictionary<ValuesAandB, string>
I'm having issues when trying to find any key within this dictionary, it never finds it at all. As you can see, I have overridden Equals and GetHashCode.
To look for the key I'm using
ValuesAandB key = new ValuesAandB(A,B);
if (DictionaryName.ContainsKey(key)) {
...
}
Is there anything else that I'm missing? Can anyone point out what I'm doing wrong?
private class ValuesAandB {
public string valueA;
public string valueB;
// Constructor
public ValuesAandB (string valueAIn, string valueBIn) {
valueA = valueAIn;
valueB = ValueBIn;
}
public class EqualityComparer : IEqualityComparer<ValuesAandB> {
public bool Equals(ValuesAandB x, ValuesAandB y) {
return ((x.valueA.Equals(y.valueA)) && (x.valueB.Equals(y.valueB)));
}
public int GetHashCode(ValuesAandB x) {
return x.valueA.GetHashCode() ^ x.valueB.GetHashCode();
}
}
}
And before anyone asks, yes the values are in the Dictionary!
You have not overridden Equals and GetHashCode. You have implemented a second class which can serve as an EqualityComparer. If you don't construct the Dictionary with the EqualityComparer, it will not be used.
The simplest fix would be to override GetHashCode and Equals directly rather than implementing a comparer (comparers are generally only interesting when you need to supply multiple different comparison types (case sensitive and case insensitive for example) or when you need to be able to perform comparisons on a class which you don't control.
I had this problem, turns out the dictionary was comparing referances for my key, not the values in the object.
I was using a custom Point class as keys. I overrode the ToString() and the GetHashCode() methods and viola, key lookup worked fine.
It looks like you're comparing two strings. Iirc, when using .Equals(), you are comparing the reference of the strings, not the actual contents. To implement an EqualityComparer that works with strings, you would want to use the String.Compare() method.
public class EqualityComparer : IEqualityComparer<ValuesAandB>
{
public bool Equals(ValuesAandB x, ValuesAandB y)
{
return ((String.Compare(x.valueA,y.valueA) == 0) &&
(String.Compare(x.valueB, y.valueB) == 0));
}
// gethashcode stuff here
}
I could be a little off with the code, that should get you close...