I am trying to understand how works the key sorting / insertion check in a hashtable.
I've understood that when I'm adding an object to a hashtable, it checks at runtime that there isn't the same key already entered in there.
In my test, I've 2 hashtables which keys are filled in with:
1- Integers
2- An object which I've overriden the GetHashCode method to return always 1.
My issue here: while the first test is breaking when adding the same int key, the second test isn't! How come? The hashcodes that should be checked at the insertion are all returning 1.
Thank you in advance!
My code:
class Collections
{
public Collections()
{
// Testing a hashtable with integer keys
Dictionary<int, string> d1 = new Dictionary<int, string>();
d1.Add(1, "one");
d1.Add(2, "two");
d1.Add(3, "three");
// d1.Add(3, "three"); // Cannot add the same key, i.e. same hashcode
foreach (int key in d1.Keys)
Console.WriteLine(key);
// Testing a hashtable with objects returning only 1 as hashcode for its keys
Dictionary<Hashkey, string> d2 = new Dictionary<Hashkey, string>();
d2.Add(new Hashkey(1), "one");
d2.Add(new Hashkey(2), "two");
d2.Add(new Hashkey(3), "three");
d2.Add(new Hashkey(3), "three");
for (int i = 0; i < d2.Count; i++)
Console.WriteLine(d2.Keys.ElementAt(i).ToString());
}
}
/// <summary>
/// Creating a class that is serving as a key of a hasf table, overring the GetHashcode() of System.Object
/// </summary>
class Hashkey
{
public int Key { get; set; }
public Hashkey(int key)
{
this.Key = key;
}
// Overriding the Hashcode to return always 1
public override int GetHashCode()
{
return 1;
// return base.GetHashCode();
}
// No override
public override bool Equals(object obj)
{
return base.Equals(obj);
}
// returning the name of the object
public override string ToString()
{
return this.Key.ToString();
}
}
}
Dictionary will check for the hash code and object equality. The hash is just used to come up with a "first approximation" to find possibly equal keys very efficiently.
Your override of Equals just delegates to the base implementation, which uses reference equality. That means any two distinct instances of HashKey are unequal, even if they have the same value for the Key property.
What are you actually trying to achieve? Or are you just trying to understand how GetHashCode and Equals relate to each other?
The hash code is merely an heuristic to choose the right bucket to store the item in. Just because 2 items share the same hashcode doesn't mean that they are equal. In the case of collision (as will be occurring with your 1 hashcode), we revert to simple search within the bucket to find members that are equal to the searched key. As your equality test is checking if the references are the same, no 2 items will ever be equal.
The Equals compares the reference of the HashKey.
Because that are different instances, they are not equal.
Your Equals should look like this:
public override bool Equals(object obj)
{
if (ReferenceEquals(this, obj))
return true;
var other = obj as Hashkey;
return
other != null &&
Key.Equals(other.Key);
}
Related
I am checking if a total group of edges already contains the connection between 2 points.
I want to use HashSet's that will contain 2 vectors as Dictionary keys. Then I want to be able to call a performant Dictionary.ContainsKey(hashSet). I want the contains/equality check to be dependent on the Vectors in the Set.
Fex. If I add HashSet [V000 V001] to the Dict. I want to get Dictionary.ContainsKey(HashSet [V001 V000]) return true. (HashSet, so the order can vary, just the same Elements)
The Problem seems to be, that the Dictionary.ContainsKey() method does see separately created HashSets as different objects, even though, they contain the same elements.
Dictionary<HashSet<Vector3>, Vector3> d = new Dictionary<HashSet<Vector3>, Vector3>();
HashSet<Vector3> s = new HashSet<Vector3>();
s.Add(Vector3.one);
s.Add(Vector3.zero);
d.Add(s);
HashSet<Vector3> s2 = new HashSet<Vector3>();
s2.Add(Vector3.zero);
s2.Add(Vector3.one);
bool doesContain = d.ContainsKey(s2); // should be true
You also may suggest a better way of doing this 'Contains()' check efficiently.
The HashSet type doesn't do the equality comparison you want out of the box. It only has reference equality.
To get what you want, you'll need a new type to use as the Dictionary key. The new type will have a HashSet property, and overload Equals() and GetHashCode(), and may as well implement IEquatable at this point as well.
I'll get you started:
public class HashKey<T> : IEquatable<HashKey<T>>
{
private HashSet<T> _items;
public HashSet<T> Items
{
get {return _items;}
private set {_items = value;}
}
public HashKey()
{
_items = new HashSet<T>();
}
public HashKey(HashSet<T> initialSet)
{
_items = initialSet ?? new HashSet();
}
public override int GetHashCode()
{
// I'm leaving this for you to do
}
public override bool Equals(Object obj)
{
if (! (obj is HashKey)) return false;
return this.GetHashCode().Equals(obj.GetHashCode());
}
public bool Equals(HashSet<T> obj)
{
if (obj is null) return false;
return this.GetHashCode().Equals(obj.GetHashCode());
}
}
You want to use a hashset as key.
So the keys are references where one key is one hashset reference.
The ContainsKey compare references.
For what you want to do, you can create a class that implements IEqualityComparer to pass it to the dictionary constructor.
https://learn.microsoft.com/dotnet/api/system.collections.generic.iequalitycomparer-1
If you want a full management, you should create a new class embedding the dictionary and implement your own public operations wrapping that of the dictionary : ContainsKey and all others methods you need.
public class MyDictionary : IEnumerable<>
{
private Dictionary<HashSet<Vector3>, Vector3> d
= new Dictionary<HashSet<Vector3>, Vector3>();
public int Count { get; }
public this...
public ContainsKey()
{
// implements your own comparison algorithm
}
public Add();
public Remove();
...
}
So you will have a strongly typed dictionary for your intended usage.
My question is basically the opposite of Dictionary.ContainsKey return False, but a want True and of "the given key was not present in the dictionary" error when using a self-defined class as key:
I want to use a medium-sized class as the dictionary's key, and the dictionary must compare the keys by reference, not by value equality. The problem is, that the class already implements Equals() (which is performing value equality - which is what not what I want here).
Here's a small test class for reproduction:
class CTest
{
public int m_iValue;
public CTest (int i_iValue)
{
m_iValue = i_iValue;
}
public override bool Equals (object i_value)
{
if (ReferenceEquals (null, i_value))
return false;
if (ReferenceEquals (this, i_value))
return true;
if (i_value.GetType () != GetType ())
return false;
return m_iValue == ((CTest)i_value).m_iValue;
}
}
I have NOT yet implemented GetHashCode() (actually I have, but it only returns base.GetHashCode() so far).
Now I created a test program with a dictionary that uses instances of this class as keys. I can add multiple identical instances to the dictionary without problems, but this only works because GetHashCode() returns different values:
private static void Main ()
{
var oTest1 = new CTest (1);
var oTest2 = new CTest (1);
bool bEquals = Equals (oTest1, oTest2); // true
var dict = new Dictionary<CTest, int> ();
dict.Add (oTest1, 1);
dict.Add (oTest2, 2); // works
var iValue1 = dict[oTest1]; // correctly returns 1
var iValue2 = dict[oTest2]; // correctly returns 2
int iH1 = oTest1.GetHashCode (); // values different on each execution
int iH2 = oTest2.GetHashCode (); // values different on each execution, but never equals iH1
}
And the hash values are different every time, maybe because the calculatation in object.GetHashCode() uses some randomization or some numbers that come from the reference handle (which is different for each object).
However, this answer on Why is it important to override GetHashCode when Equals method is overridden? says that GetHashCode() must return the same values for equal objects, so I added
public override int GetHashCode ()
{
return m_iValue;
}
After that, I could not add multiple equal objects to the dictionary any more.
Now, there are two conclusions:
If I removed my own GetHashCode() again, the hash values will be different again and the dictionary can be used. But there may be situations that accidentally give the same hash code for two equal objects, which will cause an exception at runtime, whose cause will for sure never be found. Because of that (little, but not zero) risk, I cannot use a dictionary.
If I correctly implement GetHashCode() like I am supposed to do, I cannot use a dictionary anyway.
What possibilities exist to still use a dictionary?
Like many times before, I had the idea for a solution when writing this question.
You can specify an IEqualityComparer<TKey> in the constructor of the dictionary. There is one in the .net framework, but it's internal sealed, so you need to implement your own:
Is there any kind of "ReferenceComparer" in .NET?
internal class ReferenceComparer<T> : IEqualityComparer<T> where T : class
{
static ReferenceComparer ()
{
Instance = new ReferenceComparer<T> ();
}
public static ReferenceComparer<T> Instance { get; }
public bool Equals (T x, T y)
{
return ReferenceEquals (x, y);
}
public int GetHashCode (T obj)
{
return System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode (obj);
}
}
I'm trying to get a hash (md5 or sha) of an object.
I've implemented this:
http://alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx
I'm using nHibernate to retrieve my POCOs from a database.
When running GetHash on this, it's different each time it's selected and hydrated from the database. I guess this is expected, as the underlying proxies will change.
Anyway,
Is there a way to get a hash of all the properties on an object, consistently each time?
I've toyed with the idea of using a StringBuilder over this.GetType().GetProperties..... and creating a hash on that, but that seems inefficient?
As a side note, this is for change-tracking these entities from one database (RDBMS) to a NoSQL store
(comparing hash values to see if objects changed between rdbms and nosql)
If you're not overriding GetHashCode you just inherit Object.GetHashCode. Object.GetHashCode basically just returns the memory address of the instance, if it's a reference object. Of course, each time an object is loaded it will likely be loaded into a different part of memory and thus result in a different hash code.
It's debatable whether that's the correct thing to do; but that's what was implemented "back in the day" so it can't change now.
If you want something consistent then you have to override GetHashCode and create a code based on the "value" of the object (i.e. the properties and/or fields). This can be as simple as a distributed merging of the hash codes of all the properties/fields. Or, it could be as complicated as you need it to be. If all you're looking for is something to differentiate two different objects, then using a unique key on the object might work for you.If you're looking for change tracking, using the unique key for the hash probably isn't going to work
I simply use all the hash codes of the fields to create a reasonably distributed hash code for the parent object. For example:
public override int GetHashCode()
{
unchecked
{
int result = (Name != null ? Name.GetHashCode() : 0);
result = (result*397) ^ (Street != null ? Street.GetHashCode() : 0);
result = (result*397) ^ Age;
return result;
}
}
The use of the prime number 397 is to generate a unique number for a value to better distribute the hash code. See http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/ for more details on the use of primes in hash code calculations.
You could, of course, use reflection to get at all the properties to do this, but that would be slower. Alternatively you could use the CodeDOM to generate code dynamically to generate the hash based on reflecting on the properties and cache that code (i.e. generate it once and reload it next time). But, this of course, is very complex and might not be worth the effort.
An MD5 or SHA hash or CRC is generally based on a block of data. If you want that, then using the hash code of each property doesn't make sense. Possibly serializing the data to memory and calculating the hash that way would be more applicable, as Henk describes.
If this 'hash' is solely used to determine whether entities have changed then the following algorithm may help (NB it is untested and assumes that the same runtime will be used when generating hashes (otherwise the reliance on GetHashCode for 'simple' types is incorrect)):
public static byte[] Hash<T>(T entity)
{
var seen = new HashSet<object>();
var properties = GetAllSimpleProperties(entity, seen);
return properties.Select(p => BitConverter.GetBytes(p.GetHashCode()).AsEnumerable()).Aggregate((ag, next) => ag.Concat(next)).ToArray();
}
private static IEnumerable<object> GetAllSimpleProperties<T>(T entity, HashSet<object> seen)
{
foreach (var property in PropertiesOf<T>.All(entity))
{
if (property is int || property is long || property is string ...) yield return property;
else if (seen.Add(property)) // Handle cyclic references
{
foreach (var simple in GetAllSimpleProperties(property, seen)) yield return simple;
}
}
}
private static class PropertiesOf<T>
{
private static readonly List<Func<T, dynamic>> Properties = new List<Func<T, dynamic>>();
static PropertiesOf()
{
foreach (var property in typeof(T).GetProperties())
{
var getMethod = property.GetGetMethod();
var function = (Func<T, dynamic>)Delegate.CreateDelegate(typeof(Func<T, dynamic>), getMethod);
Properties.Add(function);
}
}
public static IEnumerable<dynamic> All(T entity)
{
return Properties.Select(p => p(entity)).Where(v => v != null);
}
}
This would then be useable like so:
var entity1 = LoadEntityFromRdbms();
var entity2 = LoadEntityFromNoSql();
var hash1 = Hash(entity1);
var hash2 = Hash(entity2);
Assert.IsTrue(hash1.SequenceEqual(hash2));
GetHashCode() returns an Int32 (not an MD5).
If you create two objects with all the same property values they will not have the same Hash if you use the base or system GetHashCode().
String is an object and an exception.
string s1 = "john";
string s2 = "john";
if (s1 == s2) returns true and will return the same GetHashCode()
If you want to control equality comparison of two objects then you should override the GetHash and Equality.
If two object are the same then they must also have the same GetHash(). But two objects with the same GetHash() are not necessarily the same. A comparison will first test the GetHash() and if it gets a match there it will test the Equals. OK there are some comparisons that go straight to Equals but you should still override both and make sure two identical objects produce the same GetHash.
I use this for syncing a client with the server. You could use all the Properties or you could have any Property change change the VerID. The advantage here is a simpler quicker GetHashCode(). In my case I was resetting the VerID with any Property change already.
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is FTSdocWord)) return false;
FTSdocWord item = (FTSdocWord)obj;
return (OjbID == item.ObjID && VerID == item.VerID);
}
public override int GetHashCode()
{
return ObjID ^ VerID;
}
I ended up using ObjID alone so I could do the following
if (myClientObj == myServerObj && myClientObj.VerID <> myServerObj.VerID)
{
// need to synch
}
Object.GetHashCode Method
Two objects with the same property values. Are they equal? Do they produce the same GetHashCode()?
personDefault pd1 = new personDefault("John");
personDefault pd2 = new personDefault("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// different GetHashCode
if (pd1.Equals(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("pd1 == pd2");
}
List<personDefault> personsDefault = new List<personDefault>();
personsDefault.Add(pd1);
if (personsDefault.Contains(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("Contains(pd2)");
}
personOverRide po1 = new personOverRide("John");
personOverRide po2 = new personOverRide("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// same hash
if (po1.Equals(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("po1 == po2");
}
List<personOverRide> personsOverRide = new List<personOverRide>();
personsOverRide.Add(po1);
if (personsOverRide.Contains(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("Contains(p02)");
}
}
public class personDefault
{
public string Name { get; private set; }
public personDefault(string name) { Name = name; }
}
public class personOverRide: Object
{
public string Name { get; private set; }
public personOverRide(string name) { Name = name; }
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is personOverRide)) return false;
personOverRide item = (personOverRide)obj;
return (Name == item.Name);
}
public override int GetHashCode()
{
return Name.GetHashCode();
}
}
I have read that when you override Equals on an class/object you need to override GetHashCode.
public class Person : IEquatable<Person>
{
public int PersonId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public Person(int personId, string firstName, string lastName)
{
PersonId = personId;
FirstName = firstName;
LastName = lastName;
}
public bool Equals(Person obj)
{
Person p = obj as Person;
if (ReferenceEquals(null, p))
return false;
if (ReferenceEquals(this, p))
return true;
return Equals(p.FirstName, FirstName) &&
Equals(p.LastName, LastName);
}
}
Now given the following:
public static Dictionary<Person, Person> ObjDic= new Dictionary<Person, Person>();
public static Dictionary<int, Person> PKDic = new Dictionary<int, Person>();
Will not overridding the GetHashCode affect both of the Dictionary's above? What I am basically asking is how is GetHashCode generated? IF I still look for an object in PKDic will I be able to find it just based of the PK. If I wanted to override the GetHashCode how would one go about doing that?
You should always override GetHashCode.
A Dictionary<int, Person> will function without GetHashCode, but as soon as you call LINQ methods like Distinct or GroupBy, it will stop working.
Note, by the way, that you haven't actually overridden Equals either.
The IEquatable.Equals method is not the same as the virtual bool Equals(object obj) inherited from Object. Although the default IEqualityComparer<T> will use the IEquatable<T> interface if the class implements it, you should still override Equals, because other code might not.
In your case, you should override Equals and GetHashCode like this:
public override bool Equals(object obj) { return Equals(obj as Person); }
public override int GetHashCode() {
return FirstName.GetHashCode() ^ LastName.GetHashCode();
}
In your scenario, not overriding GetHashCode on your type will affect only the first dictionary, as the key is what's used for hashing, not the value.
When looking for the presence of a key, the Dictionary<TKey,TValue> will use the hash code to find out if any keys could be equal. It's important to note that a hash is a value that can determine if two things could be equal or very likely are equal. A hash, strictly speaking cannot determine if two items are equal.
Two equal objects are required to return the same hash code. However, two non-equal objects are not required to return different hash codes. In other words, if the hash codes don't match, you're guaranteed that the objects are not equal. If the hash codes do match, then the objects could be equal.
Because of this, the Dictionary will only call Equals on two objects if their hash codes match.
As to "how to override GetHashCode", that's a complicated question. Clasically, a hashing algorithm should provide a balance between even distribution of the codes over the set of values with a low collision rate (a collision is when two non-equal objects produce the same code). This is a simple thing to describe and a very difficult thing to accomplish. It's easy to do one or the other, but hard to balance them.
From a practical perspective (meaning disregarding performance), you could just XOR all of the characters of the first and last names (or even use their respective hash codes, as Joel suggests) as your hash code. This will give a low degree of collision, but won't result in a terribly even distribution. Unless you're dealing with very large sets or very frequent lookups, it won't be an issue.
Your GetHashCode() and Equals() methods should look like this:
public int GetHashCode()
{
return (FirstName.GetHashCode()+1) ^ (LastName.GetHashCode()+2);
}
public bool Equals(Object obj)
{
Person p = obj as Person;
if (p == null)
return false;
return this.Firstname == p.FirstName && this.LastName == p.Lastname;
}
The rule is that GetHashCode() must use exactly the fields used in determining equality for the .Equals() method.
As for the dictionary part of your question, .GetHashCode() is used for determining the key in a dictionary. However, this has a different impact for each of the dictionarys in your question.
The dictionary with the int key (presumably your person ID) will use the GetHashCode() for the integer, while the other dictionary (ObjDic) will use the GetHashCode() from your Person object. Therefore PKDic will always differentiate between two people with different IDs, while ObjDic might treat two people with different IDs but the same first and last names as the same record.
Here is how I would do it. Since it is common for two different people to have exactly the same name it makes more sense to use a unique identifier (which you already have).
public class Person : IEquatable<Person>
{
public override int GetHashCode()
{
return PersonId.GetHashCode();
}
public override bool Equals(object obj)
{
var that = obj as Person;
if (that != null)
{
return Equals(that);
}
return false;
}
public bool Equals(Person that)
{
return this.PersonId == that.PersonId;
}
}
To answer your specific question: This only matters if you are using Person as a key in an IDictionary collection. For example, Dictionary<Person, string> or SortedDictionary<Person, Foo>, but not Dictionary<int, Person>.
This question comes out of the discussion on tuples.
I started thinking about the hash code that a tuple should have.
What if we will accept KeyValuePair class as a tuple? It doesn't override the GetHashCode() method, so probably it won't be aware of the hash codes of it's "children"... So, run-time will call Object.GetHashCode(), which is not aware of the real object structure.
Then we can make two instances of some reference type, which are actually Equal, because of the overloaded GetHashCode() and Equals(). And use them as "children" in tuples to "cheat" the dictionary.
But it doesn't work! Run-time somehow figures out the structure of our tuple and calls the overloaded GetHashCode of our class!
How does it work? What's the analysis made by Object.GetHashCode()?
Can it affect the performance in some bad scenario, when we use some complicated keys? (probably, impossible scenario... but still)
Consider this code as an example:
namespace csharp_tricks
{
class Program
{
class MyClass
{
int keyValue;
int someInfo;
public MyClass(int key, int info)
{
keyValue = key;
someInfo = info;
}
public override bool Equals(object obj)
{
MyClass other = obj as MyClass;
if (other == null) return false;
return keyValue.Equals(other.keyValue);
}
public override int GetHashCode()
{
return keyValue.GetHashCode();
}
}
static void Main(string[] args)
{
Dictionary<object, object> dict = new Dictionary<object, object>();
dict.Add(new KeyValuePair<MyClass,object>(new MyClass(1, 1), 1), 1);
//here we get the exception -- an item with the same key was already added
//but how did it figure out the hash code?
dict.Add(new KeyValuePair<MyClass,object>(new MyClass(1, 2), 1), 1);
return;
}
}
}
Update I think I've found an explanation for this as stated below in my answer. The main outcomes of it are:
Be careful with your keys and their hash codes :-)
For complicated dictionary keys you must override Equals() and GetHashCode() correctly.
Don't override GetHashcode() and Equals() on mutable classes, only override it on immutable classes or structures, else if you modify a object used as key the hash table won't function properly anymore (you won't be able to retrieve the value associated to the key after the key object was modified)
Also hash tables don't use hashcodes to identify objects they use the key objects themselfes as identifiers, it's not required that all keys that are used to add entries in a hash table return different hashcodes, but it is recommended that they do, else performance suffers greatly.
Here are the proper Hash and equality implementations for the Quad tuple (contains 4 tuple components inside). This code ensures proper usage of this specific tuple in HashSets and the dictionaries.
More on the subject (including the source code) here.
Note usage of the unchecked keyword (to avoid overflows) and throwing NullReferenceException if obj is null (as required by the base method)
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj))
throw new NullReferenceException("obj is null");
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof (Quad<T1, T2, T3, T4>)) return false;
return Equals((Quad<T1, T2, T3, T4>) obj);
}
public bool Equals(Quad<T1, T2, T3, T4> obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
return Equals(obj.Item1, Item1)
&& Equals(obj.Item2, Item2)
&& Equals(obj.Item3, Item3)
&& Equals(obj.Item4, Item4);
}
public override int GetHashCode()
{
unchecked
{
int result = Item1.GetHashCode();
result = (result*397) ^ Item2.GetHashCode();
result = (result*397) ^ Item3.GetHashCode();
result = (result*397) ^ Item4.GetHashCode();
return result;
}
}
public static bool operator ==(Quad<T1, T2, T3, T4> left, Quad<T1, T2, T3, T4> right)
{
return Equals(left, right);
}
public static bool operator !=(Quad<T1, T2, T3, T4> left, Quad<T1, T2, T3, T4> right)
{
return !Equals(left, right);
}
Check out this post by Brad Abrams and also the comment by Brian Grunkemeyer for some more information on how object.GetHashCode works. Also, take a look at the first comment on Ayande's blog post. I don't know if the current releases of the Framework still follow these rules or if they have actually changed it like Brad implied.
It seems that I have a clue now.
I thought KeyValuePair is a reference type, but it is not, it is a struct. And so it uses ValueType.GetHashCode() method. MSDN for it says: "One or more fields of the derived type is used to calculate the return value".
If you will take a real reference type as a "tuple-provider" you'll cheat the dictionary (or yourself...).
using System.Collections.Generic;
namespace csharp_tricks
{
class Program
{
class MyClass
{
int keyValue;
int someInfo;
public MyClass(int key, int info)
{
keyValue = key;
someInfo = info;
}
public override bool Equals(object obj)
{
MyClass other = obj as MyClass;
if (other == null) return false;
return keyValue.Equals(other.keyValue);
}
public override int GetHashCode()
{
return keyValue.GetHashCode();
}
}
class Pair<T, R>
{
public T First { get; set; }
public R Second { get; set; }
}
static void Main(string[] args)
{
var dict = new Dictionary<Pair<int, MyClass>, object>();
dict.Add(new Pair<int, MyClass>() { First = 1, Second = new MyClass(1, 2) }, 1);
//this is a pair of the same values as previous! but... no exception this time...
dict.Add(new Pair<int, MyClass>() { First = 1, Second = new MyClass(1, 3) }, 1);
return;
}
}
}
I don't have the book reference anymore, and I'll have to find it just to confirm, but I thought the default base hash just hashed together all of the members of your object. It got access to them because of the way the CLR worked, so it wasn't something that you could write as well as they had.
That is completely from memory of something I briefly read so take it for what you will.
Edit: The book was Inside C# from MS Press. The one with the Saw blade on the cover. The author spent a good deal of time explaining how things were implemented in the CLR, how the language translated down to MSIL, ect. ect. If you can find the book it's not a bad read.
Edit: Form the link provided it looks like
Object.GetHashCode() uses an
internal field in the System.Object class to generate the hash value. Each
object created is assigned a unique object key, stored as an integer,when it
is created. These keys start at 1 and increment every time a new object of
any type gets created.
Hmm I guess I need to write a few of my own hash codes, if I expect to use objects as hash keys.
so probably it won't be aware of the hash codes of it's "children".
Your example seems to prove otherwise :-) The hash code for the key MyClass and the value 1 is the same for both KeyValuePair's . The KeyValuePair implementation must be using both its Key and Value for its own hash code
Moving up, the dictionary class wants unique keys. It is using the hashcode provided by each key to figure things out. Remember that the runtime isn't calling Object.GetHashCode(), but it is calling the GetHashCode() implementation provided by the instance you give it.
Consider a more complex case:
public class HappyClass
{
enum TheUnit
{
Points,
Picas,
Inches
}
class MyDistanceClass
{
int distance;
TheUnit units;
public MyDistanceClass(int theDistance, TheUnit unit)
{
distance = theDistance;
units = unit;
}
public static int ConvertDistance(int oldDistance, TheUnit oldUnit, TheUnit newUnit)
{
// insert real unit conversion code here :-)
return oldDistance * 100;
}
/// <summary>
/// Figure out if we are equal distance, converting into the same units of measurement if we have to
/// </summary>
/// <param name="obj">the other guy</param>
/// <returns>true if we are the same distance</returns>
public override bool Equals(object obj)
{
MyDistanceClass other = obj as MyDistanceClass;
if (other == null) return false;
if (other.units != this.units)
{
int newDistance = MyDistanceClass.ConvertDistance(other.distance, other.units, this.units);
return distance.Equals(newDistance);
}
else
{
return distance.Equals(other.distance);
}
}
public override int GetHashCode()
{
// even if the distance is equal in spite of the different units, the objects are not
return distance.GetHashCode() * units.GetHashCode();
}
}
static void Main(string[] args)
{
// these are the same distance... 72 points = 1 inch
MyDistanceClass distPoint = new MyDistanceClass(72, TheUnit.Points);
MyDistanceClass distInch = new MyDistanceClass(1, TheUnit.Inch);
Debug.Assert(distPoint.Equals(distInch), "these should be true!");
Debug.Assert(distPoint.GetHashCode() != distInch.GetHashCode(), "But yet they are fundimentally different values");
Dictionary<object, object> dict = new Dictionary<object, object>();
dict.Add(new KeyValuePair<MyDistanceClass, object>(distPoint, 1), 1);
//this should not barf
dict.Add(new KeyValuePair<MyDistanceClass, object>(distInch, 1), 1);
return;
}
}
Basically... in the case of my example, you'd want two objects that are the same distance to return "true" for Equals, but yet return different hash codes.