HashSet item can be changed into same item in Set - c#

I have a Node class :
public class Node : INode
{
public object Value { get; set; }
}
And I have EqualityComparer for this Node class like this :
public class INodeEqualityComparer : EqualityComparer<INode>
{
private INodeEqualityComparer()
{
}
private static readonly INodeEqualityComparer _instance =
new INodeEqualityComparer();
public static INodeEqualityComparer Instance
{
get { return _instance; }
}
public override bool Equals(INode x, INode y)
{
return (int)(x.Value) == (int)(y.Value);
}
public override int GetHashCode(INode obj)
{
return ((int)(obj.Value)).GetHashCode();
}
}
I create my HashSet by passing the NodeEqualityComparer.
I have 4 Node instances :
Node n1 = new Node(1);
Node n2 = new Node(2);
Node n3 = new Node(3);
Node n4 = new Node(1);
When I add n1,n2,n3,n4 to my hashset , n4 get ignored.
HashSet<INode> nodes = new HashSet<INode>(INodeEqualityComparer.Instance);
nodes.Add(n1);
nodes.Add(n2);
nodes.Add(n3);
nodes.Add(n4);
BUT after I use this changing :
nodes.Where(n => (int)(n.Value) == 3).FirstOrDefault().Value = 1;
there will be 2 elements that are equal together (value=1) based on NodeEqualityComparer. those are n1 and n3.
WHY the hashset does not prevent updating node or remove it ?

This is by design: hashing collections (whether it's dictionaries or hash sets or any other) assume that an object's hash code does not change after it's been inserted in the collection. And since two objects that are considered equal must also have the same hash code, then it also means that whatever its Equals implementation returns must not change for the same parameter.
This applies to what is hashed: in dictionaries, that's the key. In sets, that's the entire object.
The .NET documentation states:
In general, for mutable reference types, you should override GetHashCode() only if:
You can compute the hash code from fields that are not mutable; or
You can ensure that the hash code of a mutable object does not change while the object is contained in a collection that relies on its hash code.
In your Node class, you use a mutable property (Value) to compute the hash code. That's usually a bad idea, and it's actually something that ReSharper will warn against. Then again, overriding Equals and GetHashCode normally means that you're treating the type as a "value" rather than an "entity", and values should be treated as immutable when possible.
If you can't make your object immutable, don't store it in a hash collection.

Related

How do I make structural equality to work on collection properties in C#?

One of the great advantages is supposed to be value based/structural equality, but how do I get that to work with collection properties?
Concrete simple example:
public record Something(string Id);
public record Sample(List<Something> something);
With the above records I would expect the following test to pass:
[Fact]
public void Test()
{
var x = new Sample(new List<Something>() {
new Something("x1")
});
var y = new Sample(new List<Something>() {
new Something("x1")
});
Assert.Equal(x, y);
}
I understand that it is because of List being a reference type, but does it exist a collection that implements value based comparison? Basically I would like to do a "deep" value based comparison.
Records don't do this automatically, but you can implement the Equals method yourself:
public record Sample(List<Something> something) : IEquatable<Sample>
{
public virtual bool Equals(Sample? other) =>
other != null &&
Enumerable.SequenceEqual(something, other.something);
}
But note that GetHashCode should be overridden to be consistent with Equals. See also implement GetHashCode() for objects that contain collections

Use HashSet as Dictionary Key - Compare all elements

I am checking if a total group of edges already contains the connection between 2 points.
I want to use HashSet's that will contain 2 vectors as Dictionary keys. Then I want to be able to call a performant Dictionary.ContainsKey(hashSet). I want the contains/equality check to be dependent on the Vectors in the Set.
Fex. If I add HashSet [V000 V001] to the Dict. I want to get Dictionary.ContainsKey(HashSet [V001 V000]) return true. (HashSet, so the order can vary, just the same Elements)
The Problem seems to be, that the Dictionary.ContainsKey() method does see separately created HashSets as different objects, even though, they contain the same elements.
Dictionary<HashSet<Vector3>, Vector3> d = new Dictionary<HashSet<Vector3>, Vector3>();
HashSet<Vector3> s = new HashSet<Vector3>();
s.Add(Vector3.one);
s.Add(Vector3.zero);
d.Add(s);
HashSet<Vector3> s2 = new HashSet<Vector3>();
s2.Add(Vector3.zero);
s2.Add(Vector3.one);
bool doesContain = d.ContainsKey(s2); // should be true
You also may suggest a better way of doing this 'Contains()' check efficiently.
The HashSet type doesn't do the equality comparison you want out of the box. It only has reference equality.
To get what you want, you'll need a new type to use as the Dictionary key. The new type will have a HashSet property, and overload Equals() and GetHashCode(), and may as well implement IEquatable at this point as well.
I'll get you started:
public class HashKey<T> : IEquatable<HashKey<T>>
{
private HashSet<T> _items;
public HashSet<T> Items
{
get {return _items;}
private set {_items = value;}
}
public HashKey()
{
_items = new HashSet<T>();
}
public HashKey(HashSet<T> initialSet)
{
_items = initialSet ?? new HashSet();
}
public override int GetHashCode()
{
// I'm leaving this for you to do
}
public override bool Equals(Object obj)
{
if (! (obj is HashKey)) return false;
return this.GetHashCode().Equals(obj.GetHashCode());
}
public bool Equals(HashSet<T> obj)
{
if (obj is null) return false;
return this.GetHashCode().Equals(obj.GetHashCode());
}
}
You want to use a hashset as key.
So the keys are references where one key is one hashset reference.
The ContainsKey compare references.
For what you want to do, you can create a class that implements IEqualityComparer to pass it to the dictionary constructor.
https://learn.microsoft.com/dotnet/api/system.collections.generic.iequalitycomparer-1
If you want a full management, you should create a new class embedding the dictionary and implement your own public operations wrapping that of the dictionary : ContainsKey and all others methods you need.
public class MyDictionary : IEnumerable<>
{
private Dictionary<HashSet<Vector3>, Vector3> d
= new Dictionary<HashSet<Vector3>, Vector3>();
public int Count { get; }
public this...
public ContainsKey()
{
// implements your own comparison algorithm
}
public Add();
public Remove();
...
}
So you will have a strongly typed dictionary for your intended usage.

C# Dictionary with a class as key

My question is basically the opposite of Dictionary.ContainsKey return False, but a want True and of "the given key was not present in the dictionary" error when using a self-defined class as key:
I want to use a medium-sized class as the dictionary's key, and the dictionary must compare the keys by reference, not by value equality. The problem is, that the class already implements Equals() (which is performing value equality - which is what not what I want here).
Here's a small test class for reproduction:
class CTest
{
public int m_iValue;
public CTest (int i_iValue)
{
m_iValue = i_iValue;
}
public override bool Equals (object i_value)
{
if (ReferenceEquals (null, i_value))
return false;
if (ReferenceEquals (this, i_value))
return true;
if (i_value.GetType () != GetType ())
return false;
return m_iValue == ((CTest)i_value).m_iValue;
}
}
I have NOT yet implemented GetHashCode() (actually I have, but it only returns base.GetHashCode() so far).
Now I created a test program with a dictionary that uses instances of this class as keys. I can add multiple identical instances to the dictionary without problems, but this only works because GetHashCode() returns different values:
private static void Main ()
{
var oTest1 = new CTest (1);
var oTest2 = new CTest (1);
bool bEquals = Equals (oTest1, oTest2); // true
var dict = new Dictionary<CTest, int> ();
dict.Add (oTest1, 1);
dict.Add (oTest2, 2); // works
var iValue1 = dict[oTest1]; // correctly returns 1
var iValue2 = dict[oTest2]; // correctly returns 2
int iH1 = oTest1.GetHashCode (); // values different on each execution
int iH2 = oTest2.GetHashCode (); // values different on each execution, but never equals iH1
}
And the hash values are different every time, maybe because the calculatation in object.GetHashCode() uses some randomization or some numbers that come from the reference handle (which is different for each object).
However, this answer on Why is it important to override GetHashCode when Equals method is overridden? says that GetHashCode() must return the same values for equal objects, so I added
public override int GetHashCode ()
{
return m_iValue;
}
After that, I could not add multiple equal objects to the dictionary any more.
Now, there are two conclusions:
If I removed my own GetHashCode() again, the hash values will be different again and the dictionary can be used. But there may be situations that accidentally give the same hash code for two equal objects, which will cause an exception at runtime, whose cause will for sure never be found. Because of that (little, but not zero) risk, I cannot use a dictionary.
If I correctly implement GetHashCode() like I am supposed to do, I cannot use a dictionary anyway.
What possibilities exist to still use a dictionary?
Like many times before, I had the idea for a solution when writing this question.
You can specify an IEqualityComparer<TKey> in the constructor of the dictionary. There is one in the .net framework, but it's internal sealed, so you need to implement your own:
Is there any kind of "ReferenceComparer" in .NET?
internal class ReferenceComparer<T> : IEqualityComparer<T> where T : class
{
static ReferenceComparer ()
{
Instance = new ReferenceComparer<T> ();
}
public static ReferenceComparer<T> Instance { get; }
public bool Equals (T x, T y)
{
return ReferenceEquals (x, y);
}
public int GetHashCode (T obj)
{
return System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode (obj);
}
}

Generate hash of object consistently

I'm trying to get a hash (md5 or sha) of an object.
I've implemented this:
http://alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx
I'm using nHibernate to retrieve my POCOs from a database.
When running GetHash on this, it's different each time it's selected and hydrated from the database. I guess this is expected, as the underlying proxies will change.
Anyway,
Is there a way to get a hash of all the properties on an object, consistently each time?
I've toyed with the idea of using a StringBuilder over this.GetType().GetProperties..... and creating a hash on that, but that seems inefficient?
As a side note, this is for change-tracking these entities from one database (RDBMS) to a NoSQL store
(comparing hash values to see if objects changed between rdbms and nosql)
If you're not overriding GetHashCode you just inherit Object.GetHashCode. Object.GetHashCode basically just returns the memory address of the instance, if it's a reference object. Of course, each time an object is loaded it will likely be loaded into a different part of memory and thus result in a different hash code.
It's debatable whether that's the correct thing to do; but that's what was implemented "back in the day" so it can't change now.
If you want something consistent then you have to override GetHashCode and create a code based on the "value" of the object (i.e. the properties and/or fields). This can be as simple as a distributed merging of the hash codes of all the properties/fields. Or, it could be as complicated as you need it to be. If all you're looking for is something to differentiate two different objects, then using a unique key on the object might work for you.If you're looking for change tracking, using the unique key for the hash probably isn't going to work
I simply use all the hash codes of the fields to create a reasonably distributed hash code for the parent object. For example:
public override int GetHashCode()
{
unchecked
{
int result = (Name != null ? Name.GetHashCode() : 0);
result = (result*397) ^ (Street != null ? Street.GetHashCode() : 0);
result = (result*397) ^ Age;
return result;
}
}
The use of the prime number 397 is to generate a unique number for a value to better distribute the hash code. See http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/ for more details on the use of primes in hash code calculations.
You could, of course, use reflection to get at all the properties to do this, but that would be slower. Alternatively you could use the CodeDOM to generate code dynamically to generate the hash based on reflecting on the properties and cache that code (i.e. generate it once and reload it next time). But, this of course, is very complex and might not be worth the effort.
An MD5 or SHA hash or CRC is generally based on a block of data. If you want that, then using the hash code of each property doesn't make sense. Possibly serializing the data to memory and calculating the hash that way would be more applicable, as Henk describes.
If this 'hash' is solely used to determine whether entities have changed then the following algorithm may help (NB it is untested and assumes that the same runtime will be used when generating hashes (otherwise the reliance on GetHashCode for 'simple' types is incorrect)):
public static byte[] Hash<T>(T entity)
{
var seen = new HashSet<object>();
var properties = GetAllSimpleProperties(entity, seen);
return properties.Select(p => BitConverter.GetBytes(p.GetHashCode()).AsEnumerable()).Aggregate((ag, next) => ag.Concat(next)).ToArray();
}
private static IEnumerable<object> GetAllSimpleProperties<T>(T entity, HashSet<object> seen)
{
foreach (var property in PropertiesOf<T>.All(entity))
{
if (property is int || property is long || property is string ...) yield return property;
else if (seen.Add(property)) // Handle cyclic references
{
foreach (var simple in GetAllSimpleProperties(property, seen)) yield return simple;
}
}
}
private static class PropertiesOf<T>
{
private static readonly List<Func<T, dynamic>> Properties = new List<Func<T, dynamic>>();
static PropertiesOf()
{
foreach (var property in typeof(T).GetProperties())
{
var getMethod = property.GetGetMethod();
var function = (Func<T, dynamic>)Delegate.CreateDelegate(typeof(Func<T, dynamic>), getMethod);
Properties.Add(function);
}
}
public static IEnumerable<dynamic> All(T entity)
{
return Properties.Select(p => p(entity)).Where(v => v != null);
}
}
This would then be useable like so:
var entity1 = LoadEntityFromRdbms();
var entity2 = LoadEntityFromNoSql();
var hash1 = Hash(entity1);
var hash2 = Hash(entity2);
Assert.IsTrue(hash1.SequenceEqual(hash2));
GetHashCode() returns an Int32 (not an MD5).
If you create two objects with all the same property values they will not have the same Hash if you use the base or system GetHashCode().
String is an object and an exception.
string s1 = "john";
string s2 = "john";
if (s1 == s2) returns true and will return the same GetHashCode()
If you want to control equality comparison of two objects then you should override the GetHash and Equality.
If two object are the same then they must also have the same GetHash(). But two objects with the same GetHash() are not necessarily the same. A comparison will first test the GetHash() and if it gets a match there it will test the Equals. OK there are some comparisons that go straight to Equals but you should still override both and make sure two identical objects produce the same GetHash.
I use this for syncing a client with the server. You could use all the Properties or you could have any Property change change the VerID. The advantage here is a simpler quicker GetHashCode(). In my case I was resetting the VerID with any Property change already.
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is FTSdocWord)) return false;
FTSdocWord item = (FTSdocWord)obj;
return (OjbID == item.ObjID && VerID == item.VerID);
}
public override int GetHashCode()
{
return ObjID ^ VerID;
}
I ended up using ObjID alone so I could do the following
if (myClientObj == myServerObj && myClientObj.VerID <> myServerObj.VerID)
{
// need to synch
}
Object.GetHashCode Method
Two objects with the same property values. Are they equal? Do they produce the same GetHashCode()?
personDefault pd1 = new personDefault("John");
personDefault pd2 = new personDefault("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// different GetHashCode
if (pd1.Equals(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("pd1 == pd2");
}
List<personDefault> personsDefault = new List<personDefault>();
personsDefault.Add(pd1);
if (personsDefault.Contains(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("Contains(pd2)");
}
personOverRide po1 = new personOverRide("John");
personOverRide po2 = new personOverRide("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// same hash
if (po1.Equals(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("po1 == po2");
}
List<personOverRide> personsOverRide = new List<personOverRide>();
personsOverRide.Add(po1);
if (personsOverRide.Contains(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("Contains(p02)");
}
}
public class personDefault
{
public string Name { get; private set; }
public personDefault(string name) { Name = name; }
}
public class personOverRide: Object
{
public string Name { get; private set; }
public personOverRide(string name) { Name = name; }
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is personOverRide)) return false;
personOverRide item = (personOverRide)obj;
return (Name == item.Name);
}
public override int GetHashCode()
{
return Name.GetHashCode();
}
}

Custom Class used as key in Dictionary but key not found

I have a class, show below, which is used as a key in a Dictionary<ValuesAandB, string>
I'm having issues when trying to find any key within this dictionary, it never finds it at all. As you can see, I have overridden Equals and GetHashCode.
To look for the key I'm using
ValuesAandB key = new ValuesAandB(A,B);
if (DictionaryName.ContainsKey(key)) {
...
}
Is there anything else that I'm missing? Can anyone point out what I'm doing wrong?
private class ValuesAandB {
public string valueA;
public string valueB;
// Constructor
public ValuesAandB (string valueAIn, string valueBIn) {
valueA = valueAIn;
valueB = ValueBIn;
}
public class EqualityComparer : IEqualityComparer<ValuesAandB> {
public bool Equals(ValuesAandB x, ValuesAandB y) {
return ((x.valueA.Equals(y.valueA)) && (x.valueB.Equals(y.valueB)));
}
public int GetHashCode(ValuesAandB x) {
return x.valueA.GetHashCode() ^ x.valueB.GetHashCode();
}
}
}
And before anyone asks, yes the values are in the Dictionary!
You have not overridden Equals and GetHashCode. You have implemented a second class which can serve as an EqualityComparer. If you don't construct the Dictionary with the EqualityComparer, it will not be used.
The simplest fix would be to override GetHashCode and Equals directly rather than implementing a comparer (comparers are generally only interesting when you need to supply multiple different comparison types (case sensitive and case insensitive for example) or when you need to be able to perform comparisons on a class which you don't control.
I had this problem, turns out the dictionary was comparing referances for my key, not the values in the object.
I was using a custom Point class as keys. I overrode the ToString() and the GetHashCode() methods and viola, key lookup worked fine.
It looks like you're comparing two strings. Iirc, when using .Equals(), you are comparing the reference of the strings, not the actual contents. To implement an EqualityComparer that works with strings, you would want to use the String.Compare() method.
public class EqualityComparer : IEqualityComparer<ValuesAandB>
{
public bool Equals(ValuesAandB x, ValuesAandB y)
{
return ((String.Compare(x.valueA,y.valueA) == 0) &&
(String.Compare(x.valueB, y.valueB) == 0));
}
// gethashcode stuff here
}
I could be a little off with the code, that should get you close...

Categories

Resources