How to compare two objects when you can't override Equals? - c#

I have an object modelNoBend of type CalculationModel and I have serialized it into a JSON and saved it in a .txt file using the below:
private static void GenerateTextFileNoBend(string path, CalculationModel model)
{
if (!File.Exists(path)) {
using (var file = File.CreateText(path + "noBend.txt")) {
var json = JsonConvert.SerializeObject(model);
file.Write(json);
}
}
}
Then, I wanted to deserialize that JSON and check with the original object whether they are the same or not.
static void Main(string[] args)
{
GenerateTextFileNoBend(path, modelNoBend);
var jsonText = File.ReadAllText(#"D:\5113\noBend.txt");
CalculationModel model = JsonConvert.DeserializeObject<CalculationModel>(jsonText);
string one = JsonConvert.SerializeObject(modelNoBend);
string two = JsonConvert.SerializeObject(model);
if (model.Equals(modelNoBend)) {
Console.Write("Yes");
}
if (one.Equals(two)) {
Console.Write("Yes");
}
}
if (model.Equals(modelNoBend)) - False
if (one.Equals(two)) - True
If I compare the two objects, .Equals() returns false. However, if I serialize them both once again and compare the strings, the if goes on the true branch.
Obviously, something that I have missed out on the last post is that I can not edit the CalculationModel class. This means that I can not follow the answers in this question, since I can not override Equals, nor use something else like IEqualityComparer as it requires the class to implement IComparable.
Is there any workaround for this, without me having to touch that class?

Well, since you don't override Equals and GetHashCode then model and modelNoBend are
compared by their references. model and modelNoBend don't share the same reference, that's
why they considered being unequal.
You can't implement custom Equals and GetHashCode but you can implement comparer:
public class CalculationModelComparer : IEqualityComparer<CalculationModel> {
public bool Equals(CalculationModel x, CalculationModel y) {
if (ReferenceEquals(x, y))
return true;
if (null == x || null == y)
return false;
// I have nothing but serialization data to compare
//TODO: put smarter code: compare properties and fields here
return string.Equals(
JsonConvert.SerializeObject(x),
JsonConvert.SerializeObject(y));
}
public int GetHashCode(CalculationModel obj) {
if (null == obj)
return 0;
// I have nothing but serialization data to compare and
// serialization is a long process... So we can put either 1 or
// JsonConvert.SerializeObject(obj).GetHashCode();
//TODO: put smarter code: compute hash from properties and fields
return 1;
}
public static IEqualityComparer<CalculationModel> Instance {get} =
new CalculationModelComparer();
}
Then use it:
if (CalculationModelComparer.Instance.Equals(model, modelNoBend)) {
...
}

I hope this answer does not sound flippant; it is not meant to be.
If your use case is just this small case, or the class being checked for equality is reasonably small in terms of fields, it's totally legit to just code a method that compares the fields as needed.
There are plenty of cases that support something like this, especially in tight loops for performance.
That said #Dmitry's answer is correct. I offer this as an alternative when it makes sense.

Related

How do I make structural equality to work on collection properties in C#?

One of the great advantages is supposed to be value based/structural equality, but how do I get that to work with collection properties?
Concrete simple example:
public record Something(string Id);
public record Sample(List<Something> something);
With the above records I would expect the following test to pass:
[Fact]
public void Test()
{
var x = new Sample(new List<Something>() {
new Something("x1")
});
var y = new Sample(new List<Something>() {
new Something("x1")
});
Assert.Equal(x, y);
}
I understand that it is because of List being a reference type, but does it exist a collection that implements value based comparison? Basically I would like to do a "deep" value based comparison.
Records don't do this automatically, but you can implement the Equals method yourself:
public record Sample(List<Something> something) : IEquatable<Sample>
{
public virtual bool Equals(Sample? other) =>
other != null &&
Enumerable.SequenceEqual(something, other.something);
}
But note that GetHashCode should be overridden to be consistent with Equals. See also implement GetHashCode() for objects that contain collections

Generate hash of object consistently

I'm trying to get a hash (md5 or sha) of an object.
I've implemented this:
http://alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx
I'm using nHibernate to retrieve my POCOs from a database.
When running GetHash on this, it's different each time it's selected and hydrated from the database. I guess this is expected, as the underlying proxies will change.
Anyway,
Is there a way to get a hash of all the properties on an object, consistently each time?
I've toyed with the idea of using a StringBuilder over this.GetType().GetProperties..... and creating a hash on that, but that seems inefficient?
As a side note, this is for change-tracking these entities from one database (RDBMS) to a NoSQL store
(comparing hash values to see if objects changed between rdbms and nosql)
If you're not overriding GetHashCode you just inherit Object.GetHashCode. Object.GetHashCode basically just returns the memory address of the instance, if it's a reference object. Of course, each time an object is loaded it will likely be loaded into a different part of memory and thus result in a different hash code.
It's debatable whether that's the correct thing to do; but that's what was implemented "back in the day" so it can't change now.
If you want something consistent then you have to override GetHashCode and create a code based on the "value" of the object (i.e. the properties and/or fields). This can be as simple as a distributed merging of the hash codes of all the properties/fields. Or, it could be as complicated as you need it to be. If all you're looking for is something to differentiate two different objects, then using a unique key on the object might work for you.If you're looking for change tracking, using the unique key for the hash probably isn't going to work
I simply use all the hash codes of the fields to create a reasonably distributed hash code for the parent object. For example:
public override int GetHashCode()
{
unchecked
{
int result = (Name != null ? Name.GetHashCode() : 0);
result = (result*397) ^ (Street != null ? Street.GetHashCode() : 0);
result = (result*397) ^ Age;
return result;
}
}
The use of the prime number 397 is to generate a unique number for a value to better distribute the hash code. See http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/ for more details on the use of primes in hash code calculations.
You could, of course, use reflection to get at all the properties to do this, but that would be slower. Alternatively you could use the CodeDOM to generate code dynamically to generate the hash based on reflecting on the properties and cache that code (i.e. generate it once and reload it next time). But, this of course, is very complex and might not be worth the effort.
An MD5 or SHA hash or CRC is generally based on a block of data. If you want that, then using the hash code of each property doesn't make sense. Possibly serializing the data to memory and calculating the hash that way would be more applicable, as Henk describes.
If this 'hash' is solely used to determine whether entities have changed then the following algorithm may help (NB it is untested and assumes that the same runtime will be used when generating hashes (otherwise the reliance on GetHashCode for 'simple' types is incorrect)):
public static byte[] Hash<T>(T entity)
{
var seen = new HashSet<object>();
var properties = GetAllSimpleProperties(entity, seen);
return properties.Select(p => BitConverter.GetBytes(p.GetHashCode()).AsEnumerable()).Aggregate((ag, next) => ag.Concat(next)).ToArray();
}
private static IEnumerable<object> GetAllSimpleProperties<T>(T entity, HashSet<object> seen)
{
foreach (var property in PropertiesOf<T>.All(entity))
{
if (property is int || property is long || property is string ...) yield return property;
else if (seen.Add(property)) // Handle cyclic references
{
foreach (var simple in GetAllSimpleProperties(property, seen)) yield return simple;
}
}
}
private static class PropertiesOf<T>
{
private static readonly List<Func<T, dynamic>> Properties = new List<Func<T, dynamic>>();
static PropertiesOf()
{
foreach (var property in typeof(T).GetProperties())
{
var getMethod = property.GetGetMethod();
var function = (Func<T, dynamic>)Delegate.CreateDelegate(typeof(Func<T, dynamic>), getMethod);
Properties.Add(function);
}
}
public static IEnumerable<dynamic> All(T entity)
{
return Properties.Select(p => p(entity)).Where(v => v != null);
}
}
This would then be useable like so:
var entity1 = LoadEntityFromRdbms();
var entity2 = LoadEntityFromNoSql();
var hash1 = Hash(entity1);
var hash2 = Hash(entity2);
Assert.IsTrue(hash1.SequenceEqual(hash2));
GetHashCode() returns an Int32 (not an MD5).
If you create two objects with all the same property values they will not have the same Hash if you use the base or system GetHashCode().
String is an object and an exception.
string s1 = "john";
string s2 = "john";
if (s1 == s2) returns true and will return the same GetHashCode()
If you want to control equality comparison of two objects then you should override the GetHash and Equality.
If two object are the same then they must also have the same GetHash(). But two objects with the same GetHash() are not necessarily the same. A comparison will first test the GetHash() and if it gets a match there it will test the Equals. OK there are some comparisons that go straight to Equals but you should still override both and make sure two identical objects produce the same GetHash.
I use this for syncing a client with the server. You could use all the Properties or you could have any Property change change the VerID. The advantage here is a simpler quicker GetHashCode(). In my case I was resetting the VerID with any Property change already.
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is FTSdocWord)) return false;
FTSdocWord item = (FTSdocWord)obj;
return (OjbID == item.ObjID && VerID == item.VerID);
}
public override int GetHashCode()
{
return ObjID ^ VerID;
}
I ended up using ObjID alone so I could do the following
if (myClientObj == myServerObj && myClientObj.VerID <> myServerObj.VerID)
{
// need to synch
}
Object.GetHashCode Method
Two objects with the same property values. Are they equal? Do they produce the same GetHashCode()?
personDefault pd1 = new personDefault("John");
personDefault pd2 = new personDefault("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// different GetHashCode
if (pd1.Equals(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("pd1 == pd2");
}
List<personDefault> personsDefault = new List<personDefault>();
personsDefault.Add(pd1);
if (personsDefault.Contains(pd2)) // returns false
{
System.Diagnostics.Debug.WriteLine("Contains(pd2)");
}
personOverRide po1 = new personOverRide("John");
personOverRide po2 = new personOverRide("John");
System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString());
// same hash
if (po1.Equals(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("po1 == po2");
}
List<personOverRide> personsOverRide = new List<personOverRide>();
personsOverRide.Add(po1);
if (personsOverRide.Contains(po2)) // returns true
{
System.Diagnostics.Debug.WriteLine("Contains(p02)");
}
}
public class personDefault
{
public string Name { get; private set; }
public personDefault(string name) { Name = name; }
}
public class personOverRide: Object
{
public string Name { get; private set; }
public personOverRide(string name) { Name = name; }
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is personOverRide)) return false;
personOverRide item = (personOverRide)obj;
return (Name == item.Name);
}
public override int GetHashCode()
{
return Name.GetHashCode();
}
}

C# Extension Method for Object

Is it a good idea to use an extension method on the Object class?
I was wondering if by registering this method if you were incurring a performance penalty as it would be loaded on every object that was loaded in the context.
In addition to another answers:
there would be no performance penalty because extension methods is compiler feature. Consider following code:
public static class MyExtensions
{
public static void MyMethod(this object) { ... }
}
object obj = new object();
obj.MyMethod();
The call to MyMethod will be actually compiled to:
MyExtensions.MyMethod(obj);
There will be no performance penalty as it doesn't attach to every type in the system, it's just available to be called on any type in the system. All that will happen is that the method will show on every single object in intellisense.
The question is: do you really need it to be on object, or can it be more specific. If it needs to be on object, the make it for object.
If you truly intend to extend every object, then doing so is the right thing to do. However, if your extension really only applies to a subset of objects, it should be applied to the highest hierarchical level that is necessary, but no more.
Also, the method will only be available where your namespace is imported.
I have extended Object for a method that attempts to cast to a specified type:
public static T TryCast<T>(this object input)
{
bool success;
return TryCast<T>(input, out success);
}
I also overloaded it to take in a success bool (like TryParse does):
public static T TryCast<T>(this object input, out bool success)
{
success = true;
if(input is T)
return (T)input;
success = false;
return default(T);
}
I have since expanded this to also attempt to parse input (by using ToString and using a converter), but that gets more complicated.
Is it a good idea to use an extension method on the Object class?
Yes, there are cases where it is a great idea in fact.Tthere is no performance penalty whatsoever by using an extension method on the Object class. As long as you don't call this method the performance of your application won't be affected at all.
For example consider the following extension method which lists all properties of a given object and converts it to a dictionary:
public static IDictionary<string, object> ObjectToDictionary(object instance)
{
var dictionary = new Dictionary<string, object>(StringComparer.OrdinalIgnoreCase);
if (instance != null)
{
foreach (var descriptor in TypeDescriptor.GetProperties(instance))
{
object value = descriptor.GetValue(instance);
dictionary.Add(descriptor.Name, value);
}
}
return dictionary;
}
The following example demonstrates the extension method in use.
namespace NamespaceName
{
public static class CommonUtil
{
public static string ListToString(this IList list)
{
StringBuilder result = new StringBuilder("");
if (list.Count > 0)
{
result.Append(list[0].ToString());
for (int i = 1; i < list.Count; i++)
result.AppendFormat(", {0}", list[i].ToString());
}
return result.ToString();
}
}
}
The following example demonstrates how this method can be used.
var _list = DataContextORM.ExecuteQuery<string>("Select name from products").ToList();
string result = _list.ListToString();
This is an old question but I don't see any answers here that try to reuse the existing find function for objects that are active. Here's a succinct extension method with an optional overload for finding inactive objects.
using System.Linq;
namespace UnityEngine {
public static class GameObjectExtensionMethods {
public static GameObject Find(this GameObject gameObject, string name,
bool inactive = false) {
if (inactive)
return Resources.FindObjectsOfTypeAll<GameObject>().Where(
a => a.name == name).FirstOrDefault();
else
return GameObject.Find(name);
}
}
}
If you use this function within the Update method you might consider changing the LINQ statement with an array for loop traversal to eliminate garbage generation.

Distinct() returns duplicates with a user-defined type

I'm trying to write a Linq query which returns an array of objects, with unique values in their constructors. For integer types, Distinct returns only one copy of each value, but when I try creating my list of objects, things fall apart. I suspect it's a problem with the equality operator for my class, but when I set a breakpoint, it's never hit.
Filtering out the duplicate int in a sub-expression solves the problem, and also saves me from constructing objects that will be immediately discarded, but I'm curious why this version doesn't work.
UPDATE: 11:04 PM Several folks have pointed out that MyType doesn't override GetHashCode(). I'm afraid I oversimplified the example. The original MyType does indeed implement it. I've added it below, modified only to put the hash code in a temp variable before returning it.
Running through the debugger, I see that all five invocations of GetHashCode return a different value. And since MyType only inherits from Object, this is presumably the same behavior Object would exhibit.
Would I be correct then to conclude that the hash should instead be based on the contents of Value? This was my first attempt at overriding operators, and at the time, it didn't appear that GetHashCode needed to be particularly fancy. (This is the first time one of my equality checks didn't seem to work properly.)
class Program
{
static void Main(string[] args)
{
int[] list = { 1, 3, 4, 4, 5 };
int[] list2 =
(from value in list
select value).Distinct().ToArray(); // One copy of each value.
MyType[] distinct =
(from value in list
select new MyType(value)).Distinct().ToArray(); // Two objects created with 4.
Array.ForEach(distinct, value => Console.WriteLine(value));
}
}
class MyType
{
public int Value { get; private set; }
public MyType(int arg)
{
Value = arg;
}
public override int GetHashCode()
{
int retval = base.GetHashCode();
return retval;
}
public override bool Equals(object obj)
{
if (obj == null)
return false;
MyType rhs = obj as MyType;
if ((Object)rhs == null)
return false;
return this == rhs;
}
public static bool operator ==(MyType lhs, MyType rhs)
{
bool result;
if ((Object)lhs != null && (Object)rhs != null)
result = lhs.Value == rhs.Value;
else
result = (Object)lhs == (Object)rhs;
return result;
}
public static bool operator !=(MyType lhs, MyType rhs)
{
return !(lhs == rhs);
}
}
You need to override GetHashCode() in your class. GetHashCode must be implemented in tandem with Equals overloads. It is common for code to check for hashcode equality before calling Equals. That's why your Equals implementation is not getting called.
Your suspicion is correct,it is the equality which currently just checks the object references. Even your implementation does not do anything extra, change it to this:
public override bool Equals(object obj)
{
if (obj == null)
return false;
MyType rhs = obj as MyType;
if ((Object)rhs == null)
return false;
return this.Value == rhs.Value;
}
In you equality method you are still testing for reference equality, rather than semantic equality, eg on this line:
result = (Object)lhs == (Object)rhs
you are just comparing two object references which, even if they hold exactly the same data, are still not the same object. Instead, your test for equality needs to compare one or more properties of your object. For instance, if your object had an ID property, and objects with the same ID should be considered semantically equivalent, then you could do this:
result = lhs.ID == rhs.ID
Note that overriding Equals() means you should also override GetHashCode(), which is another kettle of fish, and can be quite difficult to do correctly.
You need to implement GetHashCode().
It seems that a simple Distinct operation can be implemented more elegantly as follows:
var distinct = items.GroupBy(x => x.ID).Select(x => x.First());
where ID is the property that determines if two objects are semantically equivalent. From the confusion here (including that of myself), the default implementation of Distinct() seems to be a little convoluted.
I think MyType needs to implement IEquatable for this to work.
The other answers have pretty much covered the fact that you need to implement Equals and GetHashCode correctly, but as a side note you may be interested to know that anonymous types have these values implemented automatically:
var distinct =
(from value in list
select new {Value = value}).Distinct().ToArray();
So without ever having to define this class, you automatically get the Equals and GetHashCode behavior you're looking for. Cool, eh?

C# Queue problem

Suppose I have a class
XYNode
{
protected int mX;
protected int mY;
}
and a queue
Queue<XyNode> testQueue = new Queue<XYNode>();
I want to check if a node with that specific x and y coordinate is already in the queue.
The following obviously doesn't work :
testQueue.Contains(new XYNode(testX, testY))
because even if a node with those coordinates is in the queue, we're testing against a different XYNode object so it will always return false.
What's the right solution ?
The simplest way is to override Equals so that one XYNode knows whether it's equal to another XYNode. You should override GetHashCode() at the same time, and possibly also implement IEquatable<XYNode> to allow a strongly-typed equality comparison.
Alternatively, you could write an IEqualityComparer<XYNode> implementation to compare any two nodes and return whether or not they're the same - and then pass that into the call to the appropriate overload of the Contains extension method defined in Enumerable (assuming you're using .NET 3.5).
Further things to consider:
Could you use private fields instead of protected ones?
Could your class be sealed?
Could your class be immutable?
Should your class perhaps be a struct instead? (Judgement call...)
Should you overload the == and != operators?
To illustrate Jon Skeet's ... original ... answer:
class XYNode {
protected int mX;
protected int mY;
public override bool Equals(Object obj) {
if (obj == null || this.GetType() != obj.GetType()) { return false; }
XYNode otherNode = (XYNode)obj;
return (this.mX == other.mX) && (this.mY == other.mY);
}
}
This is a pretty simplistic solution. There are a lot of additional factors to consider, which Jon has already mentioned.
You could simply iterate with a foreach and check the X and Y of your XYNode on every element.

Categories

Resources