How can I compare two generic collections? - c#

How can I compare two generic collections? Here's my attempt with two string arrays, but it doesn't return true.
namespace genericCollections
{
class Program
{
static void Main(string[] args)
{
string[] xx = new string[] { "gfdg", "gfgfd", "fgfgfd" };
string[] yy = new string[] { "gfdg", "gfgfd", "fgfgfd" };
Console.WriteLine(ComparerCollection(xx, yy).ToString());
Console.ReadKey();
}
static bool ComparerCollection<T>(ICollection<T> x, ICollection<T> y)
{
return x.Equals(y);
}
}
}

Call Enumerable.SequenceEqual:
bool arraysAreEqual = xx.SequenceEqual(yy);

From the MSDN documenation:
The default implementation of Equals
supports reference equality only, but
derived classes can override this
method to support value equality.
In your case xx and yy are two different instances of string[] so they are never equal.
You'd have to override the .Equal() method of the string[]
But you can simply solve it by looping through the entire collection
static bool CompareCollection<T>(ICollection<T> x, ICollection<T> y)
{
if (x.Length != y.Length)
return false;
for(int i = 0; i < x.Length,i++)
{
if (x[i] != y[i])
return false;
}
return true;
}

You can get elements that are in xx but not in xy by using LINQ:
var newInXY = xx.Except(xy);

Related

How to get a distinct result for list of array?

I have a list of long type array.
List<ulong[]> TestList = new List<ulong[]>();
and list has following items.
{1,2,3,4,5,6},
{2,3,4,5,6,7},
{3,4,5,6,7,8},
{1,2,3,4,5,6}
and expected distinct result is
{1,2,3,4,5,6},
{2,3,4,5,6,7},
{3,4,5,6,7,8}
So I try as following, but useless.
TestList = TestList.Distinct().ToList();
Am I need something special comparer for getting distinct list?
Distinct() uses the default equality check, which for arrays is reference equality. It does not check the contents of the array for equality.
If you want to do that, you'll need the overload of Distinct() that takes an IEqualityComparer<T>. This allows you to customize the behaviour to determine if two items are equal or not.
For comparing arrays, IStructuralEquatable and friends already do the heavy lifting. You can wrap it simply, like so:
sealed class StructuralComparer<T> : IEqualityComparer<T>
{
public static IEqualityComparer<T> Instance { get; } = new StructuralComparer<T>();
public bool Equals(T x, T y)
=> StructuralComparisons.StructuralEqualityComparer.Equals(x, y);
public int GetHashCode(T obj)
=> StructuralComparisons.StructuralEqualityComparer.GetHashCode(obj);
}
Then, use it in the Distinct() call like this:
TestList = TestList.Distinct(StructuralComparer<ulong[]>.Instance).ToList();
You need to provide an equality comparer, default implementation does not know how to compare arrays of long (it uses reference equality):
class LongArrayComparer : EqualityComparer<long[]>
{
public override bool Equals(long[] a1, long[] a2)
{
if (a1 == null && a2 == null)
return true;
else if (a1 == null || a2 == null)
return false;
return a1.SequenceEqual(a2);
}
public override int GetHashCode(long[] arr)
{
long hCode = arr.Aggregate(0, (acc, it) => acc ^ it);
return hCode.GetHashCode();
}
}
Then use it:
TestList = TestList.Distinct(new LongArrayComparer()).ToList();
List<ulong[]> TestList = new List<ulong[]>() {
new ulong[]{ 1,2,3,4,5,6},
new ulong[]{ 2,3,4,5,6,7},
new ulong[]{ 3,4,5,6,7,8},
new ulong[]{ 1,2,3,4,5,6}
};
var result = TestList.GroupBy(x => String.Join(",", x))
.Select(x => x.First().ToArray())
.ToList();
You can implement an IEqualityComparer
public class IntArrayComparer : IEqualityComparer<string[]>
{
public bool Equals(int[] x, int[] y)
{
var shared = x.Intersect(y);
return x.Length == y.Length && shared.Count() == x.Length;;
}
public int GetHashCode(int[] obj)
{
int hashCode=obj.Length;
for(int i=0;i<obj.Length;++i)
{
hashCode=unchecked(hashCode*314159 +obj[i]);
}
return hashCode;
}
}
Then can implement it:
TestList = TestList.Distinct(new IntArrayComparer()).ToList();

HashSet<T>.CreateSetComparer() cannot specify IEqualityComparer<T>, is there an alternative?

In the internal source there is such a constructor public HashSetEqualityComparer(IEqualityComparer<T> comparer) but it's internal so I can't use it.
By default, HashSet<T>.CreateSetComparer() just uses the parameterless constructor which will apply EqualityComparer<T>.Default.
Is there a way to get a HashSetEqualityComparer<T> with a IEqualityComparer<T> of choice, without copying out the code from the source?
I think best solution is using SetEquals. It does the job you need and exactly in the same way that HashSetEqualityComparer does but it will account for any custom comparers defined in the sets its comparing.
So, in your specific scenario where you want to use a HashSet<T> as a key of a dictionary, you need to implement an IEqualityComparer<HashSet<T>> that makes use of SetEquals and "borrows" the reference source of HashSetEqualityComparer.GetHashCode():
public class CustomHashSetEqualityComparer<T>
: IEqualityComparer<HashSet<T>>
{
public bool Equals(HashSet<T> x, HashSet<T> y)
{
if (ReferenceEquals(x, null))
return false;
return x.SetEquals(y);
}
public int GetHashCode(HashSet<T> set)
{
int hashCode = 0;
if (set != null)
{
foreach (T t in set)
{
hashCode = hashCode ^
(set.Comparer.GetHashCode(t) & 0x7FFFFFFF);
}
}
return hashCode;
}
}
But yes, its a small pain that there is not way to directly create a SetEqualityComparer that leverages custom comparers but this unfortunate behavior is due, IMHO, more to a bug of the existing implementation than a lack of the needed overload; there is no reason why CreateSetComparer() can't return an IEqualityComparer that actually uses the comparers of the sets its comparing as the code above demonstrates.
If I had a voice in it, CreateSetComparer() wouldn't be static method at all. It would then be obvious, or at least predictable, that whatever comparer was returned would be created with the current set's comparer.
I agree #InBetween, using SetEquals is the best way. Even if add the constructor still can not achieve what you want.
please see this code:
http://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,1360
Here is I try to do:
class HashSetEqualityComparerWrapper<T> : IEqualityComparer<HashSet<T>>
{
static private Type HashSetEqualityComparerType = HashSet<T>.CreateSetComparer().GetType();
private IEqualityComparer<HashSet<T>> _comparer;
public HashSetEqualityComparerWrapper()
{
_comparer = HashSet<T>.CreateSetComparer();
}
public HashSetEqualityComparerWrapper(IEqualityComparer<T> comparer)
{
_comparer = HashSet<T>.CreateSetComparer();
if (comparer != null)
{
FieldInfo m_comparer_field = HashSetEqualityComparerType.GetField("m_comparer", BindingFlags.NonPublic | BindingFlags.Instance);
m_comparer_field.SetValue(_comparer, comparer);
}
}
public bool Equals(HashSet<T> x, HashSet<T> y)
{
return _comparer.Equals(x, y);
}
public int GetHashCode(HashSet<T> obj)
{
return _comparer.GetHashCode(obj);
}
}
UPDATE
I took 5 mins to implement another version form HashSetEqualityComparer<T> source code. And rewrite the bool Equals(HashSet<T> x, HashSet<T> y) method. It is not complex. All code just copy and paste from source, I just revise a bit.
class CustomHashSetEqualityComparer<T> : IEqualityComparer<HashSet<T>>
{
private IEqualityComparer<T> m_comparer;
public CustomHashSetEqualityComparer()
{
m_comparer = EqualityComparer<T>.Default;
}
public CustomHashSetEqualityComparer(IEqualityComparer<T> comparer)
{
if (comparer == null)
{
m_comparer = EqualityComparer<T>.Default;
}
else
{
m_comparer = comparer;
}
}
// using m_comparer to keep equals properties in tact; don't want to choose one of the comparers
public bool Equals(HashSet<T> x, HashSet<T> y)
{
// http://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,1360
// handle null cases first
if (x == null)
{
return (y == null);
}
else if (y == null)
{
// set1 != null
return false;
}
// all comparers are the same; this is faster
if (AreEqualityComparersEqual(x, y))
{
if (x.Count != y.Count)
{
return false;
}
}
// n^2 search because items are hashed according to their respective ECs
foreach (T set2Item in y)
{
bool found = false;
foreach (T set1Item in x)
{
if (m_comparer.Equals(set2Item, set1Item))
{
found = true;
break;
}
}
if (!found)
{
return false;
}
}
return true;
}
public int GetHashCode(HashSet<T> obj)
{
int hashCode = 0;
if (obj != null)
{
foreach (T t in obj)
{
hashCode = hashCode ^ (m_comparer.GetHashCode(t) & 0x7FFFFFFF);
}
} // else returns hashcode of 0 for null hashsets
return hashCode;
}
// Equals method for the comparer itself.
public override bool Equals(Object obj)
{
CustomHashSetEqualityComparer<T> comparer = obj as CustomHashSetEqualityComparer<T>;
if (comparer == null)
{
return false;
}
return (this.m_comparer == comparer.m_comparer);
}
public override int GetHashCode()
{
return m_comparer.GetHashCode();
}
static private bool AreEqualityComparersEqual(HashSet<T> set1, HashSet<T> set2)
{
return set1.Comparer.Equals(set2.Comparer);
}
}
Avoid this class if you use custom comparers. It uses its own equality comparer to perform GetHashCode, but when performing Equals(Set1, Set2) if Set1 and Set2 have the same equality comparer, the the HashSetEqualityComparer will use the comparer of the sets. HashsetEqualityComparer will only use its own comparer for equals if Set1 and Set2 have different comparers
It gets worse. It calls HashSet.HashSetEquals, which has a bug in it (See https://referencesource.microsoft.com/#system.core/System/Collections/Generic/HashSet.cs line 1489, which is missing a if (set1.Count != set2.Count) return false before performing the subset check.
The bug is illustrated by the following program:
class Program
{
private class MyEqualityComparer : EqualityComparer<int>
{
public override bool Equals(int x, int y)
{
return x == y;
}
public override int GetHashCode(int obj)
{
return obj.GetHashCode();
}
}
static void Main(string[] args)
{
var comparer = HashSet<int>.CreateSetComparer();
var set1 = new HashSet<int>(new MyEqualityComparer()) { 1 };
var set2 = new HashSet<int> { 1, 2 };
Console.WriteLine(comparer.Equals(set1, set2));
Console.WriteLine(comparer.Equals(set2, set1)); //True!
Console.ReadKey();
}
}
Regarding other answers to this question (I don't have the rep to comment):
Wilhelm Liao: His answer also contains the bug because it's copied from the reference source
InBetween: The solution is not symmetric. CustomHashSetEqualityComparer.Equals(A, B) does not always equals CustomHashSetEqualityComparer.Equals(B, A). I would be scared of that.
I think a robust implementation should throw an exception if it encounters a set which has a different comparer to its own. It could always use its own comparer and ignore the set comparer, but that would give strange and unintuitive behaviour.
Additional to the original solution, we can simplify GetHashCode with HashCode.Combine function:
public int GetHashCode(HashSet<T> set) {
int hashCode = 0;
foreach (var item in set) {
hashCode ^= HashCode.Combine(item);
}
return hashCode;
}

How to keep one of duplicate entries in list<T>

I have a list and in the list there are multiple entries. If the list contains an entry that is duplicated then I want to only keep one of the duplicates.
I've tried many things, the list.Distinct().ToList() and this does not remove the duplicate entry, I do not want to override the classes Equals method, so is there a way outside of that.
I've also done this method which seems to again, not remove the duplicate entry as it does not consider object a == object b.
private void removeDupes(List<Bookings> list)
{
int duplicates = 0;
int previousIndex = 0;
for (int i = 0; i < list.Count; i++)
{
bool duplicateFound = false;
for (int x = 0; x < i; x++)
{
if (list[i] == list[x])
{
duplicateFound = true;
duplicates++;
break;
}
}
if (duplicateFound == false)
{
list[previousIndex] = list[i];
previousIndex++;
}
}
}
There is another overload of the Distinct LINQ extension method that also takes an IEqualityComparer as an argument (see this link). So you'd need to create a class that implements IEqualityComparer<Bookings> and supply an instance of it to the Distinct-method. This way, you do not need to override the Equals method of the type.
The rules on whether two objects are equal to one another are implemented in the EqualityComparer.
As an alternative, you can use a HashSet and supply the EqualityComparer in the constructor.
A possible solution for your problem in order of Markus answer might look like this:
public class Booking
{
public Booking(int id, float amount)
{
BookingId = id;
BookingAmount = amount;
}
public int BookingId { get; }
public float BookingAmount { get; }
}
public class BookingComparer : IEqualityComparer<Booking>
{
public bool Equals(Booking x, Booking y)
{
return (x.BookingAmount == y.BookingAmount) && (x.BookingId == y.BookingId);
}
public int GetHashCode(Booking obj)
{
return obj.BookingId.GetHashCode()*17 + obj.BookingAmount.GetHashCode()*17;
}
}
internal class Program
{
private static void Main(string[] args)
{
var booking1 = new Booking(1, 12);
var booking2 = new Booking(1, 12);
var bookings = new List<Booking>();
bookings.Add(booking1);
bookings.Add(booking2);
var result = bookings.Distinct(new BookingComparer()).ToList();
}
}

why doesn't .Except() and Intersect() work here using LINQ?

i have the following code which doesnt seem to be working:
Context:
I have two lists of objects:
* listOne has 100 records
* listTwo has 70 records
many of them have the same Id property (in both lists);
var listOneOnlyItems = listOne.Except(listTwo, new ItemComparer ());
here is the comparer
public class ItemComparer : IEqualityComparer<Item>
{
public bool Equals(Item x, Item y)
{
if (x.Id == y.Id)
return true;
return false;
}
public int GetHashCode(Item obj)
{
return obj.GetHashCode();
}
}
after i run this code and look into the results
listOneOnlyItems
still has 100 records (should only have 30). Can anyone help me?
also, running
IEnumerable<Item> sharedItems = listOne.Intersect(listTwo, new ItemComparer());
returns zero reesults in the sharedItems collection
public int GetHashCode(Item obj)
{
return obj.Id.GetHashCode();
}
Worth a check at least -- IIRC GetHashCode() is tested first before equality, and if they don't have the same hash it won't bother checking equality. I'm not sure what to expect from obj.GetHashCode() -- it depends on what you've implemented on the Item class.
Consider making GetHashCode() return obj.Id.GetHashCode()
This code works fine:
static void TestLinqExcept()
{
var seqA = Enumerable.Range(1, 10);
var seqB = Enumerable.Range(1, 7);
var seqAexceptB = seqA.Except(seqB, new IntComparer());
foreach (var x in seqAexceptB)
{
Console.WriteLine(x);
}
}
class IntComparer: EqualityComparer<int>
{
public override bool Equals(int x, int y)
{
return x == y;
}
public override int GetHashCode(int x)
{
return x;
}
}
You need to add 'override' keywords to your EqualityComparer methods. (I think not having 'override' as implicit was a mistake on the part of the C# designers).

Dictionary<int [], bool> - compare values in the array, not reference?

I am using dictionary for storing ID,otherID and bool value. Unfortunately it compares array reference, therefore I cannot use it.
Is there any way how to have an array as key but compare its values instead of reference?
Thanks
You can use the Comparer property of the dictionary to set it to a custom comparer created by you.
EDIT: actually the property is read-only, sorry. You should definitely use the proper constructor:
class IntArrayComparer : IEqualityComparer<int[]> {
public bool Equals(int[] x, int[] y) {
if (x.Length != y.Length) {
return false;
}
for (int i = 0; i < x.Length; ++i) {
if (x[i] != y[i]) {
return false;
}
}
return true;
}
public int GetHashCode(int[] obj) {
int ret = 0;
for (int i = 0; i < obj.Length; ++i) {
ret ^= obj[i].GetHashCode();
}
return ret;
}
}
static void Main(string[] args) {
Dictionary<int[], bool> dict = new Dictionary<int[], bool>(new IntArrayComparer());
}
You can try implementing IEqualityComparer<int[]> and then pass an instance of it to the proper constructor.
There are basically two ways of doing that:
Create a comparer that implements IEqualityComparable<int[]>, that you pass to the constructor of the dictionary.
Create a key class that encapsulates the integer array and implements IEquatable<T>.
There's nothing wrong with orsogufo's answer, but I wanted to point out that if you have .NET 3.5, you can implement an ArrayValueComparer with a lot less code, and at the same time make it generic, so it can compare the values of any type of array, and not just integer arrays. For that matter, you could easily make it work with any IEnumerable, and not just arrays.
using System.Collections.Generic;
using System.Linq;
class ArrayValueComparer<T> : IEqualityComparer<T[]>
{
public bool Equals(T[] x, T[] y)
{
return x.SequenceEqual(y, EqualityComparer<T>.Default);
}
public int GetHashCode(T[] obj)
{
return obj.Aggregate(0, (total, next) => total ^ next.GetHashCode());
}
}
static void Main(string[] args)
{
var dict = new Dictionary<int[], bool>(new ArrayValueComparer<int>());
}

Categories

Resources