Is there a built-in method to compare collections? - c#

I would like to compare the contents of a couple of collections in my Equals method. I have a Dictionary and an IList. Is there a built-in method to do this?
Edited:
I want to compare two Dictionaries and two ILists, so I think what equality means is clear - if the two dictionaries contain the same keys mapped to the same values, then they're equal.

Enumerable.SequenceEqual
Determines whether two sequences are equal by comparing their elements by using a specified IEqualityComparer(T).
You can't directly compare the list & the dictionary, but you could compare the list of values from the Dictionary with the list

As others have suggested and have noted, SequenceEqual is order-sensitive. To solve that, you can sort the dictionary by key (which is unique, and thus the sort is always stable) and then use SequenceEqual. The following expression checks if two dictionaries are equal regardless of their internal order:
dictionary1.OrderBy(kvp => kvp.Key).SequenceEqual(dictionary2.OrderBy(kvp => kvp.Key))
EDIT: As pointed out by Jeppe Stig Nielsen, some object have an IComparer<T> that is incompatible with their IEqualityComparer<T>, yielding incorrect results. When using keys with such an object, you must specify a correct IComparer<T> for those keys. For example, with string keys (which exhibit this issue), you must do the following in order to get correct results:
dictionary1.OrderBy(kvp => kvp.Key, StringComparer.Ordinal).SequenceEqual(dictionary2.OrderBy(kvp => kvp.Key, StringComparer.Ordinal))

In addition to the mentioned SequenceEqual, which
is true if two lists are of equal length and their corresponding
elements compare equal according to a comparer
(which may be the default comparer, i.e. an overriden Equals())
it is worth mentioning that in .Net4 there is SetEquals on ISet objects,
which
ignores the order of elements and any duplicate elements.
So if you want to have a list of objects, but they don't need to be in a specific order, consider that an ISet (like a HashSet) may be the right choice.

Take a look at the Enumerable.SequenceEqual method
var dictionary = new Dictionary<int, string>() {{1, "a"}, {2, "b"}};
var intList = new List<int> {1, 2};
var stringList = new List<string> {"a", "b"};
var test1 = dictionary.Keys.SequenceEqual(intList);
var test2 = dictionary.Values.SequenceEqual(stringList);

This is not directly answering your questions, but both the MS' TestTools and NUnit provide
CollectionAssert.AreEquivalent
which does pretty much what you want.

I didn't know about Enumerable.SequenceEqual method (you learn something every day....), but I was going to suggest using an extension method; something like this:
public static bool IsEqual(this List<int> InternalList, List<int> ExternalList)
{
if (InternalList.Count != ExternalList.Count)
{
return false;
}
else
{
for (int i = 0; i < InternalList.Count; i++)
{
if (InternalList[i] != ExternalList[i])
return false;
}
}
return true;
}
Interestingly enough, after taking 2 seconds to read about SequenceEqual, it looks like Microsoft has built the function I described for you.

.NET Lacks any powerful tools for comparing collections. I've developed a simple solution you can find at the link below:
http://robertbouillon.com/2010/04/29/comparing-collections-in-net/
This will perform an equality comparison regardless of order:
var list1 = new[] { "Bill", "Bob", "Sally" };
var list2 = new[] { "Bob", "Bill", "Sally" };
bool isequal = list1.Compare(list2).IsSame;
This will check to see if items were added / removed:
var list1 = new[] { "Billy", "Bob" };
var list2 = new[] { "Bob", "Sally" };
var diff = list1.Compare(list2);
var onlyinlist1 = diff.Removed; //Billy
var onlyinlist2 = diff.Added; //Sally
var inbothlists = diff.Equal; //Bob
This will see what items in the dictionary changed:
var original = new Dictionary<int, string>() { { 1, "a" }, { 2, "b" } };
var changed = new Dictionary<int, string>() { { 1, "aaa" }, { 2, "b" } };
var diff = original.Compare(changed, (x, y) => x.Value == y.Value, (x, y) => x.Value == y.Value);
foreach (var item in diff.Different)
Console.Write("{0} changed to {1}", item.Key.Value, item.Value.Value);
//Will output: a changed to aaa

To compare collections you can also use LINQ. Enumerable.Intersect returns all pairs that are equal. You can comparse two dictionaries like this:
(dict1.Count == dict2.Count) && dict1.Intersect(dict2).Count() == dict1.Count
The first comparison is needed because dict2 can contain all the keys from dict1 and more.
You can also use think of variations using Enumerable.Except and Enumerable.Union that lead to similar results. But can be used to determine the exact differences between sets.

How about this example:
static void Main()
{
// Create a dictionary and add several elements to it.
var dict = new Dictionary<string, int>();
dict.Add("cat", 2);
dict.Add("dog", 3);
dict.Add("x", 4);
// Create another dictionary.
var dict2 = new Dictionary<string, int>();
dict2.Add("cat", 2);
dict2.Add("dog", 3);
dict2.Add("x", 4);
// Test for equality.
bool equal = false;
if (dict.Count == dict2.Count) // Require equal count.
{
equal = true;
foreach (var pair in dict)
{
int value;
if (dict2.TryGetValue(pair.Key, out value))
{
// Require value be equal.
if (value != pair.Value)
{
equal = false;
break;
}
}
else
{
// Require key be present.
equal = false;
break;
}
}
}
Console.WriteLine(equal);
}
Courtesy : https://www.dotnetperls.com/dictionary-equals

For ordered collections (List, Array) use SequenceEqual
for HashSet use SetEquals
for Dictionary you can do:
namespace System.Collections.Generic {
public static class ExtensionMethods {
public static bool DictionaryEquals<TKey, TValue>(this IReadOnlyDictionary<TKey, TValue> d1, IReadOnlyDictionary<TKey, TValue> d2) {
if (object.ReferenceEquals(d1, d2)) return true;
if (d2 is null || d1.Count != d2.Count) return false;
foreach (var (d1key, d1value) in d1) {
if (!d2.TryGetValue(d1key, out TValue d2value)) return false;
if (!d1value.Equals(d2value)) return false;
}
return true;
}
}
}
(A more optimized solution will use sorting but that will require IComparable<TValue>)

No, because the framework doesn't know how to compare the contents of your lists.
Have a look at this:
http://blogs.msdn.com/abhinaba/archive/2005/10/11/479537.aspx

public bool CompareStringLists(List<string> list1, List<string> list2)
{
if (list1.Count != list2.Count) return false;
foreach(string item in list1)
{
if (!list2.Contains(item)) return false;
}
return true;
}

There wasn't, isn't and might not be, at least I would believe so. The reason behind is collection equality is probably an user defined behavior.
Elements in collections are not supposed to be in a particular order though they do have an ordering naturally, it's not what the comparing algorithms should rely on. Say you have two collections of:
{1, 2, 3, 4}
{4, 3, 2, 1}
Are they equal or not? You must know but I don't know what's your point of view.
Collections are conceptually unordered by default, until the algorithms provide the sorting rules. The same thing SQL server will bring to your attention is when you trying to do pagination, it requires you to provide sorting rules:
https://learn.microsoft.com/en-US/sql/t-sql/queries/select-order-by-clause-transact-sql?view=sql-server-2017
Yet another two collections:
{1, 2, 3, 4}
{1, 1, 1, 2, 2, 3, 4}
Again, are they equal or not? You tell me ..
Element repeatability of a collection plays its role in different scenarios and some collections like Dictionary<TKey, TValue> don't even allow repeated elements.
I believe these kinds of equality are application defined and the framework therefore did not provide all of the possible implementations.
Well, in general cases Enumerable.SequenceEqual is good enough but it returns false in the following case:
var a = new Dictionary<String, int> { { "2", 2 }, { "1", 1 }, };
var b = new Dictionary<String, int> { { "1", 1 }, { "2", 2 }, };
Debug.Print("{0}", a.SequenceEqual(b)); // false
I read some answers to questions like this(you may google for them) and what I would use, in general:
public static class CollectionExtensions {
public static bool Represents<T>(this IEnumerable<T> first, IEnumerable<T> second) {
if(object.ReferenceEquals(first, second)) {
return true;
}
if(first is IOrderedEnumerable<T> && second is IOrderedEnumerable<T>) {
return Enumerable.SequenceEqual(first, second);
}
if(first is ICollection<T> && second is ICollection<T>) {
if(first.Count()!=second.Count()) {
return false;
}
}
first=first.OrderBy(x => x.GetHashCode());
second=second.OrderBy(x => x.GetHashCode());
return CollectionExtensions.Represents(first, second);
}
}
That means one collection represents the other in their elements including repeated times without taking the original ordering into account. Some notes of the implementation:
GetHashCode() is just for the ordering not for equality; I think it's enough in this case
Count() will not really enumerates the collection and directly fall into the property implementation of ICollection<T>.Count
If the references are equal, it's just Boris

I've made my own compare method. It returns common, missing, and extra values.
private static void Compare<T>(IEnumerable<T> actual, IEnumerable<T> expected, out IList<T> common, out IList<T> missing, out IList<T> extra) {
common = new List<T>();
missing = new List<T>();
extra = new List<T>();
var expected_ = new LinkedList<T>( expected );
foreach (var item in actual) {
if (expected_.Remove( item )) {
common.Add( item );
} else {
extra.Add( item );
}
}
foreach (var item in expected_) {
missing.Add( item );
}
}

Comparing dictionaries' contents:
To compare two Dictionary<K, V> objects, we can assume that the keys are unique for every value, thus if two sets of keys are equal, then the two dictionaries' contents are equal.
Dictionary<K, V> dictionaryA, dictionaryB;
bool areDictionaryContentsEqual = new HashSet<K>(dictionaryA.Keys).SetEquals(dictionaryB.Keys);
Comparing collections' contents:
To compare two ICollection<T> objects, we need to check:
If they are of the same length.
If every T value that appears in the first collection appears an equal number of times in the second.
public static bool AreCollectionContentsEqual<T>(ICollection<T> collectionA, ICollection<T> collectionB)
where T : notnull
{
if (collectionA.Count != collectionB.Count)
{
return false;
}
Dictionary<T, int> countByValueDictionary = new(collectionA.Count);
foreach(T item in collectionA)
{
countByValueDictionary[item] = countByValueDictionary.TryGetValue(item, out int count)
? count + 1
: 1;
}
foreach (T item in collectionB)
{
if (!countByValueDictionary.TryGetValue(item, out int count) || count < 1)
{
return false;
}
countByValueDictionary[item] = count - 1;
}
return true;
}
These solutions should be optimal since their time and memory complexities are O(n), while the solutions that use ordering/sorting have time and memory complexities greater than O(n).

Related

Fastest way to check a Dictionary<> is equal to another [duplicate]

Assuming dictionary keys and values have their equals and hash methods implemented correctly, what is the most succinct and efficient way to test for equality of two dictionaries?
In this context, two dictionaries are said to be equal if they contain the same set of keys (order not important), and for every such key, they agree on the value.
Here are some ways I came up with (there are probably many more):
public bool Compare1<TKey, TValue>(
Dictionary<TKey, TValue> dic1,
Dictionary<TKey,TValue> dic2)
{
return dic1.OrderBy(x => x.Key).
SequenceEqual(dic2.OrderBy(x => x.Key));
}
public bool Compare2<TKey, TValue>(
Dictionary<TKey, TValue> dic1,
Dictionary<TKey, TValue> dic2)
{
return (dic1.Count == dic2.Count &&
dic1.Intersect(dic2).Count().
Equals(dic1.Count));
}
public bool Compare3<TKey, TValue>(
Dictionary<TKey, TValue> dic1,
Dictionary<TKey, TValue> dic2)
{
return (dic1.Intersect(dic2).Count().
Equals(dic1.Union(dic2).Count()));
}
dic1.Count == dic2.Count && !dic1.Except(dic2).Any();
It really depends on what you mean by equality.
This method will test that two dictionaries contain the same keys with the same values (assuming that both dictionaries use the same IEqualityComparer<TKey> implementation).
public bool CompareX<TKey, TValue>(
Dictionary<TKey, TValue> dict1, Dictionary<TKey, TValue> dict2)
{
if (dict1 == dict2) return true;
if ((dict1 == null) || (dict2 == null)) return false;
if (dict1.Count != dict2.Count) return false;
var valueComparer = EqualityComparer<TValue>.Default;
foreach (var kvp in dict1)
{
TValue value2;
if (!dict2.TryGetValue(kvp.Key, out value2)) return false;
if (!valueComparer.Equals(kvp.Value, value2)) return false;
}
return true;
}
You could use linq for the key/value comparisons:
public bool Compare<TKey, TValue>(Dictionary<TKey, TValue> dict1, Dictionary<TKey, TValue dict2)
{
IEqualityComparer<TValue> valueComparer = EqualityComparer<TValue>.Default;
return dict1.Count == dict2.Count &&
dict1.Keys.All(key => dict2.ContainsKey(key) && valueComparer.Equals(dict1[key], dict2[key]));
}
In addition to #Nick Jones answer, you're going to need to implement gethashcode in the same, order agnostic way. I would suggest something like this:
public override int GetHashCode()
{
var hash = 13;
var orderedKVPList = this.DictProp.OrderBy(kvp => kvp.Key);
foreach (var kvp in orderedKVPList)
{
hash = (hash * 7) + kvp.Key.GetHashCode();
hash = (hash * 7) + kvp.Value.GetHashCode();
}
return hash;
}
I thought the accepted answer would be correct based on what I was reading in the smarthelp for the Except method: "Produces the set difference of two sequences by using the default equality comparer to compare values." But I discovered it is not a good answer.
Consider this code:
Dictionary<string, List<string>> oldDict = new Dictionary<string, List<string>>()
{{"001A", new List<string> {"John", "Doe"}},
{"002B", new List<string> {"Frank", "Abignale"}},
{"003C", new List<string> {"Doe", "Jane"}}};
Dictionary<string, List<string>> newDict = new Dictionary<string, List<string>>()
{{"001A", new List<string> {"John", "Doe"}},
{"002B", new List<string> {"Frank", "Abignale"}},
{"003C", new List<string> {"Doe", "Jane"}}};
bool equal = oldDict.Count.Equals(newDict.Count) && !oldDict.Except(newDict).Any();
Console.WriteLine(string.Format("oldDict {0} newDict", equal?"equals":"does not equal"));
equal = oldDict.SequenceEqual(newDict);
Console.WriteLine(string.Format("oldDict {0} newDict", equal ? "equals" : "does not equal"));
Console.WriteLine(string.Format("[{0}]", string.Join(", ",
oldDict.Except(newDict).Select(k =>
string.Format("{0}=[{1}]", k.Key, string.Join(", ", k.Value))))));
This results in the following:
oldDict does not equal newDict
oldDict does not equal newDict
[001A=[John, Doe], 002B=[Frank, Abignale], 003C=[Doe, Jane]]
As you can see, both "oldDict" and "newDict" are setup exactly the same. And neither the suggested solution nor a call to SequenceEqual works properly. I wonder if it is a result of the Except using lazy loading or the way the comparer is setup for the Dictionary. (Although, looking at the structure and reference explanations suggest it should.)
Here's the solution I came up with. Note that the rule I used is as follows: two dictionaries are equal if both contain the same keys and the values for each key match. Both keys and values must be in the same sequential order. And my solution may not be the most efficient, since it relies on iterating through the entire set of keys.
private static bool DictionaryEqual(
Dictionary<string, List<string>> oldDict,
Dictionary<string, List<string>> newDict)
{
// Simple check, are the counts the same?
if (!oldDict.Count.Equals(newDict.Count)) return false;
// Verify the keys
if (!oldDict.Keys.SequenceEqual(newDict.Keys)) return false;
// Verify the values for each key
foreach (string key in oldDict.Keys)
if (!oldDict[key].SequenceEqual(newDict[key]))
return false;
return true;
}
Also see how the results change if:
Key order is not the same. (returns false)
newDict = new Dictionary<string, List<string>>()
{{"001A", new List<string> {"John", "Doe"}},
{"003C", new List<string> {"Doe", "Jane"}},
{"002B", new List<string> {"Frank", "Abignale"}}};
and
Key order matches, but Value does not match (returns false)
newDict = new Dictionary<string, List<string>>()
{{"001A", new List<string> {"John", "Doe"}},
{"002B", new List<string> {"Frank", "Abignale"}},
{"003C", new List<string> {"Jane", "Doe"}}};
If sequence order does not matter, the function can be changed to the following, but there is likely a performance hit.
private static bool DictionaryEqual_NoSort(
Dictionary<string, List<string>> oldDict,
Dictionary<string, List<string>> newDict)
{
// Simple check, are the counts the same?
if (!oldDict.Count.Equals(newDict.Count)) return false;
// iterate through all the keys in oldDict and
// verify whether the key exists in the newDict
foreach(string key in oldDict.Keys)
{
if (newDict.Keys.Contains(key))
{
// iterate through each value for the current key in oldDict and
// verify whether or not it exists for the current key in the newDict
foreach(string value in oldDict[key])
if (!newDict[key].Contains(value)) return false;
}
else { return false; }
}
return true;
}
Check out if the DictionaryEqual_NoSort using the following for newDict (DictionaryEquals_NoSort returns true):
newDict = new Dictionary<string, List<string>>()
{{"001A", new List<string> {"John", "Doe"}},
{"003C", new List<string> {"Jane", "Doe"}},
{"002B", new List<string> {"Frank", "Abignale"}}};
Simple O(N) time, O(1) space solution with null checks
The other solutions using Set operations Intersect, Union or Except are good but these require additional O(N) memory for the final resultant dictionary which is just used for counting elements.
Instead, use Linq Enumerable.All to check this. First validate the count of two dictionaries, next, iterate over all D1's Key Value pairs and check if they are equal to D2's Key Value pairs. Note: Linq does allocate memory for a collection iterator but it's invariant of the collection size - O(1) space. Amortized complexity for TryGetValue is O(1).
// KV is KeyValue pair
var areDictsEqual = d1.Count == d2.Count && d1.All(
(d1KV) => d2.TryGetValue(d1KV.Key, out var d2Value) && (
d1KV.Value == d2Value ||
d1KV.Value?.Equals(d2Value) == true)
);
Why d1KV.Value == d2Value? - this is to check if object references are equal. Also, if both are null, d1KV.Value == d2Value will evaluate to true.
Why d1Kv.Value?.Equals(d2Value) == true? - Value?. is for null safe check and .Equals is meant to test equality of two objects based on your object's Equals and HashCode methods.
You can tweak the equality checks as you like. I'm assuming the Dict Values are nullable type to make the solution more generic (eg: string, int?, float?). If it's non-nullable type, the checks could be simplified.
Final note: In C# dictionary, the Keys can't be null. But Values can be null. Docs for reference.
#Allen's answer:
bool equals = a.Intersect(b).Count() == a.Union(b).Count()
is about arrays but as far as IEnumerable<T> methods are used, it can be used for Dictionary<K,V> too.
If two dictionaries contain the same keys, but in different order, should they be considered equal? If not, then the dictionaries should be compared by running enumerators through both simultaneously. This will probably be faster than enumerating through one dictionary and looking up each element in the other. If you have advance knowledge that equal dictionaries will have their elements in the same order, such a double-enumeration is probably the way to go.

How to write generic function processing two Lists holding any kind of objects?

I want to write a function that processs two Lists of the same objects. The function does always the same thing:
Find the objects that are only in List2 but not in List1 -> Do something with them
Find the object that are in both Lists -> Do something different with them.
Now the point is, that I have List pairs holding different kind of objects to which I want to apply this exact process.
Example:
List<Foo1> L11, L12;
List<Foo2> L21, L22;
List<Foo3> L31, L32;
So how do I have to write the code, so that I do not have to repeat the code for each List type ?
Greetings and Thank you
I would prepare a method, like below:
static void Process<T>(IEnumerable<T> list1, IEnumerable<T> list2, Action<T> onlyIn2, Action<T> inBoth)
{
var hash = new HashSet<T>(list1);
foreach (var item2 in list2)
if (hash.Contains(item2))
inBoth(item2);
else
onlyIn2(item2);
}
You can then use it as follows:
var list1 = new List<int> {1, 2, 3, 4, 5};
var list2 = new List<int> {3, 4, 5, 6};
Process(list1, list2, a =>
{
Console.WriteLine("{0} only in 2", a);
}, a =>
{
Console.WriteLine("{0} in both", a);
});
Note that it uses standard comparison rules (for objects reference equality unless Equals is overrided or IEqualityComparer<TKey> is implemented).
LINQ already provides two methods which do this:
// get all members of L11 not present in L12
var except = L11.Except(L12).ToList();
// get members present in both lists
var intersect = L11.Intersect(L12).ToList();
These overloads will use the default comparer for the list element type, so since you want to compare custom classes, you will need to use the overload which accepts a custom IEqualityComparer<T>:
var comparer = new CustomComparer();
var except = L11.Except(L12, comparer).ToList();
var intersect = L11.Intersect(L12, comparer).ToList();
which you need to write yourself:
class CustomComparer : IEqualityComparer<SomeClass>
{
public bool Equals(SomeClass x, SomeClass y)
{
// return true if equal
}
public int GetHashCode(SomeClass obj)
{
// return a hash code for boj
}
}
Your can use the Except/Intersect Linq methods as follows:
void Process<T>(IList<T> list1, IList<T> list2, IEqualityComparer<T> comparer = null) {
//Find the objects that are only in List2 but not in List1
foreach(var item in list2.Except(list1, comparer)) {
// -> Do something with them
}
//Find the object that are in both Lists -> Do something different with them.
foreach(var item in list1.Intersect(list2, comparer)) {
// -> Do something different with them.
}
}

Generic list comparison in C#

I have a method that finds differences between two lists of ints using a dictionary. Essentially the code loops the first list, adding each int to the dictionary and setting (to 1 where not already present)/incrementing the value. It then loops the second list setting (to -1 where not already present)/decrementing the value.
Once it has looped both lists you end up with a dictionary where keys with values = 0 indicate a match, keys with values >=1 indicate presence only in the first list and values <=-1 indicate presence only in the second list.
Firstly, is this a sensible implementation?
Secondly, I would like to make it more generic, at the moment it can only handle int based lists. I'd like something that could handle any object where the caller could potentially define the comparison logic...
public static Dictionary<int, int> CompareLists(List<int> listA, List<int> listB)
{
// 0 Match
// <= -1 listB only
// >= 1 listA only
var recTable = new Dictionary<int, int>();
foreach (int value in listA)
{
if (recTable.ContainsKey(value))
recTable[value]++;
else
recTable[value] = 1;
}
foreach (int value in listB)
{
if (recTable.ContainsKey(value))
recTable[value]--;
else
recTable[value] = -1;
}
return recTable;
}
Thanks in advance!
In response to:
"It won't work properly if to example you have same value appears twice in listA and once in listB, result will be positive, which say "listA only" in your comments."
Let me clarify; if a value appears twice in listA it should also appear twice in listB - So if a value is in listA twice and once in listB, I don't care which one from listA it picks to match, as long as the one non-reconciling item is reported correctly.
Imagine the use-case where you are trying to reconcile lots of payment amounts between two files, it's entirely feasible to have repeating amounts but it doesn't really matter which of the duplicates are matched as long as the non-reconciling values are reported.
To answer your second question, here's how to make it more generic:
public static Dictionary<T, int> CompareLists<T>(IEnumerable<T> listA,
IEnumerable<T> listB, IEqualityComparer<T> comp)
{
var recTable = new Dictionary<T, int>(comp);
foreach (var value in listA)
{
if (recTable.ContainsKey(value))
recTable[value]++;
else
recTable[value] = 1;
}
foreach (var value in listB)
{
if (recTable.ContainsKey(value))
recTable[value]--;
else
recTable[value] = -1;
}
return recTable;
}
This is more generic because:
I pass in the type T instead of an int.
I use IEnumerables instead of Lists.
I pass in an IEqualityComparer and pass it to the Dictionary constructor which needs to use it.
I use var in the foreach loops instead of int. You can also use T.
You call this code like this:
static void Main()
{
int[] arr1 = { 1, 2, 3 };
int[] arr2 = { 3, 2, 1 };
var obj = CompareLists(arr1, arr2, EqualityComparer<int>.Default);
Console.ReadLine();
}
Here's an example of implementing IEqualityComparer. This treats all odd ints as equal and all even ints as equal:
public class MyEq : IEqualityComparer<int>
{
public bool Equals(int x, int y)
{
return (x % 2) == (y % 2);
}
public int GetHashCode(int obj)
{
return (obj % 2).GetHashCode();
}
}
FullOuterJoin as found here: LINQ - Full Outer Join
public static Dictionary<int, int> CompareLists(List<int> listA, List<int> listB)
{
return listA.FullOuterJoin(listB,
a=>a, // What to compare from ListA
b=>b, // What to compare from ListB
(a,b,key)=>
new {key=key,value=0}, // What to return if found in both
new {key=key,value=-1},// What to return if found only in A
new {key=key,value=1}) // What to return if found only in B
.ToDictionary(a=>a.key,a=>a.value); // Only because you want a dictionary
}
You can do this using Generics:
public static Dictionary<T, int> CompareLists<T>(List<T> listA, List<T> listB)
{
// 0 Match
// <= -1 listB only
// >= 1 listA only
var recTable = new Dictionary<T, int>();
foreach (T value in listA)
{
if (recTable.ContainsKey(value))
recTable[value]++;
else
recTable[value] = 1;
}
foreach (T value in listB)
{
if (recTable.ContainsKey(value))
recTable[value]--;
else
recTable[value] = -1;
}
return recTable;
}
These are my two cents:
public static Dictionary<T, int> CompareLists<T>(List<T> left, List<T> right, IEqualityComparer<T> comparer)
{
Dictionary<T, int> result = left.ToDictionary(l => l, l => right.Any(r => comparer.Equals(l, r)) ? 0 : -1);
foreach (T r in right.Where(t => result.Keys.All(k => !comparer.Equals(k, t))))
result[r] = 1;
return result;
}
The method takes Lists of any type T and an IEqualityComparer for that type T. It then at first generates a dictionary of those elements contained in the "left" List, thereby checking if they are also in the "right" List and setting the value accordingly.
The second step adds the elements that are only contained in the "right" List with value 1.
If this is a sensible implementation depends on what you are trying to achieve with it. I think it's a short but still readable one, relying on proper implementation of the LINQ methods. Though there might be faster possibilities one could think about if this is for really big lists or an very often called method.

Iterate over multiple lists

Given a bunch of lists, I need to iterate over them simultaneously. Suppose I have three of them: list1, list2, and list3.
What I found so far is the following:
foreach (var tuple in list1.Zip(list2, (first, second) => new { object1 = first, object2 = second })
.Zip(list3, (first, second) => new { object1 = first.object1, object2 = first.object2, object3 = second }))
{
//do stuff
}
This works fine and is quite readable, unless the number of lists is not big. I know how to extend it further to 4, 5,.... lists, but if I zip 10 of them, the code would be extremely long. Is there any possibility to refactor it? Or would I need other solution than Zip function?
With a help of a bit of code generation (think T4), one could produce up to 6 overloads (because Tuple is limited to 7 generic arguments) of something similar to:
public static class Iterate
{
public static IEnumerable<Tuple<T1, T2, T3>> Over<T1, T2, T3>(IEnumerable<T1> t1s, IEnumerable<T2> t2s, IEnumerable<T3> t3s)
{
using(var it1s = t1s.GetEnumerator())
using(var it2s = t2s.GetEnumerator())
using(var it3s = t3s.GetEnumerator())
{
while(it1s.MoveNext() && it2s.MoveNext() && it3s.MoveNext())
yield return Tuple.Create(it1s.Current, it2s.Current, it3s.Current);
}
}
}
With this Iterate class, iteration becomes very simple:
foreach(var t in Iterate.Over(
new[] { 1, 2, 3 },
new[] { "a", "b", "c" },
new[] { 1f, 2f, 3f }))
{
}
This can be futher generalized (with a total loss of type safety) to:
public static IEnumerable<object[]> Over(params IEnumerable[] enumerables)
Why not good old for loop?
int n = new int[] {
list1.Count,
list2.Count,
list3.Count,
// etc.
}.Min(); // if lists have different number of items
for (int i = 0; i < n; ++i) {
var item1 = list1[i]; // if you want an item
...
}
As far as I get it, the real problem is the unknown number of lists to iterate over. Another issue I see is that there is no guarantee that all the lists will have the same length... correct?
If the number of lists is unknown, tuples won't do it because they will go up to 8... and must be set at compile time...
In that case i would suggest that you, instead of mapping to a tuple, do it to a simple and very old structure: a matrix! The width will be the number of list (known at runtime) and the depth will be the longest list. You can iterate using a simple and well know for, have the compiler optimise memory and allocation... The code will be very readable not only by C# folks but for practically anyone who works with any kind of programming language...
Adding to #AntonGogolev's answer, on his last remark... if you don't care about type-safety and performance (for boxing-unboxing), you could implement an enumerator using object[]:
public static class Iterator
{
public static IEnumerable<object[]> Enumerate(params IEnumerable[] enumerables)
{
var list = new List<object>();
var enumerators = new List<IEnumerator>();
bool end = false;
foreach(var enu in enumerables)
{
enumerators.Add(enu.GetEnumerator());
}
while(!end)
{
list.Clear();
foreach(var enu in enumerators)
{
if(!enu.MoveNext()) { end = true; break; }
list.Add(enu.Current);
}
if(!end) yield return list.ToArray();
}
}
}
Warning: no effort whatsoever has been made to optimize this code and it has been written as it came through the fingers :-)
You can use it like:
var listA = new[] { 1, 2, 3 };
var listB = new[] { "a", "b", "c" };
var listC = new[] { 5f, 6f, 7f };
foreach(var n in Iterator.Enumerate(listA, listB, listC))
{
foreach(var obj in n)
{
Console.Write(obj.ToString() + ", ");
}
Console.WriteLine();
}
Fiddle here: https://dotnetfiddle.net/irTY8M

Fastest way to Remove Duplicate Value from a list<> by lambda

what is fastest way to remove duplicate values from a list.
Assume List<long> longs = new List<long> { 1, 2, 3, 4, 3, 2, 5 }; So I am interesting in use lambda to remove duplicate and returned : {1, 2, 3, 4, 5}. What is your suggestion?
The easiest way to get a new list would be:
List<long> unique = longs.Distinct().ToList();
Is that good enough for you, or do you need to mutate the existing list? The latter is significantly more long-winded.
Note that Distinct() isn't guaranteed to preserve the original order, but in the current implementation it will - and that's the most natural implementation. See my Edulinq blog post about Distinct() for more information.
If you don't need it to be a List<long>, you could just keep it as:
IEnumerable<long> unique = longs.Distinct();
At this point it will go through the de-duping each time you iterate over unique though. Whether that's good or not will depend on your requirements.
You can use this extension method for enumerables containing more complex types:
IEnumerable<Foo> distinctList = sourceList.DistinctBy(x => x.FooName);
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
var knownKeys = new HashSet<TKey>();
return source.Where(element => knownKeys.Add(keySelector(element)));
}
There is Distinct() method. it should works.
List<long> longs = new List<long> { 1, 2, 3, 4, 3, 2, 5 };
var distinctList = longs.Distinct().ToList();
If you want to stick with the original List instead of creating a new one, you can something similar to what the Distinct() extension method does internally, i.e. use a HashSet to check for uniqueness:
HashSet<long> set = new HashSet<long>(longs.Count);
longs.RemoveAll(x => !set.Add(x));
The List class provides this convenient RemoveAll(predicate) method that drops all elements not satisfying the condition specified by the predicate. The predicate is a delegate taking a parameter of the list's element type and returning a bool value. The HashSet's Add() method returns true only if the set doesn't contain the item yet. Thus by removing any items from the list that can't be added to the set you effectively remove all duplicates.
List<long> distinctlongs = longs.Distinct().OrderBy(x => x).ToList();
A simple intuitive implementation
public static List<PointF> RemoveDuplicates(List<PointF> listPoints)
{
List<PointF> result = new List<PointF>();
for (int i = 0; i < listPoints.Count; i++)
{
if (!result.Contains(listPoints[i]))
result.Add(listPoints[i]);
}
return result;
}
In-place:
public static void DistinctValues<T>(List<T> list)
{
list.Sort();
int src = 0;
int dst = 0;
while (src < list.Count)
{
var val = list[src];
list[dst] = val;
++dst;
while (++src < list.Count && list[src].Equals(val)) ;
}
if (dst < list.Count)
{
list.RemoveRange(dst, list.Count - dst);
}
}

Categories

Resources