How to create a <T> function for list comparison?

How to create a <T> function for list comparison? - c#

public static bool CompareLists(List<Product> lstProduct1, List<Product> lstProduct2, List<DuplicateExpression> DuplicateExpression)
{
string[] Fields = DuplicateExpression.Select(x => x.ExpressionName).ToArray();
//var JoinExp = lstProduct1.Join(lstProduct2, new[] { "ProductName", "ProductCode" });
var JoinExp = lstProduct1.Join(lstProduct2, Fields);
bool IsSuccess = CompareTwoLists(lstProduct1, lstProduct2, (listProductx1, listProductx2) => JoinExp.Any());
return IsSuccess;
}
How to convert above function as <T> function?. Actually this is a List comparison function.

SequenceEqual solves your problem.
new[] { "A", "B" }.SequenceEqual(new[] { "A", "B" }).Should().BeTrue();
Here is the source code.
public static bool SequenceEqual<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
{
if (comparer == null) comparer = EqualityComparer<TSource>.Default;
if (first == null) throw Error.ArgumentNull("first");
if (second == null) throw Error.ArgumentNull("second");
using (IEnumerator<TSource> e1 = first.GetEnumerator())
using (IEnumerator<TSource> e2 = second.GetEnumerator())
{
while (e1.MoveNext())
{
if (!(e2.MoveNext() && comparer.Equals(e1.Current, e2.Current)))
return false;
}
if (e2.MoveNext())
return false;
}
return true;
}
In your case you could elect to replace IEnumerable<TSource> with IList<TSource> or even List<TSource> ideally the highest level of abstraction is preferred.

Related

Populate a list inside a linq select statement using the query's index

I have a list (assume it has values):
var IdNumber = new List<string>();
and I have a linq:
var output = list.Select(y => new {
Id = IdNumber[index],
Name = y.Name,
}).ToList();
Is there an index that can be used to populate 'IdNumber' in Id? If it is impossible, is there any other way to do it?

Select can give an index.
var output = list.Select((y,index) => new {
Id = IdNumber[index],
Name = y.Name,
}).ToList();
Basic demo:
public static void Main()
{
var list = new[]{"abc","abc","abc"};
foreach (var item in list.Select((x,i)=>x[i]))
Console.WriteLine(item);
}
output
a
b
c
Try it online!
Select implementation:
public static IEnumerable<TResult> Select<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, int, TResult> selector) {
if (source == null) throw Error.ArgumentNull("source");
if (selector == null) throw Error.ArgumentNull("selector");
return SelectIterator<TSource, TResult>(source, selector);
}
static IEnumerable<TResult> SelectIterator<TSource, TResult>(IEnumerable<TSource> source, Func<TSource, int, TResult> selector) {
int index = -1;
foreach (TSource element in source) {
checked { index++; }
yield return selector(element, index);
}
}
source

select in linq has its built-in index. You just have to specify a second argument in the code.
Replace the beginning of your code with
var output = list.Select((y, index) => new {
...
}).ToList();

StartWith method for arrays

Is there a StartWith method for arrays in .NET? Or something similar to it in LINQ?
var arr1 = { "A", "B, "C" }
var arr2 = { "A", "B, "C", "D" }
var arr3 = { "A", "B, "CD" }
var arr4 = { "E", "A, "B", "C" }
arr2.StartWith(arr1) // true
arr1.StartWith(arr2) // false
arr3.StartWith(arr1) // false
arr4.StartWith(arr1) // false
Or I should do it straightforward:
bool StartWith(string[] arr1, string[] arr2)
{
if (arr1.Count() < arr2.Count) return false;
for (var i = 0; i < arr2.Count(), i++)
{
if (arr2[i] != arr1[i]) return false;
}
return true;
}
I'm looking for the most efficient way to do that.

bool answer = arr2.Take(arr1.Length).SequenceEqual(arr1);

Your "striaghtformward" way is the way most LINQ methods would be doing it anyway. There are a few tweaks you could do. For example make it a extension method and use a comparer for the comparison of the two types so custom comparers could be used.
public static class ExtensionMethods
{
static bool StartWith<T>(this T[] arr1, T[] arr2)
{
return StartWith(arr1, arr2, EqualityComparer<T>.Default);
}
static bool StartWith<T>(this T[] arr1, T[] arr2, IEqualityComparer<T> comparer)
{
if (arr1.Length < arr2.Length) return false;
for (var i = 0; i < arr2.Length, i++)
{
if (!comparer.Equals(arr2[i], arr1[i])) return false;
}
return true;
}
}
UPDATE: For fun I decided to take the time and write a little more "advanced" version that would work with any IEnumerable<T> and not just arrays.
public static class ExtensionMethods
{
static bool StartsWith<T>(this IEnumerable<T> #this, IEnumerable<T> #startsWith)
{
return StartsWith(#this, startsWith, EqualityComparer<T>.Default);
}
static bool StartsWith<T>(this IEnumerable<T> #this, IEnumerable<T> startsWith, IEqualityComparer<T> comparer)
{
if (#this == null) throw new ArgumentNullException("this");
if (startsWith == null) throw new ArgumentNullException("startsWith");
if (comparer == null) throw new ArgumentNullException("comparer");
//Check to see if both types implement ICollection<T> to get a free Count check.
var thisCollection = #this as ICollection<T>;
var startsWithCollection = startsWith as ICollection<T>;
if (thisCollection != null && startsWithCollection != null && (thisCollection.Count < startsWithCollection.Count))
return false;
using (var thisEnumerator = #this.GetEnumerator())
using (var startsWithEnumerator = startsWith.GetEnumerator())
{
//Keep looping till the startsWithEnumerator runs out of items.
while (startsWithEnumerator.MoveNext())
{
//Check to see if the thisEnumerator ran out of items.
if (!thisEnumerator.MoveNext())
return false;
if (!comparer.Equals(thisEnumerator.Current, startsWithEnumerator.Current))
return false;
}
}
return true;
}
}

You can do:
var result = arr2.Take(arr1.Length).SequenceEqual(arr1);
To optimize it further you can add the check arr2.Length >= arr1.Length in the start like:
var result = arr2.Length >= arr1.Length && arr2.Take(arr1.Length).SequenceEqual(arr1);
The end result would be same.

Try Enumerable.SequenceEqual(a1, a2) but trim your first array, i.e.,
var arr1 = { "A", "B, "C" }
var arr2 = { "A", "B, "C", "D" }
if (Enumerable.SequenceEqual(arr1, arr2.Take(arr1.Length))

You don't want to require everything to be an array, and you don't want to call Count() on an IEnumerable<T> that may be a large query, when you only really want to sniff at the first four items or whatever.
public static class Extensions
{
public static void Test()
{
var a = new[] { "a", "b" };
var b = new[] { "a", "b", "c" };
var c = new[] { "a", "b", "c", "d" };
var d = new[] { "x", "y" };
Console.WriteLine("b.StartsWith(a): {0}", b.StartsWith(a));
Console.WriteLine("b.StartsWith(c): {0}", b.StartsWith(c));
Console.WriteLine("b.StartsWith(d, x => x.Length): {0}",
b.StartsWith(d, x => x.Length));
}
public static bool StartsWith<T>(
this IEnumerable<T> sequence,
IEnumerable<T> prefixCandidate,
Func<T, T, bool> compare = null)
{
using (var eseq = sequence.GetEnumerator())
using (var eprefix = prefixCandidate.GetEnumerator())
{
if (compare == null)
{
compare = (x, y) => Object.Equals(x, y);
}
eseq.MoveNext();
eprefix.MoveNext();
do
{
if (!compare(eseq.Current, eprefix.Current))
return false;
if (!eprefix.MoveNext())
return true;
}
while (eseq.MoveNext());
return false;
}
}
public static bool StartsWith<T, TProperty>(
this IEnumerable<T> sequence,
IEnumerable<T> prefixCandidate,
Func<T, TProperty> selector)
{
using (var eseq = sequence.GetEnumerator())
using (var eprefix = prefixCandidate.GetEnumerator())
{
eseq.MoveNext();
eprefix.MoveNext();
do
{
if (!Object.Equals(
selector(eseq.Current),
selector(eprefix.Current)))
{
return false;
}
if (!eprefix.MoveNext())
return true;
}
while (eseq.MoveNext());
return false;
}
}
}

Here are some different ways of doing that. I didn't optimize or fully validated everything, there is room for improvement everywhere. But this should give you some idea.
The best performance will always be going low level, if you grab the iterator and go step by step you can get much faster results.
Methods and performance results:
StartsWith1 00:00:01.9014586
StartsWith2 00:00:02.1227468
StartsWith3 00:00:03.2222109
StartsWith4 00:00:05.5544177
Test method:
var watch = new Stopwatch();
watch.Start();
for (int i = 0; i < 10000000; i++)
{
bool test = action(arr2, arr1);
}
watch.Stop();
return watch.Elapsed;
Methods:
public static class IEnumerableExtender
{
public static bool StartsWith1<T>(this IEnumerable<T> source, IEnumerable<T> compare)
{
if (source.Count() < compare.Count())
{
return false;
}
using (var se = source.GetEnumerator())
{
using (var ce = compare.GetEnumerator())
{
while (ce.MoveNext() && se.MoveNext())
{
if (!ce.Current.Equals(se.Current))
{
return false;
}
}
}
}
return true;
}
public static bool StartsWith2<T>(this IEnumerable<T> source, IEnumerable<T> compare) =>
compare.Take(source.Count()).SequenceEqual(source);
public static bool StartsWith3<T>(this IEnumerable<T> source, IEnumerable<T> compare)
{
if (source == null)
{
throw new ArgumentNullException(nameof(source));
}
if (compare == null)
{
throw new ArgumentNullException(nameof(compare));
}
if (source.Count() < compare.Count())
{
return false;
}
return compare.SequenceEqual(source.Take(compare.Count()));
}
public static bool StartsWith4<T>(this IEnumerable<T> arr1, IEnumerable<T> arr2)
{
return StartsWith4(arr1, arr2, EqualityComparer<T>.Default);
}
public static bool StartsWith4<T>(this IEnumerable<T> arr1, IEnumerable<T> arr2, IEqualityComparer<T> comparer)
{
if (arr1.Count() < arr2.Count()) return false;
for (var i = 0; i < arr2.Count(); i++)
{
if (!comparer.Equals(arr2.ElementAt(i), arr1.ElementAt(i))) return false;
}
return true;
}
}

Unexpected behavior using Enumerable.Empty<string>()

I would expect Enumerable.Empty<string>() to return an empty array of strings. Instead, it appears to return an array with a single null value. This breaks other LINQ operators like DefaultIfEmpty, since the enumerable is not, in fact, empty. This doesn't seem to be documented anywhere, so I'm wondering if I'm missing something (99% probability).
GameObject Class
public GameObject(string id,IEnumerable<string> keywords) {
if (String.IsNullOrWhiteSpace(id)) {
throw new ArgumentException("invalid", "id");
}
if (keywords==null) {
throw new ArgumentException("invalid", "keywords");
}
if (keywords.DefaultIfEmpty() == null) { //This line doesn't work correctly.
throw new ArgumentException("invalid", "keywords");
}
if (keywords.Any(kw => String.IsNullOrWhiteSpace(kw))) {
throw new ArgumentException("invalid", "keywords");
}
_id = id;
_keywords = new HashSet<string>(keywords);
}
Test
[TestMethod]
[ExpectedException(typeof(ArgumentException))]
public void EmptyKeywords() {
GameObject test = new GameObject("test",System.Linq.Enumerable.Empty<string>());
}

It looks like you expect this condition:
keywords.DefaultIfEmpty() == null
to evaluate to true. However DefaultIfEmpty returns a singleton sequence containing the default for the element type (string in this case) if the source sequence is empty. Therefore it will return a sequence containing null. This is not itself null however so the condition returns false.

You are misinterpreting the implementation of DefaultIfEmpty, here is it's implementation from the reference source.
public static IEnumerable<TSource> DefaultIfEmpty<TSource>(this IEnumerable<TSource> source) {
return DefaultIfEmpty(source, default(TSource));
}
public static IEnumerable<TSource> DefaultIfEmpty<TSource>(this IEnumerable<TSource> source, TSource defaultValue) {
if (source == null) throw Error.ArgumentNull("source");
return DefaultIfEmptyIterator<TSource>(source, defaultValue);
}
static IEnumerable<TSource> DefaultIfEmptyIterator<TSource>(IEnumerable<TSource> source, TSource defaultValue) {
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) {
do {
yield return e.Current;
} while (e.MoveNext());
}
else {
yield return defaultValue;
}
}
}
So what it does is if a IEnumerable<T> is not empty it simply returns the IEnumerable<T>, if the IEnumerable<T> is empty it returns new a IEnumerable<T> with one object in it with the value default(T). It will never return null which is what your test is testing for. If you wanted to test this you would need to do
if(keywords.DefaultIfEmpty().First() == null)
However this is going to cause the IEnumerable<string> to be evaluated multiple times. I would drop the LINQ and just do like the LINQ method does and do it the long way (this also gets rid of the extra evaluation you had inside new HashSet<string>(keywords)).
public GameObject(string id,IEnumerable<string> keywords)
{
if (String.IsNullOrWhiteSpace(id)) {
throw new ArgumentException("invalid", "id");
}
if (keywords==null) {
throw new ArgumentException("invalid", "keywords");
}
_keywords = new HashSet<string>();
using (var enumerator = keywords.GetEnumerator())
{
if (e.MoveNext())
{
do
{
if(e.Current == null)
throw new ArgumentException("invalid", "keywords");
_keywords.Add(e.Current);
} while (e.MoveNext());
}
else
{
throw new ArgumentException("invalid", "keywords");
}
}
_id = id;
}
This makes it so you only loop once over the IEnumerable<string>.

Does this solve your problem?
public GameObject(string id, IEnumerable<string> keywords) {
if (String.IsNullOrWhiteSpace(id)) {
throw new ArgumentException("invalid", "id");
}
if (keywords == null || !keywords.Any()
|| keywords.Any(k => String.IsNullOrWhiteSpace(k))) {
throw new ArgumentException("invalid", "keywords");
}
_id = id;
_keywords = new HashSet<string>(keywords);
}
*Improved the code with suggestions from #ScottChamberlain & #ginkner

Quick way to get the difference between two List<> objects

How do I get itemsToRemove to only contain "bar one", and itemsToAdd to only contain "bar five"?
I'm trying to use "Except", but obviously I'm using it incorrectly.
var oldList = new List<Foo>();
oldList.Add(new Foo(){ Bar = "bar one"});
oldList.Add(new Foo(){ Bar = "bar two"});
oldList.Add(new Foo(){ Bar = "bar three"});
oldList.Add(new Foo(){ Bar = "bar four"});
var newList = new List<Foo>();
newList.Add(new Foo(){ Bar = "bar two"});
newList.Add(new Foo(){ Bar = "bar three"});
newList.Add(new Foo(){ Bar = "bar four"});
newList.Add(new Foo(){ Bar = "bar five"});
var itemsToRemove = oldList.Except(newList); // should only contain "bar one"
var itemsToAdd = newList.Except(oldList); // should only contain "bar one"
foreach(var item in itemsToRemove){
Console.WriteLine(item.Bar + " removed");
// currently says
// bar one removed
// bar two removed
// bar three removed
// bar four removed
}
foreach(var item in itemsToAdd){
Console.WriteLine(item.Bar + " added");
// currently says
// bar two added
// bar three added
// bar four added
// bar five added
}

Except will use the default Equals and GetHashCode method of the objects in question to define "equality" for the objects, unless you provide a custom comparer (you have not). In this case, that will compare the references of the objects, not their Bar value.
One option would be to create an IEqualityComparer<Foo> that compares the Bar property, rather than references to the object itself.
public class FooComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
if (x == null ^ y == null)
return false;
if (x == null && y == null)
return true;
return x.Bar == y.Bar;
}
public int GetHashCode(Foo obj)
{
if (obj == null)
return 0;
return obj.Bar.GetHashCode();
}
}
Another option is to create an Except method that accepts a selector to compare the values on. We can create such a method and then use that:
public static IEnumerable<TSource> ExceptBy<TSource, TKey>(
this IEnumerable<TSource> first,
IEnumerable<TSource> second,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer = null)
{
comparer = comparer ?? EqualityComparer<TKey>.Default;
var set = new HashSet<TKey>(second.Select(keySelector), comparer);
return first.Where(item => set.Add(keySelector(item)));
}
This allows us to write:
var itemsToRemove = oldList.ExceptBy(newList, foo => foo.Bar);
var itemsToAdd = newList.ExceptBy(oldList, foo => foo.Bar);

Your logic is sound, but Except default behaviour for comparing two classes is to go by references. Since you are effectively create two lists with 8 differents objets (regardless of their content), there will be no two equal objects.
You can, however, use the Except overload that takes an IEqualityComparer. For example:
public class FooEqualityComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo left, Foo right)
{
if(left == null && right == null) return true;
return left != null && right != null && left.Bar == right.Bar;
}
public int GetHashCode(Foo item)
{
return item != null ? item.Bar.GetHashcode() : 0;
}
}
// In your code
var comparer = new FooEqualityComparer();
var itemsToRemove = oldList.Except(newList, comparer );
var itemsToAdd = newList.Except(oldList, comparer);

This is mostly a riff on Servy's answer to give a more general approach to this:
public class PropertyEqualityComparer<TItem, TKey> : EqualityComparer<Tuple<TItem, TKey>>
{
readonly Func<TItem, TKey> _getter;
public PropertyEqualityComparer(Func<TItem, TKey> getter)
{
_getter = getter;
}
public Tuple<TItem, TKey> Wrap(TItem item) {
return Tuple.Create(item, _getter(item));
}
public TItem Unwrap(Tuple<TItem, TKey> tuple) {
return tuple.Item1;
}
public override bool Equals(Tuple<TItem, TKey> x, Tuple<TItem, TKey> y)
{
if (x.Item2 == null && y.Item2 == null) return true;
if (x.Item2 == null || y.Item2 == null) return false;
return x.Item2.Equals(y.Item2);
}
public override int GetHashCode(Tuple<TItem, TKey> obj)
{
if (obj.Item2 == null) return 0;
return obj.Item2.GetHashCode();
}
}
public static class ComparerLinqExtensions {
public static IEnumerable<TSource> Except<TSource, TKey>(this IEnumerable<TSource> first, IEnumerable<TSource> second, Func<TSource, TKey> keyGetter)
{
var comparer = new PropertyEqualityComparer<TSource, TKey>(keyGetter);
var firstTuples = first.Select(comparer.Wrap);
var secondTuples = second.Select(comparer.Wrap);
return firstTuples.Except(secondTuples, comparer)
.Select(comparer.Unwrap);
}
}
// ...
var itemsToRemove = oldList.Except(newList, foo => foo.Bar);
var itemsToAdd = newList.Except(oldList, foo => foo.Bar);
This should work fine for any classes without unusual equality semantics, where it's incorrect to call the object.Equals() override instead of IEquatable<T>.Equals().Notably, this will work fine for anonymous types.

This is because you're comparing objects of type Foo, and not property Bar of type string. Try:
var itemsToRemove = oldList.Select(i => i.Bar).Except(newList.Select(i => i.Bar));
var itemsToAdd = newList.Select(i => i.Bar).Except(oldList.Select(i => i.Bar));

Implement IComparable on your data objects; I think you're being bitten by reference comparison. If you change Foo to just string, your code works.
var oldList = new List<string>();
oldList.Add("bar one");
oldList.Add("bar two");
oldList.Add("bar three");
oldList.Add("bar four");
var newList = new List<string>();
newList.Add("bar two");
newList.Add("bar three");
newList.Add("bar four");
newList.Add("bar five");
var itemsToRemove = oldList.Except(newList); // should only contain "bar one"
var itemsToAdd = newList.Except(oldList); // should only contain "bar one"
foreach (var item in itemsToRemove)
{
Console.WriteLine(item + " removed");
}
foreach (var item in itemsToAdd)
{
Console.WriteLine(item + " added");
}

Using IEqualityComparer for Union

I simply want to remove duplicates from two lists and combine them into one list. I also need to be able to define what a duplicate is. I define a duplicate by the ColumnIndex property, if they are the same, they are duplicates. Here is the approach I took:
I found a nifty example of how to write inline comparers for the random occassions where you need em only once in a code segment.
public class InlineComparer<T> : IEqualityComparer<T>
{
private readonly Func<T, T, bool> getEquals;
private readonly Func<T, int> getHashCode;
public InlineComparer(Func<T, T, bool> equals, Func<T, int> hashCode)
{
getEquals = equals;
getHashCode = hashCode;
}
public bool Equals(T x, T y)
{
return getEquals(x, y);
}
public int GetHashCode(T obj)
{
return getHashCode(obj);
}
}
Then I just have my two lists, and attempt a union on them with the comparer.
var formatIssues = issues.Where(i => i.IsFormatError == true);
var groupIssues = issues.Where(i => i.IsGroupError == true);
var dupComparer = new InlineComparer<Issue>((i1, i2) => i1.ColumnInfo.ColumnIndex == i2.ColumnInfo.ColumnIndex,
i => i.ColumnInfo.ColumnIndex);
var filteredIssues = groupIssues.Union(formatIssues, dupComparer);
The result set however is null.
Where am I going astray?
I have already confirmed that the two lists have columns with equal ColumnIndex properties.

I've just run your code on a test set.... and it works!
public class InlineComparer<T> : IEqualityComparer<T>
{
private readonly Func<T, T, bool> getEquals;
private readonly Func<T, int> getHashCode;
public InlineComparer(Func<T, T, bool> equals, Func<T, int> hashCode)
{
getEquals = equals;
getHashCode = hashCode;
}
public bool Equals(T x, T y)
{
return getEquals(x, y);
}
public int GetHashCode(T obj)
{
return getHashCode(obj);
}
}
class TestClass
{
public string S { get; set; }
}
[TestMethod]
public void testThis()
{
var l1 = new List<TestClass>()
{
new TestClass() {S = "one"},
new TestClass() {S = "two"},
};
var l2 = new List<TestClass>()
{
new TestClass() {S = "three"},
new TestClass() {S = "two"},
};
var dupComparer = new InlineComparer<TestClass>((i1, i2) => i1.S == i2.S, i => i.S.GetHashCode());
var unionList = l1.Union(l2, dupComparer);
Assert.AreEqual(3, unionList);
}
So... maybe go back and check your test data - or run it with some other test data?
After all - for a Union to be empty - that suggests that both your input lists are also empty?

A slightly simpler way:
it does preserve the original order
it ignores dupes as it finds them
Uses a link extension method:
formatIssues.Union(groupIssues).DistinctBy(x => x.ColumnIndex)
This is the DistinctBy lambda method from MoreLinq
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
HashSet<TKey> knownKeys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (knownKeys.Add(keySelector(element)))
{
yield return element;
}
}
}

Would the Linq Except method not do it for you?
var formatIssues = issues.Where(i => i.IsFormatError == true);
var groupIssues = issues.Where(i => i.IsGroupError == true);
var dupeIssues = issues.Where(i => issues.Except(new List<Issue> {i})
.Any(x => x.ColumnIndex == i.ColumnIndex));
var filteredIssues = formatIssues.Union(groupIssues).Except(dupeIssues);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to create a <T> function for list comparison? - c#

Related

Populate a list inside a linq select statement using the query's index

StartWith method for arrays

Unexpected behavior using Enumerable.Empty<string>()

Quick way to get the difference between two List<> objects

Using IEqualityComparer for Union

Categories

Resources