Linq get values not shared across multiple lists - c#

What's the most efficient way to write a method that will compare n lists and return all the values that do not appear in all lists, so that
var lists = new List<List<int>> {
new List<int> { 1, 2, 3, 4 },
new List<int> { 2, 3, 4, 5, 8 },
new List<int> { 2, 3, 4, 5, 9, 9 },
new List<int> { 2, 3, 3, 4, 9, 10 }
};
public IEnumerable<T> GetNonShared(this IEnumerable<IEnumerable<T>> lists)
{
//...fast algorithm here
}
so that
lists.GetNonShared();
returns 1, 5, 8, 9, 10
I had
public IEnumerable<T> GetNonShared(this IEnumerable<IEnumerable<T>> lists)
{
return list.SelectMany(item => item)
.Except(lists.Aggregate((a, b) => a.Intersect(b));
}
But I wasn't sure if that was efficient. Order does not matter. Thanks!

public static IEnumerable<T> GetNonShared<T>(this IEnumerable<IEnumerable<T>> list)
{
return list.SelectMany(x => x.Distinct()).GroupBy(x => x).Where(g => g.Count() < list.Count()).Select(group => group.Key);
}

EDIT: I think I'd think of it like this...
You want the union of all the lists, minus the intersection of all the lists. That's effectively what your original does, leaving Except to do the "set" operation of Union despite getting duplicate inputs. In this case I suspect you could do this more efficiently just building up two HashSets and doing all the work in-place:
public IEnumerable<T> GetNonShared(this IEnumerable<IEnumerable<T>> lists)
{
using (var iterator = lists.GetEnumerator())
{
if (!iterator.MoveNext())
{
return new T[0]; // Empty
}
HashSet<T> union = new HashSet<T>(iterator.Current.ToList());
HashSet<T> intersection = new HashSet<T>(union);
while (iterator.MoveNext())
{
// This avoids iterating over it twice; it may not be necessary,
// it depends on how you use it.
List<T> list = iterator.Current.Toist();
union.UnionWith(list);
intersection = intersection.IntersectWith(list);
}
union.ExceptWith(intersection);
return union;
}
}
Note that this is now eager, not deferred.
Here's an alternative option:
public IEnumerable<T> GetNonShared(this IEnumerable<IEnumerable<T>> lists)
{
return list.SelectMany(list => list)
.GroupBy(x => x)
.Where(group => group.Count() < lists.Count)
.Select(group => group.Key);
}
If it's possible for a list to contain the same item more than once, you'd want a Distinct call in there:
public IEnumerable<T> GetNonShared(this IEnumerable<IEnumerable<T>> lists)
{
return list.SelectMany(list => list.Distinct())
.GroupBy(x => x)
.Where(group => group.Count() < list.Count)
.Select(group => group.Key);
}
EDIT: Now I've corrected this, I understand your original code... and I suspect I can find something better... thinking...

I think you need to create an intermediate step, which is finding all the items which are common to all lists. This is easy to do with set logic - it's just the set of items in the first list intersected with the set of items in each succeeding list. I don't think that step's doable in LINQ, though.
class Program
{
static void Main(string[] args)
{
IEnumerable<IEnumerable<int>> lists = new List<IEnumerable<int>> {
new List<int> { 1, 2, 3, 4 },
new List<int> { 2, 3, 4, 5, 8 },
new List<int> { 2, 3, 4, 5, 9, 9 },
new List<int> { 2, 3, 3, 4, 9, 10 }
};
Console.WriteLine(string.Join(", ", GetNonShared(lists)
.Distinct()
.OrderBy(x => x)
.Select(x => x.ToString())
.ToArray()));
Console.ReadKey();
}
public static HashSet<T> GetShared<T>(IEnumerable<IEnumerable<T>> lists)
{
HashSet<T> result = null;
foreach (IEnumerable<T> list in lists)
{
result = (result == null)
? new HashSet<T>(list)
: new HashSet<T>(result.Intersect(list));
}
return result;
}
public static IEnumerable<T> GetNonShared<T>(IEnumerable<IEnumerable<T>> lists)
{
HashSet<T> shared = GetShared(lists);
return lists.SelectMany(x => x).Where(x => !shared.Contains(x));
}
}

public static IEnumerable<T> GetNonShared<T>(this IEnumerable<IEnumerable<T>> list)
{
var lstCnt=list.Count(); //get the total number if items in the list
return list.SelectMany (l => l.Distinct())
.GroupBy (l => l)
.Select (l => new{n=l.Key, c=l.Count()})
.Where (l => l.c<lstCnt)
.Select (l => l.n)
.OrderBy (l => l) //can be commented
;
}
//use HashSet and SymmetricExceptWith for .net >= 4.5

Related

a Method to find Common integers between 2 arrays

I need to write a method to find the commons between 2 arrays in C# but the thing is I can't convert my python logic from the past to C#
it used to be like this in python:
def commonfinder(list1, list2):
commonlist = []
for x in list1:
for y in list2:
if x==y:
commonlist.append(x)
return commonlist
but when I tried to convert it to C#:
public int [] Commons(int[] ar1, int[] ar2)
{
int commoncount;
int[] Commonslist = new int[commoncount];
foreach (int x in ar1)
{
foreach (int y in ar2)
{
if (x == y)
{
commoncount++;
// here I should add x to Commonlist
}
}
}
return Commonslist;
}
I couldn't find any method or functions that would append x to my Commonlist
and ofc I got a lot of errors I couldn't solve
can I get a tip?
Your original algorithm has O(n * m) time complexity, which can be too long:
imagine that you have lists of 1 million items each (1 trillion compares to perform). You can implement a better code with O(n + m) complexity only:
Code: (let's generalize the problem)
using System.Linq;
...
public static T[] CommonFinder<T>(IEnumerable<T> left,
IEnumerable<T> right,
IEqualityComparer<T> comparer = null) {
if (null == left || null == right)
return new T[0]; // Or throw ArgumentNullException exception
comparer = comparer ?? EqualityComparer<T>.Default;
Dictionary<T, int> dict = right
.GroupBy(item => item)
.ToDictionary(group => group.Key, group => group.Count());
List<T> result = new List<T>();
foreach (T item in left)
if (dict.TryGetValue(item, out int count)) {
result.Add(item);
if (count <= 1)
dict.Remove(item);
else
dict[item] = count - 1;
}
return result.ToArray();
}
Demo:
int[] left = new int[] { 1, 2, 3, 4, 5 };
int[] right = new int[] { 0, 3, 2, 6, 9};
var common = CommonFinder(left, right);
Console.WriteLine(string.Join(", ", common));
Outcome:
2, 3
Note: What I understood is you want a method that takes 2 int arrays and yields 1 int array as the output with the unique intersecting values.
You can use HashSet to speed up to insert and lookup time (amortized O(1)). The running time is O(Max(n,m)) due to us having to go through both the entire arrays (separately). In terms of memory, O(Min(n,m)) because we select the smaller array at the beginning to populate the set and for the rest of the logic naturally won't have more elements than the smaller array because it is the intersect.
The Main method shows you how to utilize the method. CommonIntegers has the logic which you seek.
using System;
using System.Collections.Generic;
using System.Linq;
namespace TestCode.StackOverflow
{
public class So66935672
{
public static void Main(string[] args)
{
int[] intArray1 = new int[] { 9, 9, 1, 3, 5, 6, 10, 9 };
int[] intArray2 = new int[] { 19, 17, 16, 5, 1, 6 };
Console.Write(
CommonIntegers(intArray1, intArray2)
.Select(i => $"{i}, ")
.Aggregate(string.Empty, string.Concat));
}
private static int[] CommonIntegers(int[] intArray1, int[] intArray2)
{
if (intArray1 == null || intArray1.Length == 0
|| intArray2 == null || intArray2.Length == 0)
{
return Array.Empty<int>();
}
var primaryArraySet = new HashSet<int>(); // Contains the unique values from the shorter array
var intersectSet = new HashSet<int>(); // Contains unique values found in both arrays
int[] secondarySet;
// Fill primary set
if (intArray1.Length > intArray2.Length)
{
foreach (var i in intArray2)
primaryArraySet.Add(i);
secondarySet = intArray1;
}
else
{
foreach (var i in intArray1)
primaryArraySet.Add(i);
secondarySet = intArray2;
}
// Fill intersect array
foreach (var i in secondarySet)
if (primaryArraySet.Contains(i))
intersectSet.Add(i);
return intersectSet.ToArray();
}
}
}
You can try this one:
static List<int> CommonFinder(List<int> list1, List<int> list2)
{
List<int> commonList = new List<int>();
foreach (int x in list1)
foreach (int y in list2)
if (x == y)
commonList.Add(x);
return commonList;
}
static void Main()
{
List<int> list1 = new List<int> { 1, 2, 3 };
List<int> list2 = new List<int> { 2, 3, 4};
var common = CommonFinder(list1, list2);
Console.WriteLine(string.Join(", ", common));
}

How to filter members of more than one list with LINQ? [duplicate]

How do I select the unique elements from the list {0, 1, 2, 2, 2, 3, 4, 4, 5} so that I get {0, 1, 3, 5}, effectively removing all instances of the repeated elements {2, 4}?
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
var uniqueNumbers =
from n in numbers
group n by n into nGroup
where nGroup.Count() == 1
select nGroup.Key;
// { 0, 1, 3, 5 }
var nums = new int{ 0...4,4,5};
var distinct = nums.Distinct();
make sure you're using Linq and .NET framework 3.5.
With lambda..
var all = new[] {0,1,1,2,3,4,4,4,5,6,7,8,8}.ToList();
var unique = all.GroupBy(i => i).Where(i => i.Count() == 1).Select(i=>i.Key);
C# 2.0 solution:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, int> counts = new Dictionary<T, int>();
foreach (T item in things)
{
int count;
if (counts.TryGetValue(item, out count))
counts[item] = ++count;
else
counts.Add(item, 1);
}
foreach (KeyValuePair<T, int> kvp in counts)
{
if (kvp.Value == 1)
yield return kvp.Key;
}
}
Here is another way that works if you have complex type objects in your List and want to get the unique values of a property:
var uniqueValues= myItems.Select(k => k.MyProperty)
.GroupBy(g => g)
.Where(c => c.Count() == 1)
.Select(k => k.Key)
.ToList();
Or to get distinct values:
var distinctValues = myItems.Select(p => p.MyProperty)
.Distinct()
.ToList();
If your property is also a complex type you can create a custom comparer for the Distinct(), such as Distinct(OrderComparer), where OrderComparer could look like:
public class OrderComparer : IEqualityComparer<Order>
{
public bool Equals(Order o1, Order o2)
{
return o1.OrderID == o2.OrderID;
}
public int GetHashCode(Order obj)
{
return obj.OrderID.GetHashCode();
}
}
If Linq isn't available to you because you have to support legacy code that can't be upgraded, then declare a Dictionary, where the first int is the number and the second int is the number of occurences. Loop through your List, loading up your Dictionary. When you're done, loop through your Dictionary selecting only those elements where the number of occurences is 1.
I believe Matt meant to say:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, bool> uniques = new Dictionary<T, bool>();
foreach (T item in things)
{
if (!(uniques.ContainsKey(item)))
{
uniques.Add(item, true);
}
}
return uniques.Keys;
}
There are many ways to skin a cat, but HashSet seems made for the task here.
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
HashSet<int> r = new HashSet<int>(numbers);
foreach( int i in r ) {
Console.Write( "{0} ", i );
}
The output:
0 1 2 3 4 5
Here's a solution with no LINQ:
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
// This assumes the numbers are sorted
var noRepeats = new List<int>();
int temp = numbers[0]; // Or .First() if using IEnumerable
var count = 1;
for(int i = 1; i < numbers.Length; i++) // Or foreach (var n in numbers.Skip(1)) if using IEnumerable
{
if (numbers[i] == temp) count++;
else
{
if(count == 1) noRepeats.Add(temp);
temp = numbers[i];
count = 1;
}
}
if(count == 1) noRepeats.Add(temp);
Console.WriteLine($"[{string.Join(separator: ",", values: numbers)}] -> [{string.Join(separator: ",", values: noRepeats)}]");
This prints:
[0,1,2,2,2,3,4,4,5] -> [0,1,3,5]
In .Net 2.0 I`m pretty sure about this solution:
public IEnumerable<T> Distinct<T>(IEnumerable<T> source)
{
List<T> uniques = new List<T>();
foreach (T item in source)
{
if (!uniques.Contains(item)) uniques.Add(item);
}
return uniques;
}

Combine entries from two lists by position using LINQ

Say I have two lists with following entries
List<int> a = new List<int> { 1, 2, 5, 10 };
List<int> b = new List<int> { 6, 20, 3 };
I want to create another List c where its entries are items inserted by position from two lists. So List c would contain the following entries:
List<int> c = {1, 6, 2, 20, 5, 3, 10}
Is there a way to do it in .NET using LINQ? I was looking at .Zip() LINQ extension, but wasn't sure how to use it in this case.
Thanks in advance!
To do it using LINQ, you can use this piece of LINQPad example code:
void Main()
{
List<int> a = new List<int> { 1, 2, 5, 10 };
List<int> b = new List<int> { 6, 20, 3 };
var result = Enumerable.Zip(a, b, (aElement, bElement) => new[] { aElement, bElement })
.SelectMany(ab => ab)
.Concat(a.Skip(Math.Min(a.Count, b.Count)))
.Concat(b.Skip(Math.Min(a.Count, b.Count)));
result.Dump();
}
Output:
This will:
Zip the two lists together (which will stop when either runs out of elements)
Producing an array containing the two elements (one from a, another from b)
Using SelectMany to "flatten" this out to one sequence of values
Concatenate in the remainder from either list (only one or neither of the two calls to Concat should add any elements)
Now, having said that, personally I would've used this:
public static IEnumerable<T> Intertwine<T>(this IEnumerable<T> a, IEnumerable<T> b)
{
using (var enumerator1 = a.GetEnumerator())
using (var enumerator2 = b.GetEnumerator())
{
bool more1 = enumerator1.MoveNext();
bool more2 = enumerator2.MoveNext();
while (more1 && more2)
{
yield return enumerator1.Current;
yield return enumerator2.Current;
more1 = enumerator1.MoveNext();
more2 = enumerator2.MoveNext();
}
while (more1)
{
yield return enumerator1.Current;
more1 = enumerator1.MoveNext();
}
while (more2)
{
yield return enumerator2.Current;
more2 = enumerator2.MoveNext();
}
}
}
Reasons:
It doesn't enumerate a nor b more than once
I'm skeptical about the performance of Skip
It can work with any IEnumerable<T> and not just List<T>
I'd create an extension method to do it.
public static List<T> MergeAll<T>(this List<T> first, List<T> second)
{
int maxCount = (first.Count > second. Count) ? first.Count : second.Count;
var ret = new List<T>();
for (int i = 0; i < maxCount; i++)
{
if (first.Count < maxCount)
ret.Add(first[i]);
if (second.Count < maxCount)
ret.Add(second[i]);
}
return ret;
}
This would iterate through both lists once. If one list is bigger than the other it will continue to add until it's done.
You could try this code:
List<int> c = a.Select((i, index) => new Tuple<int, int>(i, index * 2))
.Union(b.Select((i, index) => new Tuple<int, int>(i, index * 2 + 1)))
.OrderBy(t => t.Second)
.Select(t => t.First).ToList();
It makes a union of two collections and then sorts that union using index. Elements from the first collection have even indices, from the second - odd ones.
Just wrote a little extension for this:
public static class MyEnumerable
{
public static IEnumerable<T> Smash<T>(this IEnumerable<T> one, IEnumerable<T> two)
{
using (IEnumerator<T> enumeratorOne = one.GetEnumerator(),
enumeratorTwo = two.GetEnumerator())
{
bool twoFinished = false;
while (enumeratorOne.MoveNext())
{
yield return enumeratorOne.Current;
if (!twoFinished && enumeratorTwo.MoveNext())
{
yield return enumeratorTwo.Current;
}
}
if (!twoFinished)
{
while (enumeratorTwo.MoveNext())
{
yield return enumeratorTwo.Current;
}
}
}
}
}
Usage:
var a = new List<int> { 1, 2, 5, 10 };
var b = new List<int> { 6, 20, 3 };
var c = a.Smash(b); // 1, 6, 2, 20, 5, 3, 10
var d = b.Smash(a); // 6, 1, 20, 2, 3, 5, 10
This will work for any IEnumerable so you can also do:
var a = new List<string> { "the", "brown", "jumped", "the", "lazy", "dog" };
var b = new List<string> { "quick", "dog", "over" };
var c = a.Smash(b); // the, quick, brown, fox, jumped, over, the, lazy, dog
You could use Concat and an anonymous type which you order by the index:
List<int> c = a
.Select((val, index) => new { val, index })
.Concat(b.Select((val, index) => new { val, index }))
.OrderBy(x => x.index)
.Select(x => x.val)
.ToList();
However, since that's not really elegant and also less efficient than:
c = new List<int>(a.Count + b.Count);
int max = Math.Max(a.Count, b.Count);
int aMax = a.Count;
int bMax = b.Count;
for (int i = 0; i < max; i++)
{
if(i < aMax)
c.Add(a[i]);
if(i < bMax)
c.Add(b[i]);
}
I wouldn't use LINQ at all.
Sorry for adding a third extension method inspired by the other two, but I like it shorter:
static IEnumerable<T> Intertwine<T>(this IEnumerable<T> a, IEnumerable<T> b)
{
using (var enumerator1 = a.GetEnumerator())
using (var enumerator2 = b.GetEnumerator()) {
bool more1 = true, more2 = true;
do {
if (more1 && (more1 = enumerator1.MoveNext()))
yield return enumerator1.Current;
if (more2 && (more2 = enumerator2.MoveNext()))
yield return enumerator2.Current;
} while (more1 || more2);
}
}

how to check if the list with the same values exists

In these following code segment::
static void Main(string[] args)
{
List<List<int>> bigList = new List<List<int>> { };
bigList.Add(new List<int> { 1, 2 });
bigList.Add(new List<int> { 2, 3 });
bigList.Add(new List<int> { 3, 4 });
List<int> subList = new List<int> { 1, 2 };
Console.WriteLine(bigList.Contains(subList));
}
the output is:: 'False'.
then what is the method to check this. i mean how will the output become 'True'
If you don't care about duplicate entries in the lists you can use:
bigList.Any(b => new HashSet<int>(b).SetEquals(subList))
If you want both lists to contain exactly the same elements you can use this:
bigList.Any(b => b.OrderBy(x => x).SequenceEqual(subList.OrderBy(x => x)))
If you want both lists to have the same elements in the same order you can use this:
bigList.Any(x => x.SequenceEqual(subList))
If the order doesn't matter you can use Any+All:
bool anyContains = bigList
.Any(l => bigList.Count == l.Count && l.All(i => subList.Contains(i)));
Otherwise you can use Any + SequenceEqual
bool anySequencequals = bigList.Any(l => l.SequenceEqual(subList));
Use the All linq statement
var result = bigList.Where(x => x.All(y => subList.Contains(y)));
You can use SequenceEqual method to check with Any:
bigList.Any(x => x.SequenceEqual(subList))
The reason that your code returns "false" is because you are testing if bigList contains subList. Which it does not! BigList contains a list that looks the same as subList but isn't THE subList.
Try this
bigList.Add(subList);
Complete Code
List<List<int>> bigList = new List<List<int>> { };
List<int> subList = new List<int> { 1, 2 };
bigList.Add(subList); //<<<<<<<<<< Here goes Now bigList contains subList
bigList.Add(new List<int> { 2, 3 });
bigList.Add(new List<int> { 3, 4 });
Console.WriteLine(bigList.Contains(subList));// true
Try using SequenceEqual and Any:
bigList.Any(c => c.SequenceEqual(subList));
Or, if you want to use the other way, with Contains, you'll need to make a custom EqualityComparer:
public class CollectionEqualityComparer<T> : IEqualityComparer<IEnumerable<T>>
{
public Equals(IEnumerable<T> x, IEnumerable<T> y)
{
return x.SequenceEqual(y);
}
public GetHashCode(IEnumerable<T> obj)
{
unchecked
{
return obj.Select(x => x.GetHashCode())
.Aggregate(17, (a, b) => a * 31 * b);
}
}
}
And then just use Contains like this:
bigList.Contains(sublist, new CollectionEqualityComparer<int>());

Interleaving multiple (more than 2) irregular lists using LINQ

Say I have the following data
IEnumerable<IEnumerable<int>> items = new IEnumerable<int>[] {
new int[] { 1, 2, 3, 4 },
new int[] { 5, 6 },
new int[] { 7, 8, 9 }
};
What would be the easiest way to return a flat list with the items interleaved so I'd get the result:
1, 5, 7, 2, 6, 8, 3, 9, 4
Note: The number of inner lists is not known at runtime.
What you're describing is essentially a Transpose Method where overhanging items are included and the result is flattened. Here's my attempt:
static IEnumerable<IEnumerable<T>> TransposeOverhanging<T>(
this IEnumerable<IEnumerable<T>> source)
{
var enumerators = source.Select(e => e.GetEnumerator()).ToArray();
try
{
T[] g;
do
{
yield return g = enumerators
.Where(e => e.MoveNext()).Select(e => e.Current).ToArray();
}
while (g.Any());
}
finally
{
Array.ForEach(enumerators, e => e.Dispose());
}
}
Example:
var result = items.TransposeOverhanging().SelectMany(g => g).ToList();
// result == { 1, 5, 7, 2, 6, 8, 3, 9, 4 }
The solution below is very straight forward. As it turns out, it is also nearly twice as fast as the solution proposed by dtb.
private static IEnumerable<T> Interleave<T>(this IEnumerable<IEnumerable<T>> source )
{
var queues = source.Select(x => new Queue<T>(x)).ToList();
while (queues.Any(x => x.Any())) {
foreach (var queue in queues.Where(x => x.Any())) {
yield return queue.Dequeue();
}
}
}
Here's my attempt, based on dtb's answer. It avoids the external SelectMany and internal ToArray calls.
public static IEnumerable<T> Interleave<T>(this IEnumerable<IEnumerable<T>> source)
{
var enumerators = source.Select(e => e.GetEnumerator()).ToArray();
try
{
bool itemsRemaining;
do
{
itemsRemaining = false;
foreach (var item in
enumerators.Where(e => e.MoveNext()).Select(e => e.Current))
{
yield return item;
itemsRemaining = true;
}
}
while (itemsRemaining);
}
finally
{
Array.ForEach(enumerators, e => e.Dispose());
}
}
Disposed all enumerators, even when exceptions are thrown
Evaluates the outer sequence eagerly, but uses lazy evaluation for the inner sequences.
public static IEnumerable<T> Interleave<T>(IEnumerable<IEnumerable<T>> sequences)
{
var enumerators = new List<IEnumerator<T>>();
try
{
// using foreach here ensures that `enumerators` contains all already obtained enumerators, in case of an expection is thrown here.
// this ensures proper disposing in the end
foreach(var enumerable in sequences)
{
enumerators.Add(enumerable.GetEnumerator());
}
var queue = new Queue<IEnumerator<T>>(enumerators);
while (queue.Any())
{
var enumerator = queue.Dequeue();
if (enumerator.MoveNext())
{
queue.Enqueue(enumerator);
yield return enumerator.Current;
}
}
}
finally
{
foreach(var enumerator in enumerators)
{
enumerator.Dispose();
}
}
}
Though its not as elegant as "dtb"'s answer, but it also works and its a single liner :)
Enumerable.Range(0, items.Max(x => x.Count()))
.ToList()
.ForEach(x =>
{
items
.Where(lstChosen => lstChosen.Count()-1 >= x)
.Select(lstElm => lstElm.ElementAt(x))
.ToList().ForEach(z => Console.WriteLine(z));
});

Categories

Resources