Cluster items by condition and order using linq

Cluster items by condition and order using linq - c#

I want to cluster some numbers by condition and their order in the list.
int delta = 3;
var numbers = new List<int>() { 2, 4, 9, 6, 3, 2, 7, 7, 4, 1, 9, 1, 2 };
var g = numbers.GroupBy(n => n <= delta);
This gives two groups based on the condition. What I want is:
g1: 2
g2: 4, 9, 6
g3: 3, 2
g4: 7, 7, 4
g5: 1
g6: 9
g7: 1, 2
edit
The condition is to group them based on a condition (here it is number <= delta), but every group should only contain numbers that are next to each other in the first list.

If I understand the logic, you want to create a new group whenever item n passes the condition but item n - 1 fails, or vice-versa.
Well, normally you wouldn't use Linq for such a thing. You'd have to iterate over the loop one item at a time and build the result set by yourself. For example:
List<int> list = null;
var result = new List<IEnumerable<int>>();
bool? prev = null;
foreach (var n in numbers)
{
bool cur = n <= 3;
if (prev != cur)
{
list = new List<int>();
result.Add(list);
prev = cur;
}
list.Add(n);
}
But here is a workable solution in Linq. It depends upon site-effects, which you should normally avoid:
var prev = numbers.First() <= delta;
var counter = 0;
var result = numbers.GroupBy(n => (prev != (prev = n <= delta)) ? ++counter : counter)
.ToList();

So what we are conceptually doing here is going through the list and grouping while a condition is met. We can write a corresponding operation just for that without too much difficulty:
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(
this IEnumerable<T> source, Func<T, T, bool> predicate)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
List<T> list = new List<T>() { iterator.Current };
T previous = iterator.Current;
while (iterator.MoveNext())
{
if (!predicate(previous, iterator.Current))
{
yield return list;
list = new List<T>();
}
list.Add(iterator.Current);
previous = iterator.Current;
}
yield return list;
}
}
We can now write:
var groups = numbers.GroupWhile((prev,next) =>
(prev <= delta) == (next <= delta));
Here the condition for when to start a new group is when the previous item's comparison is the same as the current item's.

If you're a fan of fold you can write it this way:
var groups = numbers.Skip(1).Aggregate(new List<List<int>>(new[] { new List<int> { numbers[0] } }), (acc, b) =>
{
if ((acc.Last().LastOrDefault() <= delta) == (b <= delta))
{
acc.Last().Add(b);
}
else
{
acc.Add(new List<int>() { b });
}
return acc;
})
Here groups is of type of List<List<int>>

Related

Linq splitting group if there is missing number

This one has totally stumped me.
Let's say we have a list of integers
var list = new List {
1,
2,
3,
5,
6,
7,
9,
10
};
How can I group this where this would be 1-3,5-7,9-10 the group is split where the next integer is missing?

See if this works. No for loop, just linq
List<int> list = new List<int> { 1, 2, 3, 5, 6, 7, 9, 10};
List<int> splitIndex = list.Skip(1).Select((x,i) => new { x = x, i = i}).Where(x => list[x.i] + 1 != x.x).Select(x => x.i).ToList();
//add last index
splitIndex.Add(list.Count - 1);
var results = splitIndex.Select((x,i) => (i == 0) ? list.Take(x + 1).ToList() : list.Skip(splitIndex[i - 1] + 1).Take(splitIndex[i] - splitIndex[i - 1]).ToList()).ToList();

You won't achieve it with simple LINQ, but you can write your own extension method that can deal with such grouping.
You have to place it in static class, and call it like normal LINQ.
public static class LinqExtensions
{
public static IEnumerable<IEnumerable<int>> GroupSequential (
this IEnumerable<int> source)
{
var previous = source.First();
var list = new List<int>() { previous };
foreach (var item in source.Skip(1))
{
if (item - previous != 1)
{
yield return list;
list = new List<int>();
}
list.Add(item);
previous = item;
}
yield return list;
}
}
and call it like list.GroupSequential()
I think this should work for your needs.

I agree with #Arion that it probably isn't possible with a readable LINQ method chain. #jdweng proved me wrong though :-)
I'd like to offer my alternative solution. It's an extension method, and it utilizes a custom Interval type.
Range:
public struct Interval
{
public Interval(int from, int to)
{
From = from;
To = to;
}
public int From { get; }
public int To { get; }
public IEnumerable<int> Members() => Enumerable.Range(From, To - From + 1);
}
To get the numbers within the Range, you would use Numbers() function. Numbers are lazily generated, thus saving space unless you need them all.
The extension:
public static class EnumerableExtensions
{
public static IEnumerable<Interval> GetIntervals(this IEnumerable<int> numbers)
{
var array = numbers.OrderBy(x => x).ToArray();
var fromIndex = 0;
var toIndex = fromIndex;
for (var i = 1; i < array.Length; i++)
{
var current = array[i];
if (current == array[toIndex] + 1)
{
toIndex++;
}
else if (fromIndex != toIndex)
{
yield return new Interval(array[fromIndex], array[toIndex]);
fromIndex = i;
toIndex = fromIndex;
}
}
if (toIndex != fromIndex)
{
yield return new Interval(array[fromIndex], array[toIndex]);
}
}
}
The usage:
public void Demo()
{
var list = new List<int> {1, 2, 3, 5, 6, 7, 9, 10};
// 1-3, 5-7, 9-10, lazily generated
var intervals = list.GetIntervals();
foreach (var interval in intervals)
{
// [1, 2, 3], then [5, 6, 7], then [9, 10], lazily generated
var members = interval.Members();
foreach (var numberInRange in members)
{
// do something with numberInRange
}
}
}

Can be simplified a bit :
var list = new List<int> { 1,2,3, 5,6,7, 9,10 };
List<List<int>> result = list.Aggregate(new List<List<int>>(), (L, n) => {
if (L.Count < 1 || L.Last().Last() < n - 1) L.Add(new List<int>());
L.Last().Add(n);
return L;
});

Its unclear what format you want it back as, but this extension will report the missing numbers in a list and return a list of the missing numbers:
public static IEnumerable<int> SequenceFindMissings(this IList<int> sequence)
{
var missing = new List<int>();
if ((sequence != null) && (sequence.Any()))
{
sequence.Aggregate((seed, aggr) =>
{
var diff = (aggr - seed) - 1;
if (diff > 0)
missing.AddRange(Enumerable.Range((aggr - diff), diff));
return aggr;
});
}
return missing;
}
Usage
var list = new List<int> {1,2,3,5,6,7,9,10};
var missings = list.SequenceFindMissings(); // { 4, 8 }
Its a topic covered on my blog:
C# Linq: Find Missing Values in a Sequence of Numbers and Other Sequence Related lists as IEnumerable Extensions

I found this LINQ solution more easy to understand and debug than the accepted one IMHO:
var list = new List<int>() { 1, 2, 3, 5, 6, 7, 9, 10 };
int groupIndex = 0;
int previousNumber = list.First();
var groups = list.Select((n, i) =>
{
previousNumber = (i == 0) ? list[i] : list[i - 1];
return new
{
Number = n,
GroupIndex = (i == 0) || (previousNumber + 1 == list[i]) ? groupIndex : ++groupIndex
};
}).GroupBy(x => x.GroupIndex).Select(g => g.Select(x => x.Number).ToList()).ToList();

If you want to go crazy, you can try the below 1 linq,
var desiredOutput = string.Join(',', string.Join(',', Enumerable.Range(list.First(), list.Last())).Split(Enumerable.Range(list.First(), list.Last())
.Except(list).Select(y => y.ToString()).ToArray(), StringSplitOptions.None)
.Select(x => string.Concat(x.Trim(',').Split(',').First(), "-", x.Trim(',').Split(',').Last())));
Or for more clear format, you can try the below
var sequenceOrder = Enumerable.Range(list.First(), list.Last()).ToList();
var splitters = sequenceOrder.Except(list).Select(y => y.ToString()).ToArray();
var joinedSequence = string.Join(',', sequenceOrder);
var result = joinedSequence.Split(splitters, StringSplitOptions.None).Select(x => string.Concat(x.Trim(',').Split(',').First(), "-", x.Trim(',').Split(',').Last()));
var desiredOutput = string.Join(',', result);

Using LINQ to remove duplicate content in list?

Suppose I have this list:
int[] list = { 1, 2, 3, 3, 1 };
What I would like is to remove duplicates that immediately follow same number. So in this case I want to remove 3, but not 1.
New list should therefore be: {1, 2, 3, 1}
Another example is this list: {2, 7, 7, 7, 2, 6, 4} which will become {2, 7, 2, 6, 4}.
Can I do this with LINQ?

You could use Aggregate if you want to use an existing LINQ method but such an approach would lose laziness. You can write your own extension method:
public static IEnumerable<T> RemoveConsecutiveDuplicates<T>(this IEnumerable<T> source, IEqualityComparer<T> comp = null)
{
comp = comp ?? EqualityComparer<T>.Default;
using (var e = source.GetEnumerator())
{
if (e.MoveNext())
{
T last = e.Current;
yield return e.Current;
while (e.MoveNext())
{
if (!comp.Equals(e.Current, last))
{
yield return e.Current;
last = e.Current;
}
}
}
}
}

If you insist on doing this with LINQ, you could use Aggregate:
var result = array.Aggregate(new List<int>(), (a, b) =>
{
if (!a.Any() || a.Last() != b)
a.Add(b);
return a;
});
But this isn't necessarily the most efficient solution because of Any and Last in each iteration. A simple foreach comparing the previous and current iteration value will perform much better.

You can use PairWise from MoreLINQ like this:
var result =
new[] {list[0]}
.Concat(
list
.Pairwise((x, y) => new {Item = y, Same = x == y})
.Where(x => !x.Same)
.Select(x => x.Item))
.ToArray();
PairWise allows you to get a sequence that results from applying a function on each item in the original sequence along with the item before it (expect for the first item).
What I am doing here is for each item (expect the first item), I am getting the item itself and a boolean (Same) indicating whether this item equals the item before it. Then, I am filtering the sequence to take only the items that each does not equal the item before it. I am then simply appending the first item in the original list to the new sequence.
Note: don't forget to handle the case where list is empty.

You could do the following (without linq).
var collection = new [] { 2, 7, 7, 7, 2, 6, 4 }.ToList();
for (int i = 0; i < collection.Count - 1; i++)
{
if (collection[i] == collection[i + 1])
{
collection.RemoveAt(i);
i--;
}
}
Another yield based solution would be this.
public static IEnumerable<T> RemoveConsecutiveDuplicates<T>(this IEnumerable<T> collection)
{
using (var enumerator = collection.GetEnumerator())
{
bool wasNotLast = enumerator.MoveNext(),
hasEntry = wasNotLast;
T last = hasEntry ? enumerator.Current : default(T);
while(wasNotLast)
{
if (!last.Equals(enumerator.Current))
yield return last;
last = enumerator.Current;
wasNotLast = enumerator.MoveNext();
}
if (hasEntry)
yield return last;
}
}

Combine entries from two lists by position using LINQ

Say I have two lists with following entries
List<int> a = new List<int> { 1, 2, 5, 10 };
List<int> b = new List<int> { 6, 20, 3 };
I want to create another List c where its entries are items inserted by position from two lists. So List c would contain the following entries:
List<int> c = {1, 6, 2, 20, 5, 3, 10}
Is there a way to do it in .NET using LINQ? I was looking at .Zip() LINQ extension, but wasn't sure how to use it in this case.
Thanks in advance!

To do it using LINQ, you can use this piece of LINQPad example code:
void Main()
{
List<int> a = new List<int> { 1, 2, 5, 10 };
List<int> b = new List<int> { 6, 20, 3 };
var result = Enumerable.Zip(a, b, (aElement, bElement) => new[] { aElement, bElement })
.SelectMany(ab => ab)
.Concat(a.Skip(Math.Min(a.Count, b.Count)))
.Concat(b.Skip(Math.Min(a.Count, b.Count)));
result.Dump();
}
Output:
This will:
Zip the two lists together (which will stop when either runs out of elements)
Producing an array containing the two elements (one from a, another from b)
Using SelectMany to "flatten" this out to one sequence of values
Concatenate in the remainder from either list (only one or neither of the two calls to Concat should add any elements)
Now, having said that, personally I would've used this:
public static IEnumerable<T> Intertwine<T>(this IEnumerable<T> a, IEnumerable<T> b)
{
using (var enumerator1 = a.GetEnumerator())
using (var enumerator2 = b.GetEnumerator())
{
bool more1 = enumerator1.MoveNext();
bool more2 = enumerator2.MoveNext();
while (more1 && more2)
{
yield return enumerator1.Current;
yield return enumerator2.Current;
more1 = enumerator1.MoveNext();
more2 = enumerator2.MoveNext();
}
while (more1)
{
yield return enumerator1.Current;
more1 = enumerator1.MoveNext();
}
while (more2)
{
yield return enumerator2.Current;
more2 = enumerator2.MoveNext();
}
}
}
Reasons:
It doesn't enumerate a nor b more than once
I'm skeptical about the performance of Skip
It can work with any IEnumerable<T> and not just List<T>

I'd create an extension method to do it.
public static List<T> MergeAll<T>(this List<T> first, List<T> second)
{
int maxCount = (first.Count > second. Count) ? first.Count : second.Count;
var ret = new List<T>();
for (int i = 0; i < maxCount; i++)
{
if (first.Count < maxCount)
ret.Add(first[i]);
if (second.Count < maxCount)
ret.Add(second[i]);
}
return ret;
}
This would iterate through both lists once. If one list is bigger than the other it will continue to add until it's done.

You could try this code:
List<int> c = a.Select((i, index) => new Tuple<int, int>(i, index * 2))
.Union(b.Select((i, index) => new Tuple<int, int>(i, index * 2 + 1)))
.OrderBy(t => t.Second)
.Select(t => t.First).ToList();
It makes a union of two collections and then sorts that union using index. Elements from the first collection have even indices, from the second - odd ones.

Just wrote a little extension for this:
public static class MyEnumerable
{
public static IEnumerable<T> Smash<T>(this IEnumerable<T> one, IEnumerable<T> two)
{
using (IEnumerator<T> enumeratorOne = one.GetEnumerator(),
enumeratorTwo = two.GetEnumerator())
{
bool twoFinished = false;
while (enumeratorOne.MoveNext())
{
yield return enumeratorOne.Current;
if (!twoFinished && enumeratorTwo.MoveNext())
{
yield return enumeratorTwo.Current;
}
}
if (!twoFinished)
{
while (enumeratorTwo.MoveNext())
{
yield return enumeratorTwo.Current;
}
}
}
}
}
Usage:
var a = new List<int> { 1, 2, 5, 10 };
var b = new List<int> { 6, 20, 3 };
var c = a.Smash(b); // 1, 6, 2, 20, 5, 3, 10
var d = b.Smash(a); // 6, 1, 20, 2, 3, 5, 10
This will work for any IEnumerable so you can also do:
var a = new List<string> { "the", "brown", "jumped", "the", "lazy", "dog" };
var b = new List<string> { "quick", "dog", "over" };
var c = a.Smash(b); // the, quick, brown, fox, jumped, over, the, lazy, dog

You could use Concat and an anonymous type which you order by the index:
List<int> c = a
.Select((val, index) => new { val, index })
.Concat(b.Select((val, index) => new { val, index }))
.OrderBy(x => x.index)
.Select(x => x.val)
.ToList();
However, since that's not really elegant and also less efficient than:
c = new List<int>(a.Count + b.Count);
int max = Math.Max(a.Count, b.Count);
int aMax = a.Count;
int bMax = b.Count;
for (int i = 0; i < max; i++)
{
if(i < aMax)
c.Add(a[i]);
if(i < bMax)
c.Add(b[i]);
}
I wouldn't use LINQ at all.

Sorry for adding a third extension method inspired by the other two, but I like it shorter:
static IEnumerable<T> Intertwine<T>(this IEnumerable<T> a, IEnumerable<T> b)
{
using (var enumerator1 = a.GetEnumerator())
using (var enumerator2 = b.GetEnumerator()) {
bool more1 = true, more2 = true;
do {
if (more1 && (more1 = enumerator1.MoveNext()))
yield return enumerator1.Current;
if (more2 && (more2 = enumerator2.MoveNext()))
yield return enumerator2.Current;
} while (more1 || more2);
}
}

LINQ to count Continues repeated items(int) in an int Array?

Here is an scenario of my question: I have an array, say:
{ 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 }
The result should be something like this (array element => its count):
4 => 1
1 => 2
3 => 2
2 => 1
5 => 1
3 => 1
2 => 2
I know this can be achieved by for loop.
But google'd a lot to make this possible using lesser lines of code using LINQ without success.

I believe the most optimal way to do this is to create a "LINQ-like" extension methods using an iterator block. This allows you to perform the calculation doing a single pass over your data. Note that performance isn't important at all if you just want to perform the calculation on a small array of numbers. Of course this is really your for loop in disguise.
static class Extensions {
public static IEnumerable<Tuple<T, Int32>> ToRunLengths<T>(this IEnumerable<T> source) {
using (var enumerator = source.GetEnumerator()) {
// Empty input leads to empty output.
if (!enumerator.MoveNext())
yield break;
// Retrieve first item of the sequence.
var currentValue = enumerator.Current;
var runLength = 1;
// Iterate the remaining items in the sequence.
while (enumerator.MoveNext()) {
var value = enumerator.Current;
if (!Equals(value, currentValue)) {
// A new run is starting. Return the previous run.
yield return Tuple.Create(currentValue, runLength);
currentValue = value;
runLength = 0;
}
runLength += 1;
}
// Return the last run.
yield return Tuple.Create(currentValue, runLength);
}
}
}
Note that the extension method is generic and you can use it on any type. Values are compared for equality using Object.Equals. However, if you want to you could pass an IEqualityComparer<T> to allow for customization of how values are compared.
You can use the method like this:
var numbers = new[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var runLengths = numbers.ToRunLengths();
For you input data the result will be these tuples:
4 1
1 2
3 2
2 1
5 1
3 1
2 2

(Adding another answer to avoid the two upvotes for my deleted one counting towards this...)
I've had a little think about this (now I've understood the question) and it's really not clear how you'd do this nicely in LINQ. There are definitely ways that it could be done, potentially using Zip or Aggregate, but they'd be relatively unclear. Using foreach is pretty simple:
// Simplest way of building an empty list of an anonymous type...
var results = new[] { new { Value = 0, Count = 0 } }.Take(0).ToList();
// TODO: Handle empty arrays
int currentValue = array[0];
int currentCount = 1;
foreach (var value in array.Skip(1))
{
if (currentValue != value)
{
results.Add(new { Value = currentValue, Count = currentCount });
currentCount = 0;
currentValue = value;
}
currentCount++;
}
// Handle tail, which we won't have emitted yet
results.Add(new { Value = currentValue, Count = currentCount });

Here's a LINQ expression that works (edit: tightened up code just a little more):
var data = new int[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var result = data.Select ((item, index) =>
new
{
Key = item,
Count = (index == 0 || data.ElementAt(index - 1) != item)
? data.Skip(index).TakeWhile (d => d == item).Count ()
: -1
}
)
.Where (d => d.Count != -1);
And here's a proof that shows it working.

This not short enough?
public static IEnumerable<KeyValuePair<T, int>> Repeats<T>(
this IEnumerable<T> source)
{
int count = 0;
T lastItem = source.First();
foreach (var item in source)
{
if (Equals(item, lastItem))
{
count++;
}
else
{
yield return new KeyValuePair<T, int>(lastItem, count);
lastItem = item;
count = 1;
}
}
yield return new KeyValuePair<T, int>(lastItem, count);
}
I'll be interested to see a linq way.

I already wrote the method you need over there. Here's how to call it.
foreach(var g in numbers.GroupContiguous(i => i))
{
Console.WriteLine("{0} => {1}", g.Key, g.Count);
}

Behold (you can run this directly in LINQPad -- rle is where the magic happens):
var xs = new[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var rle = Enumerable.Range(0, xs.Length)
.Where(i => i == 0 || xs[i - 1] != xs[i])
.Select(i => new { Key = xs[i], Count = xs.Skip(i).TakeWhile(x => x == xs[i]).Count() });
Console.WriteLine(rle);
Of course, this is O(n^2), but you didn't request linear efficiency in the spec.

var array = new int[] {1,1,2,3,5,6,6 };
foreach (var g in array.GroupBy(i => i))
{
Console.WriteLine("{0} => {1}", g.Key, g.Count());
}

var array = new int[]{};//whatever ur array is
array.select((s)=>{return array.where((s2)=>{s == s2}).count();});
the only prob with is tht if you have 1 - two times you will get the result for 1-two times

var array = new int[] {1,1,2,3,5,6,6 };
var arrayd = array.Distinct();
var arrayl= arrayd.Select(s => { return array.Where(s2 => s2 == s).Count(); }).ToArray();
Output
arrayl=[0]2 [1]1 [2]1 [3]1 [4]2

Try GroupBy through List<int>
List<int> list = new List<int>() { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var res = list.GroupBy(val => val);
foreach (var v in res)
{
MessageBox.Show(v.Key.ToString() + "=>" + v.Count().ToString());
}

Take every 2nd object in list

I have an IEnumerable and I want to get a new IEnumerable containing every nth element.
Can this be done in Linq?

Just figured it out myself...
The IEnumerable<T>.Where() method has an overload that takes the index of the current element - just what the doctor ordered.
(new []{1,2,3,4,5}).Where((elem, idx) => idx % 2 == 0);
This would return
{1, 3, 5}
Update: In order to cover both my use case and Dan Tao's suggestion, let's also specify what the first returned element should be:
var firstIdx = 1;
var takeEvery = 2;
var list = new []{1,2,3,4,5};
var newList = list
.Skip(firstIdx)
.Where((elem, idx) => idx % takeEvery == 0);
...would return
{2, 4}

To implement Cristi's suggestion:
public static IEnumerable<T> Sample<T>(this IEnumerable<T> source, int interval)
{
// null check, out of range check go here
return source.Where((value, index) => (index + 1) % interval == 0);
}
Usage:
var upToTen = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var evens = upToTen.Sample(2);
var multiplesOfThree = upToTen.Sample(3);

While not LINQ you may also create an extension method with yield.
public static IEnumerable<T> EverySecondObject<T>(this IEnumerable<T> list)
{
using (var enumerator = list.GetEnumerator())
{
while (true)
{
if (!enumerator.MoveNext())
yield break;
if (enumerator.MoveNext())
yield return enumerator.Current;
else
yield break;
}
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Cluster items by condition and order using linq - c#

Related

Linq splitting group if there is missing number

Using LINQ to remove duplicate content in list?

Combine entries from two lists by position using LINQ

LINQ to count Continues repeated items(int) in an int Array?

Take every 2nd object in list

Categories

Resources