Getting the index of a sequence of items - c#

I was trying to get the index of a sequence of items inside an IEnumerable<T>
var collection = new[] { 1, 2, 3, 4, 5 };
var sequence = new[] { 2, 3 };
// IndexOf is an extension method.
collection.IndexOf(sequence); // Should return 1
I wrote an IndexOf extension method for this and it works fine unless there are more than one of the first item of the sequence in collection, consecutively:
// There are two items that are 2, consecutively in the collection,
// which is the first item of the sequence.
var collection = new[] { 1, 2, 2, 3, 4, 5 };
var sequence = new[] { 2, 3 };
collection.IndexOf(sequence); // Should return 2 but returns -1
Here is the IndexOf method:
public static int IndexOf<T>(this IEnumerable<T> collection,
IEnumerable<T> sequence)
{
var comparer = EqualityComparer<T>.Default;
var counter = 0;
var index = 0;
var seqEnumerator = sequence.GetEnumerator();
foreach (var item in collection)
if (seqEnumerator.MoveNext())
{
if (!comparer.Equals(item, seqEnumerator.Current))
{
seqEnumerator.Dispose();
seqEnumerator = sequence.GetEnumerator();
counter = 0;
// UPDATED AFTER MICHAEL'S ANSWER,
// IT WORKS WITH THIS ADDED PART:
seqEnumerator.MoveNext();
if (comparer.Equals(item, seqEnumerator.Current))
counter++;
}
else counter++;
index++;
}
else break;
var done = !seqEnumerator.MoveNext();
seqEnumerator.Dispose();
return done ? index - counter : -1;
}
I couldn't figure out how to fix this.

public static int IndexOf<T>(this IEnumerable<T> collection,
IEnumerable<T> sequence)
{
var ccount = collection.Count();
var scount = sequence.Count();
if (scount > ccount) return -1;
if (collection.Take(scount).SequenceEqual(sequence)) return 0;
int index = Enumerable.Range(1, ccount - scount + 1)
.FirstOrDefault(i => collection.Skip(i).Take(scount).SequenceEqual(sequence));
if (index == 0) return -1;
return index;
}

When you encounter wrong symbol on not first position you restarting the sequence iterator but you don't check if the current item is matching the start of the sequence iterator, so you actually never compare second 2 from collection to 2 from sequence .

Related

Using LINQ to remove duplicate content in list?

Suppose I have this list:
int[] list = { 1, 2, 3, 3, 1 };
What I would like is to remove duplicates that immediately follow same number. So in this case I want to remove 3, but not 1.
New list should therefore be: {1, 2, 3, 1}
Another example is this list: {2, 7, 7, 7, 2, 6, 4} which will become {2, 7, 2, 6, 4}.
Can I do this with LINQ?
You could use Aggregate if you want to use an existing LINQ method but such an approach would lose laziness. You can write your own extension method:
public static IEnumerable<T> RemoveConsecutiveDuplicates<T>(this IEnumerable<T> source, IEqualityComparer<T> comp = null)
{
comp = comp ?? EqualityComparer<T>.Default;
using (var e = source.GetEnumerator())
{
if (e.MoveNext())
{
T last = e.Current;
yield return e.Current;
while (e.MoveNext())
{
if (!comp.Equals(e.Current, last))
{
yield return e.Current;
last = e.Current;
}
}
}
}
}
If you insist on doing this with LINQ, you could use Aggregate:
var result = array.Aggregate(new List<int>(), (a, b) =>
{
if (!a.Any() || a.Last() != b)
a.Add(b);
return a;
});
But this isn't necessarily the most efficient solution because of Any and Last in each iteration. A simple foreach comparing the previous and current iteration value will perform much better.
You can use PairWise from MoreLINQ like this:
var result =
new[] {list[0]}
.Concat(
list
.Pairwise((x, y) => new {Item = y, Same = x == y})
.Where(x => !x.Same)
.Select(x => x.Item))
.ToArray();
PairWise allows you to get a sequence that results from applying a function on each item in the original sequence along with the item before it (expect for the first item).
What I am doing here is for each item (expect the first item), I am getting the item itself and a boolean (Same) indicating whether this item equals the item before it. Then, I am filtering the sequence to take only the items that each does not equal the item before it. I am then simply appending the first item in the original list to the new sequence.
Note: don't forget to handle the case where list is empty.
You could do the following (without linq).
var collection = new [] { 2, 7, 7, 7, 2, 6, 4 }.ToList();
for (int i = 0; i < collection.Count - 1; i++)
{
if (collection[i] == collection[i + 1])
{
collection.RemoveAt(i);
i--;
}
}
Another yield based solution would be this.
public static IEnumerable<T> RemoveConsecutiveDuplicates<T>(this IEnumerable<T> collection)
{
using (var enumerator = collection.GetEnumerator())
{
bool wasNotLast = enumerator.MoveNext(),
hasEntry = wasNotLast;
T last = hasEntry ? enumerator.Current : default(T);
while(wasNotLast)
{
if (!last.Equals(enumerator.Current))
yield return last;
last = enumerator.Current;
wasNotLast = enumerator.MoveNext();
}
if (hasEntry)
yield return last;
}
}

Cluster items by condition and order using linq

I want to cluster some numbers by condition and their order in the list.
int delta = 3;
var numbers = new List<int>() { 2, 4, 9, 6, 3, 2, 7, 7, 4, 1, 9, 1, 2 };
var g = numbers.GroupBy(n => n <= delta);
This gives two groups based on the condition. What I want is:
g1: 2
g2: 4, 9, 6
g3: 3, 2
g4: 7, 7, 4
g5: 1
g6: 9
g7: 1, 2
edit
The condition is to group them based on a condition (here it is number <= delta), but every group should only contain numbers that are next to each other in the first list.
If I understand the logic, you want to create a new group whenever item n passes the condition but item n - 1 fails, or vice-versa.
Well, normally you wouldn't use Linq for such a thing. You'd have to iterate over the loop one item at a time and build the result set by yourself. For example:
List<int> list = null;
var result = new List<IEnumerable<int>>();
bool? prev = null;
foreach (var n in numbers)
{
bool cur = n <= 3;
if (prev != cur)
{
list = new List<int>();
result.Add(list);
prev = cur;
}
list.Add(n);
}
But here is a workable solution in Linq. It depends upon site-effects, which you should normally avoid:
var prev = numbers.First() <= delta;
var counter = 0;
var result = numbers.GroupBy(n => (prev != (prev = n <= delta)) ? ++counter : counter)
.ToList();
So what we are conceptually doing here is going through the list and grouping while a condition is met. We can write a corresponding operation just for that without too much difficulty:
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(
this IEnumerable<T> source, Func<T, T, bool> predicate)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
List<T> list = new List<T>() { iterator.Current };
T previous = iterator.Current;
while (iterator.MoveNext())
{
if (!predicate(previous, iterator.Current))
{
yield return list;
list = new List<T>();
}
list.Add(iterator.Current);
previous = iterator.Current;
}
yield return list;
}
}
We can now write:
var groups = numbers.GroupWhile((prev,next) =>
(prev <= delta) == (next <= delta));
Here the condition for when to start a new group is when the previous item's comparison is the same as the current item's.
If you're a fan of fold you can write it this way:
var groups = numbers.Skip(1).Aggregate(new List<List<int>>(new[] { new List<int> { numbers[0] } }), (acc, b) =>
{
if ((acc.Last().LastOrDefault() <= delta) == (b <= delta))
{
acc.Last().Add(b);
}
else
{
acc.Add(new List<int>() { b });
}
return acc;
})
Here groups is of type of List<List<int>>

LINQ to count Continues repeated items(int) in an int Array?

Here is an scenario of my question: I have an array, say:
{ 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 }
The result should be something like this (array element => its count):
4 => 1
1 => 2
3 => 2
2 => 1
5 => 1
3 => 1
2 => 2
I know this can be achieved by for loop.
But google'd a lot to make this possible using lesser lines of code using LINQ without success.
I believe the most optimal way to do this is to create a "LINQ-like" extension methods using an iterator block. This allows you to perform the calculation doing a single pass over your data. Note that performance isn't important at all if you just want to perform the calculation on a small array of numbers. Of course this is really your for loop in disguise.
static class Extensions {
public static IEnumerable<Tuple<T, Int32>> ToRunLengths<T>(this IEnumerable<T> source) {
using (var enumerator = source.GetEnumerator()) {
// Empty input leads to empty output.
if (!enumerator.MoveNext())
yield break;
// Retrieve first item of the sequence.
var currentValue = enumerator.Current;
var runLength = 1;
// Iterate the remaining items in the sequence.
while (enumerator.MoveNext()) {
var value = enumerator.Current;
if (!Equals(value, currentValue)) {
// A new run is starting. Return the previous run.
yield return Tuple.Create(currentValue, runLength);
currentValue = value;
runLength = 0;
}
runLength += 1;
}
// Return the last run.
yield return Tuple.Create(currentValue, runLength);
}
}
}
Note that the extension method is generic and you can use it on any type. Values are compared for equality using Object.Equals. However, if you want to you could pass an IEqualityComparer<T> to allow for customization of how values are compared.
You can use the method like this:
var numbers = new[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var runLengths = numbers.ToRunLengths();
For you input data the result will be these tuples:
4 1
1 2
3 2
2 1
5 1
3 1
2 2
(Adding another answer to avoid the two upvotes for my deleted one counting towards this...)
I've had a little think about this (now I've understood the question) and it's really not clear how you'd do this nicely in LINQ. There are definitely ways that it could be done, potentially using Zip or Aggregate, but they'd be relatively unclear. Using foreach is pretty simple:
// Simplest way of building an empty list of an anonymous type...
var results = new[] { new { Value = 0, Count = 0 } }.Take(0).ToList();
// TODO: Handle empty arrays
int currentValue = array[0];
int currentCount = 1;
foreach (var value in array.Skip(1))
{
if (currentValue != value)
{
results.Add(new { Value = currentValue, Count = currentCount });
currentCount = 0;
currentValue = value;
}
currentCount++;
}
// Handle tail, which we won't have emitted yet
results.Add(new { Value = currentValue, Count = currentCount });
Here's a LINQ expression that works (edit: tightened up code just a little more):
var data = new int[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var result = data.Select ((item, index) =>
new
{
Key = item,
Count = (index == 0 || data.ElementAt(index - 1) != item)
? data.Skip(index).TakeWhile (d => d == item).Count ()
: -1
}
)
.Where (d => d.Count != -1);
And here's a proof that shows it working.
This not short enough?
public static IEnumerable<KeyValuePair<T, int>> Repeats<T>(
this IEnumerable<T> source)
{
int count = 0;
T lastItem = source.First();
foreach (var item in source)
{
if (Equals(item, lastItem))
{
count++;
}
else
{
yield return new KeyValuePair<T, int>(lastItem, count);
lastItem = item;
count = 1;
}
}
yield return new KeyValuePair<T, int>(lastItem, count);
}
I'll be interested to see a linq way.
I already wrote the method you need over there. Here's how to call it.
foreach(var g in numbers.GroupContiguous(i => i))
{
Console.WriteLine("{0} => {1}", g.Key, g.Count);
}
Behold (you can run this directly in LINQPad -- rle is where the magic happens):
var xs = new[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var rle = Enumerable.Range(0, xs.Length)
.Where(i => i == 0 || xs[i - 1] != xs[i])
.Select(i => new { Key = xs[i], Count = xs.Skip(i).TakeWhile(x => x == xs[i]).Count() });
Console.WriteLine(rle);
Of course, this is O(n^2), but you didn't request linear efficiency in the spec.
var array = new int[] {1,1,2,3,5,6,6 };
foreach (var g in array.GroupBy(i => i))
{
Console.WriteLine("{0} => {1}", g.Key, g.Count());
}
var array = new int[]{};//whatever ur array is
array.select((s)=>{return array.where((s2)=>{s == s2}).count();});
the only prob with is tht if you have 1 - two times you will get the result for 1-two times
var array = new int[] {1,1,2,3,5,6,6 };
var arrayd = array.Distinct();
var arrayl= arrayd.Select(s => { return array.Where(s2 => s2 == s).Count(); }).ToArray();
Output
arrayl=[0]2 [1]1 [2]1 [3]1 [4]2
Try GroupBy through List<int>
List<int> list = new List<int>() { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var res = list.GroupBy(val => val);
foreach (var v in res)
{
MessageBox.Show(v.Key.ToString() + "=>" + v.Count().ToString());
}

Take every 2nd object in list

I have an IEnumerable and I want to get a new IEnumerable containing every nth element.
Can this be done in Linq?
Just figured it out myself...
The IEnumerable<T>.Where() method has an overload that takes the index of the current element - just what the doctor ordered.
(new []{1,2,3,4,5}).Where((elem, idx) => idx % 2 == 0);
This would return
{1, 3, 5}
Update: In order to cover both my use case and Dan Tao's suggestion, let's also specify what the first returned element should be:
var firstIdx = 1;
var takeEvery = 2;
var list = new []{1,2,3,4,5};
var newList = list
.Skip(firstIdx)
.Where((elem, idx) => idx % takeEvery == 0);
...would return
{2, 4}
To implement Cristi's suggestion:
public static IEnumerable<T> Sample<T>(this IEnumerable<T> source, int interval)
{
// null check, out of range check go here
return source.Where((value, index) => (index + 1) % interval == 0);
}
Usage:
var upToTen = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var evens = upToTen.Sample(2);
var multiplesOfThree = upToTen.Sample(3);
While not LINQ you may also create an extension method with yield.
public static IEnumerable<T> EverySecondObject<T>(this IEnumerable<T> list)
{
using (var enumerator = list.GetEnumerator())
{
while (true)
{
if (!enumerator.MoveNext())
yield break;
if (enumerator.MoveNext())
yield return enumerator.Current;
else
yield break;
}
}
}

C# - elegant way of partitioning a list?

I'd like to partition a list into a list of lists, by specifying the number of elements in each partition.
For instance, suppose I have the list {1, 2, ... 11}, and would like to partition it such that each set has 4 elements, with the last set filling as many elements as it can. The resulting partition would look like {{1..4}, {5..8}, {9..11}}
What would be an elegant way of writing this?
Here is an extension method that will do what you want:
public static IEnumerable<List<T>> Partition<T>(this IList<T> source, Int32 size)
{
for (int i = 0; i < (source.Count / size) + (source.Count % size > 0 ? 1 : 0); i++)
yield return new List<T>(source.Skip(size * i).Take(size));
}
Edit: Here is a much cleaner version of the function:
public static IEnumerable<List<T>> Partition<T>(this IList<T> source, Int32 size)
{
for (int i = 0; i < Math.Ceiling(source.Count / (Double)size); i++)
yield return new List<T>(source.Skip(size * i).Take(size));
}
Using LINQ you could cut your groups up in a single line of code like this...
var x = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
var groups = x.Select((i, index) => new
{
i,
index
}).GroupBy(group => group.index / 4, element => element.i);
You could then iterate over the groups like the following...
foreach (var group in groups)
{
Console.WriteLine("Group: {0}", group.Key);
foreach (var item in group)
{
Console.WriteLine("\tValue: {0}", item);
}
}
and you'll get an output that looks like this...
Group: 0
Value: 1
Value: 2
Value: 3
Value: 4
Group: 1
Value: 5
Value: 6
Value: 7
Value: 8
Group: 2
Value: 9
Value: 10
Value: 11
Something like (untested air code):
IEnumerable<IList<T>> PartitionList<T>(IList<T> list, int maxCount)
{
List<T> partialList = new List<T>(maxCount);
foreach(T item in list)
{
if (partialList.Count == maxCount)
{
yield return partialList;
partialList = new List<T>(maxCount);
}
partialList.Add(item);
}
if (partialList.Count > 0) yield return partialList;
}
This returns an enumeration of lists rather than a list of lists, but you can easily wrap the result in a list:
IList<IList<T>> listOfLists = new List<T>(PartitionList<T>(list, maxCount));
To avoid grouping, mathematics and reiteration.
The method avoids unnecessary calculations, comparisons and allocations. Parameter validation is included.
Here is a working demonstration on fiddle.
public static IEnumerable<IList<T>> Partition<T>(
this IEnumerable<T> source,
int size)
{
if (size < 2)
{
throw new ArgumentOutOfRangeException(
nameof(size),
size,
"Must be greater or equal to 2.");
}
T[] partition;
int count;
using (var e = source.GetEnumerator())
{
if (e.MoveNext())
{
partition = new T[size];
partition[0] = e.Current;
count = 1;
}
else
{
yield break;
}
while(e.MoveNext())
{
partition[count] = e.Current;
count++;
if (count == size)
{
yield return partition;
count = 0;
partition = new T[size];
}
}
}
if (count > 0)
{
Array.Resize(ref partition, count);
yield return partition;
}
}
var yourList = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
var groupSize = 4;
// here's the actual query that does the grouping...
var query = yourList
.Select((x, i) => new { x, i })
.GroupBy(i => i.i / groupSize, x => x.x);
// and here's a quick test to ensure that it worked properly...
foreach (var group in query)
{
foreach (var item in group)
{
Console.Write(item + ",");
}
Console.WriteLine();
}
If you need an actual List<List<T>> rather than an IEnumerable<IEnumerable<T>> then change the query as follows:
var query = yourList
.Select((x, i) => new { x, i })
.GroupBy(i => i.i / groupSize, x => x.x)
.Select(g => g.ToList())
.ToList();
Or in .Net 2.0 you would do this:
static void Main(string[] args)
{
int[] values = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
List<int[]> items = new List<int[]>(SplitArray(values, 4));
}
static IEnumerable<T[]> SplitArray<T>(T[] items, int size)
{
for (int index = 0; index < items.Length; index += size)
{
int remains = Math.Min(size, items.Length-index);
T[] segment = new T[remains];
Array.Copy(items, index, segment, 0, remains);
yield return segment;
}
}
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> list, int size)
{
while (list.Any()) { yield return list.Take(size); list = list.Skip(size); }
}
and for the special case of String
public static IEnumerable<string> Partition(this string str, int size)
{
return str.Partition<char>(size).Select(AsString);
}
public static string AsString(this IEnumerable<char> charList)
{
return new string(charList.ToArray());
}
Using ArraySegments might be a readable and short solution (casting your list to array is required):
var list = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 }; //Added 0 in front on purpose in order to enhance simplicity.
int[] array = list.ToArray();
int step = 4;
List<int[]> listSegments = new List<int[]>();
for(int i = 0; i < array.Length; i+=step)
{
int[] segment = new ArraySegment<int>(array, i, step).ToArray();
listSegments.Add(segment);
}
I'm not sure why Jochems answer using ArraySegment was voted down. It could be really useful as long as you are not going to need to extend the segments (cast to IList). For example, imagine that what you are trying to do is pass segments into a TPL DataFlow pipeline for concurrent processing. Passing the segments in as IList instances allows the same code to deal with arrays and lists agnostically.
Of course, that begs the question: Why not just derive a ListSegment class that does not require wasting memory by calling ToArray()? The answer is that arrays can actually be processed marginally faster in some situations (slightly faster indexing). But you would have to be doing some fairly hardcore processing to notice much of a difference. More importantly, there is no good way to protect against random insert and remove operations by other code holding a reference to the list.
Calling ToArray() on a million value numeric list takes about 3 milliseconds on my workstation. That's usually not too great a price to pay when you're using it to gain the benefits of more robust thread safety in concurrent operations, without incurring the heavy cost of locking.
You could use an extension method:
public static IList<HashSet<T>> Partition<T>(this IEnumerable<T> input, Func<T, object> partitionFunc)
{
Dictionary<object, HashSet> partitions = new Dictionary<object, HashSet<T>>();
object currentKey = null;
foreach (T item in input ?? Enumerable.Empty<T>())
{
currentKey = partitionFunc(item);
if (!partitions.ContainsKey(currentKey))
{
partitions[currentKey] = new HashSet<T>();
}
partitions[currentKey].Add(item);
}
return partitions.Values.ToList();
}
To avoid multiple checks, unnecessary instantiations, and repetitive iterations, you could use the code:
namespace System.Collections.Generic
{
using Linq;
using Runtime.CompilerServices;
public static class EnumerableExtender
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsEmpty<T>(this IEnumerable<T> enumerable) => !enumerable?.GetEnumerator()?.MoveNext() ?? true;
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> source, int size)
{
if (source == null)
throw new ArgumentNullException(nameof(source));
if (size < 2)
throw new ArgumentOutOfRangeException(nameof(size));
IEnumerable<T> items = source;
IEnumerable<T> partition;
while (true)
{
partition = items.Take(size);
if (partition.IsEmpty())
yield break;
else
yield return partition;
items = items.Skip(size);
}
}
}
}

Categories

Resources