So I was writing a mergesort in C# as an exercise and although it worked, looking back at the code, there was room for improvement.
Basically, the second part of the algorithm requires a routine to merge two sorted lists.
Here is my way too long implementation that could use some refactoring:
private static List<int> MergeSortedLists(List<int> sLeft, List<int> sRight)
{
if (sLeft.Count == 0 || sRight.Count == 0)
{
sLeft.AddRange(sRight);
return sLeft;
}
else if (sLeft.Count == 1 && sRight.Count == 1)
{
if (sLeft[0] <= sRight[0])
sLeft.Add(sRight[0]);
else
sLeft.Insert(0, sRight[0]);
return sLeft;
}
else if (sLeft.Count == 1 && sRight.Count > 1)
{
for (int i=0; i<sRight.Count; i++)
{
if (sLeft[0] <= sRight[i])
{
sRight.Insert(i, sLeft[0]);
return sRight;
}
}
sRight.Add(sLeft[0]);
return sRight;
}
else if (sLeft.Count > 1 && sRight.Count == 1)
{
for (int i=0; i<sLeft.Count; i++)
{
if (sRight[0] <= sLeft[i])
{
sLeft.Insert(i, sRight[0]);
return sLeft;
}
}
sLeft.Add(sRight[0]);
return sLeft;
}
else
{
List<int> list = new List<int>();
if (sLeft[0] <= sRight[0])
{
list.Add(sLeft[0]);
sLeft.RemoveAt(0);
}
else
{
list.Add(sRight[0]);
sRight.RemoveAt(0);
}
list.AddRange(MergeSortedLists(sLeft, sRight));
return list;
}
}
Surely this routine can be improved/shortened by removing recursion, etc. There are even other ways to merge 2 sorted lists. So any refactoring is welcome.
Although I do have an answer, I'm curious as to how would other programmers would go about improving this routine.
Thank you!
Merging two sorted lists can be done in O(n).
List<int> lList, rList, resultList;
int r,l = 0;
while(l < lList.Count && r < rList.Count)
{
if(lList[l] < rList[r]
resultList.Add(lList[l++]);
else
resultList.Add(rList[r++]);
}
//And add the missing parts.
while(l < lList.Count)
resultList.Add(lList[l++]);
while(r < rList.Count)
resultList.Add(rList[r++]);
My take on this would be:
private static List<int> MergeSortedLists(List<int> sLeft, List<int> sRight)
{
List<int> result = new List<int>();
int indexLeft = 0;
int indexRight = 0;
while (indexLeft < sLeft.Count || indexRight < sRight.Count)
{
if (indexRight == sRight.Count ||
(indexLeft < sLeft.Count && sLeft[indexLeft] < sRight[indexRight]))
{
result.Add(sLeft[indexLeft]);
indexLeft++;
}
else
{
result.Add(sRight[indexRight]);
indexRight++;
}
}
return result;
}
Exactly what I'd do if I had to do it by hand. =)
Are you really sure your code works at all? Without testing it, i see the following:
...
else if (sLeft.Count > 1 && sRight.Count == 0) //<-- sRight is empty
{
for (int i=0; i<sLeft.Count; i++)
{
if (sRight[0] <= sLeft[i]) //<-- IndexError?
{
sLeft.Insert(i, sRight[0]);
return sLeft;
}
}
sLeft.Add(sRight[0]);
return sLeft;
}
...
As a starting point, I would remove your special cases for when either of the lists has Count == 1 - they can be handled by your more general (currently recursing) case.
The if (sLeft.Count > 1 && sRight.Count == 0) will never be true because you've checked for sRight.Count == 0 at the start - so this code will never be reached and is redundant.
Finally, instead of recursing (which is very costly in this case due to the number of new Lists you create - one per element!), I'd do something like this in your else (actually, this could replace your entire method):
List<int> list = new List<int>();
while (sLeft.Count > 0 && sRight.Count > 0)
{
if (sLeft[0] <= sRight[0])
{
list.Add(sLeft[0]);
sLeft.RemoveAt(0);
}
else
{
list.Add(sRight[0]);
sRight.RemoveAt(0);
}
}
// one of these two is already empty; the other is in sorted order...
list.AddRange(sLeft);
list.AddRange(sRight);
return list;
(Ideally I'd refactor this to use integer indexes against each list, instead of using .RemoveAt, because it's more performant to loop through the list than destroy it, and because it might be useful to leave the original lists intact. This is still more efficient code than the original, though!)
You were asking for differrent approaches as well. I might do as below depending on the usage. The below code is lazy so it will not sort the entire list at once but only when elements are requested.
class MergeEnumerable<T> : IEnumerable<T>
{
public IEnumerator<T> GetEnumerator()
{
var left = _left.GetEnumerator();
var right = _right.GetEnumerator();
var leftHasSome = left.MoveNext();
var rightHasSome = right.MoveNext();
while (leftHasSome || rightHasSome)
{
if (leftHasSome && rightHasSome)
{
if(_comparer.Compare(left.Current,right.Current) < 0)
{
yield return returner(left);
} else {
yield return returner(right);
}
}
else if (rightHasSome)
{
returner(right);
}
else
{
returner(left);
}
}
}
private T returner(IEnumerator<T> enumerator)
{
var current = enumerator.Current;
enumerator.MoveNext();
return current;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return ((IEnumerable<T>)this).GetEnumerator();
}
private IEnumerable<T> _left;
private IEnumerable<T> _right;
private IComparer<T> _comparer;
MergeEnumerable(IEnumerable<T> left, IEnumerable<T> right, IComparer<T> comparer)
{
_left = left;
_right = right;
_comparer = comparer;
}
}
EDIT: It's basically the same implementatin as Sergey Osypchuk his will from start to finish when looking only at the sorting be fastest but the latency will be higher as well due to the fact of sorting the entire list upfront. So as I said depending on the usage I might go with this approach and an alternative would be something similar to Sergey Osypchuk
Often you can use a stack instead of use recursion
Merge list (by theory, input lists are sorted in advance) sorting could be implemented in following way:
List<int> MergeSorting(List<int> a, List<int> b)
{
int apos = 0;
int bpos = 0;
List<int> result = new List<int>();
while (apos < a.Count && bpos < b.Count)
{
int avalue = int.MaxValue;
int bvalue = int.MaxValue;
if (apos < a.Count)
avalue = a[apos];
if (bpos < b.Count)
bvalue = b[bpos];
if (avalue < bvalue)
{
result.Add(avalue);
apos++;
}
else
{
result.Add(bvalue);
bpos++;
}
}
return result;
}
In case you start with not sorted list you need to split it by sorted subsequence and than marge them using function above
I never use recursion for merge sort. You can make iterative passes over the input, taking advantage of the fact that the sorted block size doubles with every merge pass. Keep track of the block size and the count of items you've processed from each input list; when they're equal, the list is exhausted. When both lists are exhausted you can move on to the next pair of blocks. When the block size is greater than or equal to your input size, you're done.
Edit: Some of the information I had left previously was incorrect, due to my misunderstanding - a List in C# is similar to an array and not a linked list. My apologies.
Related
I'm out of ideas on this one. Tried originally myself and then copied from SO and google, which worked on all cases except one, however still didn't find a recursive algorithm that is fast enough for that particular test case in my assignment :/
In any case, why this:
public static int FindMaximum(int[] array)
{
if (array is null)
{
throw new ArgumentNullException(nameof(array));
}
if (array.Length == 0)
{
throw new ArgumentException(null);
}
return FindMaxRec(array, array.Length);
}
public static int FindMaxRec(int[] arr, int n)
{
if (n == 1)
{
return arr[0];
}
return Math.Max(arr[n - 1], FindMaxRec(arr, n - 1));
}
doesn't work with this TestCase?:
[Test]
[Order(0)]
[Timeout(5_000)]
public void FindMaximum_TestForLargeArray()
{
int expected = this.max;
int actual = FindMaximum(this.array);
Assert.AreEqual(expected, actual);
}
EDIT 1:
This works fine though, but I need recursive:
public static int FindMaximum(int[] array)
{
if (array is null)
{
throw new ArgumentNullException(nameof(array));
}
if (array.Length == 0)
{
throw new ArgumentException(null);
}
int maxValue = int.MinValue;
for (int i = 0; i < array.Length; i++)
{
if (array[i] > maxValue)
{
maxValue = array[i];
}
}
return maxValue;
}
You can try splitting array in two:
public static int FindMaximum(int[] array) {
if (null == array)
throw new ArgumentNullException(nameof(array));
if (array.Length <= 0)
throw new ArgumentException("Empty array is not allowed.", nameof(array));
return FindMaxRec(array, 0, array.Length - 1);
}
private static int FindMaxRec(int[] array, int from, int to) {
if (to < from)
throw new ArgumentOutOfRangeException(nameof(to));
if (to <= from + 1)
return Math.Max(array[from], array[to]);
return Math.Max(FindMaxRec(array, from, (from + to) / 2),
FindMaxRec(array, (from + to) / 2 + 1, to));
}
Demo:
Random random = new Random(123);
int[] data = Enumerable
.Range(0, 10_000_000)
.Select(_ => random.Next(1_000_000_000))
.ToArray();
Stopwatch sw = new Stopwatch();
sw.Start();
int max = FindMaximum(data);
sw.Stop();
Console.WriteLine($"max = {max}");
Console.WriteLine($"time = {sw.ElapsedMilliseconds}");
Outcome:
max = 999999635
time = 100
An easy way to turn a simple linear algorithm into a recursive one is to make use of the enumerator of the array.
public static int FindMax(int[] values)
{
using var enumerator = values.GetEnumerator();
return FindMaxRecursively(enumerator, int.MinValue);
}
private static T FindMaxRecursively<T>(IEnumerator<T> enumerator, T currentMax) where T : IComparable
{
if (!enumerator.MoveNext()) return currentMax;
var currentValue = enumerator.Current;
if (currentValue.CompareTo(currentMax) > 0) currentMax = currentValue;
return FindMaxRecursively(enumerator, currentMax);
}
This passes your test case and uses recursion.
Edit: Here is a more beginner friendly version of the above, with comments to explain what it is doing:
public static int FindMax(IEnumerable<int> values)
{
using var enumerator = values.GetEnumerator();//the using statement disposes the enumerator when we are done
//disposing the enumerator is important because we want to reset the index back to zero for the next time someone enumerates the array
return FindMaxRecursively(enumerator, int.MinValue);
}
private static int FindMaxRecursively(IEnumerator<int> enumerator, int currentMax)
{
if (!enumerator.MoveNext()) //move to the next item in the array. If there are no more items in the array MoveNext() returns false
return currentMax; //if there are no more items in the array return the current maximum value
var currentValue = enumerator.Current;//this is the value in the array at the current index
if (currentValue > currentMax) currentMax = currentValue;//if it's larger than the current maximum update the maximum
return FindMaxRecursively(enumerator, currentMax);//continue on to the next value, making sure to pass the current maximum
}
Something that might help understand this is that the IEnumerator is what enables foreach loops. Under the hood, foreach loops are just repeatedly calling MoveNext on an item that has an IEnumerator. Here is some more info on that topic.
public static int findMax(int[] a, int index) {
if (index > 0) {
return Math.max(a[index], findMax(a, index-1))
} else {
return a[0];
}
}
I have a collection of numbers (Collection) and it can be any size and contain negative and positive numbers. I am trying to split it up based on some criteria. starting at the first number in the collection I want to make a collection while that number is above -180 and below 180. Any numbers above 180 will go in a new collection or any numbers below -180 will go in an new collection. If the numbers become within the acceptable parameters again those will go in a new collection again. the problem is the collections need to stay in order.
For example.
Take a collection of 100:
the first 50 is between 180 and -180.
the next 20 are below -180
the next 20 are above 180
the last 10 are between 180 and -180
From the collection above I should now have 4 separate collection in the same order as the original 1 collection.
First collection numbers in original order between 180 and -180
second collection numbers in original order below -180
third collection numbers in original order above 180
fourth collection numbers in original order between 180 and -180
I have made an attempt, what I have doesn't work and is a nasty mess of if statements. I don't know linq very well but I think there may be a more elegant solution using that. Can anyone help me out here either with showing me how to create a linq statement or suggestions on how to get my if statements to work if that is the best way.
Collection<Tuple<Collection<double>, int>> collectionOfDataSets = new Collection<Tuple<Collection<double>, int>>();
Collection<double> newDataSet = new Collection<double>();
for (int i = 0; i < dataSet.Count; i++) {
if (dataSet[i] < 180 && dataSet[i] > -180) {
newDataSet.Add(dataSet[i]);
} else {
Tuple<Collection<double>, int> lastEntry = collectionOfDataSets.LastOrDefault(b => b.Item2 == i--);
if (lastEntry != null){
lastEntry.Item1.Add(dataSet[i]);
}
double lastInLastCollection = collectionOfDataSets.ElementAtOrDefault(collectionOfDataSets.Count).Item1.Last();
if (newDataSet.Count > 0 && lastInLastCollection!= dataSet[i]){
collectionOfDataSets.Add(new Tuple<Collection<double>, int>(newDataSet, i));
}
newDataSet = new Collection<double>();
}
}
Thank you in advance for any assistance.
Your example is complicated. I'll first state and solve a simpler problem, then use the same method to solve your original problem.
I want to split a list of numbers into contiguous groups of even and odd numbers. For example, given the list 2,2,4,3,6,2 I would split it into three groups [2,2,4], [3], [6,2]
This can be done concisely with a GroupAdjacentBy method
> var numbers = new List<int>{2,2,4,3,6,2};
> numbers.GroupAdjacentBy(x => x % 2)
[[2,2,4], [3], [6,2]]
To solve your problem, simply replace the even-odd classifying function above with your classification function:
> var points = new List<int>{-180,180};
> var f = new Func<int,int>(x => points.BinarySearch(x));
> var numbers = new List<int>{6,-50,100,190,200,20};
> numbers.GroupAdjacentBy(f)
[[6,-50,100], [190,200], [20]]
If you need the collections to be updated as soon as the values change why don;t you use properties? Something like
// your original collection
public IList<double> OriginalValues; //= new List<double> { -1000, 5, 7 1000 };
public IList<double> BelowMinus180
{
get { return OriginalValues.Where(x => x < -180).ToList().AsReadOnly(); }
}
public IList<double> BetweenMinus180And180
{
get { return OriginalValues.Where(x => x >= -180 && x <= 180).ToList().AsReadOnly(); }
}
public IList<double> Above180
{
get { return OriginalValues.Where(x => x > 180).ToList().AsReadOnly(); }
}
public static List<List<T>> PartitionBy<T>(this IEnumerable<T> seq, Func<T, bool> predicate)
{
bool lastPass = true;
return seq.Aggregate(new List<List<T>>(), (partitions, item) =>
{
bool inc = predicate(item);
if (inc == lastPass)
{
if (partitions.Count == 0)
{
partitions.Add(new List<T>());
}
partitions.Last().Add(item);
}
else
{
partitions.Add(new List<T> { item });
}
lastPass = inc;
return partitions;
});
}
You can then use:
List<List<double>> segments = newDataSet.PartitionBy(d => d > -180 && d < 180);
How about this possible solution using two passes. In the first pass we find the indices were a change occurs, and in the second pass we do the actual partitioning.
First an auxiliary method to determine the category:
protected int DetermineCategory(double number)
{
if (number < 180 && number > -180)
return 0;
else if (number < -180)
return 1;
else
return 2;
}
And then the actual algorithm:
List<int> indices = new List<int>();
int currentCategory = -1;
for (int i = 0; i < numbers.Count; i++)
{
int newCat = DetermineCategory(numbers[i]);
if (newCat != currentCategory)
{
indices.Add(i);
currentCategory = newCat;
}
}
List<List<double>> collections = new List<List<double>>(indices.Count);
for (int i = 1; i < indices.Count; ++i)
collections.Add(new List<double>(
numbers.Skip(indices[i - 1]).Take(indices[i] - indices[i - 1])));
Here is a new answer based on the new info you provided. I hope this time I will be closer to what you need
public IEnumerable<IList<double>> GetCollectionOfCollections(IList<double> values, IList<double> boundries)
{
var ordered = values.OrderBy(x => x).ToList();
for (int i = 0; i < boundries.Count; i++)
{
var collection = ordered.Where(x => x < boundries[i]).ToList();
if (collection.Count > 0)
{
ordered = ordered.Except(collection).ToList();
yield return collection.ToList();
}
}
if (ordered.Count() > 0)
{
yield return ordered;
}
}
One method with linq. Untested but should work
var firstSet = dataSet.TakeWhile(x=>x>-180&&x<180);
var totalCount = firstSet.Count();
var secondSet = dataSet.Skip(totalCount).TakeWhile(x=>x<-180);
totalCount+=secondSet.Count();
var thirdSet = dataSet.Skip(totalCount).TakeWhile(x=>x>180);
totalCount += thirdSet.Count();
var fourthSet = dataSet.Skip(totalCount);
There are two lists of string
List<string> A;
List<string> B;
What is the shortest code you would suggest to check that A.Count == B.Count and each element of A in B and vise versa: every B is in A (A items and B items may have different order).
If you don't need to worry about duplicates:
bool equal = new HashSet<string>(A).SetEquals(B);
If you are concerned about duplicates, that becomes slightly more awkward. This will work, but it's relatively slow:
bool equal = A.OrderBy(x => x).SequenceEquals(B.OrderBy(x => x));
Of course you can make both options more efficient by checking the count first, which is a simple expression. For example:
bool equal = (A.Count == B.Count) && new HashSet<string>(A).SetEquals(B);
... but you asked for the shortest code :)
A.Count == B.Count && new HashSet<string>(A).SetEquals(B);
If different frequencies of duplicates are an issue, check out this question.
If you call Enumerable.Except() on the two lists, that will return an IEnumerable<string> containing all of the elements that are in one list but not the other. If the count of this is 0, then you know that the two lists are the same.
var result = A.Count == B.Count && A.Where(y => B.Contains(y)).Count() == A.Count;
Maybe?
How about a simple loop?
private bool IsEqualLists(List<string> A, List<string> B)
{
for(int i = 0; i < A.Count; i++)
{
if(i < B.Count - 1) {
return false; }
else
{
if(!String.Equals(A[i], B[i]) {
return false;
}
}
}
return true;
}
If you aren't concerned about duplicates, or you're concerned about duplicates but not overly concerned about performance micro-optimisations, then the various techniques in Jon's answer are definitely the way to go.
If you're concerned about duplicates and performance then something like this extension method should do the trick, although it really doesn't meet your "shortest code" criteria!
bool hasSameElements = A.HasSameElements(B);
// ...
public static bool HasSameElements<T>(this IList<T> a, IList<T> b)
{
if (a == b) return true;
if ((a == null) || (b == null)) return false;
if (a.Count != b.Count) return false;
var dict = new Dictionary<string, int>(a.Count);
foreach (string s in a)
{
int count;
dict.TryGetValue(s, out count);
dict[s] = count + 1;
}
foreach (string s in b)
{
int count;
dict.TryGetValue(s, out count);
if (count < 1) return false;
dict[s] = count - 1;
}
return dict.All(kvp => kvp.Value == 0);
}
(Note that this method will return true if both sequences are null. If that's not the desired behaviour then it's easy enough to add in the extra null checks.)
Given a collection, is there a way to get the last N elements of that collection? If there isn't a method in the framework, what would be the best way to write an extension method to do this?
collection.Skip(Math.Max(0, collection.Count() - N));
This approach preserves item order without a dependency on any sorting, and has broad compatibility across several LINQ providers.
It is important to take care not to call Skip with a negative number. Some providers, such as the Entity Framework, will produce an ArgumentException when presented with a negative argument. The call to Math.Max avoids this neatly.
The class below has all of the essentials for extension methods, which are: a static class, a static method, and use of the this keyword.
public static class MiscExtensions
{
// Ex: collection.TakeLast(5);
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int N)
{
return source.Skip(Math.Max(0, source.Count() - N));
}
}
A brief note on performance:
Because the call to Count() can cause enumeration of certain data structures, this approach has the risk of causing two passes over the data. This isn't really a problem with most enumerables; in fact, optimizations exist already for Lists, Arrays, and even EF queries to evaluate the Count() operation in O(1) time.
If, however, you must use a forward-only enumerable and would like to avoid making two passes, consider a one-pass algorithm like Lasse V. Karlsen or Mark Byers describe. Both of these approaches use a temporary buffer to hold items while enumerating, which are yielded once the end of the collection is found.
coll.Reverse().Take(N).Reverse().ToList();
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> coll, int N)
{
return coll.Reverse().Take(N).Reverse();
}
UPDATE: To address clintp's problem: a) Using the TakeLast() method I defined above solves the problem, but if you really want the do it without the extra method, then you just have to recognize that while Enumerable.Reverse() can be used as an extension method, you aren't required to use it that way:
List<string> mystring = new List<string>() { "one", "two", "three" };
mystring = Enumerable.Reverse(mystring).Take(2).Reverse().ToList();
.NET Core 2.0+ provides the LINQ method TakeLast():
https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.takelast
example:
Enumerable
.Range(1, 10)
.TakeLast(3) // <--- takes last 3 items
.ToList()
.ForEach(i => System.Console.WriteLine(i))
// outputs:
// 8
// 9
// 10
Note: I missed your question title which said Using Linq, so my answer does not in fact use Linq.
If you want to avoid caching a non-lazy copy of the entire collection, you could write a simple method that does it using a linked list.
The following method will add each value it finds in the original collection into a linked list, and trim the linked list down to the number of items required. Since it keeps the linked list trimmed to this number of items the entire time through iterating through the collection, it will only keep a copy of at most N items from the original collection.
It does not require you to know the number of items in the original collection, nor iterate over it more than once.
Usage:
IEnumerable<int> sequence = Enumerable.Range(1, 10000);
IEnumerable<int> last10 = sequence.TakeLast(10);
...
Extension method:
public static class Extensions
{
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> collection,
int n)
{
if (collection == null)
throw new ArgumentNullException(nameof(collection));
if (n < 0)
throw new ArgumentOutOfRangeException(nameof(n), $"{nameof(n)} must be 0 or greater");
LinkedList<T> temp = new LinkedList<T>();
foreach (var value in collection)
{
temp.AddLast(value);
if (temp.Count > n)
temp.RemoveFirst();
}
return temp;
}
}
Here's a method that works on any enumerable but uses only O(N) temporary storage:
public static class TakeLastExtension
{
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int takeCount)
{
if (source == null) { throw new ArgumentNullException("source"); }
if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
if (takeCount == 0) { yield break; }
T[] result = new T[takeCount];
int i = 0;
int sourceCount = 0;
foreach (T element in source)
{
result[i] = element;
i = (i + 1) % takeCount;
sourceCount++;
}
if (sourceCount < takeCount)
{
takeCount = sourceCount;
i = 0;
}
for (int j = 0; j < takeCount; ++j)
{
yield return result[(i + j) % takeCount];
}
}
}
Usage:
List<int> l = new List<int> {4, 6, 3, 6, 2, 5, 7};
List<int> lastElements = l.TakeLast(3).ToList();
It works by using a ring buffer of size N to store the elements as it sees them, overwriting old elements with new ones. When the end of the enumerable is reached the ring buffer contains the last N elements.
I am surprised that no one has mentioned it, but SkipWhile does have a method that uses the element's index.
public static IEnumerable<T> TakeLastN<T>(this IEnumerable<T> source, int n)
{
if (source == null)
throw new ArgumentNullException("Source cannot be null");
int goldenIndex = source.Count() - n;
return source.SkipWhile((val, index) => index < goldenIndex);
}
//Or if you like them one-liners (in the spirit of the current accepted answer);
//However, this is most likely impractical due to the repeated calculations
collection.SkipWhile((val, index) => index < collection.Count() - N)
The only perceivable benefit that this solution presents over others is that you can have the option to add in a predicate to make a more powerful and efficient LINQ query, instead of having two separate operations that traverse the IEnumerable twice.
public static IEnumerable<T> FilterLastN<T>(this IEnumerable<T> source, int n, Predicate<T> pred)
{
int goldenIndex = source.Count() - n;
return source.SkipWhile((val, index) => index < goldenIndex && pred(val));
}
Use EnumerableEx.TakeLast in RX's System.Interactive assembly. It's an O(N) implementation like #Mark's, but it uses a queue rather than a ring-buffer construct (and dequeues items when it reaches buffer capacity).
(NB: This is the IEnumerable version - not the IObservable version, though the implementation of the two is pretty much identical)
If you are dealing with a collection with a key (e.g. entries from a database) a quick (i.e. faster than the selected answer) solution would be
collection.OrderByDescending(c => c.Key).Take(3).OrderBy(c => c.Key);
If you don't mind dipping into Rx as part of the monad, you can use TakeLast:
IEnumerable<int> source = Enumerable.Range(1, 10000);
IEnumerable<int> lastThree = source.AsObservable().TakeLast(3).AsEnumerable();
I tried to combine efficiency and simplicity and end up with this :
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int count)
{
if (source == null) { throw new ArgumentNullException("source"); }
Queue<T> lastElements = new Queue<T>();
foreach (T element in source)
{
lastElements.Enqueue(element);
if (lastElements.Count > count)
{
lastElements.Dequeue();
}
}
return lastElements;
}
About
performance : In C#, Queue<T> is implemented using a circular buffer so there is no object instantiation done each loop (only when the queue is growing up). I did not set queue capacity (using dedicated constructor) because someone might call this extension with count = int.MaxValue . For extra performance you might check if source implement IList<T> and if yes, directly extract the last values using array indexes.
If using a third-party library is an option, MoreLinq defines TakeLast() which does exactly this.
It is a little inefficient to take the last N of a collection using LINQ as all the above solutions require iterating across the collection. TakeLast(int n) in System.Interactive also has this problem.
If you have a list a more efficient thing to do is slice it using the following method
/// Select from start to end exclusive of end using the same semantics
/// as python slice.
/// <param name="list"> the list to slice</param>
/// <param name="start">The starting index</param>
/// <param name="end">The ending index. The result does not include this index</param>
public static List<T> Slice<T>
(this IReadOnlyList<T> list, int start, int? end = null)
{
if (end == null)
{
end = list.Count();
}
if (start < 0)
{
start = list.Count + start;
}
if (start >= 0 && end.Value > 0 && end.Value > start)
{
return list.GetRange(start, end.Value - start);
}
if (end < 0)
{
return list.GetRange(start, (list.Count() + end.Value) - start);
}
if (end == start)
{
return new List<T>();
}
throw new IndexOutOfRangeException(
"count = " + list.Count() +
" start = " + start +
" end = " + end);
}
with
public static List<T> GetRange<T>( this IReadOnlyList<T> list, int index, int count )
{
List<T> r = new List<T>(count);
for ( int i = 0; i < count; i++ )
{
int j=i + index;
if ( j >= list.Count )
{
break;
}
r.Add(list[j]);
}
return r;
}
and some test cases
[Fact]
public void GetRange()
{
IReadOnlyList<int> l = new List<int>() { 0, 10, 20, 30, 40, 50, 60 };
l
.GetRange(2, 3)
.ShouldAllBeEquivalentTo(new[] { 20, 30, 40 });
l
.GetRange(5, 10)
.ShouldAllBeEquivalentTo(new[] { 50, 60 });
}
[Fact]
void SliceMethodShouldWork()
{
var list = new List<int>() { 1, 3, 5, 7, 9, 11 };
list.Slice(1, 4).ShouldBeEquivalentTo(new[] { 3, 5, 7 });
list.Slice(1, -2).ShouldBeEquivalentTo(new[] { 3, 5, 7 });
list.Slice(1, null).ShouldBeEquivalentTo(new[] { 3, 5, 7, 9, 11 });
list.Slice(-2)
.Should()
.BeEquivalentTo(new[] {9, 11});
list.Slice(-2,-1 )
.Should()
.BeEquivalentTo(new[] {9});
}
I know it's to late to answer this question. But if you are working with collection of type IList<> and you don't care about an order of the returned collection, then this method is working faster. I've used Mark Byers answer and made a little changes. So now method TakeLast is:
public static IEnumerable<T> TakeLast<T>(IList<T> source, int takeCount)
{
if (source == null) { throw new ArgumentNullException("source"); }
if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
if (takeCount == 0) { yield break; }
if (source.Count > takeCount)
{
for (int z = source.Count - 1; takeCount > 0; z--)
{
takeCount--;
yield return source[z];
}
}
else
{
for(int i = 0; i < source.Count; i++)
{
yield return source[i];
}
}
}
For test I have used Mark Byers method and kbrimington's andswer. This is test:
IList<int> test = new List<int>();
for(int i = 0; i<1000000; i++)
{
test.Add(i);
}
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
IList<int> result = TakeLast(test, 10).ToList();
stopwatch.Stop();
Stopwatch stopwatch1 = new Stopwatch();
stopwatch1.Start();
IList<int> result1 = TakeLast2(test, 10).ToList();
stopwatch1.Stop();
Stopwatch stopwatch2 = new Stopwatch();
stopwatch2.Start();
IList<int> result2 = test.Skip(Math.Max(0, test.Count - 10)).Take(10).ToList();
stopwatch2.Stop();
And here are results for taking 10 elements:
and for taking 1000001 elements results are:
Here's my solution:
public static class EnumerationExtensions
{
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> input, int count)
{
if (count <= 0)
yield break;
var inputList = input as IList<T>;
if (inputList != null)
{
int last = inputList.Count;
int first = last - count;
if (first < 0)
first = 0;
for (int i = first; i < last; i++)
yield return inputList[i];
}
else
{
// Use a ring buffer. We have to enumerate the input, and we don't know in advance how many elements it will contain.
T[] buffer = new T[count];
int index = 0;
count = 0;
foreach (T item in input)
{
buffer[index] = item;
index = (index + 1) % buffer.Length;
count++;
}
// The index variable now points at the next buffer entry that would be filled. If the buffer isn't completely
// full, then there are 'count' elements preceding index. If the buffer *is* full, then index is pointing at
// the oldest entry, which is the first one to return.
//
// If the buffer isn't full, which means that the enumeration has fewer than 'count' elements, we'll fix up
// 'index' to point at the first entry to return. That's easy to do; if the buffer isn't full, then the oldest
// entry is the first one. :-)
//
// We'll also set 'count' to the number of elements to be returned. It only needs adjustment if we've wrapped
// past the end of the buffer and have enumerated more than the original count value.
if (count < buffer.Length)
index = 0;
else
count = buffer.Length;
// Return the values in the correct order.
while (count > 0)
{
yield return buffer[index];
index = (index + 1) % buffer.Length;
count--;
}
}
}
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> input, int count)
{
if (count <= 0)
return input;
else
return input.SkipLastIter(count);
}
private static IEnumerable<T> SkipLastIter<T>(this IEnumerable<T> input, int count)
{
var inputList = input as IList<T>;
if (inputList != null)
{
int first = 0;
int last = inputList.Count - count;
if (last < 0)
last = 0;
for (int i = first; i < last; i++)
yield return inputList[i];
}
else
{
// Aim to leave 'count' items in the queue. If the input has fewer than 'count'
// items, then the queue won't ever fill and we return nothing.
Queue<T> elements = new Queue<T>();
foreach (T item in input)
{
elements.Enqueue(item);
if (elements.Count > count)
yield return elements.Dequeue();
}
}
}
}
The code is a bit chunky, but as a drop-in reusable component, it should perform as well as it can in most scenarios, and it'll keep the code that's using it nice and concise. :-)
My TakeLast for non-IList`1 is based on the same ring buffer algorithm as that in the answers by #Mark Byers and #MackieChan further up. It's interesting how similar they are -- I wrote mine completely independently. Guess there's really just one way to do a ring buffer properly. :-)
Looking at #kbrimington's answer, an additional check could be added to this for IQuerable<T> to fall back to the approach that works well with Entity Framework -- assuming that what I have at this point does not.
Honestly I'm not super proud of the answer, but for small collections you could use the following:
var lastN = collection.Reverse().Take(n).Reverse();
A bit hacky but it does the job ;)
My solution is based on ranges, introduced in C# version 8.
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int N)
{
return source.ToArray()[(source.Count()-N)..];
}
After running a benchmark with most rated solutions (and my humbly proposed solution):
public static class TakeLastExtension
{
public static IEnumerable<T> TakeLastMarkByers<T>(this IEnumerable<T> source, int takeCount)
{
if (source == null) { throw new ArgumentNullException("source"); }
if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
if (takeCount == 0) { yield break; }
T[] result = new T[takeCount];
int i = 0;
int sourceCount = 0;
foreach (T element in source)
{
result[i] = element;
i = (i + 1) % takeCount;
sourceCount++;
}
if (sourceCount < takeCount)
{
takeCount = sourceCount;
i = 0;
}
for (int j = 0; j < takeCount; ++j)
{
yield return result[(i + j) % takeCount];
}
}
public static IEnumerable<T> TakeLastKbrimington<T>(this IEnumerable<T> source, int N)
{
return source.Skip(Math.Max(0, source.Count() - N));
}
public static IEnumerable<T> TakeLastJamesCurran<T>(this IEnumerable<T> source, int N)
{
return source.Reverse().Take(N).Reverse();
}
public static IEnumerable<T> TakeLastAlex<T>(this IEnumerable<T> source, int N)
{
return source.ToArray()[(source.Count()-N)..];
}
}
Test
[MemoryDiagnoser]
public class TakeLastBenchmark
{
[Params(10000)]
public int N;
private readonly List<string> l = new();
[GlobalSetup]
public void Setup()
{
for (var i = 0; i < this.N; i++)
{
this.l.Add($"i");
}
}
[Benchmark]
public void Benchmark1_MarkByers()
{
var lastElements = l.TakeLastMarkByers(3).ToList();
}
[Benchmark]
public void Benchmark2_Kbrimington()
{
var lastElements = l.TakeLastKbrimington(3).ToList();
}
[Benchmark]
public void Benchmark3_JamesCurran()
{
var lastElements = l.TakeLastJamesCurran(3).ToList();
}
[Benchmark]
public void Benchmark4_Alex()
{
var lastElements = l.TakeLastAlex(3).ToList();
}
}
Program.cs:
var summary = BenchmarkRunner.Run(typeof(TakeLastBenchmark).Assembly);
Command dotnet run --project .\TestsConsole2.csproj -c Release --logBuildOutput
The results were following:
// * Summary *
BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.1889/21H2/November2021Update)
AMD Ryzen 5 5600X, 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.401
[Host] : .NET 6.0.9 (6.0.922.41905), X64 RyuJIT AVX2
DefaultJob : .NET 6.0.9 (6.0.922.41905), X64 RyuJIT AVX2
Method
N
Mean
Error
StdDev
Gen0
Gen1
Allocated
Benchmark1_MarkByers
10000
89,390.53 ns
1,735.464 ns
1,704.457 ns
-
-
248 B
Benchmark2_Kbrimington
10000
46.15 ns
0.410 ns
0.363 ns
0.0076
-
128 B
Benchmark3_JamesCurran
10000
2,703.15 ns
46.298 ns
67.862 ns
4.7836
0.0038
80264 B
Benchmark4_Alex
10000
2,513.48 ns
48.661 ns
45.517 ns
4.7607
-
80152 B
Turns out the solution proposed by #Kbrimington to be the most efficient in terms of memory alloc as well as raw performance.
Below the real example how to take last 3 elements from a collection (array):
// split address by spaces into array
string[] adrParts = adr.Split(new string[] { " " },StringSplitOptions.RemoveEmptyEntries);
// take only 3 last items in array
adrParts = adrParts.SkipWhile((value, index) => { return adrParts.Length - index > 3; }).ToArray();
Using This Method To Get All Range Without Error
public List<T> GetTsRate( List<T> AllT,int Index,int Count)
{
List<T> Ts = null;
try
{
Ts = AllT.ToList().GetRange(Index, Count);
}
catch (Exception ex)
{
Ts = AllT.Skip(Index).ToList();
}
return Ts ;
}
Little different implementation with usage of circular buffer. The benchmarks show that the method is circa two times faster than ones using Queue (implementation of TakeLast in System.Linq), however not without a cost - it needs a buffer which grows along with the requested number of elements, even if you have a small collection you can get huge memory allocation.
public IEnumerable<T> TakeLast<T>(IEnumerable<T> source, int count)
{
int i = 0;
if (count < 1)
yield break;
if (source is IList<T> listSource)
{
if (listSource.Count < 1)
yield break;
for (i = listSource.Count < count ? 0 : listSource.Count - count; i < listSource.Count; i++)
yield return listSource[i];
}
else
{
bool move = true;
bool filled = false;
T[] result = new T[count];
using (var enumerator = source.GetEnumerator())
while (move)
{
for (i = 0; (move = enumerator.MoveNext()) && i < count; i++)
result[i] = enumerator.Current;
filled |= move;
}
if (filled)
for (int j = i; j < count; j++)
yield return result[j];
for (int j = 0; j < i; j++)
yield return result[j];
}
}
//detailed code for the problem
//suppose we have a enumerable collection 'collection'
var lastIndexOfCollection=collection.Count-1 ;
var nthIndexFromLast= lastIndexOfCollection- N;
var desiredCollection=collection.GetRange(nthIndexFromLast, N);
---------------------------------------------------------------------
// use this one liner
var desiredCollection=collection.GetRange((collection.Count-(1+N)), N);
I'm using .NET 3.5 and would like to be able to obtain every *n*th item from a List. I'm not bothered as to whether it's achieved using a lambda expression or LINQ.
Edit
Looks like this question provoked quite a lot of debate (which is a good thing, right?). The main thing I've learnt is that when you think you know every way to do something (even as simple as this), think again!
return list.Where((x, i) => i % nStep == 0);
I know it's "old school," but why not just use a for loop with stepping = n?
Sounds like
IEnumerator<T> GetNth<T>(List<T> list, int n) {
for (int i=0; i<list.Count; i+=n)
yield return list[i]
}
would do the trick. I do not see the need to use Linq or a lambda expressions.
EDIT:
Make it
public static class MyListExtensions {
public static IEnumerable<T> GetNth<T>(this List<T> list, int n) {
for (int i=0; i<list.Count; i+=n)
yield return list[i];
}
}
and you write in a LINQish way
from var element in MyList.GetNth(10) select element;
2nd Edit:
To make it even more LINQish
from var i in Range(0, ((myList.Length-1)/n)+1) select list[n*i];
You can use the Where overload which passes the index along with the element
var everyFourth = list.Where((x,i) => i % 4 == 0);
For Loop
for(int i = 0; i < list.Count; i += n)
//Nth Item..
I think if you provide a linq extension, you should be able to operate on the least specific interface, thus on IEnumerable. Of course, if you are up for speed especially for large N you might provide an overload for indexed access. The latter removes the need of iterating over large amounts of not needed data, and will be much faster than the Where clause. Providing both overloads lets the compiler select the most suitable variant.
public static class LinqExtensions
{
public static IEnumerable<T> GetNth<T>(this IEnumerable<T> list, int n)
{
if (n < 0)
throw new ArgumentOutOfRangeException("n");
if (n > 0)
{
int c = 0;
foreach (var e in list)
{
if (c % n == 0)
yield return e;
c++;
}
}
}
public static IEnumerable<T> GetNth<T>(this IList<T> list, int n)
{
if (n < 0)
throw new ArgumentOutOfRangeException("n");
if (n > 0)
for (int c = 0; c < list.Count; c += n)
yield return list[c];
}
}
I'm not sure if it's possible to do with a LINQ expression, but I know that you can use the Where extension method to do it. For example to get every fifth item:
List<T> list = originalList.Where((t,i) => (i % 5) == 0).ToList();
This will get the first item and every fifth from there. If you want to start at the fifth item instead of the first, you compare with 4 instead of comparing with 0.
Imho no answer is right. All solutions begins from 0. But I want to have the real nth element
public static IEnumerable<T> GetNth<T>(this IList<T> list, int n)
{
for (int i = n - 1; i < list.Count; i += n)
yield return list[i];
}
#belucha I like this, because the client code is very readable and the Compiler chooses the most efficient Implementation. I would build upon this by reducing the requirements to IReadOnlyList<T> and to save the Division for high-performance LINQ:
public static IEnumerable<T> GetNth<T>(this IEnumerable<T> list, int n) {
if (n <= 0) throw new ArgumentOutOfRangeException(nameof(n), n, null);
int i = n;
foreach (var e in list) {
if (++i < n) { //save Division
continue;
}
i = 0;
yield return e;
}
}
public static IEnumerable<T> GetNth<T>(this IReadOnlyList<T> list, int n
, int offset = 0) { //use IReadOnlyList<T>
if (n <= 0) throw new ArgumentOutOfRangeException(nameof(n), n, null);
for (var i = offset; i < list.Count; i += n) {
yield return list[i];
}
}
private static readonly string[] sequence = "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15".Split(',');
static void Main(string[] args)
{
var every4thElement = sequence
.Where((p, index) => index % 4 == 0);
foreach (string p in every4thElement)
{
Console.WriteLine("{0}", p);
}
Console.ReadKey();
}
output