Why might System.Random.Sample throw IndexOutOfRangeException?

Why might System.Random.Sample throw IndexOutOfRangeException? - c#

I'm calling Next() on a .NET System.Random instance inside Unity (2017.1.1f1). It's throwing an IndexOutOfRangeException from inside the protected function Sample(), which accepts no arguments and returns a double between 0 and 1. What might cause this behavior?
Here is the exception in detail
System.IndexOutOfRangeException: Array index is out of range.
at System.Random.Sample () [0x0003e] in
/Users/builduser/buildslave/mono/build/mcs/class/corlib/System/Random.cs:91
at System.Random.Next (Int32 maxValue) [0x00017] in
/Users/builduser/buildslave/mono/build/mcs/class/corlib/System/Random.cs:112
at Quicksilver.SysIRand.RandInt (Int32 max_exclusive) [0x00008] in
F:\SVNHome\gemrush\GemRush\Assets\Source\Shared\Utility\IRand.cs:38
at Quicksilver.IEnumerableExt.SelectRandom[Skill] (IEnumerable`1
collection, IRand rand, Int32 count) [0x00070] in
F:\SVNHome\gemrush\GemRush\Assets\Source\Shared\Utility\IEnumerableExt.cs:61
(there's another 13 layers of callstack above this)
This is a multi-threaded environment, but each thread has its own dedicated instance of System.Random. As you can see from the code below the parameter passed to Next() must have been 1 or higher.
This error was thrown about 1 hour into a complex automated test script, so running multiple times to test theories is expensive, and any modifications that change how the RNG gets invoked will prevent a reproduction. (If the error somehow involves unintended interaction between threads, then it probably won't be reproducible at all.)
Since it made it an hour into the test script, the overwhelming majority of invocations of this method must NOT be throwing an error.
The function making direct use of the random numbers is here:
// Chooses count items at random from the enumeration and returns them in an array
// The order of selected items within the array is also random
// If the collection is smaller than count, the entire collection is returned (in random order)
public static T[] SelectRandom<T>(this IEnumerable<T> collection, IRand rand, int count = 1)
{
if (count <= 0) return new T[0]; // Optimization for trivial case
T[] keep = new T[count];
int found = 0;
foreach (T item in collection)
{
if (found < count)
{
// Take the first #count items, in case that's all there are
// Move a random item of those found so far (including the new one)
// to the end of the array, and insert the new one in its place
int r = rand.RandInt(found + 1);
keep[found++] = keep[r];
keep[r] = item;
}
else
{
// Random chance to replace one of our previously-selected elements
++found;
if (rand.RandInt(found) < count) // probability desired/found
{
// Replace a random previously-selected element
int r = rand.RandInt(count);
keep[r] = item;
}
}
}
if (found < count)
{
// The collection was too small to get everything requested;
// Make a new, smaller array containing everything in the collection
T[] all = new T[found];
Array.Copy(keep, all, found);
return all;
}
return keep;
}
The error is being thrown from this line:
if (rand.RandInt(found) < count) // probability desired/found
IRand is the interface for a very thin wrapper around System.Random; IRand.RandInt() simply returns Random.Next().
EDIT
Here's how the Random instances are created and distributed:
private void Start()
{
SysIRand[] rngs = new SysIRand[parallelTesters];
if (parallelTesters > 0) rngs[0] = new SysIRand(new System.Random(548913));
if (parallelTesters > 1) rngs[1] = new SysIRand(new System.Random(138498));
if (parallelTesters > 2) rngs[2] = new SysIRand(new System.Random(976336));
if (parallelTesters > 3) rngs[3] = new SysIRand(new System.Random(793461));
if (parallelTesters > 4) rngs[4] = new SysIRand(new System.Random(648791));
if (parallelTesters > 5) rngs[5] = new SysIRand(new System.Random(348916));
if (parallelTesters > 6) rngs[6] = new SysIRand(new System.Random(8467168));
if (parallelTesters > 7) rngs[7] = new SysIRand(new System.Random(6183569));
for (int i = 8; i < parallelTesters; ++i)
{
rngs[i] = new SysIRand(new System.Random(7 * i * i + 8961 * i + 129786));
}
for (int t = 0; t < parallelTesters; ++t)
{
SysIRand rand = rngs[t];
IBot bot = BotFactory.DrawBot(rand);
BotTester tester = new BotTester(rand, bot);
tester.testerID = t + 1;
tester.OnMessage += (str) => UponMessage(tester.testerID + " ~ " + str);
tester.OnError += (str) => UponError(tester.testerID + " ~ " + str);
tester.OnGameAborted += UponGameAborted;
tester.OnDebugMoment += UponDebugMoment;
testers.Add(tester);
}
(in this run, parallelTesters had a value of 3)
Each BotTester exclusively uses the Random instance passed to its constructor. Each thread is private to a single BotTester, started from BotTester.RunGame():
public bool RunGame(GameMode mode, int maxGames = 1, long maxMilliSeconds = 100000000, int maxRetries = 5000)
{
if (threadRunning) return false;
thread = new Thread(() => ThreadedRunGame(mode, maxGames, maxMilliSeconds, maxRetries));
thread.Start();
return true;
}

The only explanation that makes sense is that you think you are accessing Random() instance thread safe, with your own words, each thread has its own Random() instance but looks like you are accessing the same Random() instance while it is still in the middle of calculating. Here is the implementation in Unity. Sample() simply calls InternalSample()
private int InternalSample()
{
int inext = this.inext;
int inextp = this.inextp;
int index1;
if ((index1 = inext + 1) >= 56)
index1 = 1;
int index2;
if ((index2 = inextp + 1) >= 56)
index2 = 1;
int num = this.SeedArray[index1] - this.SeedArray[index2];
if (num < 0)
num += int.MaxValue;
this.SeedArray[index1] = num;
this.inext = index1;
this.inextp = index2;
return num;
}
As you can see the places you can get a IndexOutOfRangeException are limited,i.e when you access this.SeedArray. Here is the definition for SeedArray.
public class Random
{
private int[] SeedArray = new int[56];
....
}
It is already allocated to 56 elements and in the implementation of InternalSample method you can see that for each call index1 and index2 are always limited to be at most 55 unless InternalSample method is called multiple times.

Related

c# binary search strings from array (including json)

So I have a Json implimentation that reads characters, the names go into arrays then I use Array.BinarySearch to get the position of the element.
I'm researching how to impliment the Binary Search own my own. I'm having trouble seeing logically what to do with the string name that is entered for the search.
Instead of using Array.BinarySearch, I need a separate method with the algorithm.
Any advice / strategy? :)
example:
/* json array implimented, manu printed etc... before this point, */
static void FindCharacters(Characters[] characters)
{
Characters result = new Characters();
string userInput = Console.ReadLine();
string[] name = new string[10000];
Console.Write("Search Name : ");
string searchKeyword = Console.ReadLine();
if (userInput.ToLower() == "name")
{
name = characters.Select(m => m.Name).ToArray();
Array.Sort(name);
Sorting.Sort(characters, searchKeyword);
var tmp = BinarySearch(name, searchKeyword);
if (tmp < 0)
{
Console.WriteLine("No data found!");
return;
}
else
{
result = characters[tmp];
CharacterPrint(result);
}
//result = characters[tmp]; //Convert.ToInt32(tmp)
//CharacterPrint(result);
}
public static int BinarySearch(int[] name, int item)
{
int min = 0;
int N = name.Length;
int max = N - 1;
do
{
int mid = (min + max) / 2;
if (item > name[mid])
min = mid + 1;
else
max = mid - 1;
if (name[mid] == item)
return mid;
//if (min > max)
// break;
} while (min <= max);
return -1;
}

Your int solution will work perfectly fine for strings. In fact, by just tweaking a couple lines, it would work for any data type that implements IComparable:
public static int BinarySearch<T>(T[] name, T item)
where T : IComparable<T>
{
int min = 0;
int N = name.Length;
int max = N - 1;
do
{
int mid = (min + max) / 2;
int t = item.CompareTo(name[mid]); // Temp value holder
if (t > 0) // item > name[mid]
min = mid + 1;
else if (t < 0) // item < name[mid]
max = mid - 1;
else // item == name[mid]
return mid;
//if (min > max)
// break;
} while (min <= max);
return -1;
}
You can call it like this:
string[] names = // ...
string name = //...
// Explicit calling
int idx = BinarySearch<string>(names, name);
// Implicit calling
// The following option works because the C# compiler can tell you are
// using two values of type string and inserts the correct generic
// type for you
int idx = BinarySearch(names, name);
You can see the changes made above reflect how to replace the default comparison operators (i.e. "<", ">", "==") with their CompareTo equivolents. The extra variable t is there to avoid redundantly calling CompareTo on the objects twice.
The way that CompareTo works is that it takes the calling object and compares it with the passed object. If the passed object would appear before the calling object in a sorted list, the method returns -1. If it would appear after, it returns 1. If they are the same, it returns 0.
See the following example for an illustration of this:
// The following values are compared based on standard lexical alphabetical order
a.CompareTo(b); // "b" would appear after "a", so this returns 1
c.CompareTo(b); // "b" would appear before "c", so this returns -1
b.CompareTo(b); // "b" and "b" are the same value, so this returns 0

Garbage Collector generations not incrementing

I have a lot of questions about the garbage collection procedure, mainly when does it run, when the objects are set to an older generation, so on..
static void Main(string[] args)
{
int i = 0, j = 0;
int a = 0;
Holder prev = new Holder(null);
while(GC.CollectionCount(1) == 0)
{
int aux = GC.CollectionCount(0);
if(aux > a){
a = aux;
++j;
Console.WriteLine((i+1));
}
++i;
Holder h = new Holder(prev);
Console.WriteLine(GC.GetGeneration(prev));
prev = h;
}
}
I'm trying to get the number of objects in the gen1.
Why does j = 1; ?? the GC only runs once on the gen0 (to leave the while shouldn't it at least run 2 times)?
[EDIT]
by adding this after the while breaks, i got very confused
Console.WriteLine("#gc0 = "+GC.CollectionCount(0)); --> 2
Console.WriteLine("#gc1 = "+GC.CollectionCount(1)); --> 1
Console.WriteLine("#objs = "+ i);
Console.ReadLine();
how come the GC.CollectionCount(0) be only 2? I´ve been reading Richter clr via c# and he said this
the objects in generation 1 are examined
only when generation 1 reaches its budget, which usually requires several garbage collections of
generation 0.
[EDIT]
but at the same time, if the gc sees that all the object's survived, it grows g0 limit, maybe the reason for only 2 gc's on g0?

How about we spice things up with a bit of randomness like this:
class Program
{
static void Main(string[] args)
{
Random random = new Random();
int i = 0, j = 0;
int a = 0;
Holder prev = new Holder(null);
Holder prev2 = new Holder(null);
while (GC.CollectionCount(1) == 0)
{
int aux = GC.CollectionCount(0);
if (aux > a)
{
a = aux;
++j;
Console.WriteLine((i + 1));
}
++i;
var flag = random.Next(1) == 1;
Holder h = new Holder(flag ? prev : prev2);
Console.WriteLine("Prev: " + GC.GetGeneration(prev));
Console.WriteLine("Prev2: " + GC.GetGeneration(prev2));
if (flag)
{
prev = h;
}
else
{
prev2 = h;
}
}
}
}
internal class Holder
{
private Holder holder;
public Holder(Holder o)
{
holder = o;
}
}
The code sample you've provided was so simple that the CLR knew there was not point in moving your prev item to another generation.
It usage was simple and I think that the runtime had it optimized it to live on G0 only.
Adding a more complex logic breaks the runtime's optimizations and now one of prev or prev1 will go on the G1 depending on which of the objects was used less frequently (don't know the exact mechanics here).
You can try adding instead of prev and prev2 an prevs array and do a random there on the index and you can better see how the array elements will advance the generations.

Thread safety Parallel.For c#

im frenchi so sorry first sorry for my english .
I have an error on visual studio (index out of range) i have this problem only with a Parallel.For not with classic for.
I think one thread want acces on my array[i] and another thread want too ..
It's a code for calcul Kmeans clustering for building link between document (with cosine similarity).
more information :
IndexOutOfRange is on similarityMeasure[i]=.....
I have a computer with 2 Processor (12logical)
with classic for , cpu usage is 9-14% , time for 1 iteration=9min..
with parallel.for , cpu usage is 70-90% =p, time for 1 iteration =~1min30
Sometimes it works longer before generating an error
My function is :
private static int FindClosestClusterCenter(List<Centroid> clustercenter, DocumentVector obj)
{
float[] similarityMeasure = new float[clustercenter.Count()];
float[] copy = similarityMeasure;
object sync = new Object();
Parallel.For(0, clustercenter.Count(), (i) => //for(int i = 0; i < clustercenter.Count(); i++) Parallel.For(0, clustercenter.Count(), (i) => //
{
similarityMeasure[i] = SimilarityMatrics.FindCosineSimilarity(clustercenter[i].GroupedDocument[0].VectorSpace, obj.VectorSpace);
});
int index = 0;
float maxValue = similarityMeasure[0];
for (int i = 0; i < similarityMeasure.Count(); i++)
{
if (similarityMeasure[i] > maxValue)
{
maxValue = similarityMeasure[i];
index = i;
}
}
return index;
}
My function is call here :
do
{
prevClusterCenter = centroidCollection;
DateTime starttime = DateTime.Now;
foreach (DocumentVector obj in documentCollection)//Parallel.ForEach(documentCollection, parallelOptions, obj =>//foreach (DocumentVector obj in documentCollection)
{
int ind = FindClosestClusterCenter(centroidCollection, obj);
resultSet[ind].GroupedDocument.Add(obj);
}
TimeSpan tempsecoule = DateTime.Now.Subtract(starttime);
Console.WriteLine(tempsecoule);
//Console.ReadKey();
InitializeClusterCentroid(out centroidCollection, centroidCollection.Count());
centroidCollection = CalculMeanPoints(resultSet);
stoppingCriteria = CheckStoppingCriteria(prevClusterCenter, centroidCollection);
if (!stoppingCriteria)
{
//initialisation du resultat pour la prochaine itération
InitializeClusterCentroid(out resultSet, centroidCollection.Count);
}
} while (stoppingCriteria == false);
_counter = counter;
return resultSet;
FindCosSimilarity :
public static float FindCosineSimilarity(float[] vecA, float[] vecB)
{
var dotProduct = DotProduct(vecA, vecB);
var magnitudeOfA = Magnitude(vecA);
var magnitudeOfB = Magnitude(vecB);
float result = dotProduct / (float)Math.Pow((magnitudeOfA * magnitudeOfB),2);
//when 0 is divided by 0 it shows result NaN so return 0 in such case.
if (float.IsNaN(result))
return 0;
else
return (float)result;
}
CalculMeansPoint :
private static List<Centroid> CalculMeanPoints(List<Centroid> _clust)
{
for (int i = 0; i < _clust.Count(); i++)
{
if (_clust[i].GroupedDocument.Count() > 0)
{
for (int j = 0; j < _clust[i].GroupedDocument[0].VectorSpace.Count(); j++)
{
float total = 0;
foreach (DocumentVector vspace in _clust[i].GroupedDocument)
{
total += vspace.VectorSpace[j];
}
_clust[i].GroupedDocument[0].VectorSpace[j] = total / _clust[i].GroupedDocument.Count();
}
}
}
return _clust;
}

You may have some side effects in FindCosineSimilarity, make sure it does not modify any field or input parameter. Example: resultSet[ind].GroupedDocument.Add(obj);. If resultSet is not a reference to locally instantiated array, then that is a side effect.
That may fix it. But FYI you could use AsParallel for this rather than Parallel.For:
similarityMeasure = clustercenter
.AsParallel().AsOrdered()
.Select(c=> SimilarityMatrics.FindCosineSimilarity(c.GroupedDocument[0].VectorSpace, obj.VectorSpace))
.ToArray();

You realize that if you synchronize the whole Content of the Parallel-For, it's just the same as having a normal synchrone for-loop, right? Meaning the code as it is doesnt do anything in parallel, so I dont think you'll have any Problems with concurrency. My guess from what I can tell is clustercenter[i].GroupedDocument is propably an empty Array.

Combination Algorithm

Length = input Long(can be 2550, 2880, 2568, etc)
List<long> = {618, 350, 308, 300, 250, 232, 200, 128}
The program takes a long value, for that particular long value we have to find the possible combination from the above list which when added give me a input result(same value can be used twice). There can be a difference of +/- 30.
Largest numbers have to be used most.
Ex:Length = 868
For this combinations can be
Combination 1 = 618 + 250
Combination 2 = 308 + 232 + 200 +128
Correct Combination would be Combination 1
But there should also be different combinations.
public static void Main(string[] args)
{
//subtotal list
List<int> totals = new List<int>(new int[] { 618, 350, 308, 300, 250, 232, 200, 128 });
// get matches
List<int[]> results = KnapSack.MatchTotal(2682, totals);
// print results
foreach (var result in results)
{
Console.WriteLine(string.Join(",", result));
}
Console.WriteLine("Done.");
}
internal static List<int[]> MatchTotal(int theTotal, List<int> subTotals)
{
List<int[]> results = new List<int[]>();
while (subTotals.Contains(theTotal))
{
results.Add(new int[1] { theTotal });
subTotals.Remove(theTotal);
}
if (subTotals.Count == 0)
return results;
subTotals.Sort();
double mostNegativeNumber = subTotals[0];
if (mostNegativeNumber > 0)
mostNegativeNumber = 0;
if (mostNegativeNumber == 0)
subTotals.RemoveAll(d => d > theTotal);
for (int choose = 0; choose <= subTotals.Count; choose++)
{
IEnumerable<IEnumerable<int>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);
results.AddRange(from combo in combos where combo.Sum() == theTotal select combo.ToArray());
}
return results;
}
public static class Combination
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)
{
return choose == 0 ?
new[] { new T[0] } :
elements.SelectMany((element, i) =>
elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));
}
}
I Have used the above code, can it be more simplified, Again here also i get unique values. A value can be used any number of times. But the largest number has to be given the most priority.
I have a validation to check whether the total of the sum is greater than the input value. The logic fails even there..

The algorithm you have shown assumes that the list is sorted in ascending order. If not, then you shall first have to sort the list in O(nlogn) time and then execute the algorithm.
Also, it assumes that you are only considering combinations of pairs and you exit on the first match.
If you want to find all combinations, then instead of "break", just output the combination and increment startIndex or decrement endIndex.
Moreover, you should check for ranges (targetSum - 30 to targetSum + 30) rather than just the exact value because the problem says that a margin of error is allowed.
This is the best solution according to me because its complexity is O(nlogn + n) including the sorting.

V4 - Recursive Method, using Stack structure instead of stack frames on thread
It works (tested in VS), but there could be some bugs remaining.
static int Threshold = 30;
private static Stack<long> RecursiveMethod(long target)
{
Stack<long> Combination = new Stack<long>(establishedValues.Count); //Can grow bigger, as big as (target / min(establishedValues)) values
Stack<int> Index = new Stack<int>(establishedValues.Count); //Can grow bigger
int lowerBound = 0;
int dimensionIndex = lowerBound;
long fail = -1 * Threshold;
while (true)
{
long thisVal = establishedValues[dimensionIndex];
dimensionIndex++;
long afterApplied = target - thisVal;
if (afterApplied < fail)
lowerBound = dimensionIndex;
else
{
target = afterApplied;
Combination.Push(thisVal);
if (target <= Threshold)
return Combination;
Index.Push(dimensionIndex);
dimensionIndex = lowerBound;
}
if (dimensionIndex >= establishedValues.Count)
{
if (Index.Count == 0)
return null; //No possible combinations
dimensionIndex = Index.Pop();
lowerBound = dimensionIndex;
target += Combination.Pop();
}
}
}
Maybe V3 - Suggestion for Ordered solution trying every combination
Although this isn't chosen as the answer for the related question, I believe this is a good approach - https://stackoverflow.com/a/17258033/887092(, otherwise you could try the chosen answer (although the output for that is only 2 items in set being summed, rather than up to n items)) - it will enumerate every option including multiples of the same value. V2 works but would be slightly less efficient than an ordered solution, as the same failing-attempt will likely be attempted multiple times.
V2 - Random Selection - Will be able to reuse the same number twice
I'm a fan of using random for "intelligence", allowing the computer to brute force the solution. It's also easy to distribute - as there is no state dependence between two threads trying at the same time for example.
static int Threshold = 30;
public static List<long> RandomMethod(long Target)
{
List<long> Combinations = new List<long>();
Random rnd = new Random();
//Assuming establishedValues is sorted
int LowerBound = 0;
long runningSum = Target;
while (true)
{
int newLowerBound = FindLowerBound(LowerBound, runningSum);
if (newLowerBound == -1)
{
//No more beneficial values to work with, reset
runningSum = Target;
Combinations.Clear();
LowerBound = 0;
continue;
}
LowerBound = newLowerBound;
int rIndex = rnd.Next(LowerBound, establishedValues.Count);
long val = establishedValues[rIndex];
runningSum -= val;
Combinations.Add(val);
if (Math.Abs(runningSum) <= 30)
return Combinations;
}
}
static int FindLowerBound(int currentLowerBound, long runningSum)
{
//Adjust lower bound, so we're not randomly trying a number that's too high
for (int i = currentLowerBound; i < establishedValues.Count; i++)
{
//Factor in the threshold, because an end aggregate which exceeds by 20 is better than underperforming by 21.
if ((establishedValues[i] - Threshold) < runningSum)
{
return i;
}
}
return -1;
}
V1 - Ordered selection - Will not be able to reuse the same number twice
Add this very handy extension function (uses a binary algorithm to find all combinations):
//Make sure you put this in a static class inside System namespace
public static IEnumerable<List<T>> EachCombination<T>(this List<T> allValues)
{
var collection = new List<List<T>>();
for (int counter = 0; counter < (1 << allValues.Count); ++counter)
{
List<T> combination = new List<T>();
for (int i = 0; i < allValues.Count; ++i)
{
if ((counter & (1 << i)) == 0)
combination.Add(allValues[i]);
}
if (combination.Count == 0)
continue;
yield return combination;
}
}
Use the function
static List<long> establishedValues = new List<long>() {618, 350, 308, 300, 250, 232, 200, 128, 180, 118, 155};
//Return is a list of the values which sum to equal the target. Null if not found.
List<long> FindFirstCombination(long target)
{
foreach (var combination in establishedValues.EachCombination())
{
//if (combination.Sum() == target)
if (Math.Abs(combination.Sum() - target) <= 30) //Plus or minus tolerance for difference
return combination;
}
return null; //Or you could throw an exception
}
Test the solution
var target = 858;
var result = FindFirstCombination(target);
bool success = (result != null && result.Sum() == target);
//TODO: for loop with random selection of numbers from the establishedValues, Sum and test through FindFirstCombination

Using Linq to get the last N elements of a collection?

Given a collection, is there a way to get the last N elements of that collection? If there isn't a method in the framework, what would be the best way to write an extension method to do this?

collection.Skip(Math.Max(0, collection.Count() - N));
This approach preserves item order without a dependency on any sorting, and has broad compatibility across several LINQ providers.
It is important to take care not to call Skip with a negative number. Some providers, such as the Entity Framework, will produce an ArgumentException when presented with a negative argument. The call to Math.Max avoids this neatly.
The class below has all of the essentials for extension methods, which are: a static class, a static method, and use of the this keyword.
public static class MiscExtensions
{
// Ex: collection.TakeLast(5);
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int N)
{
return source.Skip(Math.Max(0, source.Count() - N));
}
}
A brief note on performance:
Because the call to Count() can cause enumeration of certain data structures, this approach has the risk of causing two passes over the data. This isn't really a problem with most enumerables; in fact, optimizations exist already for Lists, Arrays, and even EF queries to evaluate the Count() operation in O(1) time.
If, however, you must use a forward-only enumerable and would like to avoid making two passes, consider a one-pass algorithm like Lasse V. Karlsen or Mark Byers describe. Both of these approaches use a temporary buffer to hold items while enumerating, which are yielded once the end of the collection is found.

coll.Reverse().Take(N).Reverse().ToList();
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> coll, int N)
{
return coll.Reverse().Take(N).Reverse();
}
UPDATE: To address clintp's problem: a) Using the TakeLast() method I defined above solves the problem, but if you really want the do it without the extra method, then you just have to recognize that while Enumerable.Reverse() can be used as an extension method, you aren't required to use it that way:
List<string> mystring = new List<string>() { "one", "two", "three" };
mystring = Enumerable.Reverse(mystring).Take(2).Reverse().ToList();

.NET Core 2.0+ provides the LINQ method TakeLast():
https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.takelast
example:
Enumerable
.Range(1, 10)
.TakeLast(3) // <--- takes last 3 items
.ToList()
.ForEach(i => System.Console.WriteLine(i))
// outputs:
// 8
// 9
// 10

Note: I missed your question title which said Using Linq, so my answer does not in fact use Linq.
If you want to avoid caching a non-lazy copy of the entire collection, you could write a simple method that does it using a linked list.
The following method will add each value it finds in the original collection into a linked list, and trim the linked list down to the number of items required. Since it keeps the linked list trimmed to this number of items the entire time through iterating through the collection, it will only keep a copy of at most N items from the original collection.
It does not require you to know the number of items in the original collection, nor iterate over it more than once.
Usage:
IEnumerable<int> sequence = Enumerable.Range(1, 10000);
IEnumerable<int> last10 = sequence.TakeLast(10);
...
Extension method:
public static class Extensions
{
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> collection,
int n)
{
if (collection == null)
throw new ArgumentNullException(nameof(collection));
if (n < 0)
throw new ArgumentOutOfRangeException(nameof(n), $"{nameof(n)} must be 0 or greater");
LinkedList<T> temp = new LinkedList<T>();
foreach (var value in collection)
{
temp.AddLast(value);
if (temp.Count > n)
temp.RemoveFirst();
}
return temp;
}
}

Here's a method that works on any enumerable but uses only O(N) temporary storage:
public static class TakeLastExtension
{
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int takeCount)
{
if (source == null) { throw new ArgumentNullException("source"); }
if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
if (takeCount == 0) { yield break; }
T[] result = new T[takeCount];
int i = 0;
int sourceCount = 0;
foreach (T element in source)
{
result[i] = element;
i = (i + 1) % takeCount;
sourceCount++;
}
if (sourceCount < takeCount)
{
takeCount = sourceCount;
i = 0;
}
for (int j = 0; j < takeCount; ++j)
{
yield return result[(i + j) % takeCount];
}
}
}
Usage:
List<int> l = new List<int> {4, 6, 3, 6, 2, 5, 7};
List<int> lastElements = l.TakeLast(3).ToList();
It works by using a ring buffer of size N to store the elements as it sees them, overwriting old elements with new ones. When the end of the enumerable is reached the ring buffer contains the last N elements.

I am surprised that no one has mentioned it, but SkipWhile does have a method that uses the element's index.
public static IEnumerable<T> TakeLastN<T>(this IEnumerable<T> source, int n)
{
if (source == null)
throw new ArgumentNullException("Source cannot be null");
int goldenIndex = source.Count() - n;
return source.SkipWhile((val, index) => index < goldenIndex);
}
//Or if you like them one-liners (in the spirit of the current accepted answer);
//However, this is most likely impractical due to the repeated calculations
collection.SkipWhile((val, index) => index < collection.Count() - N)
The only perceivable benefit that this solution presents over others is that you can have the option to add in a predicate to make a more powerful and efficient LINQ query, instead of having two separate operations that traverse the IEnumerable twice.
public static IEnumerable<T> FilterLastN<T>(this IEnumerable<T> source, int n, Predicate<T> pred)
{
int goldenIndex = source.Count() - n;
return source.SkipWhile((val, index) => index < goldenIndex && pred(val));
}

Use EnumerableEx.TakeLast in RX's System.Interactive assembly. It's an O(N) implementation like #Mark's, but it uses a queue rather than a ring-buffer construct (and dequeues items when it reaches buffer capacity).
(NB: This is the IEnumerable version - not the IObservable version, though the implementation of the two is pretty much identical)

If you are dealing with a collection with a key (e.g. entries from a database) a quick (i.e. faster than the selected answer) solution would be
collection.OrderByDescending(c => c.Key).Take(3).OrderBy(c => c.Key);

If you don't mind dipping into Rx as part of the monad, you can use TakeLast:
IEnumerable<int> source = Enumerable.Range(1, 10000);
IEnumerable<int> lastThree = source.AsObservable().TakeLast(3).AsEnumerable();

I tried to combine efficiency and simplicity and end up with this :
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int count)
{
if (source == null) { throw new ArgumentNullException("source"); }
Queue<T> lastElements = new Queue<T>();
foreach (T element in source)
{
lastElements.Enqueue(element);
if (lastElements.Count > count)
{
lastElements.Dequeue();
}
}
return lastElements;
}
About
performance : In C#, Queue<T> is implemented using a circular buffer so there is no object instantiation done each loop (only when the queue is growing up). I did not set queue capacity (using dedicated constructor) because someone might call this extension with count = int.MaxValue . For extra performance you might check if source implement IList<T> and if yes, directly extract the last values using array indexes.

If using a third-party library is an option, MoreLinq defines TakeLast() which does exactly this.

It is a little inefficient to take the last N of a collection using LINQ as all the above solutions require iterating across the collection. TakeLast(int n) in System.Interactive also has this problem.
If you have a list a more efficient thing to do is slice it using the following method
/// Select from start to end exclusive of end using the same semantics
/// as python slice.
/// <param name="list"> the list to slice</param>
/// <param name="start">The starting index</param>
/// <param name="end">The ending index. The result does not include this index</param>
public static List<T> Slice<T>
(this IReadOnlyList<T> list, int start, int? end = null)
{
if (end == null)
{
end = list.Count();
}
if (start < 0)
{
start = list.Count + start;
}
if (start >= 0 && end.Value > 0 && end.Value > start)
{
return list.GetRange(start, end.Value - start);
}
if (end < 0)
{
return list.GetRange(start, (list.Count() + end.Value) - start);
}
if (end == start)
{
return new List<T>();
}
throw new IndexOutOfRangeException(
"count = " + list.Count() +
" start = " + start +
" end = " + end);
}
with
public static List<T> GetRange<T>( this IReadOnlyList<T> list, int index, int count )
{
List<T> r = new List<T>(count);
for ( int i = 0; i < count; i++ )
{
int j=i + index;
if ( j >= list.Count )
{
break;
}
r.Add(list[j]);
}
return r;
}
and some test cases
[Fact]
public void GetRange()
{
IReadOnlyList<int> l = new List<int>() { 0, 10, 20, 30, 40, 50, 60 };
l
.GetRange(2, 3)
.ShouldAllBeEquivalentTo(new[] { 20, 30, 40 });
l
.GetRange(5, 10)
.ShouldAllBeEquivalentTo(new[] { 50, 60 });
}
[Fact]
void SliceMethodShouldWork()
{
var list = new List<int>() { 1, 3, 5, 7, 9, 11 };
list.Slice(1, 4).ShouldBeEquivalentTo(new[] { 3, 5, 7 });
list.Slice(1, -2).ShouldBeEquivalentTo(new[] { 3, 5, 7 });
list.Slice(1, null).ShouldBeEquivalentTo(new[] { 3, 5, 7, 9, 11 });
list.Slice(-2)
.Should()
.BeEquivalentTo(new[] {9, 11});
list.Slice(-2,-1 )
.Should()
.BeEquivalentTo(new[] {9});
}

I know it's to late to answer this question. But if you are working with collection of type IList<> and you don't care about an order of the returned collection, then this method is working faster. I've used Mark Byers answer and made a little changes. So now method TakeLast is:
public static IEnumerable<T> TakeLast<T>(IList<T> source, int takeCount)
{
if (source == null) { throw new ArgumentNullException("source"); }
if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
if (takeCount == 0) { yield break; }
if (source.Count > takeCount)
{
for (int z = source.Count - 1; takeCount > 0; z--)
{
takeCount--;
yield return source[z];
}
}
else
{
for(int i = 0; i < source.Count; i++)
{
yield return source[i];
}
}
}
For test I have used Mark Byers method and kbrimington's andswer. This is test:
IList<int> test = new List<int>();
for(int i = 0; i<1000000; i++)
{
test.Add(i);
}
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
IList<int> result = TakeLast(test, 10).ToList();
stopwatch.Stop();
Stopwatch stopwatch1 = new Stopwatch();
stopwatch1.Start();
IList<int> result1 = TakeLast2(test, 10).ToList();
stopwatch1.Stop();
Stopwatch stopwatch2 = new Stopwatch();
stopwatch2.Start();
IList<int> result2 = test.Skip(Math.Max(0, test.Count - 10)).Take(10).ToList();
stopwatch2.Stop();
And here are results for taking 10 elements:
and for taking 1000001 elements results are:

Here's my solution:
public static class EnumerationExtensions
{
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> input, int count)
{
if (count <= 0)
yield break;
var inputList = input as IList<T>;
if (inputList != null)
{
int last = inputList.Count;
int first = last - count;
if (first < 0)
first = 0;
for (int i = first; i < last; i++)
yield return inputList[i];
}
else
{
// Use a ring buffer. We have to enumerate the input, and we don't know in advance how many elements it will contain.
T[] buffer = new T[count];
int index = 0;
count = 0;
foreach (T item in input)
{
buffer[index] = item;
index = (index + 1) % buffer.Length;
count++;
}
// The index variable now points at the next buffer entry that would be filled. If the buffer isn't completely
// full, then there are 'count' elements preceding index. If the buffer *is* full, then index is pointing at
// the oldest entry, which is the first one to return.
//
// If the buffer isn't full, which means that the enumeration has fewer than 'count' elements, we'll fix up
// 'index' to point at the first entry to return. That's easy to do; if the buffer isn't full, then the oldest
// entry is the first one. :-)
//
// We'll also set 'count' to the number of elements to be returned. It only needs adjustment if we've wrapped
// past the end of the buffer and have enumerated more than the original count value.
if (count < buffer.Length)
index = 0;
else
count = buffer.Length;
// Return the values in the correct order.
while (count > 0)
{
yield return buffer[index];
index = (index + 1) % buffer.Length;
count--;
}
}
}
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> input, int count)
{
if (count <= 0)
return input;
else
return input.SkipLastIter(count);
}
private static IEnumerable<T> SkipLastIter<T>(this IEnumerable<T> input, int count)
{
var inputList = input as IList<T>;
if (inputList != null)
{
int first = 0;
int last = inputList.Count - count;
if (last < 0)
last = 0;
for (int i = first; i < last; i++)
yield return inputList[i];
}
else
{
// Aim to leave 'count' items in the queue. If the input has fewer than 'count'
// items, then the queue won't ever fill and we return nothing.
Queue<T> elements = new Queue<T>();
foreach (T item in input)
{
elements.Enqueue(item);
if (elements.Count > count)
yield return elements.Dequeue();
}
}
}
}
The code is a bit chunky, but as a drop-in reusable component, it should perform as well as it can in most scenarios, and it'll keep the code that's using it nice and concise. :-)
My TakeLast for non-IList`1 is based on the same ring buffer algorithm as that in the answers by #Mark Byers and #MackieChan further up. It's interesting how similar they are -- I wrote mine completely independently. Guess there's really just one way to do a ring buffer properly. :-)
Looking at #kbrimington's answer, an additional check could be added to this for IQuerable<T> to fall back to the approach that works well with Entity Framework -- assuming that what I have at this point does not.

Honestly I'm not super proud of the answer, but for small collections you could use the following:
var lastN = collection.Reverse().Take(n).Reverse();
A bit hacky but it does the job ;)

My solution is based on ranges, introduced in C# version 8.
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int N)
{
return source.ToArray()[(source.Count()-N)..];
}
After running a benchmark with most rated solutions (and my humbly proposed solution):
public static class TakeLastExtension
{
public static IEnumerable<T> TakeLastMarkByers<T>(this IEnumerable<T> source, int takeCount)
{
if (source == null) { throw new ArgumentNullException("source"); }
if (takeCount < 0) { throw new ArgumentOutOfRangeException("takeCount", "must not be negative"); }
if (takeCount == 0) { yield break; }
T[] result = new T[takeCount];
int i = 0;
int sourceCount = 0;
foreach (T element in source)
{
result[i] = element;
i = (i + 1) % takeCount;
sourceCount++;
}
if (sourceCount < takeCount)
{
takeCount = sourceCount;
i = 0;
}
for (int j = 0; j < takeCount; ++j)
{
yield return result[(i + j) % takeCount];
}
}
public static IEnumerable<T> TakeLastKbrimington<T>(this IEnumerable<T> source, int N)
{
return source.Skip(Math.Max(0, source.Count() - N));
}
public static IEnumerable<T> TakeLastJamesCurran<T>(this IEnumerable<T> source, int N)
{
return source.Reverse().Take(N).Reverse();
}
public static IEnumerable<T> TakeLastAlex<T>(this IEnumerable<T> source, int N)
{
return source.ToArray()[(source.Count()-N)..];
}
}
Test
[MemoryDiagnoser]
public class TakeLastBenchmark
{
[Params(10000)]
public int N;
private readonly List<string> l = new();
[GlobalSetup]
public void Setup()
{
for (var i = 0; i < this.N; i++)
{
this.l.Add($"i");
}
}
[Benchmark]
public void Benchmark1_MarkByers()
{
var lastElements = l.TakeLastMarkByers(3).ToList();
}
[Benchmark]
public void Benchmark2_Kbrimington()
{
var lastElements = l.TakeLastKbrimington(3).ToList();
}
[Benchmark]
public void Benchmark3_JamesCurran()
{
var lastElements = l.TakeLastJamesCurran(3).ToList();
}
[Benchmark]
public void Benchmark4_Alex()
{
var lastElements = l.TakeLastAlex(3).ToList();
}
}
Program.cs:
var summary = BenchmarkRunner.Run(typeof(TakeLastBenchmark).Assembly);
Command dotnet run --project .\TestsConsole2.csproj -c Release --logBuildOutput
The results were following:
// * Summary *
BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19044.1889/21H2/November2021Update)
AMD Ryzen 5 5600X, 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.401
[Host] : .NET 6.0.9 (6.0.922.41905), X64 RyuJIT AVX2
DefaultJob : .NET 6.0.9 (6.0.922.41905), X64 RyuJIT AVX2
Method
N
Mean
Error
StdDev
Gen0
Gen1
Allocated
Benchmark1_MarkByers
10000
89,390.53 ns
1,735.464 ns
1,704.457 ns
-
-
248 B
Benchmark2_Kbrimington
10000
46.15 ns
0.410 ns
0.363 ns
0.0076
-
128 B
Benchmark3_JamesCurran
10000
2,703.15 ns
46.298 ns
67.862 ns
4.7836
0.0038
80264 B
Benchmark4_Alex
10000
2,513.48 ns
48.661 ns
45.517 ns
4.7607
-
80152 B
Turns out the solution proposed by #Kbrimington to be the most efficient in terms of memory alloc as well as raw performance.

Below the real example how to take last 3 elements from a collection (array):
// split address by spaces into array
string[] adrParts = adr.Split(new string[] { " " },StringSplitOptions.RemoveEmptyEntries);
// take only 3 last items in array
adrParts = adrParts.SkipWhile((value, index) => { return adrParts.Length - index > 3; }).ToArray();

Using This Method To Get All Range Without Error
public List<T> GetTsRate( List<T> AllT,int Index,int Count)
{
List<T> Ts = null;
try
{
Ts = AllT.ToList().GetRange(Index, Count);
}
catch (Exception ex)
{
Ts = AllT.Skip(Index).ToList();
}
return Ts ;
}

Little different implementation with usage of circular buffer. The benchmarks show that the method is circa two times faster than ones using Queue (implementation of TakeLast in System.Linq), however not without a cost - it needs a buffer which grows along with the requested number of elements, even if you have a small collection you can get huge memory allocation.
public IEnumerable<T> TakeLast<T>(IEnumerable<T> source, int count)
{
int i = 0;
if (count < 1)
yield break;
if (source is IList<T> listSource)
{
if (listSource.Count < 1)
yield break;
for (i = listSource.Count < count ? 0 : listSource.Count - count; i < listSource.Count; i++)
yield return listSource[i];
}
else
{
bool move = true;
bool filled = false;
T[] result = new T[count];
using (var enumerator = source.GetEnumerator())
while (move)
{
for (i = 0; (move = enumerator.MoveNext()) && i < count; i++)
result[i] = enumerator.Current;
filled |= move;
}
if (filled)
for (int j = i; j < count; j++)
yield return result[j];
for (int j = 0; j < i; j++)
yield return result[j];
}
}

//detailed code for the problem
//suppose we have a enumerable collection 'collection'
var lastIndexOfCollection=collection.Count-1 ;
var nthIndexFromLast= lastIndexOfCollection- N;
var desiredCollection=collection.GetRange(nthIndexFromLast, N);
---------------------------------------------------------------------
// use this one liner
var desiredCollection=collection.GetRange((collection.Count-(1+N)), N);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why might System.Random.Sample throw IndexOutOfRangeException? - c#

Related

c# binary search strings from array (including json)

Garbage Collector generations not incrementing

Thread safety Parallel.For c#

Combination Algorithm

Using Linq to get the last N elements of a collection?

Categories

Resources