How to speed up the process of looping a million values array?

How to speed up the process of looping a million values array? - c#

So I was taking an online test where i had to implement a piece of code to simply check if the value was in the array. I wrote the following code:
using System;
using System.IO;
using System.Linq;
public class Check
{
public static bool ExistsInArray(int[] ints, int val)
{
if (ints.Contains(val)) return true;
else return false;
}
}
Now i didn't see any problems here because the code works fine but somehow i still failed the test because this is "not fast enough" once the array contains a million values.
The only code i wrote myself is:
if (ints.Contains(val)) return true;
else return false;
The other code i was given to work with.
Is there a way to speed up this process?
Thanks in advance.
EDIT:
I came across a page where someone apparently took the same test as i took and it seems to come down to saving CPU cycles.
Reference: How to save CPU cycles when searching for a value in a sorted list?
Now his solution within the method is:
var lower = 0;
var upper = ints.Length - 1;
if ( k < ints[lower] || k > ints[upper] ) return false;
if ( k == ints[lower] ) return true;
if ( k == ints[upper] ) return true;
do
{
var middle = lower + ( upper - lower ) / 2;
if ( ints[middle] == k ) return true;
if ( lower == upper ) return false;
if ( k < ints[middle] )
upper = Math.Max( lower, middle - 1 );
else
lower = Math.Min( upper, middle + 1 );
} while ( true );
Now i see how this code works but it's not clear to me why this is supposed to be faster. Would be nice if someone could elaborate.

If it's sorted array you can use BinarySearch To speed Up the process
public static bool ExistsInArray(int[] ints, int val)
{
return Array.BinarySearch(ints, val) >= 0;
}

You can use Parallel, something like the code below:
namespace ParallelDemo
{
class Program
{
static void Main()
{
var options = new ParallelOptions()
{
MaxDegreeOfParallelism = 2
};
List<int> integerList = Enumerable.Range(0,10).ToList();
Parallel.ForEach(integerList, options, i =>
{
Console.WriteLine(#"value of i = {0}, thread = {1}",
i, Thread.CurrentThread.ManagedThreadId);
});
Console.WriteLine("Press any key to exist");
Console.ReadLine();
}
}
}
Note: It will speed up but you're going to use more memory

The correct answer is: it depends.
Is the list sorted?
How big is the list?
How many cores can you throw at the problem?
The simplest answer is that Linq, for all its wonder is actually quite slow. It uses a lot of reflection and generally performs a lot of work under the covers. It's great when ease of readability is your main goal. But for performance? No.
In a single threaded, unsorted list, the old fashioned for loop will give you the best results. If it is sorted then a binary search or some version of the a quick search will work best.
As for parallel, C# has the the parallel class. But beware, if the list is small enough, the overhead of creating threads can overcome your search time.
Simple, single threaded, unsorted answer:
public static bool ExistsInArray(int[] ints, int val)
{
for( int index = 0, count = ints.GetLowerBound(0); index < count; ++index)
{
if (ints[index] == val) return true;
}
return false;
}
it's possible that the site you're looking wants this instead. But this only works if the array is sorted.
public static bool ExistsInArray(int[] ints, int val)
{
return Array.BinarySearch(ints, val) > 0;
}
Supporting posts that demonstrate that Linq is not so fast.
For vs. Linq - Performance vs. Future
https://www.anujvarma.com/linq-versus-loopingperformance/

If the input array is already sorted, then using BinarySearch is the best approach.
.NET has inbuilt support of BinarySearch by using Array.BinarySearch method.
Just did a quick experiment on Contains and BinarySearch with a sorted array of 1 Million integer values as following.
public static void Main()
{
var collection = Enumerable.Range(0, 1000000).ToArray();
var st = new Stopwatch();
var val = 999999;
st.Start();
var isExist = collection.Contains(val);
st.Stop();
Console.WriteLine("Time taken for Contains : {0}", st.Elapsed.TotalMilliseconds);
t.Restart();
var p = BinarySearchArray(collection, 0, collection.Length - 1, val);
st.Stop();
if(p == -1)
{
Console.WriteLine("Not Found");
}
else
{
Console.WriteLine("Item found at position : {0}", p);
}
Console.WriteLine("Time taken for binary search {0}", st.Elapsed.TotalMilliseconds);
}
private static int BinarySearchArray(int[] inputArray, int lower, int upper, int val)
{
if(lower > upper)
return -1;
var midpoint = (upper + lower) / 2;
if(inputArray[midpoint] == val)
{
return midpoint;
}
else if(inputArray[midpoint] > val)
{
upper = midpoint - 1;
}
else if(inputArray[midpoint] < val)
{
lower = midpoint+1;
}
return BinarySearchArray(inputArray, lower, upper, val);
}
Following is the output.
Time taken for Contains : 1.0518
Item found at position : 999999
Time taken for binary search 0.1522
It is clear that the BinarySearch has the upper hand here.
.NET's Contains method does not use BinarySearch internally. Contains is good for small collections but for bigger arrays BinarySearch is better approach.

Related

Binary search with comparer is faster than without

I have a data that consists of about 2 million records. I am trying to find the single data, which is closest to the given timeframe. The list of data is ordered and the data is represented by the following class:
public class DataPoint
{
public long OpenTimeTs;
}
I have implemented 3 methods that do the same job and produce the same results. I have some questions about why one of the approaches performs faster
Method 1
uses the binary search within the list of long
private DataPoint BinaryFindClosest(List<DataPoint> candles, List<long> times, long dateToFindMs)
{
int index = times.BinarySearch(dateToFindMs);
if (index >= 0)
return candles[index];
// If not found, List.BinarySearch returns the complement
// of the index where the element should have been.
index = ~index;
// This date search for is larger than any
if (index == times.Count)
return candles[index - 1];
// The date searched is smaller than any in the list.
if (index == 0)
return candles[0];
if (Math.Abs(dateToFindMs - times[index - 1]) < Math.Abs(dateToFindMs - times[index]))
return candles[index - 1];
else
return candles[index];
}
Method 2
Almost same as method 1, except it uses custom object comparer.
private DataPoint BinaryFindClosest2(List<DataPoint> candles, DataPoint toFind)
{
var comparer = Comparer<DataPoint>.Create((x, y) => x.OpenTimeTs > y.OpenTimeTs ? 1 : x.OpenTimeTs < y.OpenTimeTs ? -1 : 0);
int index = candles.BinarySearch(toFind, comparer);
if (index >= 0)
return candles[index];
// If not found, List.BinarySearch returns the complement
// of the index where the element should have been.
index = ~index;
// This date search for is larger than any
if (index == candles.Count)
return candles[index - 1];
// The date searched is smaller than any in the list.
if (index == 0)
return candles[0];
if (Math.Abs(toFind.OpenTimeTs - candles[index - 1].OpenTimeTs) < Math.Abs(toFind.OpenTimeTs - candles[index].OpenTimeTs))
return candles[index - 1];
else
return candles[index];
}
Method 3
Finally this is the method I've been using before discovering the BinarySearch approach on stackoverflow in some other topic.
private DataPoint FindClosest(List<DataPoint> candles, DataPoint toFind)
{
long timeToFind = toFind.OpenTimeTs;
int smallestDistanceIdx = -1;
long smallestDistance = long.MaxValue;
for (int i = 0; i < candles.Count(); i++)
{
var candle = candles[i];
var distance = Math.Abs(candle.OpenTimeTs - timeToFind);
if (distance <= smallestDistance)
{
smallestDistance = distance;
smallestDistanceIdx = i;
}
else
{
break;
}
}
return candles[smallestDistanceIdx];
}
Question
Now here comes the problem. After running some benchmarks, it has come to my attention that the second method (which uses the custom comprarer) is the fastest amonght the others.
I would like to know why the approach with the custom comparer is performing faster than the approach that binary-searches within the list of longs.
I am using the following code to test the methods:
var candles = AppState.GetLoadSymbolData();
var times = candles.Select(s => s.OpenTimeTs).ToList();
var dateToFindMs = candles[candles.Count / 2].OpenTimeTs;
var candleToFind = new DataPoint() { OpenTimeTs = dateToFindMs };
var numberOfFinds = 100_000;
var sw = Stopwatch.StartNew();
for (int i = 0; i < numberOfFinds; i++)
{
var foundCandle = BinaryFindClosest(candles, times, dateToFindMs);
}
sw.Stop();
var elapsed1 = sw.ElapsedMilliseconds;
sw.Restart();
for (int i = 0; i < numberOfFinds; i++)
{
var foundCandle = BinaryFindClosest2(candles, candleToFind);
}
sw.Stop();
var elapsed2 = sw.ElapsedMilliseconds;
sw.Restart();
for (int i = 0; i < numberOfFinds; i++)
{
var foundCandle = FindClosest(candles, candleToFind);
}
sw.Stop();
var elapsed3 = sw.ElapsedMilliseconds;
Console.WriteLine($"Elapsed 1: {elapsed1} ms");
Console.WriteLine($"Elapsed 2: {elapsed2} ms");
Console.WriteLine($"Elapsed 3: {elapsed3} ms");
In release mode, the results are following:
Elapsed 1: 19 ms
Elapsed 2: 1 ms
Elapsed 3: 60678 ms
Logically I would assume that it should be faster to compare the list of longs, but this is not the case. I tried profiling the code, but it only points to BinarySearch method slow execution, nothing else.. So there must be some internal processes that slow things down for longs.
Edit: After following the advice I have implemented a proper benchmark test using benchmarkdotnet and here are the results
Method
N
Mean
Error
StdDev
Gen0
Allocated
BinaryFindClosest
10000
28.31 ns
0.409 ns
0.362 ns
-
-
BinaryFindClosest2
10000
75.85 ns
0.865 ns
0.722 ns
0.0014
24 B
FindClosest
10000
3,363,223.68 ns
63,300.072 ns
52,858.427 ns
-
2 B
It does look like order in which methods are executed messed up my initial result. Now it looks like the first method works faster (and it should be). The slowest is of course my own implementation. I have tuned it a bit, but it still is the slowest method:
public static DataPoint FindClosest(List<DataPoint> candles, List<long> times, DataPoint toFind)
{
long timeToFind = toFind.OpenTimeTs;
int smallestDistanceIdx = -1;
long smallestDistance = long.MaxValue;
var count = candles.Count();
for (int i = 0; i < count; i++)
{
var diff = times[i] - timeToFind;
var distance = diff < 0 ? -diff : diff;
if (distance < smallestDistance)
{
smallestDistance = distance;
smallestDistanceIdx = i;
}
else
{
break;
}
}
return candles[smallestDistanceIdx];
}
To make the long story short - use a proper benchmarking tool.

Please give a look at the IL that the Method 1 and 2 generate. It is likely an invalid test. They should be almost the same machine code.
First: i don't see where you guarantee the ordering. But suppose it is there somehow. Binary search will find the most hidden number in almost 20 to 25 steps (log2(2.000.000)). This test smells weird.
Second: where is the definition of BinaryFindClosestCandle(candles, times, dateToFindMs)? Why is it receiving both the class instances and the list of longs? Why don't you return the index applying the binary search on the long list and use it to index the original list of candles? (if you create the list of longs with select, the 1:1 relation in the lists is preserved)
Third: The data you are using is a class, so that all elements live on the heap. You are boxing an array of 2 million long numbers in method2. It is almost a crime. Deferencing data from the heap will cost much more than the comparison itself. I still think that the lists are not ordered.
Create a swap list to apply the seach algo on, as you did with times, but convert it to an array with a .ToArray() instead and let it on the stack. I don't think there can be anything better on the market than the default comparer of long valueTypes.
EDIT FOR SOLUTION HINT:
Depending on how many insertions you do before one lookup for the minimum value I would go for the following:
if (insertions/lookups > 300.000)
{
a. store the index of the minimum (and the minimum value) apart in a dedicated field, I would store also a flag for IsUpdated to get false at the first deletion from the list.
b. spawn a parallel thread to refresh that index and the minumum value at every now an then (depending on how often you do the lookups) if the IsUpdated is false, or lazily when you start a lookup with a IsUpdated = false.
}
else
{
use a dictionary with the long as a key ( I suppose that two entities with the same long value are likely to be considered equal).
}

Get all possible distinct triples using LINQ

I have a List contains these values: {1, 2, 3, 4, 5, 6, 7}. And I want to be able to retrieve unique combination of three. The result should be like this:
{1,2,3}
{1,2,4}
{1,2,5}
{1,2,6}
{1,2,7}
{2,3,4}
{2,3,5}
{2,3,6}
{2,3,7}
{3,4,5}
{3,4,6}
{3,4,7}
{3,4,1}
{4,5,6}
{4,5,7}
{4,5,1}
{4,5,2}
{5,6,7}
{5,6,1}
{5,6,2}
{5,6,3}
I already have 2 for loops that able to do this:
for (int first = 0; first < test.Count - 2; first++)
{
int second = first + 1;
for (int offset = 1; offset < test.Count; offset++)
{
int third = (second + offset)%test.Count;
if(Math.Abs(first - third) < 2)
continue;
List<int> temp = new List<int>();
temp .Add(test[first]);
temp .Add(test[second]);
temp .Add(test[third]);
result.Add(temp );
}
}
But since I'm learning LINQ, I wonder if there is a smarter way to do this?

UPDATE: I used this question as the subject of a series of articles starting here; I'll go through two slightly different algorithms in that series. Thanks for the great question!
The two solutions posted so far are correct but inefficient for the cases where the numbers get large. The solutions posted so far use the algorithm: first enumerate all the possibilities:
{1, 1, 1 }
{1, 1, 2 },
{1, 1, 3 },
...
{7, 7, 7}
And while doing so, filter out any where the second is not larger than the first, and the third is not larger than the second. This performs 7 x 7 x 7 filtering operations, which is not that many, but if you were trying to get, say, permutations of ten elements from thirty, that's 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30 x 30, which is rather a lot. You can do better than that.
I would solve this problem as follows. First, produce a data structure which is an efficient immutable set. Let me be very clear what an immutable set is, because you are likely not familiar with them. You normally think of a set as something you add items and remove items from. An immutable set has an Add operation but it does not change the set; it gives you back a new set which has the added item. The same for removal.
Here is an implementation of an immutable set where the elements are integers from 0 to 31:
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System;
// A super-cheap immutable set of integers from 0 to 31 ;
// just a convenient wrapper around bit operations on an int.
internal struct BitSet : IEnumerable<int>
{
public static BitSet Empty { get { return default(BitSet); } }
private readonly int bits;
private BitSet(int bits) { this.bits = bits; }
public bool Contains(int item)
{
Debug.Assert(0 <= item && item <= 31);
return (bits & (1 << item)) != 0;
}
public BitSet Add(int item)
{
Debug.Assert(0 <= item && item <= 31);
return new BitSet(this.bits | (1 << item));
}
public BitSet Remove(int item)
{
Debug.Assert(0 <= item && item <= 31);
return new BitSet(this.bits & ~(1 << item));
}
IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); }
public IEnumerator<int> GetEnumerator()
{
for(int item = 0; item < 32; ++item)
if (this.Contains(item))
yield return item;
}
public override string ToString()
{
return string.Join(",", this);
}
}
Read this code carefully to understand how it works. Again, always remember that adding an element to this set does not change the set. It produces a new set that has the added item.
OK, now that we've got that, let's consider a more efficient algorithm for producing your permutations.
We will solve the problem recursively. A recursive solution always has the same structure:
Can we solve a trivial problem? If so, solve it.
If not, break the problem down into a number of smaller problems and solve each one.
Let's start with the trivial problems.
Suppose you have a set and you wish to choose zero items from it. The answer is clear: there is only one possible permutation with zero elements, and that is the empty set.
Suppose you have a set with n elements in it and you want to choose more than n elements. Clearly there is no solution, not even the empty set.
We have now taken care of the cases where the set is empty or the number of elements chosen is more than the number of elements total, so we must be choosing at least one thing from a set that has at least one thing.
Of the possible permutations, some of them have the first element in them and some of them do not. Find all the ones that have the first element in them and yield them. We do this by recursing to choose one fewer elements on the set that is missing the first element.
The ones that do not have the first element in them we find by enumerating the permutations of the set without the first element.
static class Extensions
{
public static IEnumerable<BitSet> Choose(this BitSet b, int choose)
{
if (choose < 0) throw new InvalidOperationException();
if (choose == 0)
{
// Choosing zero elements from any set gives the empty set.
yield return BitSet.Empty;
}
else if (b.Count() >= choose)
{
// We are choosing at least one element from a set that has
// a first element. Get the first element, and the set
// lacking the first element.
int first = b.First();
BitSet rest = b.Remove(first);
// These are the permutations that contain the first element:
foreach(BitSet r in rest.Choose(choose-1))
yield return r.Add(first);
// These are the permutations that do not contain the first element:
foreach(BitSet r in rest.Choose(choose))
yield return r;
}
}
}
Now we can ask the question that you need the answer to:
class Program
{
static void Main()
{
BitSet b = BitSet.Empty.Add(1).Add(2).Add(3).Add(4).Add(5).Add(6).Add(7);
foreach(BitSet result in b.Choose(3))
Console.WriteLine(result);
}
}
And we're done. We have generated only as many sequences as we actually need. (Though we have done a lot of set operations to get there, but set operations are cheap.) The point here is that understanding how this algorithm works is extremely instructive. Recursive programming on immutable structures is a powerful tool that many professional programmers do not have in their toolbox.

You can do it like this:
var data = Enumerable.Range(1, 7);
var r = from a in data
from b in data
from c in data
where a < b && b < c
select new {a, b, c};
foreach (var x in r) {
Console.WriteLine("{0} {1} {2}", x.a, x.b, x.c);
}
Demo.
Edit: Thanks Eric Lippert for simplifying the answer!

var ints = new int[] { 1, 2, 3, 4, 5, 6, 7 };
var permutations = ints.SelectMany(a => ints.Where(b => (b > a)).
SelectMany(b => ints.Where(c => (c > b)).
Select(c => new { a = a, b = b, c = c })));

How do I turn a recursive algorithm into a tail-recursive algorithm?

As a first attempt to get into merge sort i produced the following code which works on strings because they are easier than lists to deal with.
class Program
{
static int iterations = 0;
static void Main(string[] args)
{
string test = "zvutsrqponmlihgfedcba";
test = MergeSort(test);
// test is sorted after 41 iterations
}
static string MergeSort(string input)
{
iterations++;
if (input.Length < 2)
return input;
int pivot = 0;
foreach (char c in input)
pivot += c;
pivot /= input.Length;
string left = "";
string right = "";
foreach (char c in input)
if (c <= (char)pivot)
left += c;
else
right += c;
return string.Concat(new string[] { MergeSort(left), MergeSort(right) });
}
}
Reading on Wikipedia about possible optimizations i found the following hint "To make sure at most O(log N) space is used, recurse first into the smaller half of the array, and use a tail call to recurse into the other." but honestly i have no idea how to apply this to my case.
I have some vague memories of tail calls from my IT class when we were taught about recursion and factorials but i really can't understand how to apply Wikipedia's advice to my piece of code.
Any help would be much appreciated.

There are numerous problems with this question, starting with the fact that you've implemented a very slow version of QuickSort but asked a question about MergeSort. MergeSort is not typically implemented as a tail recursive algorithm.
Let me ask a better question on your behalf:
How do I turn a recursive algorithm into a tail-recursive algorithm?
Let me sketch out a simpler tail-recursive transformation and then you can work out how to apply that to your sort, should you decide that doing so is a good idea.
Suppose you have the following recursive algorithm:
static int Count(Tree tree)
{
if (tree.IsEmpty)
return 0;
return 1 + Count(tree.Left) + Count(tree.Right);
}
Let's break that down into more steps using the following somewhat bizarre transformation:
static int Count(Tree tree)
{
int total = 0;
Tree current = tree;
if (current.IsEmpty)
return 0;
total += 1;
int countLeft = Count(current.Left);
total += countLeft;
current = current.Right;
int countRight = Count(current);
total += countRight;
return total;
}
Notice that this is exactly the same program as before, just more verbose. Of course you would not write the program in such a verbose manner, but it will help us make it tail recursive.
The point of tail recursion is to turn a recursive call into a goto. We can do that like this:
static int Count(Tree tree)
{
int total = 0;
Tree current = tree;
Restart:
if (current.IsEmpty)
return total;
int countLeft = Count(current.Left);
total += 1;
total += countLeft;
current = current.Right;
goto Restart;
}
See what we've done there? Instead of recursing, we reset the current reference to the thing that would have been recursed on, and go back to the start, while maintaining the state of the accumulator.
Now is it clear how to do the same thing to the QuickSort algorithm?

This looks like a less-than-optimal variant of QuickSort, not a MergeSort. You are missing a C# equivalent of this part:
function merge(left, right)
var list result
while length(left) > 0 or length(right) > 0
if length(left) > 0 and length(right) > 0
if first(left) <= first(right)
append first(left) to result
left = rest(left)
else
append first(right) to result
right = rest(right)
else if length(left) > 0
append first(left) to result
left = rest(left)
else if length(right) > 0
append first(right) to result
right = rest(right)
end while
return result

How do I keep a list of only the last n objects?

I want to do some performance measuring of a particular method, but I'd like to average the time it takes to complete. (This is a C# Winforms application, but this question could well apply to other frameworks.)
I have a Stopwatch which I reset at the start of the method and stop at the end. I'd like to store the last 10 values in a list or array. Each new value added should push the oldest value off the list.
Periodically I will call another method which will average all stored values.
Am I correct in thinking that this construct is a circular buffer?
How can I create such a buffer with optimal performance? Right now I have the following:
List<long> PerfTimes = new List<long>(10);
// ...
private void DoStuff()
{
MyStopWatch.Restart();
// ...
MyStopWatch.Stop();
PerfTimes.Add(MyStopWatch.ElapsedMilliseconds);
if (PerfTimes.Count > 10) PerfTimes.RemoveAt(0);
}
This seems inefficient somehow, but perhaps it's not.
Suggestions?

You could create a custom collection:
class SlidingBuffer<T> : IEnumerable<T>
{
private readonly Queue<T> _queue;
private readonly int _maxCount;
public SlidingBuffer(int maxCount)
{
_maxCount = maxCount;
_queue = new Queue<T>(maxCount);
}
public void Add(T item)
{
if (_queue.Count == _maxCount)
_queue.Dequeue();
_queue.Enqueue(item);
}
public IEnumerator<T> GetEnumerator()
{
return _queue.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
Your current solution works, but it's inefficient, because removing the first item of a List<T> is expensive.

private int ct = 0;
private long[] times = new long[10];
void DoStuff ()
{
...
times[ct] = MyStopWatch.ElapsedMilliseconds;
ct = (ct + 1) % times.Length; // Wrap back around to 0 when we reach the end.
}
Here is a simple circular structure.
This requires none of the array copying or garbage collection of linked list nodes that the other solutions have.

For optimal performance, you can probably just use an array of longs rather than a list.
We had a similar requirement at one point to implement a download time estimator, and we used a circular buffer to store the speed over each of the last N seconds.
We weren't interested in how fast the download was over the entire time, just roughly how long it was expected to take based on recent activity but not so recent that the figures would be jumping all over the place (such as if we just used the last second to calculate it).
The reason we weren't interested in the entire time frame was that a download could so 1M/s for half an hour then switch up to 10M/s for the next ten minutes. That first half hour will drag down the average speed quite severely, despite the fact that you're now downloading quite fast.
We created a circular buffer with each cell holding the amount downloaded in a 1-second period. The circular buffer size was 300, allowing for 5 minutes of historical data, and every cell was initialised to zero. In your case, you would only need ten cells.
We also maintained a total (the sum of all entries in the buffer, so also initially zero) and the count (initially zero, obviously).
Every second, we would figure out how much data had been downloaded since the last second and then:
subtract the current cell from the total.
put the current figure into that cell and advance the cell pointer.
add that current figure to the total.
increase the count if it wasn't already 300.
update the figure displayed to the user, based on total / count.
Basically, in pseudo-code:
def init (sz):
buffer = new int[sz]
for i = 0 to sz - 1:
buffer[i] = 0
total = 0
count = 0
index = 0
maxsz = sz
def update (kbps):
total = total - buffer[index] + kbps # Adjust sum based on deleted/inserted values.
buffer[index] = kbps # Insert new value.
index = (index + 1) % maxsz # Update pointer.
if count < maxsz: # Update count.
count = count + 1
return total / count # Return average.
That should be easily adaptable to your own requirements. The sum is a nice feature to "cache" information which may make your code even faster. By that I mean: if you need to work out the sum or average, you can work it out only when the data changes, and using the minimal necessary calculations.
The alternative would be a function which added up all ten numbers when requested, something that would be slower than the single subtract/add when loading another value into the buffer.

You may want to look at using the Queue data structure instead. You could use a simple linear List, but it is wholly inefficient. A circular array could be used but then you must resize it constantly. Therefore, I suggest you go with the Queue.

I needed to keep 5 last scores in a array and I came up with this simple solution.
Hope it will help some one.
void UpdateScoreRecords(int _latestScore){
latestScore = _latestScore;
for (int cnt = 0; cnt < scoreRecords.Length; cnt++) {
if (cnt == scoreRecords.Length - 1) {
scoreRecords [cnt] = latestScore;
} else {
scoreRecords [cnt] = scoreRecords [cnt+1];
}
}
}

Seems okay to me. What about using a LinkedList instead? When using a List, if you remove the first item, all of the other items have to be bumped back one item. With a LinkedList, you can add or remove items anywhere in the list at very little cost. However, I don't know how much difference this would make, since we're only talking about ten items.
The trade-off of a linked list is that you can't efficiently access random elements of the list, because the linked list must essentially "walk" along the list, passing each item, until it gets to the one you need. But for sequential access, linked lists are fine.

For java, it could be that way
import java.util.Iterator;
import java.util.LinkedList;
import java.util.Queue;
public class SlidingBuffer<T> implements Iterable<T>{
private Queue<T> _queue;
private int _maxCount;
public SlidingBuffer(int maxCount) {
_maxCount = maxCount;
_queue = new LinkedList<T>();
}
public void Add(T item) {
if (_queue.size() == _maxCount)
_queue.remove();
_queue.add(item);
}
public Queue<T> getQueue() {
return _queue;
}
public Iterator<T> iterator() {
return _queue.iterator();
}
}
It could be started that way
public class ListT {
public static void main(String[] args) {
start();
}
private static void start() {
SlidingBuffer<String> sb = new SlidingBuffer<>(5);
sb.Add("Array1");
sb.Add("Array2");
sb.Add("Array3");
sb.Add("Array4");
sb.Add("Array5");
sb.Add("Array6");
sb.Add("Array7");
sb.Add("Array8");
sb.Add("Array9");
//Test printout
for (String s: sb) {
System.out.println(s);
}
}
}
The result is
Array5
Array6
Array7
Array8
Array9

Years after the latest answer I stumbled on this questions while looking for the same solution. I ended with a combination of the above answers especially the one of: cycling by agent-j and of using a queue by Thomas Levesque
public class SlidingBuffer<T> : IEnumerable<T>
{
protected T[] items;
protected int index = -1;
protected bool hasCycled = false;
public SlidingBuffer(int windowSize)
{
items = new T[windowSize];
}
public void Add(T item)
{
index++;
if (index >= items.Length) {
hasCycled = true;
index %= items.Length;
}
items[index] = item;
}
public IEnumerator<T> GetEnumerator()
{
if (index == -1)
yield break;
for (int i = index; i > -1; i--)
{
yield return items[i];
}
if (hasCycled)
{
for (int i = items.Length-1; i > index; i--)
{
yield return items[i];
}
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
I had to forego the very elegant one-liner of j-agent: ct = (ct + 1) % times.Length;
because I needed to detect when we circled back (through hasCycled) to have a well behaving enumerator. Note that the enumerator returns values from most-recent to oldest value.

Find the closest number in a list of numbers

I have a list of constant numbers. I need to find the closest number to x in the list of the numbers. Any ideas on how to implement this algorithm?

Well, you cannot do this faster than O(N) because you have to check all numbers to be sure you have the closest one. That said, why not use a simple variation on finding the minimum, looking for the one with the minimum absolute difference with x?
If you can say the list is ordered from the beginning (and it allows random-access, like an array), then a better approach is to use a binary search. When you end the search at index i (without finding x), just pick the best out of that element and its neighbors.

I suppose that the array is unordered. In ordered it can be faster
I think that the simpliest and the fastest method is using linear algorithm for finding minimum or maximum but instead of comparing values you will compare absolute value of difference between this and needle.
In the C++ ( I can't C# but it will be similar ) can code look like this:
// array of numbers is haystack
// length is length of array
// needle is number which you are looking for ( or compare with )
int closest = haystack[0];
for ( int i = 0; i < length; ++i ) {
if ( abs( haystack[ i ] - needle ) < abs( closest - needle ) ) closest = haystack[i];
}
return closest;

In general people on this site won't do your homework for you. Since you didn't post code I won't post code either. However, here's one possible approach.
Loop through the list, subtracting the number in the list from x. Take the absolute value of this difference and compare it to the best previous result you've gotten and, if the current difference is less than the best previous result, save the current number from the list. At the end of the loop you'll have your answer.

private int? FindClosest(IEnumerable<int> numbers, int x)
{
return
(from number in numbers
let difference = Math.Abs(number - x)
orderby difference, Math.Abs(number), number descending
select (int?) number)
.FirstOrDefault();
}
Null means there was no closest number. If there are two numbers with the same difference, it will choose the one closest to zero. If two numbers are the same distance from zero, the positive number will be chosen.
Edit in response to Eric's comment:
Here is a version which has the same semantics, but uses the Min operator. It requires an implementation of IComparable<> so we can use Min while preserving the number that goes with each distance. I also made it an extension method for ease-of-use:
public static int? FindClosestTo(this IEnumerable<int> numbers, int targetNumber)
{
var minimumDistance = numbers
.Select(number => new NumberDistance(targetNumber, number))
.Min();
return minimumDistance == null ? (int?) null : minimumDistance.Number;
}
private class NumberDistance : IComparable<NumberDistance>
{
internal NumberDistance(int targetNumber, int number)
{
this.Number = number;
this.Distance = Math.Abs(targetNumber - number);
}
internal int Number { get; private set; }
internal int Distance { get; private set; }
public int CompareTo(NumberDistance other)
{
var comparison = this.Distance.CompareTo(other.Distance);
if(comparison == 0)
{
// When they have the same distance, pick the number closest to zero
comparison = Math.Abs(this.Number).CompareTo(Math.Abs(other.Number));
if(comparison == 0)
{
// When they are the same distance from zero, pick the positive number
comparison = this.Number.CompareTo(other.Number);
}
}
return comparison;
}
}

It can be done using SortedList:
Blog post on finding closest number
If the complexity you're looking for counts only the searching the complexity is O(log(n)). The list building will cost O(n*log(n))
If you're going to insert item to the list much more times than you're going to query it for the closest number then the best choice is to use List and use naive algorithm to query it for the closest number. Each search will cost O(n) but time to insert will be reduced to O(n).
General complexity: If the collection has n numbers and searched q times -
List: O(n+q*n)
Sorted List: O(n*log(n)+q*log(n))
Meaning, from some q the sorted list will provide better complexity.

Being lazy I have not check this but shouldn't this work
private int FindClosest(IEnumerable<int> numbers, int x)
{
return
numbers.Aggregate((r,n) => Math.Abs(r-x) > Math.Abs(n-x) ? n
: Math.Abs(r-x) < Math.Abs(n-x) ? r
: r < x ? n : r);
}

Haskell:
import Data.List (minimumBy)
import Data.Ord (comparing)
findClosest :: (Num a, Ord a) => a -> [a] -> Maybe a
findClosest _ [] = Nothing
findClosest n xs = Just $ minimumBy (comparing $ abs . (+ n)) xs

Performance wise custom code will be more use full.
List<int> results;
int targetNumber = 0;
int nearestValue=0;
if (results.Any(ab => ab == targetNumber ))
{
nearestValue= results.FirstOrDefault<int>(i => i == targetNumber );
}
else
{
int greaterThanTarget = 0;
int lessThanTarget = 0;
if (results.Any(ab => ab > targetNumber ))
{
greaterThanTarget = results.Where<int>(i => i > targetNumber ).Min();
}
if (results.Any(ab => ab < targetNumber ))
{
lessThanTarget = results.Where<int>(i => i < targetNumber ).Max();
}
if (lessThanTarget == 0 )
{
nearestValue= greaterThanTarget;
}
else if (greaterThanTarget == 0)
{
nearestValue= lessThanTarget;
}
else if (targetNumber - lessThanTarget < greaterThanTarget - targetNumber )
{
nearestValue= lessThanTarget;
}
else
{
nearestValue= greaterThanTarget;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to speed up the process of looping a million values array? - c#

If it's sorted array you can use BinarySearch To speed Up the process public static bool ExistsInArray(int[] ints, int val) { return Array.BinarySearch(ints, val) >= 0; }

Related

Binary search with comparer is faster than without

Get all possible distinct triples using LINQ

How do I turn a recursive algorithm into a tail-recursive algorithm?

How do I keep a list of only the last n objects?

Find the closest number in a list of numbers

Categories

Resources