Sorting numbers using maximum spread - c#

I want to sort a list of integers in such a way that they end up being spread out as much as possible. Assuming base 8, the order of items between 1 and 7 ought to be: {4, 6, 2, 7, 1, 5, 3} as per:
There is a fair amount of ambiguity of course, as both 6 and 2 are equally far away from 4, 0 and 8, so the specific ordering of 6 and 2 is irrelevant. What I'm trying to achieve is to first pick the number furthest away from 0 and base, then pick the number furthest away from 0, base and first number, etc. Any multiple of the base will never occur so I don't care how that is handled.
I can manually design the sort order for any given base, but I need this to work for any base >= 2. Is there a clever/fast way to compute this or do I need to lazily build the sorting mapping tables and cache them for future use?
int SortOrder(int radix, int value)
{
int offset = value % radix;
int[] table = {int.MinValue, 4, 2, 6, 0, 5, 1, 3}; // Hand-crafted for base-8
return table[offset];
}

This is specifically not an answer to the question since it doesn't attempt to find the answer quickly. Rather it builds a dictionary of cached sorting values for each radix.
#region sorting logic
/// <summary>
/// Maintains a collection of sorting maps for all used number bases.
/// </summary>
private static readonly Dictionary<int, int[]> _sortingTable = new Dictionary<int, int[]>();
private static readonly object _sortingLock = new object();
/// <summary>
/// Compute the sorting key for a given multiple.
/// </summary>
/// <param name="radix">Radix or base.</param>
/// <param name="multiple">Multiple.</param>
/// <returns>Sorting key.</returns>
public static int ComputeSortingKey(int radix, long multiple)
{
if (radix < 2)
throw new ArgumentException("Radix may not be less than 2.");
if (multiple == 0)
return int.MinValue; // multiple=0 always needs to be sorted first, so pick the smallest possible key.
int[] map;
if (!_sortingTable.TryGetValue(radix, out map))
lock (_sortingLock)
{
map = new int[radix];
map[0] = -1; // Multiples of the radix are sorted first.
int key = 0;
HashSet<int> occupancy = new HashSet<int> { 0, radix };
HashSet<int> collection = new HashSet<int>(1.ArrayTo(radix)); // (ArrayTo is an extension method in this project)
while (collection.Count > 0)
{
int maxValue = 0;
int maxDistance = 0;
foreach (int value in collection)
{
int distance = int.MaxValue;
foreach (int existingValue in occupancy)
distance = Math.Min(distance, Math.Abs(existingValue - value));
if (distance > maxDistance)
{
maxDistance = distance;
maxValue = value;
}
}
collection.Remove(maxValue);
occupancy.Add(maxValue);
map[maxValue] = key++;
}
_sortingTable.Remove(radix); // Just in case of a race-condition.
_sortingTable.Add(radix, map);
}
long offset = multiple % radix;
if (offset != 0)
if (multiple < 0)
offset = radix - (Math.Abs(multiple) % radix);
return map[(int)offset];
}
#endregion

My original answer was to find the maximum delta. To work your way from out to in, use the same comparison but different selects:
List<double> answer = new List<double>();
List<double> doub = new List<double>() { 0, -1, 2, 3, 4, -5, 7 };//SORT this list for sorted results!
List<double> lowerHalf = new List<double>();
List<double> upperHalf = new List<double>();
for (int i = 0; i < doub.Count; i++)
{
if (i <= (int)Math.Floor((double)doub.Count / 2))
lowerHalf.Add(doub[i]);
else
upperHalf.Add(doub[i]);
}
if (upperHalf.Count < lowerHalf.Count)
{
upperHalf.Insert(0,lowerHalf[lowerHalf.Count-1]);
}
//if(upperHalf[0]==lowerHalf[lowerHalf.Count-1]){double median = lowerHalf[lowerHalf.Count-1]+upperHalf[1])/2;lowerHalf[lowerHalf.Count-1] = median; upperHalf[0]=median;}//use Math.Round or Math.Floor/Ceiling if necessary
for (int i = 0; i < lowerHalf.Count; i++)
{
double deltas = Math.Sqrt(Math.Pow(upperHalf[upperHalf.Count - (i + 1)] - lowerHalf[i], 2));
answer.Add(deltas);
Console.WriteLine("The answer for {1}, {2} is: {0}", deltas, lowerHalf[i], upperHalf[upperHalf.Count - (i+1)]);
}
Console.ReadLine();
This will provide:
The answer for 0, 7 is: 7
The answer for -1, -5 is: 4
The answer for 2, 4 is: 2
The answer for 3, 3 is: 0
NOTE that in the event of an odd number of items in the original range, this method uses the same item for both upper and lower lists. I've added a line to use the "actual" median for your benefit
To get rid of duplicates, use a hashset, union, or distinct()
Original answer - to find maximum delta):
You can use math in your Linq, like:
List<double> doub = new List<double>() { 0, 1, 2, 3, 4, 5, 7 };
double deltas = doub.Select(p => p - doub.First()).OrderBy(p => p).Last();
Console.WriteLine("The answer is: {0}",deltas);
Console.ReadLine();
If your values go negative, you'd need to use squares:
double deltas = Math.Sqrt( doub.Select(p => Math.Pow(p - doub.First(), 2)).OrderBy(p => p).Last());
or Math.Abs or a test to see which is larger - but this should give you an idea on how to get started. If the numbers aren't in order in the original list, you can call an orderby before the select, as well.
Literally:
List<double> doub = new List<double>() { 0, 1, 2, 3, 4, 5, 7 };
double deltas = Math.Sqrt( doub.Select(p => Math.Pow(p - doub.First(), 2)).OrderBy(p => p).Last());
Console.WriteLine("The answer is: {0}",deltas);
Console.ReadLine();
produces
'OilTracker.vshost.exe' (CLR v4.0.30319: OilTracker.vshost.exe): Loaded 'C:\Users\User\Documents\Visual Studio 2015\Projects\OilTracker\OilTracker\bin\Debug\TDAInterface.dll'. Symbols loaded.
'OilTracker.vshost.exe' (CLR v4.0.30319: OilTracker.vshost.exe): Loaded 'C:\Users\User\Documents\Visual Studio 2015\Projects\OilTracker\OilTracker\bin\Debug\BackFeeder.exe'. Symbols loaded.
The answer is: 7
Moving forward to sorting the list, use:
List<double> answer = new List<double>();
List<double> doub = new List<double>() { 0, 1, 2, 3, 4, 5, 7 };
//sort doub if necessary
foreach (double num in doub)
{
double deltas = Math.Sqrt(Math.Pow(doub.Select(p => p - num).OrderBy(p => p).Last(), 2));
answer.Add(deltas);
Console.WriteLine("The answer for {1} is: {0}", deltas,num);
}
Console.ReadLine();
(Again, use another orderby if the list is not in order).
Produces:
'OilTracker.vshost.exe' (CLR v4.0.30319: OilTracker.vshost.exe): Loaded 'C:\Users\User\Documents\Visual Studio 2015\Projects\OilTracker\OilTracker\bin\Debug\TDA_Stream_Interface.dll'. Symbols loaded.
The answer for 0 is: 7
The answer for 1 is: 6
The answer for 2 is: 5
The answer for 3 is: 4
The answer for 4 is: 3
The answer for 5 is: 2
The answer for 7 is: 0
The squares/square root help us change signs and deal with negatives - so
List<double> doub = new List<double>() { 0, -1, 2, 3, 4, -5, 7 };
Produces:
The answer for 0 is: 7
The answer for -1 is: 8
The answer for 2 is: 5
The answer for 3 is: 4
The answer for 4 is: 3
The answer for -5 is: 12
The answer for 7 is: 0
(which aren't in order because I failed to sort the list on either the inbound or outbound sides).
After running, the list "answer" will contain the results - the lowest delta can be accessed by answers.First() and the highest by answers.Last(). similar deltas will exist for different numbers the same number of units apart - you could use a HashSet conversion in the formula if you want to eradicate duplicate deltas. If you need help with that, please let me know.
If you need to store the numbers that created the delta as well as the delta itself, they are available to you in the For/Each loop as the Console.WriteLine() indicates.
If you want to work the range-to-median on both sides of the median, it's probably a good idea to split the list and work in pairs. Make 0 to median in one list, and median to endRange in the second list, work out in the first and in in the second. This should get you there, but if you need help getting over that final hump, let me know.
Hope this helps!!

Related

Group (previous) elements of array by number of subsequent negative ones

I'm trying to process (in C#) a large data file with some numeric data. Given an array of integers, how can it be split/grouped, so that previous n elements are grouped if next n (two or more) are negative numbers. For example, in the array below, two or more consecutive negative numbers should be used as a hint to group the same number of previous elements:
0
1
4
-99
-99
-99
1
2
7
9
-99
-99
3
6
8
-99
-99
5
Output:
Group 1:
0
1
4
[3 negative numbers were next, so group previous 3]
Group 2:
1
Group 3:
2
Group 4:
7
9
[2 negative numbers were next, so group previous 2, but keep any prior positive numbers (1, 2) as separate groups]
Group 5:
3
Group 6:
6
8
[2 negative numbers were next, so group previous 2, but keep any prior positive numbers (3) as separate groups]
Group 7:
5
What will be the fastest, most efficient way to process such an array of data into a new grouped collection?
I'd probably try to stream the whole thing, maintaining a buffer of "current non-negative numbers" and a count of negative numbers. Here's some code which appears to work... I'd expect it to be at least pretty efficient. I'd start with that, and if it's not efficient enough for you, look into why.
(If you do want to have the whole array in memory, you don't need to build up the numbersToEmit list, as you can just maintain indexes instead. But given that a single list is reused, I'd expect this to be okay.)
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
static void Main(string[] args)
{
int[] input = { 0, 1, 4, -99, -99, -99, 1,
2, 7, 9, -99, -99, 3, 6, 8, -99, -99, 5 };
foreach (var group in GroupByNegativeCounts(input))
{
Console.WriteLine(string.Join(", ", group));
}
}
static IEnumerable<int[]> GroupByNegativeCounts(IEnumerable<int> input)
{
List<int> numbersToEmit = new List<int>();
int negativeCount = 0;
foreach (var value in input)
{
// We never emit anything when we see a negative number
if (value < 0)
{
negativeCount++;
}
else
{
// We emit numbers if we've previously seen negative
// numbers, and then we see a non-negative one.
if (negativeCount > 0)
{
int singles = Math.Max(numbersToEmit.Count - negativeCount, 0);
foreach (var single in numbersToEmit.Take(singles))
{
yield return new[] { single };
}
if (singles != numbersToEmit.Count)
{
yield return numbersToEmit.Skip(singles).ToArray();
}
negativeCount = 0;
numbersToEmit.Clear();
}
numbersToEmit.Add(value);
}
}
// Emit anything we've got left at the end.
foreach (var single in numbersToEmit)
{
yield return new[] { single };
}
}
}
Here's my solution, I tried to make it as simple as possible :)
List<int> input = new List<int>() { 0, 1, 4, -99, -99, -99, 1, 2, 7, 9, -99, -99, 3, 6, 8, -99, -99, 5 };
List<int> reverse_input = input.Reverse<int>().ToList(); //reverse list for easier reading
List<List<int>> grouped_input = new List<List<int>>(); //output variable
List<int> leading_positives;
int leading_negative_count;
while (reverse_input.Any())
{
//Get the amount of leading negatives and remove them from the reversed list
leading_negative_count = reverse_input.TakeWhile(num => num < 0).Count();
reverse_input = reverse_input.Skip(leading_negative_count).ToList();
//get and store leading positives and remove them from the reversed list
leading_positives = reverse_input.TakeWhile(num => num >= 0).ToList();
reverse_input = reverse_input.Skip(leading_positives.Count).ToList();
//take an amount of positives equal to the amount of previously found negatives and add them as a separate list to the output
grouped_input.Add(leading_positives.Take(leading_negative_count).Reverse().ToList());
//for each remaining positive add it as an individual into the output
leading_positives.Skip(leading_negative_count).ToList().ForEach(num => grouped_input.Add(new List<int>() { num }));
}
//output display
grouped_input.Reverse<List<int>>().ToList().ForEach(lst => Console.WriteLine(string.Join(",", lst)));

How to find intersection of two arrays (Optimal solution)

I am writting algorithm for intersection of two arrays A and B , I want an optimized solution in terms of space complexity and time complexity.
I have written algorithm and it works fine but i want to know if there is any more optimal solution then this exist or if someone could provide me.
What i do is:
(1) Find the Smallest size array among two.
(2) The new array wil be of size allocated size equal to smaller size array
(3) From smaller size array i go and compare with each element in bigger array if it exists one ,i get it in third array"C" and break it right there (because we need to find intersection, even if it repeats 100 times after
we don't care for us only one existence is enough to put in third array). At the same time we also have to check if the element in smaller array which to be compared with all elements in bigger array already exist in third array, Example A=[0,1,1], B[0,1,2,3].
Now we start with A's first element, it is present in array B we save it in C[0], then go to second , now C is [0,1], and in next step we again have 1 to compare, which we have already compared.So for this situation we have to do check if element to be compare already exist in array C then we eliminate check for it.
(4) We store the found element in C (third array) and print it.
My full working code for it is :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int[] aar1 = { 0, 1, 1, 7, 2, 6, 3, 9, 11, 2, 2,3,3,3,3,3,1 };
int[] aar2 = { 0, 1, 2, 3, 4, 5, 6, 11, 11, 1, 1, 1, 1 };
int[] arr3 = findIntersection(aar1, aar2);
Console.WriteLine("the array is : " + arr3);
Console.ReadKey();
}
private static int[] findIntersection(int[] aar1, int[] aar2)
{
int[] arr3 = { 0 };
if (aar1.Count() < aar2.Count())
{
int counter = 0;
arr3 = new int[aar1.Count()];
foreach (int var1 in aar1)
{
if (!checkifInThirdArray(var1, arr3))
{
foreach (int var2 in aar2)
{
if (var1 == var2)
{
arr3[counter++] = var1;
break;
}
}
}
}
}
else
{
int counter = 0;
arr3 = new int[aar1.Count()];
foreach (int var2 in aar2)
{
if (!checkifInThirdArray(var2, arr3))
{
foreach (int var1 in aar1)
{
if (var2 == var1)
{
arr3[counter++] = var2;
break;
}
}
}
}
}
return arr3;
}
private static bool checkifInThirdArray(int var1, int[] arr3)
{
bool flag = false;
if (arr3 != null)
{
foreach (int arr in arr3)
{
if (arr == var1) { flag = true; break; }
}
}
return flag;
}
}
}
One space complexity issue i found is (the others i would really appreciate if you let me know with solution if you find any) :
(1) When i allocate the size to third array, i allocate the Min of the two arrays to be compared, In case if the intersection element are too less then we
have unnecessarily allocated the extra memory. How to solve this issue ?
Please note that i don't have to use any inbuilt function like intersection() or any other.
It sounds like your solution is an O(n2) one in that, for every single element in one array, you may need to process every single element in the other (in the case where the intersection is the null set). You should be aware that C# actually has facilities for finding the intersection of arrays but, should you wish to implement your own, read on.
You would probably be better of sorting both arrays (in-place if allowed otherwise to a separate collection) then doing a merge-check of the two to construct another. The sort could be O(n log n) and the merge check would be O(n).
If you're wondering what I mean by merge check, it's simply processing both (sorted) arrays side by side.
If the first element in both matches, you have an intersect point and you should store that value and advance both lists until the next value is different.
If they're different, there's no intersect point and you can advance the array with the lowest value until it changes.
By way of example, here's some code in Python (the ideal pseudo-code language) that implements such a solution. Array a contains all the multiples of three between 0 and 18 inclusive (in arbitrary order and including duplicates), while array b has all the even numbers in that range (again, with some duplicates and ordered "randomly").
a = [0,3,15,3,9,6,12,15,18,6]
b = [10,0,2,12,4,6,18,8,16,10,12,6,14,16]
# Copy and sort.
a2 = a; a2.sort()
b2 = b; b2.sort()
# Initial pointers and results for merge check.
ap = 0
bp = 0
c = []
# Continue until either array is exhausted.
while ap < len(a2) and bp < len(b2):
# Check for intersect or which list has lowest value.
if a2[ap] == b2[bp]:
# Intersect, save, advance both lists to next number.
val = a2[ap]
c.append(val)
while ap < len(a2) and a2[ap] == val:
ap += 1
while bp < len(b2) and b2[bp] == val:
bp += 1
elif a2[ap] < b2[bp]:
# A has smallest, advance it to next number.
val = a2[ap]
while ap < len(a2) and a2[ap] == val:
ap += 1
else:
# B has smallest, advance it to next number.
val = b2[bp]
while bp < len(b2) and b2[bp] == val:
bp += 1
print(c)
If you run that, you'll see the intersect list that's formed between the two arrays:
[0, 6, 12, 18]
Maybe I am not understanding you right but why don't you use the following;
int[] aar1 = { 0, 1, 1, 7, 2, 6, 3, 9, 11, 2, 2,3,3,3,3,3,1 };
int[] aar2 = { 0, 1, 2, 3, 4, 5, 6, 11, 11, 1, 1, 1, 1 };
aarResult = aar1.Intersect(aar2).ToArray();
This will result in an array with only the space needed and intersects the arrays. You can also initialize the aarResult as follows to get the minimum array size:
int[] aarResult = new int[Math.Min(aar1.Count(), aar2.Count())];
You can use LINQ Intersect method. It uses hashing and works for linear O(N+M) which is faster than your algorithm:
int[] aar1 = { 0, 1, 1, 7, 2, 6, 3, 9, 11, 2, 2, 3, 3, 3, 3, 3, 1 };
int[] aar2 = { 0, 1, 2, 3, 4, 5, 6, 11, 11, 1, 1, 1, 1 };
int[] result = aar1.Intersect(aar2).ToArray();
It will also solve your unnecessarily allocated items problem, because it will create an array of the exact answer size.

Packing item from set of available packs

Suppose there is an Item that a customer is ordering - in this case it turns out they are ordering 176 (totalNeeded) of this Item.
The database has 5 records associated with this item that this item can be stored in:
{5 pack, 8 pack, 10 pack, 25 pack, 50 pack}.
A rough way of packing this would be:
Sort the array from biggest to smallest.
While (totalPacked < totalNeeded) // 176
{
1. Maintain an <int, int> dictionary which contains Keys of pack id's,
and values of how many needed
2. Add the largest pack, which is not larger than the amount remaining to pack,
increment totalPacked by the pack size
3. If any remainder is left over after the above, add the smallest pack to reduce
waste
e.g., 4 needed, smallest size is 5, so add one 5; one extra item packed
}
Based on the above logic, the outcome would be:
You need: 3 x 50 packs, 1 x 25 pack, 1 x 5 pack
Total Items: 180
Excess = 4 items; 180 - 176
The above is not too difficult to code, I have it working locally. However, it is not truly the best way to pack this item. Note: "best" means, smallest amount of excess.
Thus ... we have an 8 pack available, we need 176. 176 / 8 = 22. Send the customer 22 x 8 packs, they will get exactly what they need. Again, this is even simpler than the pseudo-code I wrote ... see if the total needed is evenly divisible by any of the packs in the array - if so, "at the very least" we know that we can fall back on 22 x 8 packs being exact.
In the case that the number is not divisible by an array value, I am attempting to determine possible way that the array values can be combined to reach at least the number we need (176), and then score the different combinations by # of Packs needed total.
If anyone has some reading that can be done on this topic, or advice of any kind to get me started it would be greatly appreciated.
Thank you
This is a variant of the Subset Sum Problem (Optimization version)
While the problem is NP-Complete, there is a pretty efficient pseudo-polynomial time Dynamic Programming solution to it, by following the recursive formulas:
D(x,i) = false x<0
D(0,i) = true
D(x,0) = false x != 0
D(x,i) = D(x,i-1) OR D(x-arr[i],i
The Dynamic Programming Solution will build up a table, where an element D[x][i]==true iff you can use the first i kinds of packs to establish sum x.
Needless to say that D[x][n] == true iff there is a solution with all available packs that sums to x. (where n is the total number of packs you have).
To get the "closest higher number", you just need to create a table of size W+pack[0]-1 (pack[0] being the smallest available pack, W being the sum you are looking for), and choose the value that yields true which is closest to W.
If you wish to give different values to the different pack types, this becomes Knapsack Problem, which is very similar - but uses values instead a simple true/false.
Getting the actual "items" (packs) chosen after is done by going back the table and retracing your steps. This thread and this thread elaborate how to achieve it with more details.
If this example problem is truly representative of the actual problem you are solving, it is small enough to try every combination with brute force using recursion. For example, I found exactly 6,681 unique packings that are locally maximized, with a total of 205 that have exactly 176 total items. The (unique) solution with minimum number of packs is 6, and that is { 2-8, 1-10, 3-50 }. Total runtime for the algorithm was 8 ms.
public static List<int[]> GeneratePackings(int[] packSizes, int totalNeeded)
{
var packings = GeneratePackingsInternal(packSizes, 0, new int[packSizes.Length], totalNeeded);
return packings;
}
private static List<int[]> GeneratePackingsInternal(int[] packSizes, int packSizeIndex, int[] packCounts, int totalNeeded)
{
if (packSizeIndex >= packSizes.Length) return new List<int[]>();
var currentPackSize = packSizes[packSizeIndex];
var currentPacks = new List<int[]>();
if (packSizeIndex + 1 == packSizes.Length) {
var lastOptimal = totalNeeded / currentPackSize;
packCounts[packSizeIndex] = lastOptimal;
return new List<int[]> { packCounts };
}
for (var i = 0; i * currentPackSize <= totalNeeded; i++) {
packCounts[packSizeIndex] = i;
currentPacks.AddRange(GeneratePackingsInternal(packSizes, packSizeIndex + 1, (int[])packCounts.Clone(), totalNeeded - i * currentPackSize));
}
return currentPacks;
}
The algorithm is pretty straightforward
Loop through every combination of number of 5-packs.
Loop through every combination of number of 8-packs, from remaining amount after deducting specified number of 5-packs.
etc to 50-packs. For 50-pack counts, directly divide the remainder.
Collect all combinations together recursively (so it dynamically handles any set of pack sizes).
Finally, once all the combinations are found, it is pretty easy to find all packs with least waste and least number of packages:
var packSizes = new int[] { 5, 8, 10, 25, 50 };
var totalNeeded = 176;
var result = GeneratePackings(packSizes, totalNeeded);
Console.WriteLine(result.Count());
var maximal = result.Where (r => r.Zip(packSizes, (a, b) => a * b).Sum() == totalNeeded).ToList();
var min = maximal.Min(m => m.Sum());
var minPacks = maximal.Where (m => m.Sum() == min).ToList();
foreach (var m in minPacks) {
Console.WriteLine("{ " + string.Join(", ", m) + " }");
}
Here is a working example: https://ideone.com/zkCUYZ
This partial solution is specifically for your pack sizes of 5, 8, 10, 25, 50. And only for order sizes at least 40 large. There are a few gaps at smaller sizes that you'll have to fill another way (specifically at values like 6, 7, 22, 27 etc).
Clearly, the only way to get any number that isn't a multiple of 5 is to use the 8 packs.
Determine the number of 8-packs needed with modular arithmatic. Since the 8 % 5 == 3, each 8-pack will handle a different remainder of 5 in this cycle: 0, 2, 4, 1, 3. Something like
public static int GetNumberOf8Packs(int orderCount) {
int remainder = (orderCount % 5);
return ((remainder % 3) * 5 + remainder) / 3;
}
In your example of 176. 176 % 5 == 1 which means you'll need 2 8-packs.
Subtract the value of the 8-packs to get the number of multiples of 5 you need to fill. At this point you still need to deliver 176 - 16 == 160.
Fill all the 50-packs you can by integer dividing. Keep track of the leftovers.
Now just fit the 5, 10, 25 packs as needed. Obviously use the larger values first.
All together your code might look like this:
public static Order MakeOrder(int orderSize)
{
if (orderSize < 40)
{
throw new NotImplementedException("You'll have to write this part, since the modular arithmetic for 8-packs starts working at 40.");
}
var order = new Order();
order.num8 = GetNumberOf8Packs(orderSize);
int multipleOf5 = orderSize - (order.num8 * 8);
order.num50 = multipleOf5 / 50;
int remainderFrom50 = multipleOf5 % 50;
while (remainderFrom50 > 0)
{
if (remainderFrom50 >= 25)
{
order.num25++;
remainderFrom50 -= 25;
}
else if (remainderFrom50 >= 10)
{
order.num10++;
remainderFrom50 -= 10;
}
else if (remainderFrom50 >= 5)
{
order.num5++;
remainderFrom50 -= 5;
}
}
return order;
}
A DotNetFiddle

Top 5 values from three given arrays

Recently i faced a question in
C#,question is:-
There are three int arrays
Array1={88,65,09,888,87}
Array2={1,49,921,13,33}
Array2={22,44,66,88,110}
Now i have to get array of highest 5 from all these three arrays.What is the most optimized way of doing this in c#?
The way i can think of is take an array of size 15 and add array elements of all three arrays and sort it n get last 5.
An easy way with LINQ:
int[] top5 = array1.Concat(array2).Concat(array3).OrderByDescending(i => i).Take(5).ToArray();
An optimal way:
List<int> highests = new List<int>(); // Keep the current top 5 sorted
// Traverse each array. No need to put them together in an int[][]..it's just for simplicity
foreach (int[] array in new int[][] { array1, array2, array3 }) {
foreach (int i in array) {
int index = highests.BinarySearch(i); // where should i be?
if (highests.Count < 5) { // if not 5 yet, add anyway
if (index < 0) {
highests.Insert(~index, i);
} else { //add (duplicate)
highests.Insert(index, i);
}
}
else if (index < 0) { // not in top-5 yet, add
highests.Insert(~index, i);
highests.RemoveAt(0);
} else if (index > 0) { // already in top-5, add (duplicate)
highests.Insert(index, i);
highests.RemoveAt(0);
}
}
}
Keep a sorted list of the top-5 and traverse each array just once.
You may even check the lowest of the top-5 each time, avoiding the BinarySearch:
List<int> highests = new List<int>();
foreach (int[] array in new int[][] { array1, array2, array3 }) {
foreach (int i in array) {
int index = highests.BinarySearch(i);
if (highests.Count < 5) { // if not 5 yet, add anyway
if (index < 0) {
highests.Insert(~index, i);
} else { //add (duplicate)
highests.Insert(index, i);
}
} else if (highests.First() < i) { // if larger than lowest top-5
if (index < 0) { // not in top-5 yet, add
highests.Insert(~index, i);
highests.RemoveAt(0);
} else { // already in top-5, add (duplicate)
highests.Insert(index, i);
highests.RemoveAt(0);
}
}
}
}
The most optimized way for a fixed K=5 is gong through all arrays five times, picking the highest element not taken so far on each pass. You need to mark the element that you take in order to skip it on subsequent passes. This has the complexity of O(N1+N2+N3) (you go through all N1+N2+N3 elements five times), which is as fast as it can get.
You can combine the arrays using LINQ, sort them, then reverse.
int[] a1 = new int[] { 1, 10, 2, 9 };
int[] a2 = new int[] { 3, 8, 4, 7 };
int[] a3 = new int[] { 2, 9, 8, 4 };
int[] a4 = a1.Concat(a2).Concat(a3).ToArray();
Array.Sort(a4);
Array.Reverse(a4);
for (int i = 0; i < 5; i++)
{
Console.WriteLine(a4[i].ToString());
}
Console.ReadLine();
Prints: 10, 9, 9, 8, 8 from the sample I provided as input for the arrays.
Maybe you could have an array of 5 elements which would be the "max values" array.
Initially fill it with the first 5 values, which in your case would just be the first array. Then loop through the rest of the values. For each value, check it against the 5 max values from least to greatest. If you find the current value from the main list is greater than the value in the max values array, insert it above that element in the array, which would push the last element out. At the end you should have an array of the 5 max values.
For three arrays of length N1,N2,N3, the fastest way should be combining the 3 arrays, and then finding the (N1+N2+N3-4)th order statistic using modified quick sort.
In the resultant array, the elements with indices (N1+N2+N3-5) to the maximum (N1+N2+N3-1) should be your 5 largest. You can also sort them later.
The time complexity of this approach is O(N1+N2+N3) on average.
Here are the two ways for doing this task. The first one is using only basic types. This is the most efficient way, with no extra loop, no extra comparison, and no extra memory consumption. You just pass the index of elements that need to be matched with another one and calculate which is the next index to be matched for each given array.
First Way -
http://www.dotnetbull.com/2013/09/find-max-top-5-number-from-3-sorted-array.html
Second Way -
int[] Array1 = { 09, 65, 87, 89, 888 };
int[] Array2 = { 1, 13, 33, 49, 921 };
int[] Array3 = { 22, 44, 66, 88, 110 };
int [] MergeArr = Array1.Concat(Array2).Concat(Array3).ToArray();
Array.Sort(MergeArr);
int [] Top5Number = MergeArr.Reverse().Take(5).ToArray()
Taken From -
Find max top 5 number from three given sorted array
Short answer: Use a SortedList from Sorted Collection Types in .NET as a min-heap.
Explanation:
From the first array, add 5 elements to this SortedList/min-heap;
Now iterate through all the rest of the elements of arrays:
If an array element is bigger than the smallest element in min-heap then remove the min element and push this array element in the heap;
Else, continue to next array element;
In the end, your min-heap has the 5 biggest elements of all arrays.
Complexity: Takes Log k time to find the minimum when you have a SortedList of k elements. Multiply that by total elements in all arrays because you are going to perform this 'find minimum operation' that many times.
Brings us to overall complexity of O(n * Log k) where n is the total number of elements in all your arrays and k is the number of highest numbers you want.

Efficient way of finding item at index in array with joined count array

I have an object that contains two arrays, the first is a slope array:
double[] Slopes = new double[capacity];
The next is an array containing the counts of various slopes:
int[] Counts = new int[capacity];
The arrays are related, in that when I add a slope to the object, if the last element entered in the slope array is the same slope as the new item, instead of adding it as a new element the count gets incremented.
i.e. If I have slopes 15 15 15 12 4 15 15, I get:
Slopes = { 15, 12, 4, 15 }
Counts = { 3, 1, 1, 2 }
Is there a better way of finding the i_th item in slopes than iterating over the Counts with the index and finding the corresponding index in Slopes?
edit: Not sure if maybe my question wasn't clear. I need to be able to access the i_th Slope that occurred, so from the example the zero indexed i = 3 slope that occurs is 12, the question is whether a more efficient solution exists for finding the corresponding slope in the new structure.
Maybe this will help better understand the question: here is how I get the i_th element now:
public double GetSlope(int index)
int countIndex = 0;
int countAccum = 0;
foreach (int count in Counts)
{
countAccum += count;
if (index - countAccum < 0)
{
return Slopes[countIndex];
}
else
{
countIndex++;
}
}
return Slopes[Index];
}
I am wondering if there is a more efficient way?
You could use a third array in order to store the first index of a repeated slope
double[] Slopes = new double[capacity];
int[] Counts = new int[capacity];
int[] Indexes = new int[capacity];
With
Slopes = { 15, 12, 4, 15 }
Counts = { 3, 1, 1, 2 }
Indexes = { 0, 3, 4, 5 }
Now you can apply a binary search in Indexes to serach for an index which is less or equal to the one you are looking for.
Instead of having an O(n) search performance, you have now O(log(n)).
If you are loading the slopes at one time and doing many of these "i-th item" lookups, it may help to have a third (or instead of Counts, depending on what that is used for) array with the totals. This would be { 0, 3, 4, 5 } for your example. Then you don't need to add them up for each look up, it's just a matter of "is i between Totals[x] and Totals[x + 1]". If you expect to have few slope buckets, or if slopes are added throughout processing, or if you don't do many of these look-ups, it probably will buy you nothing, though. Essentially, this is just doing all those additions at one time up front.
you can always wrap your existing arrays, and another array (call it OriginalSlopes), into a class. When you add to Slopes, you also add to OriginalSlopes like you would a normal array (i.e. always append). If you need the i_th slope, look it up in OriginalSlopes. O(1) operations all around.
edit adding your example data:
Slopes = { 15, 12, 4, 15 }
Counts = { 3, 1, 1, 2 }
OriginalSlopes = { 15, 15, 15, 12, 4, 15, 15 }
In counts object (or array in your base), you add a variable that has the cumulative count that you have found so far.
Using the binary search with comparator method comparing the cumulative count you would be able to find the slope in O(log N) time.
edit
`Data = 15 15 15 12 4 15 15`
Slopes = { 15, 12, 4, 15 }
Counts = { 3, 1, 1, 2 }
Cumulative count = { 3, 4, 5, 7}
For instance, if you are looking for element at 6th position, when you search into the Cumulative count dataset and find value 5, and know next value is 7, you can be sure that element at that index will have 6th position element as well.
Use binary search to find element in log(N) time.
Why not a Dictionary<double, double> with the key being Slopes and the value being counts?
Hmm, double double? Now I need a coffee...
EDIT: You could use a dictionary where the key is the slope and each key's value is a list of corresponding indexes and counts. Something like:
class IndexCount
{
public int Index { get; set; }
public int Count { get; set; }
}
Your collection declaration would look something like:
var slopes = new Dictionary<double, List<IndexCount>>();
You could then look up the dictionary by value and see from the associated collection what the count is at each index. This might make your code pretty interesting though. I would go with the list approach below if performance is not a primary concern.
You could use a single List<> of a type that associates the Slopes and Counts, something like:
class SlopeCount
{
public int Slope { get; set; }
public int Count { get; set; }
}
then:
var slopeCounts = new List<SlopeCount>();
// fill the list

Categories

Resources