Algorithm to find first sum of an array within a range - c#

I'm have a fairly complicated (to me) algorithm that I'm trying to write. The idea is to determine which elements in an array are the first ones to sum up to a value that falls within a range.
For example:
I have an array [1, 15, 25, 22, 25] that is in a prioritized order.
I want to find the first set of values with the most elements that sum within a minimum and maximum range, not necessarily the set that get me closest to my max.
So, if the min is 1 and max is 25, I would select [0(1), 1(15)] even though the third element [2(25)] is closer to my max of 25 because those come first.
If the min is 25 and max is 40, I would select [0(1), 1(15), 3(22)], skipping the third element since that would breach the max.
If the min is 50 and max is 50, I would select [2(25), 4(25)] since those are the only two that can meet the min and max requirements.
Are there any common CS algorithms that match this pattern?

This is a dynamic programming problem.
You want to build a data structure to answer the following question.
by next to last position available in the array:
by target sum:
(elements in sum, last position used)
When it finds a target_sum in range, you just read back through it to get the answer.
Here is pseudocode for that. I used slightly Pythonish syntax and JSON to represent the data structure. Your code will be longer:
Initialize the lookup to [{0: (0, null)}]
for i in 1..(length of array):
# Build up our dynamic programming data structure
Add empty mapping {} to end of lookup
best_sum = null
best_elements = null
for prev_sum, prev_elements, prev_position in lookup for i-1:
# Try not using this element
if prev_sum not in lookup[i] or lookup[i][prev_sum][0] < prev_elements:
lookup[i][prev_sum] = (prev_elements, prev_position)
# Try using this element
next_sum = prev_sum + array[i-1]
next_elements = prev_elements + 1
prev_position = i-1
if next_sum not in lookup lookup[i][next_sum][0] < prev_elements:
lookup[i][next_sum] = (next_elements, next_position)
if next_sum in desired range:
if best_elements is null or best_elements < this_elements
best_elements = this_elements
best_sum = this_sum
if best_elements is not null:
# Read out the answer!
answer = []
j = i
while j is not null:
best_sum = lookup[j][0]
answer.append(array[j])
j = lookup[j][1]
return reversed(answer)
This will return the desired values rather than the indexes. To switch, just reverse what goes into the answer.

Related

Print all partitions into disjoint combinations of fixed size

I have an array of numbers from 1 to n, and I need to find all possible partitions into disjoint combinations of 3 numbers.
That is, for n = 9 the situation is as follows:
Array: 1, 2, 3, 4, 5, 6, 7, 8, 9;
Possible combinations of 3: 123, 124 ... 245, 246 ... 478, 479, etc .;
Possible partitions into 3 disjoint combinations: 123 456 789, 123 457 689 ... 123 468 579 ... 127 458 369, etc.
I've found an algorithm for finding combinations of 3 numbers from a set, here it is: https://www.geeksforgeeks.org/print-all-possible-combinations-of-r-elements-in-a-given-array-of-size-n / (there are even 2 of them, but I used the first one). Now the question is how to find combinations of the combinations themselves, and this already causes difficulties: it seems to me that for this I need to deal with recursion again, but how and where exactly to use it, I don't fully understand (and perhaps the point is then another). Also I've seen a non-recursive algorithm that finds all the combinations from given numbers, https://rosettacode.org/wiki/Combinations#C.23, but could do nothing with it (I enclose my work with it). Could you please help me?
public static IEnumerable<int[]> Combinations(int[] a, int n, int m)
{
int[] result = new int[m];
Stack<int> stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
int index = stack.Count - 1;
int value = stack.Pop();
while (value < n)
{
result[index++] = ++value;
stack.Push(value);
if (index == m)
{
for (int i = 0; i < 3; i++)
{
a = a.Where(val => val != result[i]).ToArray();
}
return Combinations (a, n-3, m);
break;
}
}
}
}
Assuming n is a multiple of 3, there is a simple and intuitive recursive algorithm. (Writing it efficiently is a bit more of a challenge :-) ).
In pseudocode, generalising 3 to k:
# A must have a multiple of k elements
# I write V \ C to mean "V without the values in C". Since producing
# copies is expensive, you should find a more efficient way of doing
# this.
Partition(A, k):
If A has k elements, produce the partition consisting only of A
Otherwise:
Let m be the smallest element of A.
For each combination C of k-1 elements from A \ [m]:
Add m to C
For each partition P generated by Partition(A \ C, k):
produce P with the addition of C
Of course, that depends on you having access to an algorithm which can enumerate the k-combinations of a list. (Even better would be a function which produced successive shuffles of the list with different k-combinations at the beginning, while maintaining the list in order. Sadly, few standard libraries provide that.)
There's another recursive algorithm, which can easily be made into an iterative algorithm by maintaining an explicit stack. It's possibly not quite as intuitive, although once you see it, how it works is pretty obvious, but it's a lot easier to implement efficiently. It requires us to maintain the invariants that each set in the partition is stored in increasing order, and that the sets themselves are sorted in increasing order by their first element. (The order itself is irrelevant, and it's totally reasonable to just assume that the the original order of the elements is the desired sortation, as long as the elements are kept in a data structure whose ordering is constant.)
Once you establish that rule, you can start by making all the partition's sets empty, and then place each successive element in order, using each of the possible locations which obey the following simple constraints:
Once a set contains the correct number of elements, no more elements can be added to it.
Each element is placed at the end of a set (because all the already-placed elements are smaller and all the elements yet to be placed are bigger);
An element can only be added to an empty set if it is the first empty set in the partition (to guarantee that the sets themselves will be sorted).
To avoid constantly copying the sets in the partition, you can implement this by using a fixed-size two-dimensional array of k rows and n ⁄ k columns, where each row represents one set in the partition; it's then necessary to keep another array of n ⁄ k integers the current length of each set.
One advantage of the first algorithm is that it makes it reasonably obvious how many possible partitions there are, because the number of partitions generated by the inner loop is independent of the combination chosen in the outer loop. Consequently, if we write P(n, k) for the number of k-partitions of n objects, we can see that
P(n, k) = C(n−1, k−1) × P(n−k, k) for n>0   (where C(n, k) is the binomial coefficient)
That's simply a product of binomial coefficients:
P(n, k) = C(n−1, k−1) × C(n−k−1, k−1) × C(n−2k−1, k−1) × … × C(k−1, k−1)
Since C(n, k) is n! ⁄ k!(n−k)!, that can be simplified to n! ⁄ (d! × k!d) where d is the number of sets in each partition, i.e. d = n ⁄ k. That number is obviously a lot smaller than n! but it still grows extremely rapidly, making large arguments to the partition function impractical. For k=3, the first few counts are:
P( 3, 3) = 1
P( 6, 3) = 10
P( 9, 3) = 280
P(12, 3) = 15,400
P(15, 3) = 1,401,400
P(18, 3) = 190,590,400
P(21, 3) = 36,212,176,000
For this reason, it's usually advisable to generate and use the possible values one at a time rather than attempting to stash them all into a massive vector, which would take up a lot of memory.

Checking against a changing set of integer ranges using C#

When filling in a form, the user needs to specify an amount. This amount is then checked against approximately 4 to 6 ranges. The selected range is then saved in the database. The original amount will not be stored (for non-technical reasons). There will be no overlay between the ranges, e.g.:
0-999
1000-1999
2000-4999
5000-9999
10000-higher
The tricky part is that these ranges are not fixed in stone. There can be alterations and additional ranges can be added to further specify the '10000 and higher' range. These changes will occur a couple of times and can't be prevented. The old ranges will need to be stored since the specific amount can not be saved to the database.
What would be the most efficient C# data structure for checking against a changing set of ranges?
For my research I included:
One of the answers here suggest that a fixed set of integer ranges in a switch statement is possible with C#7. However, it is not possible to dynamically add cases to and/or remove cases from a switch statement.
This question suggests that using Enumerable.Range is not the most efficient way.
A simple approach here is to store the lower band values in an array, and pass it to a FindBand() method which returns an integer representing the index of the band containing the value.
For example:
public static int FindBand(double value, double[] bandLowerValues)
{
for (int i = 0; i < bandLowerValues.Length; ++i)
if (value < bandLowerValues[i])
return Math.Max(0, i-1);
return bandLowerValues.Length;
}
Test code:
double[] bandLowerValues = {0, 1, 2, 5, 10};
Console.WriteLine(FindBand(-1, bandLowerValues));
Console.WriteLine(FindBand(0, bandLowerValues));
Console.WriteLine(FindBand(0.5, bandLowerValues));
Console.WriteLine(FindBand(1, bandLowerValues));
Console.WriteLine(FindBand(1.5, bandLowerValues));
Console.WriteLine(FindBand(2.5, bandLowerValues));
Console.WriteLine(FindBand(5, bandLowerValues));
Console.WriteLine(FindBand(8, bandLowerValues));
Console.WriteLine(FindBand(9.9, bandLowerValues));
Console.WriteLine(FindBand(10, bandLowerValues));
Console.WriteLine(FindBand(11, bandLowerValues));
This isn't the fastest approach if there are a LOT of bands, but if there are just a few bands this is likely to be sufficiently fast.
(If there were a lot of bands, you could use a binary search to find the appropriate band, but that would be overkill for this in my opinion.)
You can sort low bounds, e.g.
// or decimal instead of double if values are money
double[] lowBounds = new double[] {
0, // 0th group: (-Inf .. 0)
1000, // 1st group: [0 .. 1000)
2000, // 2nd group: [1000 .. 2000)
5000, // 3d group: [2000 .. 5000)
10000, // 4th group: [5000 .. 10000)
// 5th group: [10000 .. +Inf)
};
and then find the correct group (0-based)
int index = Array.BinarySearch(lowBounds, value);
index = index < 0 ? index = -index - 1 : index + 1;
Demo:
double[] tests = new double[] {
-10,
0,
45,
999,
1000,
1997,
5123,
10000,
20000,
};
var result = tests
.Select(value => {
int index = Array.BinarySearch(lowBounds, value);
index = index < 0 ? index = -index - 1 : index + 1;
return $"{value,6} : {index}";
});
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
-10 : 0
0 : 1
45 : 1
999 : 1
1000 : 2
1997 : 2
5123 : 4
10000 : 5
20000 : 5
Since there are already great answers regarding how to find the correct range, I'd like to address the persistence issue.
What do we have here?
You cannot persist the exact value. ( Not allowed )
Values will be "blurred" by fitting them into a range.
Those ranges can (and will) change over time in bounds and number.
So, what I would probably do would be to persist lower and upper bound explicitly in the db.
That way, if ranges change, old data is still correct. You cannot "transform" to the new ranges, because you cannot know if it would be correct. So you need to keep the old values. Any new entries after the change will reflect the new ranges.
One could think of normalization, but honestly, I think that would be overcomplicating the problem. I'd only consider that if the benefit (less storage space) would greatly outweigh the complexity issues.

Recomended Combination Algorithms for Numbers with Ranges

I am currently trying to write C# code that finds multiple arrays of integers that equal a specified total when they are summed up. I would like to find these combinations while each integer in the array is given a range it can be.
For example, if our total is 10 and we have an int array of size 3 where the first number can be between 1 and 4, the second 2 and 4, and the third 3 and 6, some possible combination are [1, 3, 6], [2, 2, 6], and [4, 2, 4].
What sort of algorithm would help with solving a problem like this that can run in them most efficient amount of time? Also, what other things should I keep in mind when transitioning this problem into C# code?
I would do this using recursion. You can simply iterate over all possible values and see if they give a required sum.
Input
Let's suppose we have the following input pattern:
N S
min1 min2 min3 ... minN
max1 max2 max3 ... maxN
For your example
if our total is 10 and we have an int array of size 3 where the first
number can be between 1 and 4, the second 2 and 4, and the third 3 and
6
it will be:
3 10
1 2 3
4 4 6
Solution
We have read our input values. Now, we just try to use each possible number for our solution.
We will have a List which will store the current path:
static List<int> current = new List<int>();
The recursive function is pretty simple:
private static void Do(int index, int currentSum)
{
if (index == length) // Termination
{
if (currentSum == sum) // If result is a required sum - just output it
Output();
return;
}
// try all possible solutions for current index
for (int i = minValues[index]; i <= maxValues[index]; i++)
{
current.Add(i);
Do(index + 1, currentSum + i); // pass new index and new sum
current.RemoveAt(current.Count() - 1);
}
}
For non-negative values we can also include such condition. This is the recursion improvement which will cut off a huge amount of incorrect iterations. If we already have a currentSum greater than sum then it is useless to continue in this recursion branch:
if (currentSum > sum) return;
Actually, this algorithm is a simple "find combinations that give a sum S" problem solution with one difference: inner loop indices within minValue[index] and maxValue[index].
Demo
Here is the working IDEOne demo of my solution.
You cannot do much better than nested for loops/recursion. Though if you are familiar with the 3SUM problem you will know a little trick to reduce the time complexity of this sort of algorithm! If you have n ranges then you know what number you have to pick from the nth range after you make your first n-1 choices!
I will use an example to walk through my suggestion.
if our total is 10 and we have an int array of size 3 where the first number can be between 1 and 4, the second 2 and 4, and the third 5 and 6
First of all lets process the data to be a bit nicer to deal with. I personally like the idea of working with ranges that start at 0 instead of arbitrary numbers! So we subtract the lower bounds from the upper bounds:
(1 to 4) -> (0 to 3)
(2 to 4) -> (0 to 2)
(5 to 6) -> (0 to 1)
Of course now we need to adjust our target sum to reflect the new ranges. So we subtract our original lower bounds from our target sum as well!
TargetSum = 10-1-2-5 = 2
Now we can represent our ranges with just the upper bound since they share a lower bound! So a range array would look something like:
RangeArray = [3,2,1]
Lets sort this (it will become more obvious why later). So we have:
RangeArray = [1,2,3]
Great! Now onto the beef of the algorithm... the summing! For now I will use for loops as it is easier to use for example purposes. You will have to use recursion. Yeldar's code should give you a good starting place.
result = []
for i from 0 to RangeArray[0]:
SumList = [i]
newSum = TargetSum - i
for j from 0 to RangeArray[1]:
if (newSum-j)>=0 and (newSum-j)<=RangeArray[2] then
finalList = SumList + [j, newSum-j]
result.append(finalList)
Note the inner loop. This is what was inspired by the 3SUM algorithm. We take advantage of the fact that we know what value we have to pick from the third range (since it is defined by our first 2 choices).
From here you have to of course re-map the results back to the original ranges by adding the original lowerbounds to the values that came from the corresponding ranges.
Notice that we now understand why it may be a good idea to sort RangeList. The last range gets absorbed into the secondlast range's loop. We want the largest range to be the one that does not loop.
I hope this helps to get you started! If you need any help translating my pseudocode into c# just ask :)

Pick up two numbers from an array so that the sum is a constant

I came across an algorithm problem. Suppose I receive a credit and would like to but two items from a local store. I would like to buy two items that add up to the entire value of the credit. The input data has three lines.
The first line is the credit, the second line is the total amount of the items and the third line lists all the item price.
Sample data 1:
200
7
150 24 79 50 88 345 3
Which means I have $200 to buy two items, there are 7 items. I should buy item 1 and item 4 as 200=150+50
Sample data 2:
8
8
2 1 9 4 4 56 90 3
Which indicates that I have $8 to pick two items from total 8 articles. The answer is item 4 and item 5 because 8=4+4
My thought is first to create the array of course, then pick up any item say item x. Creating another array say "remain" which removes x from the original array.
Subtract the price of x from the credit to get the remnant and check whether the "remain" contains remnant.
Here is my code in C#.
// Read lines from input file and create array price
foreach (string s in price)
{
int x = Int32.Parse(s);
string y = (credit - x).ToString();
index1 = Array.IndexOf(price, s) ;
index2 = Array.IndexOf(price, y) ;
remain = price.ToList();
remain.RemoveAt(index1);//remove an element
if (remain.Contains(y))
{
break;
}
}
// return something....
My two questions:
How is the complexity? I think it is O(n2).
Any improvement to the algorithm? When I use sample 2, I have trouble to get correct indices. Because there two "4" in the array, it always returns the first index since IndexOf(String) reports the zero-based index of the first occurrence of the specified string in this instance.
You can simply sort the array in O(nlogn) time. Then for each element A[i] conduct a binary search for S-A[i] again in O(nlogn) time.
EDIT: As pointed out by Heuster, you can solve the 2-SUM problem on the sorted array in linear time by using two pointers (one from the beginning and other from the end).
Create a HashSet<int> of the prices. Then go through it sequentially.Something like:
HashSet<int> items = new HashSet<int>(itemsList);
int price1 = -1;
int price2 = -1;
foreach (int price in items)
{
int otherPrice = 200 - price;
if (items.Contains(otherPrice))
{
// found a match.
price1 = price;
price2 = otherPrice;
break;
}
}
if (price2 != -1)
{
// found a match.
// price1 and price2 contain the values that add up to your target.
// now remove the items from the HashSet
items.Remove(price1);
items.Remove(price2);
}
This is O(n) to create the HashSet. Because lookups in the HashSet are O(1), the foreach loop is O(n).
This problem is called 2-sum. See., for example, http://coderevisited.com/2-sum-problem/
Here is an algorithm in O(N) time complexity and O(N) space : -
1. Put all numbers in hash table.
2. for each number Arr[i] find Sum - Arr[i] in hash table in O(1)
3. If found then (Arr[i],Sum-Arr[i]) are your pair that add up to Sum
Note:- Only failing case can be when Arr[i] = Sum/2 then you can get false positive but you can always check if there are two Sum/2 in the array in O(N)
I know I am posting this is a year and a half later, but I just happened to come across this problem and wanted to add input.
If there exists a solution, then you know that both values in the solution must both be less than the target sum.
Perform a binary search in the array of values, searching for the target sum (which may or may not be there).
The binary search will end with either finding the sum, or the closest value less than sum. That is your starting high value while searching through the array using the previously mentioned solutions. Any value above your new starting high value cannot be in the solution, as it is more than the target value.
At this point, you have eliminated a chunk of data in log(n) time, that would otherwise be eliminated in O(n) time.
Again, this is an optimization that may only be worth implementing if the data set calls for it.

Building a non sequential list of numbers (From a large range)

I need to create a non sequential list of numbers that fit within a range. For instance i need to a generate a list of numbers from 1 to 1million and make sure that non of the numbers are in a sequential order, that they are completly shuffled. I guess my first question is, are there any good algorithms out there that could help and how best to implement this.
I currently am not sure the best way to implement, either via a c# console app that will spit out the numbers in an XML file or in a database that will spit out the numbers into a table or a set of tables, but that is really secondary to actually working out the best way of "shuffling" the set of numbers.
Any advice guys?
Rob
First off, if none of the numbers are in sequential order then every number in the sequence must be less than its predecessor. A sequence which has that property is sorted from biggest to smallest! Clearly that is not what you want. (Or perhaps you simply do not want any subsequence of the form 5, 6, 7 ? But 6, 8, 20 would be OK?)
To answer your question properly we need to know more information about the problem space. Things I would want to know:
1) Is the size of the range equal to, larger than, or smaller than the size of the sequence? That is, are you going to ask for ten numbers between 1 and 10, five numbers between 1 and 10 or fifty numbers between 1 and 10?
2) Is it acceptable for the sequence to contain duplicates? (If the number of items in the sequence is larger than the range, then clearly yes.)
3) What is the randomness being used for? Most random number generators are only pseudo-random; a clever attacker can deduce the next "random" number by knowing the previous ones. If for example you are generating a series of five cards out of a deck of 52 to make a poker hand, you want really strong randomness; you don't want players to be able to deduce what their opponents have in their hands.
How "non-sequential" do you want it?
You could easily generate a list of random numbers from a range with the Random class:
Random rnd1 = new Random();
List<int> largeList = new List<int>();
for (int i = 0, i < largeNumber, i++)
{
largeList.Add(rnd1.Next(1, 1000001);
}
Edit to add
Admittedly the Durstenfeld algorithm (modern version of the Fisher–Yates shuffle apparently) is much faster:
var fisherYates = new List<int>(upperBound);
for (int i = 0; i < upperBound; i++)
{
fisherYates.Add(i);
}
int n = upperBound;
while (n > 1)
{
n--;
int k = rnd.Next(n + 1);
int temp = fisherYates[k];
fisherYates[k] = fisherYates[n];
fisherYates[n] = temp;
}
For the range 1 to 10000 doing a brute force "find a random number I've not yet used" takes around 4-5 seconds, while this takes around 0.001.
Props to Greg Hewgill for the links.
I understand, that you want to get a random array of lenth 1mio with all numbers from 1 to 1mio. No duplicates, is that right?
You should build up an array with your numbers ranging from 1 to 1mio. Then start shuffling. But it can happen (that is true randomness) that two ore even more numbers are sequential.
Have a look here
Here's a C# function to get you started:
public IEnumerable<int> GetRandomSequence(int max)
{
var r = new Random();
while (true)
{
yield return r.GetNext(max);
}
}
call it like this to get a million numbers ranged 0-9999999:
var numbers = GetRandomSequence(9999999).Take(1000000);
As for sorting, or if you don't want to allow repeats, look at Enumerable.GetRange() (which will give you a consecutive ordered sequence) and use a Fisher-Yates (or Knuth) shuffle algorithm (which you can find all over the place).
"completly shuffled" is a very misunderstood term. One trick fraud experts use when examining what should be "random" data is to watch for cases where there no duplicate values (like 3743***88***123, because in a truly random sequence the chances of not having such a pair is very low... Exactly what are you trying to do ? What, exactly do you mean by "completly shuffled"? If all you mean is random sequence of digits, then just use the Random class in the CLR. to generate random numbers between 0 and 1M... as many as you need...
Well ,you could go with something like this (assuming that you want every number exactly once):
DECLARE #intFrom int
DECLARE #intTo int
DECLARE #tblList table (_id uniqueidentifier, _number int)
SET #intFrom = 0
SET #intTo = 1000000
WHILE (#intFrom < #intTo)
BEGIN
INSERT INTO #tblList
SELECT NewID(), #intFrom
SET #intFrom = #intFrom + 1
END
SELECT *
FROM #tblList
ORDER BY _id
DISCLAIMER: I didn't test this, since I don't have an SQL Server at my disposal at the moment.
This may get you what you need:
1) Populate a list of numbers in order. If your range is 1 - x, it'll look like this:
[1, 2, 4, 5, 6, 7, 8, 9, ... , x]
2) Loop over the list x times, each time choosing a random number between 0 and the length of your list - 1.
3) Use this chosen number to select the corresponding element from your list, and add this number to your output list.
4) Delete the element you just selected from your list. Rinse, repeat.
This will work for any range of numbers, not just lists that start with 1 or 0. The pseudocode looks like this:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
shuffled_nums = []
for i in range(0, len(nums)):
random_index = rand(0,len(nums))
shuffled_nums.add(nums[random_index])
del(nums[random_index])

Categories

Resources