Recomended Combination Algorithms for Numbers with Ranges - c#

I am currently trying to write C# code that finds multiple arrays of integers that equal a specified total when they are summed up. I would like to find these combinations while each integer in the array is given a range it can be.
For example, if our total is 10 and we have an int array of size 3 where the first number can be between 1 and 4, the second 2 and 4, and the third 3 and 6, some possible combination are [1, 3, 6], [2, 2, 6], and [4, 2, 4].
What sort of algorithm would help with solving a problem like this that can run in them most efficient amount of time? Also, what other things should I keep in mind when transitioning this problem into C# code?

I would do this using recursion. You can simply iterate over all possible values and see if they give a required sum.
Input
Let's suppose we have the following input pattern:
N S
min1 min2 min3 ... minN
max1 max2 max3 ... maxN
For your example
if our total is 10 and we have an int array of size 3 where the first
number can be between 1 and 4, the second 2 and 4, and the third 3 and
6
it will be:
3 10
1 2 3
4 4 6
Solution
We have read our input values. Now, we just try to use each possible number for our solution.
We will have a List which will store the current path:
static List<int> current = new List<int>();
The recursive function is pretty simple:
private static void Do(int index, int currentSum)
{
if (index == length) // Termination
{
if (currentSum == sum) // If result is a required sum - just output it
Output();
return;
}
// try all possible solutions for current index
for (int i = minValues[index]; i <= maxValues[index]; i++)
{
current.Add(i);
Do(index + 1, currentSum + i); // pass new index and new sum
current.RemoveAt(current.Count() - 1);
}
}
For non-negative values we can also include such condition. This is the recursion improvement which will cut off a huge amount of incorrect iterations. If we already have a currentSum greater than sum then it is useless to continue in this recursion branch:
if (currentSum > sum) return;
Actually, this algorithm is a simple "find combinations that give a sum S" problem solution with one difference: inner loop indices within minValue[index] and maxValue[index].
Demo
Here is the working IDEOne demo of my solution.

You cannot do much better than nested for loops/recursion. Though if you are familiar with the 3SUM problem you will know a little trick to reduce the time complexity of this sort of algorithm! If you have n ranges then you know what number you have to pick from the nth range after you make your first n-1 choices!
I will use an example to walk through my suggestion.
if our total is 10 and we have an int array of size 3 where the first number can be between 1 and 4, the second 2 and 4, and the third 5 and 6
First of all lets process the data to be a bit nicer to deal with. I personally like the idea of working with ranges that start at 0 instead of arbitrary numbers! So we subtract the lower bounds from the upper bounds:
(1 to 4) -> (0 to 3)
(2 to 4) -> (0 to 2)
(5 to 6) -> (0 to 1)
Of course now we need to adjust our target sum to reflect the new ranges. So we subtract our original lower bounds from our target sum as well!
TargetSum = 10-1-2-5 = 2
Now we can represent our ranges with just the upper bound since they share a lower bound! So a range array would look something like:
RangeArray = [3,2,1]
Lets sort this (it will become more obvious why later). So we have:
RangeArray = [1,2,3]
Great! Now onto the beef of the algorithm... the summing! For now I will use for loops as it is easier to use for example purposes. You will have to use recursion. Yeldar's code should give you a good starting place.
result = []
for i from 0 to RangeArray[0]:
SumList = [i]
newSum = TargetSum - i
for j from 0 to RangeArray[1]:
if (newSum-j)>=0 and (newSum-j)<=RangeArray[2] then
finalList = SumList + [j, newSum-j]
result.append(finalList)
Note the inner loop. This is what was inspired by the 3SUM algorithm. We take advantage of the fact that we know what value we have to pick from the third range (since it is defined by our first 2 choices).
From here you have to of course re-map the results back to the original ranges by adding the original lowerbounds to the values that came from the corresponding ranges.
Notice that we now understand why it may be a good idea to sort RangeList. The last range gets absorbed into the secondlast range's loop. We want the largest range to be the one that does not loop.
I hope this helps to get you started! If you need any help translating my pseudocode into c# just ask :)

Related

Print all partitions into disjoint combinations of fixed size

I have an array of numbers from 1 to n, and I need to find all possible partitions into disjoint combinations of 3 numbers.
That is, for n = 9 the situation is as follows:
Array: 1, 2, 3, 4, 5, 6, 7, 8, 9;
Possible combinations of 3: 123, 124 ... 245, 246 ... 478, 479, etc .;
Possible partitions into 3 disjoint combinations: 123 456 789, 123 457 689 ... 123 468 579 ... 127 458 369, etc.
I've found an algorithm for finding combinations of 3 numbers from a set, here it is: https://www.geeksforgeeks.org/print-all-possible-combinations-of-r-elements-in-a-given-array-of-size-n / (there are even 2 of them, but I used the first one). Now the question is how to find combinations of the combinations themselves, and this already causes difficulties: it seems to me that for this I need to deal with recursion again, but how and where exactly to use it, I don't fully understand (and perhaps the point is then another). Also I've seen a non-recursive algorithm that finds all the combinations from given numbers, https://rosettacode.org/wiki/Combinations#C.23, but could do nothing with it (I enclose my work with it). Could you please help me?
public static IEnumerable<int[]> Combinations(int[] a, int n, int m)
{
int[] result = new int[m];
Stack<int> stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
int index = stack.Count - 1;
int value = stack.Pop();
while (value < n)
{
result[index++] = ++value;
stack.Push(value);
if (index == m)
{
for (int i = 0; i < 3; i++)
{
a = a.Where(val => val != result[i]).ToArray();
}
return Combinations (a, n-3, m);
break;
}
}
}
}
Assuming n is a multiple of 3, there is a simple and intuitive recursive algorithm. (Writing it efficiently is a bit more of a challenge :-) ).
In pseudocode, generalising 3 to k:
# A must have a multiple of k elements
# I write V \ C to mean "V without the values in C". Since producing
# copies is expensive, you should find a more efficient way of doing
# this.
Partition(A, k):
If A has k elements, produce the partition consisting only of A
Otherwise:
Let m be the smallest element of A.
For each combination C of k-1 elements from A \ [m]:
Add m to C
For each partition P generated by Partition(A \ C, k):
produce P with the addition of C
Of course, that depends on you having access to an algorithm which can enumerate the k-combinations of a list. (Even better would be a function which produced successive shuffles of the list with different k-combinations at the beginning, while maintaining the list in order. Sadly, few standard libraries provide that.)
There's another recursive algorithm, which can easily be made into an iterative algorithm by maintaining an explicit stack. It's possibly not quite as intuitive, although once you see it, how it works is pretty obvious, but it's a lot easier to implement efficiently. It requires us to maintain the invariants that each set in the partition is stored in increasing order, and that the sets themselves are sorted in increasing order by their first element. (The order itself is irrelevant, and it's totally reasonable to just assume that the the original order of the elements is the desired sortation, as long as the elements are kept in a data structure whose ordering is constant.)
Once you establish that rule, you can start by making all the partition's sets empty, and then place each successive element in order, using each of the possible locations which obey the following simple constraints:
Once a set contains the correct number of elements, no more elements can be added to it.
Each element is placed at the end of a set (because all the already-placed elements are smaller and all the elements yet to be placed are bigger);
An element can only be added to an empty set if it is the first empty set in the partition (to guarantee that the sets themselves will be sorted).
To avoid constantly copying the sets in the partition, you can implement this by using a fixed-size two-dimensional array of k rows and n ⁄ k columns, where each row represents one set in the partition; it's then necessary to keep another array of n ⁄ k integers the current length of each set.
One advantage of the first algorithm is that it makes it reasonably obvious how many possible partitions there are, because the number of partitions generated by the inner loop is independent of the combination chosen in the outer loop. Consequently, if we write P(n, k) for the number of k-partitions of n objects, we can see that
P(n, k) = C(n−1, k−1) × P(n−k, k) for n>0   (where C(n, k) is the binomial coefficient)
That's simply a product of binomial coefficients:
P(n, k) = C(n−1, k−1) × C(n−k−1, k−1) × C(n−2k−1, k−1) × … × C(k−1, k−1)
Since C(n, k) is n! ⁄ k!(n−k)!, that can be simplified to n! ⁄ (d! × k!d) where d is the number of sets in each partition, i.e. d = n ⁄ k. That number is obviously a lot smaller than n! but it still grows extremely rapidly, making large arguments to the partition function impractical. For k=3, the first few counts are:
P( 3, 3) = 1
P( 6, 3) = 10
P( 9, 3) = 280
P(12, 3) = 15,400
P(15, 3) = 1,401,400
P(18, 3) = 190,590,400
P(21, 3) = 36,212,176,000
For this reason, it's usually advisable to generate and use the possible values one at a time rather than attempting to stash them all into a massive vector, which would take up a lot of memory.

Algorithm to find first sum of an array within a range

I'm have a fairly complicated (to me) algorithm that I'm trying to write. The idea is to determine which elements in an array are the first ones to sum up to a value that falls within a range.
For example:
I have an array [1, 15, 25, 22, 25] that is in a prioritized order.
I want to find the first set of values with the most elements that sum within a minimum and maximum range, not necessarily the set that get me closest to my max.
So, if the min is 1 and max is 25, I would select [0(1), 1(15)] even though the third element [2(25)] is closer to my max of 25 because those come first.
If the min is 25 and max is 40, I would select [0(1), 1(15), 3(22)], skipping the third element since that would breach the max.
If the min is 50 and max is 50, I would select [2(25), 4(25)] since those are the only two that can meet the min and max requirements.
Are there any common CS algorithms that match this pattern?
This is a dynamic programming problem.
You want to build a data structure to answer the following question.
by next to last position available in the array:
by target sum:
(elements in sum, last position used)
When it finds a target_sum in range, you just read back through it to get the answer.
Here is pseudocode for that. I used slightly Pythonish syntax and JSON to represent the data structure. Your code will be longer:
Initialize the lookup to [{0: (0, null)}]
for i in 1..(length of array):
# Build up our dynamic programming data structure
Add empty mapping {} to end of lookup
best_sum = null
best_elements = null
for prev_sum, prev_elements, prev_position in lookup for i-1:
# Try not using this element
if prev_sum not in lookup[i] or lookup[i][prev_sum][0] < prev_elements:
lookup[i][prev_sum] = (prev_elements, prev_position)
# Try using this element
next_sum = prev_sum + array[i-1]
next_elements = prev_elements + 1
prev_position = i-1
if next_sum not in lookup lookup[i][next_sum][0] < prev_elements:
lookup[i][next_sum] = (next_elements, next_position)
if next_sum in desired range:
if best_elements is null or best_elements < this_elements
best_elements = this_elements
best_sum = this_sum
if best_elements is not null:
# Read out the answer!
answer = []
j = i
while j is not null:
best_sum = lookup[j][0]
answer.append(array[j])
j = lookup[j][1]
return reversed(answer)
This will return the desired values rather than the indexes. To switch, just reverse what goes into the answer.

Comparing 1 million integers in an array without sorting it first

I have a task to find the difference between every integer in an array of random numbers and return the lowest difference. A requirement is that the integers can be between 0 and int.maxvalue and that the array will contain 1 million integers.
I put some code together which works fine for a small amount of integers but it takes a long time (so long most of the time I give up waiting) to do a million. My code is below, but I'm looking for some insight on how I can improve performance.
for(int i = 0; i < _RandomIntegerArray.Count(); i++) {
for(int ii = i + 1; ii < _RandomIntegerArray.Count(); ii++) {
if (_RandomIntegerArray[i] == _RandomIntegerArray[ii]) continue;
int currentDiff = Math.Abs(_RandomIntegerArray[i] - _RandomIntegerArray[ii]);
if (currentDiff < lowestDiff) {
Pairs.Clear();
}
if (currentDiff <= lowestDiff) {
Pairs.Add(new NumberPair(_RandomIntegerArray[i], _RandomIntegerArray[ii]));
lowestDiff = currentDiff;
}
}
}
Apologies to everyone that has pointed out that I don't sort; unfortunately sorting is not allowed.
Imagine that you have already found a pair of integers a and b from your random array such that a > b and a-b is the lowest among all possible pairs of integers in the array.
Does an integer c exist in the array such that a > c > b, i.e. c goes between a and b? Clearly, the answer is "no", because otherwise you'd pick the pair {a, c} or {c, b}.
This gives an answer to your problem: a and b must be next to each other in a sorted array. Sorting can be done in O(N*log N), and the search can be done in O(N) - an improvement over O(N2) algorithm that you have.
As per #JonSkeet try sorting the array first and then only compare consecutive array items, which means that you only need to iterate the array once:
Array.Sort(_RandomIntegerArray);
for (int i = 1; i < _RandomIntegerArray.Count(); i++)
{
int currentDiff = _RandomIntegerArray[i] - _RandomIntegerArray[i-1];
if (currentDiff < lowestDiff)
{
Pairs.Clear();
}
if (currentDiff <= lowestDiff)
{
Pairs.Add(new NumberPair(_RandomIntegerArray[i], _RandomIntegerArray[i-1]));
lowestDiff = currentDiff;
}
}
In my testing this results in < 200 ms elapsed for 1 million items.
You've got a million integers out of a possible 2.15 or 4.3 billion (signed or unsigned). That means the largest possible min distance is either about 2150 or 4300. Let's say that the max possible min distance is D.
Divide the legal integers into groups of length D. Create a hash h keyed on integers with arrays of ints as values. Process your array by taking each element x, and adding it to h[x/D].
The point of doing this is that any valid pair of points is either contained in h(k) for some k, or collectively in h(k) and h(k+1).
Find your pair of points by going through the keys of the hash and checking the points associated with adjacent keys. You can sort if you like, or use a bitvector, or any other method but now you're dealing with small arrays (on average 1 element per array).
As elements of the array are b/w 0 to int.maxvalue, so I suppose maxvalue will be less than 1 million. If it is so you just need to initialise the array[maxvalue] to 0 and then as you read 1 million values increment the value in your array.
Now read this array and find the lowest value as described by others as if all the values were sorted. If at any element is present more than 1 than its value will be >1 so you could easily say that min. difference is 0.
NOTE- This method is efficient only if you do not use sorting and more importantly int.maxvalue<<<<<(less than) 10^6(1 million).
It helps a little if you do not count on each iteration
int countIntegers = _RandomIntegerArray.Count();
for(int i = 0; i < countIntegers; i++) {
//...
for(int ii = i + 1; ii < countIntegers; ii++) {
//...
Given that Count() is only returning the number of Ints in an array on each successful count and not modifying the array or caching output until modifications.
How about splitting up the array into arraysize/number of processors sized chunks and running each chunk in a different thread. (Neil)
Assume three parts A, B and C of size as close as possible.
For each part, find the minimum "in-part" difference and that of pairs with the first component from the current part and the second from the next part (A being the next from C).
With a method taking O(n²) time, n/3 should take one ninth, done 2*3 times, this amounts to two thirds plus change for combining the results.
This calls to be applied recursively - remember Карацу́ба/Karatsuba multiplication?
Wait - maybe use two parts after all, for three fourth of the effort - very close to "Karatsuba". (When not seeing how to use an even number of parts, I was thinking multiprocessing with every processor doing "the same".)

Pick up two numbers from an array so that the sum is a constant

I came across an algorithm problem. Suppose I receive a credit and would like to but two items from a local store. I would like to buy two items that add up to the entire value of the credit. The input data has three lines.
The first line is the credit, the second line is the total amount of the items and the third line lists all the item price.
Sample data 1:
200
7
150 24 79 50 88 345 3
Which means I have $200 to buy two items, there are 7 items. I should buy item 1 and item 4 as 200=150+50
Sample data 2:
8
8
2 1 9 4 4 56 90 3
Which indicates that I have $8 to pick two items from total 8 articles. The answer is item 4 and item 5 because 8=4+4
My thought is first to create the array of course, then pick up any item say item x. Creating another array say "remain" which removes x from the original array.
Subtract the price of x from the credit to get the remnant and check whether the "remain" contains remnant.
Here is my code in C#.
// Read lines from input file and create array price
foreach (string s in price)
{
int x = Int32.Parse(s);
string y = (credit - x).ToString();
index1 = Array.IndexOf(price, s) ;
index2 = Array.IndexOf(price, y) ;
remain = price.ToList();
remain.RemoveAt(index1);//remove an element
if (remain.Contains(y))
{
break;
}
}
// return something....
My two questions:
How is the complexity? I think it is O(n2).
Any improvement to the algorithm? When I use sample 2, I have trouble to get correct indices. Because there two "4" in the array, it always returns the first index since IndexOf(String) reports the zero-based index of the first occurrence of the specified string in this instance.
You can simply sort the array in O(nlogn) time. Then for each element A[i] conduct a binary search for S-A[i] again in O(nlogn) time.
EDIT: As pointed out by Heuster, you can solve the 2-SUM problem on the sorted array in linear time by using two pointers (one from the beginning and other from the end).
Create a HashSet<int> of the prices. Then go through it sequentially.Something like:
HashSet<int> items = new HashSet<int>(itemsList);
int price1 = -1;
int price2 = -1;
foreach (int price in items)
{
int otherPrice = 200 - price;
if (items.Contains(otherPrice))
{
// found a match.
price1 = price;
price2 = otherPrice;
break;
}
}
if (price2 != -1)
{
// found a match.
// price1 and price2 contain the values that add up to your target.
// now remove the items from the HashSet
items.Remove(price1);
items.Remove(price2);
}
This is O(n) to create the HashSet. Because lookups in the HashSet are O(1), the foreach loop is O(n).
This problem is called 2-sum. See., for example, http://coderevisited.com/2-sum-problem/
Here is an algorithm in O(N) time complexity and O(N) space : -
1. Put all numbers in hash table.
2. for each number Arr[i] find Sum - Arr[i] in hash table in O(1)
3. If found then (Arr[i],Sum-Arr[i]) are your pair that add up to Sum
Note:- Only failing case can be when Arr[i] = Sum/2 then you can get false positive but you can always check if there are two Sum/2 in the array in O(N)
I know I am posting this is a year and a half later, but I just happened to come across this problem and wanted to add input.
If there exists a solution, then you know that both values in the solution must both be less than the target sum.
Perform a binary search in the array of values, searching for the target sum (which may or may not be there).
The binary search will end with either finding the sum, or the closest value less than sum. That is your starting high value while searching through the array using the previously mentioned solutions. Any value above your new starting high value cannot be in the solution, as it is more than the target value.
At this point, you have eliminated a chunk of data in log(n) time, that would otherwise be eliminated in O(n) time.
Again, this is an optimization that may only be worth implementing if the data set calls for it.

Building a non sequential list of numbers (From a large range)

I need to create a non sequential list of numbers that fit within a range. For instance i need to a generate a list of numbers from 1 to 1million and make sure that non of the numbers are in a sequential order, that they are completly shuffled. I guess my first question is, are there any good algorithms out there that could help and how best to implement this.
I currently am not sure the best way to implement, either via a c# console app that will spit out the numbers in an XML file or in a database that will spit out the numbers into a table or a set of tables, but that is really secondary to actually working out the best way of "shuffling" the set of numbers.
Any advice guys?
Rob
First off, if none of the numbers are in sequential order then every number in the sequence must be less than its predecessor. A sequence which has that property is sorted from biggest to smallest! Clearly that is not what you want. (Or perhaps you simply do not want any subsequence of the form 5, 6, 7 ? But 6, 8, 20 would be OK?)
To answer your question properly we need to know more information about the problem space. Things I would want to know:
1) Is the size of the range equal to, larger than, or smaller than the size of the sequence? That is, are you going to ask for ten numbers between 1 and 10, five numbers between 1 and 10 or fifty numbers between 1 and 10?
2) Is it acceptable for the sequence to contain duplicates? (If the number of items in the sequence is larger than the range, then clearly yes.)
3) What is the randomness being used for? Most random number generators are only pseudo-random; a clever attacker can deduce the next "random" number by knowing the previous ones. If for example you are generating a series of five cards out of a deck of 52 to make a poker hand, you want really strong randomness; you don't want players to be able to deduce what their opponents have in their hands.
How "non-sequential" do you want it?
You could easily generate a list of random numbers from a range with the Random class:
Random rnd1 = new Random();
List<int> largeList = new List<int>();
for (int i = 0, i < largeNumber, i++)
{
largeList.Add(rnd1.Next(1, 1000001);
}
Edit to add
Admittedly the Durstenfeld algorithm (modern version of the Fisher–Yates shuffle apparently) is much faster:
var fisherYates = new List<int>(upperBound);
for (int i = 0; i < upperBound; i++)
{
fisherYates.Add(i);
}
int n = upperBound;
while (n > 1)
{
n--;
int k = rnd.Next(n + 1);
int temp = fisherYates[k];
fisherYates[k] = fisherYates[n];
fisherYates[n] = temp;
}
For the range 1 to 10000 doing a brute force "find a random number I've not yet used" takes around 4-5 seconds, while this takes around 0.001.
Props to Greg Hewgill for the links.
I understand, that you want to get a random array of lenth 1mio with all numbers from 1 to 1mio. No duplicates, is that right?
You should build up an array with your numbers ranging from 1 to 1mio. Then start shuffling. But it can happen (that is true randomness) that two ore even more numbers are sequential.
Have a look here
Here's a C# function to get you started:
public IEnumerable<int> GetRandomSequence(int max)
{
var r = new Random();
while (true)
{
yield return r.GetNext(max);
}
}
call it like this to get a million numbers ranged 0-9999999:
var numbers = GetRandomSequence(9999999).Take(1000000);
As for sorting, or if you don't want to allow repeats, look at Enumerable.GetRange() (which will give you a consecutive ordered sequence) and use a Fisher-Yates (or Knuth) shuffle algorithm (which you can find all over the place).
"completly shuffled" is a very misunderstood term. One trick fraud experts use when examining what should be "random" data is to watch for cases where there no duplicate values (like 3743***88***123, because in a truly random sequence the chances of not having such a pair is very low... Exactly what are you trying to do ? What, exactly do you mean by "completly shuffled"? If all you mean is random sequence of digits, then just use the Random class in the CLR. to generate random numbers between 0 and 1M... as many as you need...
Well ,you could go with something like this (assuming that you want every number exactly once):
DECLARE #intFrom int
DECLARE #intTo int
DECLARE #tblList table (_id uniqueidentifier, _number int)
SET #intFrom = 0
SET #intTo = 1000000
WHILE (#intFrom < #intTo)
BEGIN
INSERT INTO #tblList
SELECT NewID(), #intFrom
SET #intFrom = #intFrom + 1
END
SELECT *
FROM #tblList
ORDER BY _id
DISCLAIMER: I didn't test this, since I don't have an SQL Server at my disposal at the moment.
This may get you what you need:
1) Populate a list of numbers in order. If your range is 1 - x, it'll look like this:
[1, 2, 4, 5, 6, 7, 8, 9, ... , x]
2) Loop over the list x times, each time choosing a random number between 0 and the length of your list - 1.
3) Use this chosen number to select the corresponding element from your list, and add this number to your output list.
4) Delete the element you just selected from your list. Rinse, repeat.
This will work for any range of numbers, not just lists that start with 1 or 0. The pseudocode looks like this:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
shuffled_nums = []
for i in range(0, len(nums)):
random_index = rand(0,len(nums))
shuffled_nums.add(nums[random_index])
del(nums[random_index])

Categories

Resources