Combination Algorithm - c#

Length = input Long(can be 2550, 2880, 2568, etc)
List<long> = {618, 350, 308, 300, 250, 232, 200, 128}
The program takes a long value, for that particular long value we have to find the possible combination from the above list which when added give me a input result(same value can be used twice). There can be a difference of +/- 30.
Largest numbers have to be used most.
Ex:Length = 868
For this combinations can be
Combination 1 = 618 + 250
Combination 2 = 308 + 232 + 200 +128
Correct Combination would be Combination 1
But there should also be different combinations.
public static void Main(string[] args)
{
//subtotal list
List<int> totals = new List<int>(new int[] { 618, 350, 308, 300, 250, 232, 200, 128 });
// get matches
List<int[]> results = KnapSack.MatchTotal(2682, totals);
// print results
foreach (var result in results)
{
Console.WriteLine(string.Join(",", result));
}
Console.WriteLine("Done.");
}
internal static List<int[]> MatchTotal(int theTotal, List<int> subTotals)
{
List<int[]> results = new List<int[]>();
while (subTotals.Contains(theTotal))
{
results.Add(new int[1] { theTotal });
subTotals.Remove(theTotal);
}
if (subTotals.Count == 0)
return results;
subTotals.Sort();
double mostNegativeNumber = subTotals[0];
if (mostNegativeNumber > 0)
mostNegativeNumber = 0;
if (mostNegativeNumber == 0)
subTotals.RemoveAll(d => d > theTotal);
for (int choose = 0; choose <= subTotals.Count; choose++)
{
IEnumerable<IEnumerable<int>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);
results.AddRange(from combo in combos where combo.Sum() == theTotal select combo.ToArray());
}
return results;
}
public static class Combination
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)
{
return choose == 0 ?
new[] { new T[0] } :
elements.SelectMany((element, i) =>
elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));
}
}
I Have used the above code, can it be more simplified, Again here also i get unique values. A value can be used any number of times. But the largest number has to be given the most priority.
I have a validation to check whether the total of the sum is greater than the input value. The logic fails even there..

The algorithm you have shown assumes that the list is sorted in ascending order. If not, then you shall first have to sort the list in O(nlogn) time and then execute the algorithm.
Also, it assumes that you are only considering combinations of pairs and you exit on the first match.
If you want to find all combinations, then instead of "break", just output the combination and increment startIndex or decrement endIndex.
Moreover, you should check for ranges (targetSum - 30 to targetSum + 30) rather than just the exact value because the problem says that a margin of error is allowed.
This is the best solution according to me because its complexity is O(nlogn + n) including the sorting.

V4 - Recursive Method, using Stack structure instead of stack frames on thread
It works (tested in VS), but there could be some bugs remaining.
static int Threshold = 30;
private static Stack<long> RecursiveMethod(long target)
{
Stack<long> Combination = new Stack<long>(establishedValues.Count); //Can grow bigger, as big as (target / min(establishedValues)) values
Stack<int> Index = new Stack<int>(establishedValues.Count); //Can grow bigger
int lowerBound = 0;
int dimensionIndex = lowerBound;
long fail = -1 * Threshold;
while (true)
{
long thisVal = establishedValues[dimensionIndex];
dimensionIndex++;
long afterApplied = target - thisVal;
if (afterApplied < fail)
lowerBound = dimensionIndex;
else
{
target = afterApplied;
Combination.Push(thisVal);
if (target <= Threshold)
return Combination;
Index.Push(dimensionIndex);
dimensionIndex = lowerBound;
}
if (dimensionIndex >= establishedValues.Count)
{
if (Index.Count == 0)
return null; //No possible combinations
dimensionIndex = Index.Pop();
lowerBound = dimensionIndex;
target += Combination.Pop();
}
}
}
Maybe V3 - Suggestion for Ordered solution trying every combination
Although this isn't chosen as the answer for the related question, I believe this is a good approach - https://stackoverflow.com/a/17258033/887092(, otherwise you could try the chosen answer (although the output for that is only 2 items in set being summed, rather than up to n items)) - it will enumerate every option including multiples of the same value. V2 works but would be slightly less efficient than an ordered solution, as the same failing-attempt will likely be attempted multiple times.
V2 - Random Selection - Will be able to reuse the same number twice
I'm a fan of using random for "intelligence", allowing the computer to brute force the solution. It's also easy to distribute - as there is no state dependence between two threads trying at the same time for example.
static int Threshold = 30;
public static List<long> RandomMethod(long Target)
{
List<long> Combinations = new List<long>();
Random rnd = new Random();
//Assuming establishedValues is sorted
int LowerBound = 0;
long runningSum = Target;
while (true)
{
int newLowerBound = FindLowerBound(LowerBound, runningSum);
if (newLowerBound == -1)
{
//No more beneficial values to work with, reset
runningSum = Target;
Combinations.Clear();
LowerBound = 0;
continue;
}
LowerBound = newLowerBound;
int rIndex = rnd.Next(LowerBound, establishedValues.Count);
long val = establishedValues[rIndex];
runningSum -= val;
Combinations.Add(val);
if (Math.Abs(runningSum) <= 30)
return Combinations;
}
}
static int FindLowerBound(int currentLowerBound, long runningSum)
{
//Adjust lower bound, so we're not randomly trying a number that's too high
for (int i = currentLowerBound; i < establishedValues.Count; i++)
{
//Factor in the threshold, because an end aggregate which exceeds by 20 is better than underperforming by 21.
if ((establishedValues[i] - Threshold) < runningSum)
{
return i;
}
}
return -1;
}
V1 - Ordered selection - Will not be able to reuse the same number twice
Add this very handy extension function (uses a binary algorithm to find all combinations):
//Make sure you put this in a static class inside System namespace
public static IEnumerable<List<T>> EachCombination<T>(this List<T> allValues)
{
var collection = new List<List<T>>();
for (int counter = 0; counter < (1 << allValues.Count); ++counter)
{
List<T> combination = new List<T>();
for (int i = 0; i < allValues.Count; ++i)
{
if ((counter & (1 << i)) == 0)
combination.Add(allValues[i]);
}
if (combination.Count == 0)
continue;
yield return combination;
}
}
Use the function
static List<long> establishedValues = new List<long>() {618, 350, 308, 300, 250, 232, 200, 128, 180, 118, 155};
//Return is a list of the values which sum to equal the target. Null if not found.
List<long> FindFirstCombination(long target)
{
foreach (var combination in establishedValues.EachCombination())
{
//if (combination.Sum() == target)
if (Math.Abs(combination.Sum() - target) <= 30) //Plus or minus tolerance for difference
return combination;
}
return null; //Or you could throw an exception
}
Test the solution
var target = 858;
var result = FindFirstCombination(target);
bool success = (result != null && result.Sum() == target);
//TODO: for loop with random selection of numbers from the establishedValues, Sum and test through FindFirstCombination

Related

Speed up processing 32 bit numbers in combinations (k from n)

I have a list of 128 32 bit numbers, and I want to know, is there any combination of 12 numbers, so that all numbers XORed give the 32 bit number with all bits set to 1.
So I have started with naive approach and took combinations generator like that:
private static IEnumerable<int[]> Combinations(int k, int n)
{
var state = new int[k];
var stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
var index = stack.Count - 1;
var value = stack.Pop();
while (value < n)
{
state[index++] = value++;
if (value < n)
{
stack.Push(value);
}
if (index == k)
{
yield return state;
break;
}
}
}
}
and used it like that (data32 is an array of given 32bit numbers)
foreach (var probe in Combinations(12, 128))
{
int p = 0;
foreach (var index in probe)
{
p = p ^ data32[index];
}
if (p == -1)
{
//print out found combination
}
}
Of course it takes forever to check all 23726045489546400 combinations...
So my question(s) are - am I missing something in options how to speedup the check process?
Even if I do the calculation of combinations in partitions (e.g. I could start like 8 threads each will check combination started with numbers 0..8), or speed up the XORing by storing the perviously calculated combination - it is still slow.
P.S. I'd like it to run in reasonable time - minutes, hours not years.
Adding a list of numbers as was requested in one of the comments:
1571089837
2107702069
466053875
226802789
506212087
484103496
1826565655
944897655
1370004928
748118360
1000006005
952591039
2072497930
2115635395
966264796
1229014633
827262231
1276114545
1480412665
2041893083
512565106
1737382276
1045554806
172937528
1746275907
1376570954
1122801782
2013209036
1650561071
1595622894
425898265
770953281
422056706
477352958
1295095933
1783223223
842809023
1939751129
1444043041
1560819338
1810926532
353960897
1128003064
1933682525
1979092040
1987208467
1523445101
174223141
79066913
985640026
798869234
151300097
770795939
1489060367
823126463
1240588773
490645418
832012849
188524191
1034384571
1802169877
150139833
1762370591
1425112310
2121257460
205136626
706737928
265841960
517939268
2070634717
1703052170
1536225470
1511643524
1220003866
714424500
49991283
688093717
1815765740
41049469
529293552
1432086255
1001031015
1792304327
1533146564
399287468
1520421007
153855202
1969342940
742525121
1326187406
1268489176
729430821
1785462100
1180954683
422085275
1578687761
2096405952
1267903266
2105330329
471048135
764314242
459028205
1313062337
1995689086
1786352917
2072560816
282249055
1711434199
1463257872
1497178274
472287065
246628231
1928555152
1908869676
1629894534
885445498
1710706530
1250732374
107768432
524848610
2791827620
1607140095
1820646148
774737399
1808462165
194589252
1051374116
1802033814
I don't know C#, I did something in Python, maybe interesting anyway. Takes about 0.8 seconds to find a solution for your sample set:
solution = {422056706, 2791827620, 506212087, 1571089837, 827262231, 1650561071, 1595622894, 512565106, 205136626, 944897655, 966264796, 477352958}
len(solution) = 12
solution.issubset(nums) = True
hex(xor(solution)) = '0xffffffff'
There are 128C12 combinations, that's 5.5 million times as many as the 232 possible XOR values. So I tried being optimistic and only tried a subset of the possible combinations. I split the 128 numbers into two blocks of 28 and 100 numbers and try combinations with six numbers from each of the two blocks. I put all possible XORs of the first block into a hash set A, then go through all XORs of the second block to find one whose bitwise inversion is in that set. Then I reconstruct the individual numbers.
This way I cover (28C6)2 × (100C6)2 = 4.5e14 combinations, still over 100000 times as many as there are possible XOR values. So probably still a very good chance to find a valid combination.
Code (Try it online!):
from itertools import combinations
from functools import reduce
from operator import xor as xor_
nums = list(map(int, '1571089837 2107702069 466053875 226802789 506212087 484103496 1826565655 944897655 1370004928 748118360 1000006005 952591039 2072497930 2115635395 966264796 1229014633 827262231 1276114545 1480412665 2041893083 512565106 1737382276 1045554806 172937528 1746275907 1376570954 1122801782 2013209036 1650561071 1595622894 425898265 770953281 422056706 477352958 1295095933 1783223223 842809023 1939751129 1444043041 1560819338 1810926532 353960897 1128003064 1933682525 1979092040 1987208467 1523445101 174223141 79066913 985640026 798869234 151300097 770795939 1489060367 823126463 1240588773 490645418 832012849 188524191 1034384571 1802169877 150139833 1762370591 1425112310 2121257460 205136626 706737928 265841960 517939268 2070634717 1703052170 1536225470 1511643524 1220003866 714424500 49991283 688093717 1815765740 41049469 529293552 1432086255 1001031015 1792304327 1533146564 399287468 1520421007 153855202 1969342940 742525121 1326187406 1268489176 729430821 1785462100 1180954683 422085275 1578687761 2096405952 1267903266 2105330329 471048135 764314242 459028205 1313062337 1995689086 1786352917 2072560816 282249055 1711434199 1463257872 1497178274 472287065 246628231 1928555152 1908869676 1629894534 885445498 1710706530 1250732374 107768432 524848610 2791827620 1607140095 1820646148 774737399 1808462165 194589252 1051374116 1802033814'.split()))
def xor(vals):
return reduce(xor_, vals)
A = {xor(a)^0xffffffff: a
for a in combinations(nums[:28], 6)}
for b in combinations(nums[28:], 6):
if a := A.get(xor(b)):
break
solution = {*a, *b}
print(f'{solution = }')
print(f'{len(solution) = }')
print(f'{solution.issubset(nums) = }')
print(f'{hex(xor(solution)) = }')
Arrange your numbers into buckets based on the position of the first 1 bit.
To set the first bit to 1, you will have to use an odd number of the items in the corresponding bucket....
As you recurse, try to maintain an invariant that the number of leading 1 bits is increasing and then select the bucket that will change the next 0 to a 1, this will greatly reduce the number of combinations that you have to try.
I have found a possible solution, which could work for my particular task.
As main issue to straitforward approach I see a number of 2E16 combinations.
But, if I want to check if combination of 12 elements equal to 0xFFFFFFFF, I could check if 2 different combinations of 6 elements with opposit values exists.
That will reduce number of combinations to "just" 5E9, which is achievable.
On first attempt I think to store all combinations and then find opposites in the big list. But, in .NET I could not find quick way of storing more then Int32.MaxValue elements.
Taking in account idea with bits from comments and answer, I decided to store at first only xor sums with leftmost bit set to 1, and then by definition I need to check only sums with leftmost bit set to 0 => reducing storage by 2.
In the end it appears that many collisions could appear, so there are many combinations with the same xor sum.
Current version which could find such combinations, need to be compiled in x64 mode and use (any impovements welcomed):
static uint print32(int[] comb, uint[] data)
{
uint p = 0;
for (int i = 0; i < comb.Length; i++)
{
Console.Write("{0} ", comb[i]);
p = p ^ data[comb[i]];
}
Console.WriteLine(" #[{0:X}]", p);
return p;
}
static uint[] data32;
static void Main(string[] args)
{
int n = 128;
int k = 6;
uint p = 0;
uint inv = 0;
long t = 0;
//load n numbers from a file
init(n);
var lookup1x = new Dictionary<uint, List<byte>>();
var lookup0x = new Dictionary<uint, List<byte>>();
Stopwatch watch = new Stopwatch();
watch.Start();
//do not use IEnumerable generator, use function directly to reuse xor value
var hash = new uint[k];
var comb = new int[k];
var stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
var index = stack.Count - 1;
var value = stack.Pop();
if (index == 0)
{
p = 0;
Console.WriteLine("Start {0} sequence, combinations found: {1}",value,t);
}
else
{
//restore previous xor value
p = hash[index - 1];
}
while (value < n)
{
//xor and store
p = p ^ data32[value];
hash[index] = p;
//remember current state (combination)
comb[index++] = value++;
if (value < n)
{
stack.Push(value);
}
//combination filled to end
if (index == k)
{
//if xor have MSB set, put it to lookup table 1x
if ((p & 0x8000000) == 0x8000000)
{
lookup1x[p] = comb.Select(i => (byte)i).ToList();
inv = p ^ 0xFFFFFFFF;
if (lookup0x.ContainsKey(inv))
{
var full = lookup0x[inv].Union(lookup1x[p]).OrderBy(x=>x).ToArray();
if (full.Length == 12)
{
print32(full, data32);
}
}
}
else
{
//otherwise put it to lookup table 2, but skip all combinations which are started with 0
if (comb[0] != 0)
{
lookup0x[p] = comb.Select(i => (byte)i).ToList();
inv = p ^ 0xFFFFFFFF;
if (lookup1x.ContainsKey(inv))
{
var full = lookup0x[p].Union(lookup1x[inv]).OrderBy(x=>x).ToArray();
if (full.Length == 12)
{
print32(full, data32);
}
}
}
}
t++;
break;
}
}
}
Console.WriteLine("Check was done in {0} ms ", watch.ElapsedMilliseconds);
//end
}

c# finding even or odd numbers

Question: You are given an array (which will have a length of at least 3, but could be very large) containing integers. The array is either entirely comprised of odd integers or entirely comprised of even integers except for a single integer N. Write a method that takes the array as an argument and returns this "outlier" N.
Examples
[2, 4, 0, 100, 4, 11, 2602, 36]
Should return: 11 (the only odd number)
[160, 3, 1719, 19, 11, 13, -21]
Should return: 160 (the only even number)
Problem: I can find the target number. It is "6" in this instance. But when i try to write it, it is written 3 times. I did not understand why it is not written once.
namespace sayitoplamları
{
class Program
{
static void Main(string[] args)
{
int[] integers;
integers = new int[] {1,3,5,6};
int even = 0;
int odd = 0;
foreach (var num in integers)
{
if (num%2==0)
{
even += 1;
}
else
{
odd += 1;
}
if (even>1)
{
foreach (var item in integers)
{
if (item%2!=0)
{
Console.WriteLine(item);
}
}
}
else if (odd>1)
{
foreach (var item in integers)
{
if (item%2==0)
{
Console.WriteLine(item);
}
}
}
}
}
}
}```
Since we are doing this assignment for a good grade :)
We really don't need to count both odd and even elements and can just count odd once as it more convenient - just add up all remainders ( odd % 2 = 1, even % 2 = 0). If we would need number of even elements it would be array.Length - countOdd.
int countOdd = 0;
for (int i = 0; i < array.Length; i++) // or foreach
{
countOdd += array[i] % 2;
}
If you feel that trick with counting is too complicated and only needed for A+ mark on the assignment than variant of your original code is fine:
int countOdd = 0;
foreach (int element in array)
{
if (element % 2 == 1)
countOdd ++; // which is exactly countOdd += 1, just shorter.
}
Now for the second part - finding the outlier. We don't even need to know if it is Odd or Even but rather just reminder from "% 2". We now know how many odd element - which could be either 1 (than desired remainder is 1 - we are looking for odd element) or more than one (then desired remainder is 0 as we are looking for even one).
int desiredRemainder = countOdd == 1 ? 1 : 0;
Now all its left is to check which element has desired remainder and return from our method:
foreach (int element in array)
{
if (element % 2 == desiredRemainder)
{
return element;
}
}
Please note that assignment asks for "return" and not "print". Printing the value does not allow it to be used by the caller of the method unlike returning it. Using early return also saves some time to iterate the rest of array.
Complete code:
int FindOutlier(int[] array)
{
int desiredReminder = array.Sum(x => x % 2) == 1 ? 1 : 0;
return array.First(x => x % 2 == desiredReminder);
}
Alternative solution would be to iterate array only once - since we need to return first odd or first even number as soon as we figure out which one repeats similar how your tried with nested foreach. Notes since we are guaranteed that outlier present and is only one we don't need to try to remember first number - just current is ok.
int FindOutlier(int[] array)
{
int? odd = null; // you can use separate int + bool instead
int? even = null;
int countOdd = 0;
int countEven = 0;
foreach (int element in array)
{
bool isEven = element % 2 == 0;
if (isEven)
{
countEven++;
even = element;
}
else
{
countOdd++;
odd = element;
}
if (countEven > 1 && odd.HasValue)
return odd.Value;
if (countOdd > 1 && even.HasValue)
return even.Value;
}
throw new Exception("Value must be found before");
}
And if you ok to always iterate whole list code will be simple as again we'd need to only keep track of count of odd numbers (or even, just one kind) and return would not need to check if we found the value yet as we know that both type of numbers are present in the list:
int FindOutlier(int[] array)
{
int odd = 0;
int even = 0;
int countEven = 0;
foreach (int element in array)
{
if (element % 2 == 0)
{
countEven++;
even= element;
}
else
{
odd = element;
}
}
return countEven > 1 ? odd : even;
}
You are mixing everything into a single algorithm. In most cases it's a better idea to have separate steps. In your case, the steps could be:
Count odds and evens
Find the outlier
Print the result
That way you can make sure that the output does not interfere with the loops.
class Program
{
static void Main(string[] args)
{
int[] integers;
integers = new int[] {1,3,5,6};
int even = 0;
int odd = 0;
// Count odds and evens
foreach (var num in integers)
{
if (num%2==0)
{
even += 1;
}
else
{
odd += 1;
}
}
// Find the outlier
var outlier = 0;
foreach (var item in integers)
{
if (even == 1 && item%2==0)
{
outlier = item;
break;
}
if (odd == 1 && item%2!=0)
{
outlier = item;
break;
}
}
// Print the result
Console.WriteLine(outlier);
}
}
You could even make this separate methods. It also makes sense due to SoC (separation of concerns) and if you want to achieve testability (unit tests).
You want a method anyway and use the return statement, as mentioned in the assignment
Write a method that takes the array as an argument and returns this "outlier" N.
Maybe you want to check the following code, which uses separate methods:
class Program
{
static void Main(string[] args)
{
int[] integers;
integers = new int[] { 1, 3, 5, 6 };
var outlier = FindOutlier(integers);
Console.WriteLine(outlier);
}
private static int FindOutlier(int[] integers)
{
if (integers.Length < 3) throw new ArgumentException("Need at least 3 elements.");
bool isEvenOutlier = IsEvenOutlier(integers);
var outlier = FindOutlier(integers, isEvenOutlier ? 0:1);
return outlier;
}
static bool IsEvenOutlier(int[] integers)
{
// Count odds and evens
int even = 0;
int odd = 0;
foreach (var num in integers)
{
if (num % 2 == 0)
{
even += 1;
}
else
{
odd += 1;
}
}
// Check which one occurs only once
if (even == 1) return true;
if (odd == 1) return false;
throw new ArgumentException("No outlier in data.", nameof(integers));
}
private static int FindOutlier(int[] integers, int remainder)
{
foreach (var item in integers)
{
if (item % 2 == remainder)
{
return item;
}
}
throw new ArgumentException("No outlier in argument.", nameof(integers));
}
}
The reason it is printing '6' 3 times is because you iterate over the inetegers inside a loop where you iterate over the integers. Once with num and again with item.
Consider what happens when you run it against [1,3,5,6].
num=1, you get odd=1 and even=0. Neither of the if conditions are true so we move to the next iteration.
num=3, you get odd=2 and even=0. odd > 1 so we iterate over the integers (with item) again and print '6' when item=6.
num=5, you get odd=3 and even=0. Again, this will print '6' when item=6.
num=6, you get odd=3 and even=1. Even though we incremented the even count, we still have odd > 3 so will print '6' when item=6.
It is best to iterate over the values once and store the last even and last odd numbers along with the count (it is more performant as well).
You can then print the last odd number if even > 0 (as there will only be one odd number, which will be the last one) or the last even number if odd > 0.
I.e.
namespace sayitoplamları
{
class Program
{
static void Main(string[] args)
{
int[] integers;
integers = new int[] { 1, 3, 5, 6 };
int even = 0;
int? lastEven = null;
int odd = 0;
int? lastOdd = null;
foreach (var num in integers)
{
if (num % 2 == 0)
{
even += 1;
lastEven = num;
}
else
{
odd += 1;
lastOdd = num;
}
}
if (even > 1)
{
Console.WriteLine(lastOdd);
}
else if (odd > 1)
{
Console.WriteLine(lastEven);
}
}
}
}
because you didn't close the the first foreach loop before printing...
namespace sayitoplamları
{
class Program
{
static void Main(string[] args)
{
int[] integers;
integers = new int[] {1,3,5,6};
int even = 0;
int odd = 0;
foreach (var num in integers)
{
if (num%2==0)
{
even += 1;
}
else
{
odd += 1;
}//you need to close it here
if (even>1)
{
foreach (var item in integers)
{
if (item%2!=0)
{
Console.WriteLine(item);
}
}
}
else if (odd>1)
{
foreach (var item in integers)
{
if (item%2==0)
{
Console.WriteLine(item);
}
}
}
}
}//not here
}
}

Nearest value from user input in an array C#

so in my application , I read some files into it and ask the user for a number , in these files there a lot of numbers and I am trying to find the nearest value when the number they enter is not in the file. So far I have as following
static int nearest(int close_num, int[] a)
{
foreach (int bob in a)
{
if ((close_num -= bob) <= 0)
return bob;
}
return -1;
}
Console.WriteLine("Enter a number to find out if is in the selected Net File: ");
int i3 = Convert.ToInt32(Console.ReadLine());
bool checker = false;
//Single nearest = 0;
//linear search#1
for (int i = 0; i < a.Length; i++)//looping through array
{
if(a[i] == i3)//checking to see the value is found in the array
{
Console.WriteLine("Value found and the position of it in the descending value of the selected Net File is: " + a[i]);
checker = true;
}
else
{
int found = nearest(i3,a);
Console.WriteLine("Cannot find this number in the Net File however here the closest number to that: " + found );
//Console.WriteLine("Cannot find this number in the Net File however here the closest number to that : " + nearest);
}
}
When a value that is in the file is entered the output is fine , but when it comes to the nearest value I cannot figure a way. I can't use this such as BinarySearchArray for this. a = the array whilst i3 is the value the user has entered. Would a binary search algorithm just be simpler for this?
Any help would be appreciated.
You need to make a pass over all the elements of the array, comparing each one in turn to find the smallest difference. At the same time, keep a note of the current nearest value.
There are many ways to do this; here's a fairly simple one:
static int nearest(int close_num, int[] a)
{
int result = -1;
long smallestDelta = long.MaxValue;
foreach (int bob in a)
{
long delta = (bob > close_num) ? (bob - close_num) : (close_num - bob);
if (delta < smallestDelta)
{
smallestDelta = delta;
result = bob;
}
}
return result;
}
Note that delta is calculated so that it is the absolute value of the difference.
Well, first we should define, what is nearest. Assuming that,
int nearest for given int number is the item of int[] a such that Math.Abs(nearest - number) is the smallest possible value
we can put it as
static int nearest(int number, int[] a)
{
long diff = -1;
int result = 0;
foreach (int item in a)
{
// actual = Math.Abs((long)item - number);
long actual = (long)item - number;
if (actual < 0)
actual = -actual;
// if item is the very first value or better than result
if (diff < 0 || actual < diff) {
result = item;
diff = actual;
}
}
return result;
}
The only tricky part is long for diff: it may appear that item - number exceeds int range (and will either have IntegerOverflow exceprion thrown or *invalid answer), e.g.
int[] a = new int[] {int.MaxValue, int.MaxValue - 1};
Console.Write(nearest(int.MinValue, a));
Note, that expected result is 2147483646, not 2147483647
what about LINQ ?
var nearestNumber = a.OrderBy(x => Math.Abs(x - i3)).First();
Just iterate through massive and find the minimal delta between close_num and array members
static int nearest(int close_num, int[] a)
{
// initialize as big number, 1000 just an example
int min_delta=1000;
int result=-1;
foreach (int bob in a)
{
if (Math.Abs(bob-close_num) <= min_delta)
{
min_delta = bob-close_num;
result = bob;
}
}
return result;
}

How can I obtain all the possible combination of a subset?

Consider this List<string>
List<string> data = new List<string>();
data.Add("Text1");
data.Add("Text2");
data.Add("Text3");
data.Add("Text4");
The problem I had was: how can I get every combination of a subset of the list?
Kinda like this:
#Subset Dimension 4
Text1;Text2;Text3;Text4
#Subset Dimension 3
Text1;Text2;Text3;
Text1;Text2;Text4;
Text1;Text3;Text4;
Text2;Text3;Text4;
#Subset Dimension 2
Text1;Text2;
Text1;Text3;
Text1;Text4;
Text2;Text3;
Text2;Text4;
#Subset Dimension 1
Text1;
Text2;
Text3;
Text4;
I came up with a decent solution which a think is worth to share here.
Similar logic as Abaco's answer, different implementation....
foreach (var ss in data.SubSets_LB())
{
Console.WriteLine(String.Join("; ",ss));
}
public static class SO_EXTENSIONS
{
public static IEnumerable<IEnumerable<T>> SubSets_LB<T>(
this IEnumerable<T> enumerable)
{
List<T> list = enumerable.ToList();
ulong upper = (ulong)1 << list.Count;
for (ulong i = 0; i < upper; i++)
{
List<T> l = new List<T>(list.Count);
for (int j = 0; j < sizeof(ulong) * 8; j++)
{
if (((ulong)1 << j) >= upper) break;
if (((i >> j) & 1) == 1)
{
l.Add(list[j]);
}
}
yield return l;
}
}
}
I think, the answers in this question need some performance tests. I'll give it a go. It is community wiki, feel free to update it.
void PerfTest()
{
var list = Enumerable.Range(0, 21).ToList();
var t1 = GetDurationInMs(list.SubSets_LB);
var t2 = GetDurationInMs(list.SubSets_Jodrell2);
var t3 = GetDurationInMs(() => list.CalcCombinations(20));
Console.WriteLine("{0}\n{1}\n{2}", t1, t2, t3);
}
long GetDurationInMs(Func<IEnumerable<IEnumerable<int>>> fxn)
{
fxn(); //JIT???
var count = 0;
var sw = Stopwatch.StartNew();
foreach (var ss in fxn())
{
count = ss.Sum();
}
return sw.ElapsedMilliseconds;
}
OUTPUT:
1281
1604 (_Jodrell not _Jodrell2)
6817
Jodrell's Update
I've built in release mode, i.e. optimizations on. When I run via Visual Studio I don't get a consistent bias between 1 or 2, but after repeated runs LB's answer wins, I get answers approaching something like,
1190
1260
more
but if I run the test harness from the command line, not via Visual Studio, I get results more like this
987
879
still more
EDIT
I've accepted the performance gauntlet, what follows is my amalgamation that takes the best of all answers. In my testing, it seems to have the best performance yet.
public static IEnumerable<IEnumerable<T>> SubSets_Jodrell2<T>(
this IEnumerable<T> source)
{
var list = source.ToList();
var limit = (ulong)(1 << list.Count);
for (var i = limit; i > 0; i--)
{
yield return list.SubSet(i);
}
}
private static IEnumerable<T> SubSet<T>(
this IList<T> source, ulong bits)
{
for (var i = 0; i < source.Count; i++)
{
if (((bits >> i) & 1) == 1)
{
yield return source[i];
}
}
}
Same idea again, almost the same as L.B's answer but my own interpretation.
I avoid the use of an internal List and Math.Pow.
public static IEnumerable<IEnumerable<T>> SubSets_Jodrell(
this IEnumerable<T> source)
{
var count = source.Count();
if (count > 64)
{
throw new OverflowException("Not Supported ...");
}
var limit = (ulong)(1 << count) - 2;
for (var i = limit; i > 0; i--)
{
yield return source.SubSet(i);
}
}
private static IEnumerable<T> SubSet<T>(
this IEnumerable<T> source,
ulong bits)
{
var check = (ulong)1;
foreach (var t in source)
{
if ((bits & check) > 0)
{
yield return t;
}
check <<= 1;
}
}
You'll note that these methods don't work with more than 64 elements in the intial set but it starts to take a while then anyhow.
I developed a simple ExtensionMethod for lists:
/// <summary>
/// Obtain all the combinations of the elements contained in a list
/// </summary>
/// <param name="subsetDimension">Subset Dimension</param>
/// <returns>IEnumerable containing all the differents subsets</returns>
public static IEnumerable<List<T>> CalcCombinations<T>(this List<T> list, int subsetDimension)
{
//First of all we will create a binary matrix. The dimension of a single row
//must be the dimension of list
//on which we are working (we need a 0 or a 1 for every single element) so row
//dimension is to obtain a row-length = list.count we have to
//populate the matrix with the first 2^list.Count binary numbers
int rowDimension = Convert.ToInt32(Math.Pow(2, list.Count));
//Now we start counting! We will fill our matrix with every number from 1
//(0 is meaningless) to rowDimension
//we are creating binary mask, hence the name
List<int[]> combinationMasks = new List<int[]>();
for (int i = 1; i < rowDimension; i++)
{
//I'll grab the binary rapresentation of the number
string binaryString = Convert.ToString(i, 2);
//I'll initialize an array of the apropriate dimension
int[] mask = new int[list.Count];
//Now, we have to convert our string in a array of 0 and 1, so first we
//obtain an array of int then we have to copy it inside our mask
//(which have the appropriate dimension), the Reverse()
//is used because of the behaviour of CopyTo()
binaryString.Select(x => x == '0' ? 0 : 1).Reverse().ToArray().CopyTo(mask, 0);
//Why should we keep masks of a dimension which isn't the one of the subset?
// We have to filter it then!
if (mask.Sum() == subsetDimension) combinationMasks.Add(mask);
}
//And now we apply the matrix to our list
foreach (int[] mask in combinationMasks)
{
List<T> temporaryList = new List<T>(list);
//Executes the cycle in reverse order to avoid index out of bound
for (int iter = mask.Length - 1; iter >= 0; iter--)
{
//Whenever a 0 is found the correspondent item is removed from the list
if (mask[iter] == 0)
temporaryList.RemoveAt(iter);
}
yield return temporaryList;
}
}
}
So considering the example in the question:
# Row Dimension of 4 (list.Count)
Binary Numbers to 2^4
# Binary Matrix
0 0 0 1 => skip
0 0 1 0 => skip
[...]
0 1 1 1 => added // Text2;Text3;Text4
[...]
1 0 1 1 => added // Text1;Text3;Text4
1 1 0 0 => skip
1 1 0 1 => added // Text1;Text2;Text4
1 1 1 0 => added // Text1;Text2;Text3
1 1 1 1 => skip
Hope this can help someone :)
If you need clarification or you want to contribute feel free to add answers or comments (which one is more appropriate).

selection based on percentage weighting

I have a set of values, and an associated percentage for each:
a: 70% chance
b: 20% chance
c: 10% chance
I want to select a value (a, b, c) based on the percentage chance given.
how do I approach this?
my attempt so far looks like this:
r = random.random()
if r <= .7:
return a
elif r <= .9:
return b
else:
return c
I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows.
(any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)
Here is a complete solution in C#:
public class ProportionValue<T>
{
public double Proportion { get; set; }
public T Value { get; set; }
}
public static class ProportionValue
{
public static ProportionValue<T> Create<T>(double proportion, T value)
{
return new ProportionValue<T> { Proportion = proportion, Value = value };
}
static Random random = new Random();
public static T ChooseByRandom<T>(
this IEnumerable<ProportionValue<T>> collection)
{
var rnd = random.NextDouble();
foreach (var item in collection)
{
if (rnd < item.Proportion)
return item.Value;
rnd -= item.Proportion;
}
throw new InvalidOperationException(
"The proportions in the collection do not add up to 1.");
}
}
Usage:
var list = new[] {
ProportionValue.Create(0.7, "a"),
ProportionValue.Create(0.2, "b"),
ProportionValue.Create(0.1, "c")
};
// Outputs "a" with probability 0.7, etc.
Console.WriteLine(list.ChooseByRandom());
For Python:
>>> import random
>>> dst = 70, 20, 10
>>> vls = 'a', 'b', 'c'
>>> picks = [v for v, d in zip(vls, dst) for _ in range(d)]
>>> for _ in range(12): print random.choice(picks),
...
a c c b a a a a a a a a
>>> for _ in range(12): print random.choice(picks),
...
a c a c a b b b a a a a
>>> for _ in range(12): print random.choice(picks),
...
a a a a c c a c a a c a
>>>
General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use random.choice to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., 70, 20, 10 makes a 100-items list where 7, 2, 1 would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario.
Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).
Knuth references Walker's method of aliases. Searching on this, I find http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/ and http://prxq.wordpress.com/2006/04/17/the-alias-method/. This gives the exact probabilities required in constant time per number generated with linear time for setup (curiously, n log n time for setup if you use exactly the method Knuth describes, which does a preparatory sort you can avoid).
Take the list of and find the cumulative total of the weights: 70, 70+20, 70+20+10. Pick a random number greater than or equal to zero and less than the total. Iterate over the items and return the first value for which the cumulative sum of the weights is greater than this random number:
def select( values ):
variate = random.random() * sum( values.values() )
cumulative = 0.0
for item, weight in values.items():
cumulative += weight
if variate < cumulative:
return item
return item # Shouldn't get here, but just in case of rounding...
print select( { "a": 70, "b": 20, "c": 10 } )
This solution, as implemented, should also be able to handle fractional weights and weights that add up to any number so long as they're all non-negative.
Let T = the sum of all item weights
Let R = a random number between 0 and T
Iterate the item list subtracting each item weight from R and return the item that causes the result to become <= 0.
def weighted_choice(probabilities):
random_position = random.random() * sum(probabilities)
current_position = 0.0
for i, p in enumerate(probabilities):
current_position += p
if random_position < current_position:
return i
return None
Because random.random will always return < 1.0, the final return should never be reached.
import random
def selector(weights):
i=random.random()*sum(x for x,y in weights)
for w,v in weights:
if w>=i:
break
i-=w
return v
weights = ((70,'a'),(20,'b'),(10,'c'))
print [selector(weights) for x in range(10)]
it works equally well for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
print [selector(weights) for x in range(10)]
If you have a lot of weights, you can use bisect to reduce the number of iterations required
import random
import bisect
def make_acc_weights(weights):
acc=0
acc_weights = []
for w,v in weights:
acc+=w
acc_weights.append((acc,v))
return acc_weights
def selector(acc_weights):
i=random.random()*sum(x for x,y in weights)
return weights[bisect.bisect(acc_weights, (i,))][1]
weights = ((70,'a'),(20,'b'),(10,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
Also works fine for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
today, the update of python document give an example to make a random.choice() with weighted probabilities:
If the weights are small integer ratios, a simple technique is to build a sample population with repeats:
>>> weighted_choices = [('Red', 3), ('Blue', 2), ('Yellow', 1), ('Green', 4)]
>>> population = [val for val, cnt in weighted_choices for i in range(cnt)]
>>> random.choice(population)
'Green'
A more general approach is to arrange the weights in a cumulative distribution with itertools.accumulate(), and then locate the random value with bisect.bisect():
>>> choices, weights = zip(*weighted_choices)
>>> cumdist = list(itertools.accumulate(weights))
>>> x = random.random() * cumdist[-1]
>>> choices[bisect.bisect(cumdist, x)]
'Blue'
one note: itertools.accumulate() needs python 3.2 or define it with the Equivalent.
I think you can have an array of small objects (I implemented in Java although I know a little bit C# but I am afraid can write wrong code), so you may need to port it yourself. The code in C# will be much smaller with struct, var but I hope you get the idea
class PercentString {
double percent;
String value;
// Constructor for 2 values
}
ArrayList<PercentString> list = new ArrayList<PercentString();
list.add(new PercentString(70, "a");
list.add(new PercentString(20, "b");
list.add(new PercentString(10, "c");
double percent = 0;
for (int i = 0; i < list.size(); i++) {
PercentString p = list.get(i);
percent += p.percent;
if (random < percent) {
return p.value;
}
}
If you are really up to speed and want to generate the random values quickly, the Walker's algorithm mcdowella mentioned in https://stackoverflow.com/a/3655773/1212517 is pretty much the best way to go (O(1) time for random(), and O(N) time for preprocess()).
For anyone who is interested, here is my own PHP implementation of the algorithm:
/**
* Pre-process the samples (Walker's alias method).
* #param array key represents the sample, value is the weight
*/
protected function preprocess($weights){
$N = count($weights);
$sum = array_sum($weights);
$avg = $sum / (double)$N;
//divide the array of weights to values smaller and geq than sum/N
$smaller = array_filter($weights, function($itm) use ($avg){ return $avg > $itm;}); $sN = count($smaller);
$greater_eq = array_filter($weights, function($itm) use ($avg){ return $avg <= $itm;}); $gN = count($greater_eq);
$bin = array(); //bins
//we want to fill N bins
for($i = 0;$i<$N;$i++){
//At first, decide for a first value in this bin
//if there are small intervals left, we choose one
if($sN > 0){
$choice1 = each($smaller);
unset($smaller[$choice1['key']]);
$sN--;
} else{ //otherwise, we split a large interval
$choice1 = each($greater_eq);
unset($greater_eq[$choice1['key']]);
}
//splitting happens here - the unused part of interval is thrown back to the array
if($choice1['value'] >= $avg){
if($choice1['value'] - $avg >= $avg){
$greater_eq[$choice1['key']] = $choice1['value'] - $avg;
}else if($choice1['value'] - $avg > 0){
$smaller[$choice1['key']] = $choice1['value'] - $avg;
$sN++;
}
//this bin comprises of only one value
$bin[] = array(1=>$choice1['key'], 2=>null, 'p1'=>1, 'p2'=>0);
}else{
//make the second choice for the current bin
$choice2 = each($greater_eq);
unset($greater_eq[$choice2['key']]);
//splitting on the second interval
if($choice2['value'] - $avg + $choice1['value'] >= $avg){
$greater_eq[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
}else{
$smaller[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
$sN++;
}
//this bin comprises of two values
$choice2['value'] = $avg - $choice1['value'];
$bin[] = array(1=>$choice1['key'], 2=>$choice2['key'],
'p1'=>$choice1['value'] / $avg,
'p2'=>$choice2['value'] / $avg);
}
}
$this->bins = $bin;
}
/**
* Choose a random sample according to the weights.
*/
public function random(){
$bin = $this->bins[array_rand($this->bins)];
$randValue = (lcg_value() < $bin['p1'])?$bin[1]:$bin[2];
}
Here is my version that can apply to any IList and normalize the weight. It is based on Timwi's solution : selection based on percentage weighting
/// <summary>
/// return a random element of the list or default if list is empty
/// </summary>
/// <param name="e"></param>
/// <param name="weightSelector">
/// return chances to be picked for the element. A weigh of 0 or less means 0 chance to be picked.
/// If all elements have weight of 0 or less they all have equal chances to be picked.
/// </param>
/// <returns></returns>
public static T AnyOrDefault<T>(this IList<T> e, Func<T, double> weightSelector)
{
if (e.Count < 1)
return default(T);
if (e.Count == 1)
return e[0];
var weights = e.Select(o => Math.Max(weightSelector(o), 0)).ToArray();
var sum = weights.Sum(d => d);
var rnd = new Random().NextDouble();
for (int i = 0; i < weights.Length; i++)
{
//Normalize weight
var w = sum == 0
? 1 / (double)e.Count
: weights[i] / sum;
if (rnd < w)
return e[i];
rnd -= w;
}
throw new Exception("Should not happen");
}
I've my own solution for this:
public class Randomizator3000
{
public class Item<T>
{
public T value;
public float weight;
public static float GetTotalWeight<T>(Item<T>[] p_itens)
{
float __toReturn = 0;
foreach(var item in p_itens)
{
__toReturn += item.weight;
}
return __toReturn;
}
}
private static System.Random _randHolder;
private static System.Random _random
{
get
{
if(_randHolder == null)
_randHolder = new System.Random();
return _randHolder;
}
}
public static T PickOne<T>(Item<T>[] p_itens)
{
if(p_itens == null || p_itens.Length == 0)
{
return default(T);
}
float __randomizedValue = (float)_random.NextDouble() * (Item<T>.GetTotalWeight(p_itens));
float __adding = 0;
for(int i = 0; i < p_itens.Length; i ++)
{
float __cacheValue = p_itens[i].weight + __adding;
if(__randomizedValue <= __cacheValue)
{
return p_itens[i].value;
}
__adding = __cacheValue;
}
return p_itens[p_itens.Length - 1].value;
}
}
And using it should be something like that (thats in Unity3d)
using UnityEngine;
using System.Collections;
public class teste : MonoBehaviour
{
Randomizator3000.Item<string>[] lista;
void Start()
{
lista = new Randomizator3000.Item<string>[10];
lista[0] = new Randomizator3000.Item<string>();
lista[0].weight = 10;
lista[0].value = "a";
lista[1] = new Randomizator3000.Item<string>();
lista[1].weight = 10;
lista[1].value = "b";
lista[2] = new Randomizator3000.Item<string>();
lista[2].weight = 10;
lista[2].value = "c";
lista[3] = new Randomizator3000.Item<string>();
lista[3].weight = 10;
lista[3].value = "d";
lista[4] = new Randomizator3000.Item<string>();
lista[4].weight = 10;
lista[4].value = "e";
lista[5] = new Randomizator3000.Item<string>();
lista[5].weight = 10;
lista[5].value = "f";
lista[6] = new Randomizator3000.Item<string>();
lista[6].weight = 10;
lista[6].value = "g";
lista[7] = new Randomizator3000.Item<string>();
lista[7].weight = 10;
lista[7].value = "h";
lista[8] = new Randomizator3000.Item<string>();
lista[8].weight = 10;
lista[8].value = "i";
lista[9] = new Randomizator3000.Item<string>();
lista[9].weight = 10;
lista[9].value = "j";
}
void Update ()
{
Debug.Log(Randomizator3000.PickOne<string>(lista));
}
}
In this example each value has a 10% chance do be displayed as a debug =3
Based loosely on python's numpy.random.choice(a=items, p=probs), which takes an array and a probability array of the same size.
public T RandomChoice<T>(IEnumerable<T> a, IEnumerable<double> p)
{
IEnumerator<T> ae = a.GetEnumerator();
Random random = new Random();
double target = random.NextDouble();
double accumulator = 0;
foreach (var prob in p)
{
ae.MoveNext();
accumulator += prob;
if (accumulator > target)
{
break;
}
}
return ae.Current;
}
The probability array p must sum to (approx.) 1. This is to keep it consistent with the numpy interface (and mathematics), but you could easily change that if you wanted.

Categories

Resources