Related
I have a list of 128 32 bit numbers, and I want to know, is there any combination of 12 numbers, so that all numbers XORed give the 32 bit number with all bits set to 1.
So I have started with naive approach and took combinations generator like that:
private static IEnumerable<int[]> Combinations(int k, int n)
{
var state = new int[k];
var stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
var index = stack.Count - 1;
var value = stack.Pop();
while (value < n)
{
state[index++] = value++;
if (value < n)
{
stack.Push(value);
}
if (index == k)
{
yield return state;
break;
}
}
}
}
and used it like that (data32 is an array of given 32bit numbers)
foreach (var probe in Combinations(12, 128))
{
int p = 0;
foreach (var index in probe)
{
p = p ^ data32[index];
}
if (p == -1)
{
//print out found combination
}
}
Of course it takes forever to check all 23726045489546400 combinations...
So my question(s) are - am I missing something in options how to speedup the check process?
Even if I do the calculation of combinations in partitions (e.g. I could start like 8 threads each will check combination started with numbers 0..8), or speed up the XORing by storing the perviously calculated combination - it is still slow.
P.S. I'd like it to run in reasonable time - minutes, hours not years.
Adding a list of numbers as was requested in one of the comments:
1571089837
2107702069
466053875
226802789
506212087
484103496
1826565655
944897655
1370004928
748118360
1000006005
952591039
2072497930
2115635395
966264796
1229014633
827262231
1276114545
1480412665
2041893083
512565106
1737382276
1045554806
172937528
1746275907
1376570954
1122801782
2013209036
1650561071
1595622894
425898265
770953281
422056706
477352958
1295095933
1783223223
842809023
1939751129
1444043041
1560819338
1810926532
353960897
1128003064
1933682525
1979092040
1987208467
1523445101
174223141
79066913
985640026
798869234
151300097
770795939
1489060367
823126463
1240588773
490645418
832012849
188524191
1034384571
1802169877
150139833
1762370591
1425112310
2121257460
205136626
706737928
265841960
517939268
2070634717
1703052170
1536225470
1511643524
1220003866
714424500
49991283
688093717
1815765740
41049469
529293552
1432086255
1001031015
1792304327
1533146564
399287468
1520421007
153855202
1969342940
742525121
1326187406
1268489176
729430821
1785462100
1180954683
422085275
1578687761
2096405952
1267903266
2105330329
471048135
764314242
459028205
1313062337
1995689086
1786352917
2072560816
282249055
1711434199
1463257872
1497178274
472287065
246628231
1928555152
1908869676
1629894534
885445498
1710706530
1250732374
107768432
524848610
2791827620
1607140095
1820646148
774737399
1808462165
194589252
1051374116
1802033814
I don't know C#, I did something in Python, maybe interesting anyway. Takes about 0.8 seconds to find a solution for your sample set:
solution = {422056706, 2791827620, 506212087, 1571089837, 827262231, 1650561071, 1595622894, 512565106, 205136626, 944897655, 966264796, 477352958}
len(solution) = 12
solution.issubset(nums) = True
hex(xor(solution)) = '0xffffffff'
There are 128C12 combinations, that's 5.5 million times as many as the 232 possible XOR values. So I tried being optimistic and only tried a subset of the possible combinations. I split the 128 numbers into two blocks of 28 and 100 numbers and try combinations with six numbers from each of the two blocks. I put all possible XORs of the first block into a hash set A, then go through all XORs of the second block to find one whose bitwise inversion is in that set. Then I reconstruct the individual numbers.
This way I cover (28C6)2 × (100C6)2 = 4.5e14 combinations, still over 100000 times as many as there are possible XOR values. So probably still a very good chance to find a valid combination.
Code (Try it online!):
from itertools import combinations
from functools import reduce
from operator import xor as xor_
nums = list(map(int, '1571089837 2107702069 466053875 226802789 506212087 484103496 1826565655 944897655 1370004928 748118360 1000006005 952591039 2072497930 2115635395 966264796 1229014633 827262231 1276114545 1480412665 2041893083 512565106 1737382276 1045554806 172937528 1746275907 1376570954 1122801782 2013209036 1650561071 1595622894 425898265 770953281 422056706 477352958 1295095933 1783223223 842809023 1939751129 1444043041 1560819338 1810926532 353960897 1128003064 1933682525 1979092040 1987208467 1523445101 174223141 79066913 985640026 798869234 151300097 770795939 1489060367 823126463 1240588773 490645418 832012849 188524191 1034384571 1802169877 150139833 1762370591 1425112310 2121257460 205136626 706737928 265841960 517939268 2070634717 1703052170 1536225470 1511643524 1220003866 714424500 49991283 688093717 1815765740 41049469 529293552 1432086255 1001031015 1792304327 1533146564 399287468 1520421007 153855202 1969342940 742525121 1326187406 1268489176 729430821 1785462100 1180954683 422085275 1578687761 2096405952 1267903266 2105330329 471048135 764314242 459028205 1313062337 1995689086 1786352917 2072560816 282249055 1711434199 1463257872 1497178274 472287065 246628231 1928555152 1908869676 1629894534 885445498 1710706530 1250732374 107768432 524848610 2791827620 1607140095 1820646148 774737399 1808462165 194589252 1051374116 1802033814'.split()))
def xor(vals):
return reduce(xor_, vals)
A = {xor(a)^0xffffffff: a
for a in combinations(nums[:28], 6)}
for b in combinations(nums[28:], 6):
if a := A.get(xor(b)):
break
solution = {*a, *b}
print(f'{solution = }')
print(f'{len(solution) = }')
print(f'{solution.issubset(nums) = }')
print(f'{hex(xor(solution)) = }')
Arrange your numbers into buckets based on the position of the first 1 bit.
To set the first bit to 1, you will have to use an odd number of the items in the corresponding bucket....
As you recurse, try to maintain an invariant that the number of leading 1 bits is increasing and then select the bucket that will change the next 0 to a 1, this will greatly reduce the number of combinations that you have to try.
I have found a possible solution, which could work for my particular task.
As main issue to straitforward approach I see a number of 2E16 combinations.
But, if I want to check if combination of 12 elements equal to 0xFFFFFFFF, I could check if 2 different combinations of 6 elements with opposit values exists.
That will reduce number of combinations to "just" 5E9, which is achievable.
On first attempt I think to store all combinations and then find opposites in the big list. But, in .NET I could not find quick way of storing more then Int32.MaxValue elements.
Taking in account idea with bits from comments and answer, I decided to store at first only xor sums with leftmost bit set to 1, and then by definition I need to check only sums with leftmost bit set to 0 => reducing storage by 2.
In the end it appears that many collisions could appear, so there are many combinations with the same xor sum.
Current version which could find such combinations, need to be compiled in x64 mode and use (any impovements welcomed):
static uint print32(int[] comb, uint[] data)
{
uint p = 0;
for (int i = 0; i < comb.Length; i++)
{
Console.Write("{0} ", comb[i]);
p = p ^ data[comb[i]];
}
Console.WriteLine(" #[{0:X}]", p);
return p;
}
static uint[] data32;
static void Main(string[] args)
{
int n = 128;
int k = 6;
uint p = 0;
uint inv = 0;
long t = 0;
//load n numbers from a file
init(n);
var lookup1x = new Dictionary<uint, List<byte>>();
var lookup0x = new Dictionary<uint, List<byte>>();
Stopwatch watch = new Stopwatch();
watch.Start();
//do not use IEnumerable generator, use function directly to reuse xor value
var hash = new uint[k];
var comb = new int[k];
var stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
var index = stack.Count - 1;
var value = stack.Pop();
if (index == 0)
{
p = 0;
Console.WriteLine("Start {0} sequence, combinations found: {1}",value,t);
}
else
{
//restore previous xor value
p = hash[index - 1];
}
while (value < n)
{
//xor and store
p = p ^ data32[value];
hash[index] = p;
//remember current state (combination)
comb[index++] = value++;
if (value < n)
{
stack.Push(value);
}
//combination filled to end
if (index == k)
{
//if xor have MSB set, put it to lookup table 1x
if ((p & 0x8000000) == 0x8000000)
{
lookup1x[p] = comb.Select(i => (byte)i).ToList();
inv = p ^ 0xFFFFFFFF;
if (lookup0x.ContainsKey(inv))
{
var full = lookup0x[inv].Union(lookup1x[p]).OrderBy(x=>x).ToArray();
if (full.Length == 12)
{
print32(full, data32);
}
}
}
else
{
//otherwise put it to lookup table 2, but skip all combinations which are started with 0
if (comb[0] != 0)
{
lookup0x[p] = comb.Select(i => (byte)i).ToList();
inv = p ^ 0xFFFFFFFF;
if (lookup1x.ContainsKey(inv))
{
var full = lookup0x[p].Union(lookup1x[inv]).OrderBy(x=>x).ToArray();
if (full.Length == 12)
{
print32(full, data32);
}
}
}
}
t++;
break;
}
}
}
Console.WriteLine("Check was done in {0} ms ", watch.ElapsedMilliseconds);
//end
}
I'm at a loss as to why I can't get this seemingly simple problem solved using Microsoft Solver Foundation.
All I need is to modify the weights (numbers) of certain observations to ensure that no 1 observation's weight AS A PERCENTAGE exceeds 25%. This is for the purposes of later calculating a constrained weighted average with the results of this algorithm.
For example, given the 5 weights of { 45, 100, 33, 500, 28 }, I would expect the result of this algorithm to be { 45, 53, 33, 53, 28 }, where 2 of the numbers had to be reduced such that they're within the 25% threshold of the new total (212 = 45+53+33+53+28) while the others remained untouched. Note that even though initially, the 2nd weight of 100 was only 14% of the total (706), as a result of decreasing the 4th weight of 500, it subsequently pushed up the % of the other observations and therein lies the only challenge with this.
I tried to recreate this using Solver only for it to tell me that it is the solution is "Infeasible" and it just returns all 1s. Update: solution need not use Solver, any alternative is welcome so long as it is fast when dealing with a decent number of weights.
var solver = SolverContext.GetContext();
var model = solver.CreateModel();
var decisionList = new List<Decision>();
decisionList.Add(new Decision(Domain.IntegerRange(1, 45), "Dec1"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 100), "Dec2"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 33), "Dec3"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 500), "Dec4"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 28), "Dec5"));
model.AddDecisions(decisionList.ToArray());
int weightLimit = 25;
foreach (var decision in model.Decisions)
{
model.AddConstraint(decision.Name + "weightLimit", 100 * (decision / Model.Sum(model.Decisions.ToArray())) <= weightLimit);
}
model.AddGoal("calcGoal", GoalKind.Maximize, Model.Sum(model.Decisions.ToArray()));
var solution = solver.Solve();
foreach (var decision in model.Decisions)
{
Debug.Print(decision.GetDouble().ToString());
}
Debug.Print("Solution Quality: " + solution.Quality.ToString());
Any help with this would be very much appreciated, thanks in advance.
I ditched Solver b/c it didn't live up to its name imo (or I didn't live up to its standards :)). Below is where I landed. Because this function gets used many times and on large lists of input weights, efficiency and performance are key so this function attempts to do the least # of iterations possible (let me know if anyone has any suggested improvements though). The results get used for a weighted average so I use "AttributeWeightPair" to store the value (attribute) and its weight and the function below is what modifies the weights to be within the constraint when given a list of these AWPs. The function assumes that weightLimit is passed in as a %, e.g. 25% gets passed in as 25, not 0.25 --- ok I'll stop stating what'll be obvious from the code - so here it is:
public static List<AttributeWeightPair<decimal>> WeightLimiter(List<AttributeWeightPair<decimal>> source, decimal weightLimit)
{
weightLimit /= 100; //convert to percentage
var zeroWeights = source.Where(w => w.Weight == 0).ToList();
var nonZeroWeights = source.Where(w => w.Weight > 0).ToList();
if (nonZeroWeights.Count == 0)
return source;
//return equal weights if given infeasible constraint
if ((1m / nonZeroWeights.Count()) > weightLimit)
{
nonZeroWeights.ForEach(w => w.Weight = 1);
return nonZeroWeights.Concat(zeroWeights).ToList();
}
//return original list if weight-limiting is unnecessary
if ((nonZeroWeights.Max(w => w.Weight) / nonZeroWeights.Sum(w => w.Weight)) <= weightLimit)
{
return source;
}
//sort (ascending) and store original weights
nonZeroWeights = nonZeroWeights.OrderBy(w => w.Weight).ToList();
var originalWeights = nonZeroWeights.Select(w => w.Weight).ToList();
//set starting point and determine direction from there
var initialSumWeights = nonZeroWeights.Sum(w => w.Weight);
var initialLimit = weightLimit * initialSumWeights;
var initialSuspects = nonZeroWeights.Where(w => w.Weight > initialLimit).ToList();
var initialTarget = weightLimit * (initialSumWeights - (initialSuspects.Sum(w => w.Weight) - initialLimit * initialSuspects.Count()));
var antepenultimateIndex = Math.Max(nonZeroWeights.FindLastIndex(w => w.Weight <= initialTarget), 1); //needs to be at least 1
for (int i = antepenultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[antepenultimateIndex - 1]; //set cap equal to the preceding weight
}
bool goingUp = (nonZeroWeights[antepenultimateIndex].Weight / nonZeroWeights.Sum(w => w.Weight)) > weightLimit ? false : true;
//Procedure 1 - find the weight # at which a cap would result in a weight % just UNDER the weight limit
int penultimateIndex = antepenultimateIndex;
bool justUnderTarget = false;
while (!justUnderTarget)
{
for (int i = penultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[penultimateIndex - 1]; //set cap equal to the preceding weight
}
var currentMaxPcntWeight = nonZeroWeights[penultimateIndex].Weight / nonZeroWeights.Sum(w => w.Weight);
if (currentMaxPcntWeight == weightLimit)
{
return nonZeroWeights.Concat(zeroWeights).ToList();
}
else if (goingUp && currentMaxPcntWeight < weightLimit)
{
nonZeroWeights[penultimateIndex].Weight = originalWeights[penultimateIndex]; //reset
if (penultimateIndex < nonZeroWeights.Count() - 1)
penultimateIndex++; //move up
else break;
}
else if (!goingUp && currentMaxPcntWeight > weightLimit)
{
if (penultimateIndex > 1)
penultimateIndex--; //move down
else break;
}
else
{
justUnderTarget = true;
}
}
if (goingUp) //then need to back up a step
{
penultimateIndex = (penultimateIndex > 1 ? penultimateIndex - 1 : 1);
for (int i = penultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[penultimateIndex - 1];
}
}
//Procedure 2 - increment the modified weights (subject to a cap equal to their original values) until the weight limit is hit (allowing a very slight overage for the last term in some cases)
int ultimateIndex = penultimateIndex;
var sumWeights = nonZeroWeights.Sum(w => w.Weight); //use this counter instead of summing every time for condition check within loop
bool justOverTarget = false;
while (!justOverTarget)
{
for (int i = ultimateIndex; i < nonZeroWeights.Count(); i++)
{
if (nonZeroWeights[i].Weight + 1 > originalWeights[i])
{
if (ultimateIndex < nonZeroWeights.Count() - 1)
ultimateIndex++;
else justOverTarget = true;
}
else
{
nonZeroWeights[i].Weight++;
sumWeights++;
}
}
if ((nonZeroWeights.Last().Weight / sumWeights) >= weightLimit)
{
justOverTarget = true;
}
}
return nonZeroWeights.Concat(zeroWeights).ToList();
}
public class AttributeWeightPair<T>
{
public T Attribute { get; set; }
public decimal? Weight { get; set; }
public AttributeWeightPair(T attribute, decimal? count)
{
this.Attribute = attribute;
this.Weight = count;
}
}
Length = input Long(can be 2550, 2880, 2568, etc)
List<long> = {618, 350, 308, 300, 250, 232, 200, 128}
The program takes a long value, for that particular long value we have to find the possible combination from the above list which when added give me a input result(same value can be used twice). There can be a difference of +/- 30.
Largest numbers have to be used most.
Ex:Length = 868
For this combinations can be
Combination 1 = 618 + 250
Combination 2 = 308 + 232 + 200 +128
Correct Combination would be Combination 1
But there should also be different combinations.
public static void Main(string[] args)
{
//subtotal list
List<int> totals = new List<int>(new int[] { 618, 350, 308, 300, 250, 232, 200, 128 });
// get matches
List<int[]> results = KnapSack.MatchTotal(2682, totals);
// print results
foreach (var result in results)
{
Console.WriteLine(string.Join(",", result));
}
Console.WriteLine("Done.");
}
internal static List<int[]> MatchTotal(int theTotal, List<int> subTotals)
{
List<int[]> results = new List<int[]>();
while (subTotals.Contains(theTotal))
{
results.Add(new int[1] { theTotal });
subTotals.Remove(theTotal);
}
if (subTotals.Count == 0)
return results;
subTotals.Sort();
double mostNegativeNumber = subTotals[0];
if (mostNegativeNumber > 0)
mostNegativeNumber = 0;
if (mostNegativeNumber == 0)
subTotals.RemoveAll(d => d > theTotal);
for (int choose = 0; choose <= subTotals.Count; choose++)
{
IEnumerable<IEnumerable<int>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);
results.AddRange(from combo in combos where combo.Sum() == theTotal select combo.ToArray());
}
return results;
}
public static class Combination
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)
{
return choose == 0 ?
new[] { new T[0] } :
elements.SelectMany((element, i) =>
elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));
}
}
I Have used the above code, can it be more simplified, Again here also i get unique values. A value can be used any number of times. But the largest number has to be given the most priority.
I have a validation to check whether the total of the sum is greater than the input value. The logic fails even there..
The algorithm you have shown assumes that the list is sorted in ascending order. If not, then you shall first have to sort the list in O(nlogn) time and then execute the algorithm.
Also, it assumes that you are only considering combinations of pairs and you exit on the first match.
If you want to find all combinations, then instead of "break", just output the combination and increment startIndex or decrement endIndex.
Moreover, you should check for ranges (targetSum - 30 to targetSum + 30) rather than just the exact value because the problem says that a margin of error is allowed.
This is the best solution according to me because its complexity is O(nlogn + n) including the sorting.
V4 - Recursive Method, using Stack structure instead of stack frames on thread
It works (tested in VS), but there could be some bugs remaining.
static int Threshold = 30;
private static Stack<long> RecursiveMethod(long target)
{
Stack<long> Combination = new Stack<long>(establishedValues.Count); //Can grow bigger, as big as (target / min(establishedValues)) values
Stack<int> Index = new Stack<int>(establishedValues.Count); //Can grow bigger
int lowerBound = 0;
int dimensionIndex = lowerBound;
long fail = -1 * Threshold;
while (true)
{
long thisVal = establishedValues[dimensionIndex];
dimensionIndex++;
long afterApplied = target - thisVal;
if (afterApplied < fail)
lowerBound = dimensionIndex;
else
{
target = afterApplied;
Combination.Push(thisVal);
if (target <= Threshold)
return Combination;
Index.Push(dimensionIndex);
dimensionIndex = lowerBound;
}
if (dimensionIndex >= establishedValues.Count)
{
if (Index.Count == 0)
return null; //No possible combinations
dimensionIndex = Index.Pop();
lowerBound = dimensionIndex;
target += Combination.Pop();
}
}
}
Maybe V3 - Suggestion for Ordered solution trying every combination
Although this isn't chosen as the answer for the related question, I believe this is a good approach - https://stackoverflow.com/a/17258033/887092(, otherwise you could try the chosen answer (although the output for that is only 2 items in set being summed, rather than up to n items)) - it will enumerate every option including multiples of the same value. V2 works but would be slightly less efficient than an ordered solution, as the same failing-attempt will likely be attempted multiple times.
V2 - Random Selection - Will be able to reuse the same number twice
I'm a fan of using random for "intelligence", allowing the computer to brute force the solution. It's also easy to distribute - as there is no state dependence between two threads trying at the same time for example.
static int Threshold = 30;
public static List<long> RandomMethod(long Target)
{
List<long> Combinations = new List<long>();
Random rnd = new Random();
//Assuming establishedValues is sorted
int LowerBound = 0;
long runningSum = Target;
while (true)
{
int newLowerBound = FindLowerBound(LowerBound, runningSum);
if (newLowerBound == -1)
{
//No more beneficial values to work with, reset
runningSum = Target;
Combinations.Clear();
LowerBound = 0;
continue;
}
LowerBound = newLowerBound;
int rIndex = rnd.Next(LowerBound, establishedValues.Count);
long val = establishedValues[rIndex];
runningSum -= val;
Combinations.Add(val);
if (Math.Abs(runningSum) <= 30)
return Combinations;
}
}
static int FindLowerBound(int currentLowerBound, long runningSum)
{
//Adjust lower bound, so we're not randomly trying a number that's too high
for (int i = currentLowerBound; i < establishedValues.Count; i++)
{
//Factor in the threshold, because an end aggregate which exceeds by 20 is better than underperforming by 21.
if ((establishedValues[i] - Threshold) < runningSum)
{
return i;
}
}
return -1;
}
V1 - Ordered selection - Will not be able to reuse the same number twice
Add this very handy extension function (uses a binary algorithm to find all combinations):
//Make sure you put this in a static class inside System namespace
public static IEnumerable<List<T>> EachCombination<T>(this List<T> allValues)
{
var collection = new List<List<T>>();
for (int counter = 0; counter < (1 << allValues.Count); ++counter)
{
List<T> combination = new List<T>();
for (int i = 0; i < allValues.Count; ++i)
{
if ((counter & (1 << i)) == 0)
combination.Add(allValues[i]);
}
if (combination.Count == 0)
continue;
yield return combination;
}
}
Use the function
static List<long> establishedValues = new List<long>() {618, 350, 308, 300, 250, 232, 200, 128, 180, 118, 155};
//Return is a list of the values which sum to equal the target. Null if not found.
List<long> FindFirstCombination(long target)
{
foreach (var combination in establishedValues.EachCombination())
{
//if (combination.Sum() == target)
if (Math.Abs(combination.Sum() - target) <= 30) //Plus or minus tolerance for difference
return combination;
}
return null; //Or you could throw an exception
}
Test the solution
var target = 858;
var result = FindFirstCombination(target);
bool success = (result != null && result.Sum() == target);
//TODO: for loop with random selection of numbers from the establishedValues, Sum and test through FindFirstCombination
I have a set of values, and an associated percentage for each:
a: 70% chance
b: 20% chance
c: 10% chance
I want to select a value (a, b, c) based on the percentage chance given.
how do I approach this?
my attempt so far looks like this:
r = random.random()
if r <= .7:
return a
elif r <= .9:
return b
else:
return c
I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows.
(any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)
Here is a complete solution in C#:
public class ProportionValue<T>
{
public double Proportion { get; set; }
public T Value { get; set; }
}
public static class ProportionValue
{
public static ProportionValue<T> Create<T>(double proportion, T value)
{
return new ProportionValue<T> { Proportion = proportion, Value = value };
}
static Random random = new Random();
public static T ChooseByRandom<T>(
this IEnumerable<ProportionValue<T>> collection)
{
var rnd = random.NextDouble();
foreach (var item in collection)
{
if (rnd < item.Proportion)
return item.Value;
rnd -= item.Proportion;
}
throw new InvalidOperationException(
"The proportions in the collection do not add up to 1.");
}
}
Usage:
var list = new[] {
ProportionValue.Create(0.7, "a"),
ProportionValue.Create(0.2, "b"),
ProportionValue.Create(0.1, "c")
};
// Outputs "a" with probability 0.7, etc.
Console.WriteLine(list.ChooseByRandom());
For Python:
>>> import random
>>> dst = 70, 20, 10
>>> vls = 'a', 'b', 'c'
>>> picks = [v for v, d in zip(vls, dst) for _ in range(d)]
>>> for _ in range(12): print random.choice(picks),
...
a c c b a a a a a a a a
>>> for _ in range(12): print random.choice(picks),
...
a c a c a b b b a a a a
>>> for _ in range(12): print random.choice(picks),
...
a a a a c c a c a a c a
>>>
General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use random.choice to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., 70, 20, 10 makes a 100-items list where 7, 2, 1 would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario.
Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).
Knuth references Walker's method of aliases. Searching on this, I find http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/ and http://prxq.wordpress.com/2006/04/17/the-alias-method/. This gives the exact probabilities required in constant time per number generated with linear time for setup (curiously, n log n time for setup if you use exactly the method Knuth describes, which does a preparatory sort you can avoid).
Take the list of and find the cumulative total of the weights: 70, 70+20, 70+20+10. Pick a random number greater than or equal to zero and less than the total. Iterate over the items and return the first value for which the cumulative sum of the weights is greater than this random number:
def select( values ):
variate = random.random() * sum( values.values() )
cumulative = 0.0
for item, weight in values.items():
cumulative += weight
if variate < cumulative:
return item
return item # Shouldn't get here, but just in case of rounding...
print select( { "a": 70, "b": 20, "c": 10 } )
This solution, as implemented, should also be able to handle fractional weights and weights that add up to any number so long as they're all non-negative.
Let T = the sum of all item weights
Let R = a random number between 0 and T
Iterate the item list subtracting each item weight from R and return the item that causes the result to become <= 0.
def weighted_choice(probabilities):
random_position = random.random() * sum(probabilities)
current_position = 0.0
for i, p in enumerate(probabilities):
current_position += p
if random_position < current_position:
return i
return None
Because random.random will always return < 1.0, the final return should never be reached.
import random
def selector(weights):
i=random.random()*sum(x for x,y in weights)
for w,v in weights:
if w>=i:
break
i-=w
return v
weights = ((70,'a'),(20,'b'),(10,'c'))
print [selector(weights) for x in range(10)]
it works equally well for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
print [selector(weights) for x in range(10)]
If you have a lot of weights, you can use bisect to reduce the number of iterations required
import random
import bisect
def make_acc_weights(weights):
acc=0
acc_weights = []
for w,v in weights:
acc+=w
acc_weights.append((acc,v))
return acc_weights
def selector(acc_weights):
i=random.random()*sum(x for x,y in weights)
return weights[bisect.bisect(acc_weights, (i,))][1]
weights = ((70,'a'),(20,'b'),(10,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
Also works fine for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
today, the update of python document give an example to make a random.choice() with weighted probabilities:
If the weights are small integer ratios, a simple technique is to build a sample population with repeats:
>>> weighted_choices = [('Red', 3), ('Blue', 2), ('Yellow', 1), ('Green', 4)]
>>> population = [val for val, cnt in weighted_choices for i in range(cnt)]
>>> random.choice(population)
'Green'
A more general approach is to arrange the weights in a cumulative distribution with itertools.accumulate(), and then locate the random value with bisect.bisect():
>>> choices, weights = zip(*weighted_choices)
>>> cumdist = list(itertools.accumulate(weights))
>>> x = random.random() * cumdist[-1]
>>> choices[bisect.bisect(cumdist, x)]
'Blue'
one note: itertools.accumulate() needs python 3.2 or define it with the Equivalent.
I think you can have an array of small objects (I implemented in Java although I know a little bit C# but I am afraid can write wrong code), so you may need to port it yourself. The code in C# will be much smaller with struct, var but I hope you get the idea
class PercentString {
double percent;
String value;
// Constructor for 2 values
}
ArrayList<PercentString> list = new ArrayList<PercentString();
list.add(new PercentString(70, "a");
list.add(new PercentString(20, "b");
list.add(new PercentString(10, "c");
double percent = 0;
for (int i = 0; i < list.size(); i++) {
PercentString p = list.get(i);
percent += p.percent;
if (random < percent) {
return p.value;
}
}
If you are really up to speed and want to generate the random values quickly, the Walker's algorithm mcdowella mentioned in https://stackoverflow.com/a/3655773/1212517 is pretty much the best way to go (O(1) time for random(), and O(N) time for preprocess()).
For anyone who is interested, here is my own PHP implementation of the algorithm:
/**
* Pre-process the samples (Walker's alias method).
* #param array key represents the sample, value is the weight
*/
protected function preprocess($weights){
$N = count($weights);
$sum = array_sum($weights);
$avg = $sum / (double)$N;
//divide the array of weights to values smaller and geq than sum/N
$smaller = array_filter($weights, function($itm) use ($avg){ return $avg > $itm;}); $sN = count($smaller);
$greater_eq = array_filter($weights, function($itm) use ($avg){ return $avg <= $itm;}); $gN = count($greater_eq);
$bin = array(); //bins
//we want to fill N bins
for($i = 0;$i<$N;$i++){
//At first, decide for a first value in this bin
//if there are small intervals left, we choose one
if($sN > 0){
$choice1 = each($smaller);
unset($smaller[$choice1['key']]);
$sN--;
} else{ //otherwise, we split a large interval
$choice1 = each($greater_eq);
unset($greater_eq[$choice1['key']]);
}
//splitting happens here - the unused part of interval is thrown back to the array
if($choice1['value'] >= $avg){
if($choice1['value'] - $avg >= $avg){
$greater_eq[$choice1['key']] = $choice1['value'] - $avg;
}else if($choice1['value'] - $avg > 0){
$smaller[$choice1['key']] = $choice1['value'] - $avg;
$sN++;
}
//this bin comprises of only one value
$bin[] = array(1=>$choice1['key'], 2=>null, 'p1'=>1, 'p2'=>0);
}else{
//make the second choice for the current bin
$choice2 = each($greater_eq);
unset($greater_eq[$choice2['key']]);
//splitting on the second interval
if($choice2['value'] - $avg + $choice1['value'] >= $avg){
$greater_eq[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
}else{
$smaller[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
$sN++;
}
//this bin comprises of two values
$choice2['value'] = $avg - $choice1['value'];
$bin[] = array(1=>$choice1['key'], 2=>$choice2['key'],
'p1'=>$choice1['value'] / $avg,
'p2'=>$choice2['value'] / $avg);
}
}
$this->bins = $bin;
}
/**
* Choose a random sample according to the weights.
*/
public function random(){
$bin = $this->bins[array_rand($this->bins)];
$randValue = (lcg_value() < $bin['p1'])?$bin[1]:$bin[2];
}
Here is my version that can apply to any IList and normalize the weight. It is based on Timwi's solution : selection based on percentage weighting
/// <summary>
/// return a random element of the list or default if list is empty
/// </summary>
/// <param name="e"></param>
/// <param name="weightSelector">
/// return chances to be picked for the element. A weigh of 0 or less means 0 chance to be picked.
/// If all elements have weight of 0 or less they all have equal chances to be picked.
/// </param>
/// <returns></returns>
public static T AnyOrDefault<T>(this IList<T> e, Func<T, double> weightSelector)
{
if (e.Count < 1)
return default(T);
if (e.Count == 1)
return e[0];
var weights = e.Select(o => Math.Max(weightSelector(o), 0)).ToArray();
var sum = weights.Sum(d => d);
var rnd = new Random().NextDouble();
for (int i = 0; i < weights.Length; i++)
{
//Normalize weight
var w = sum == 0
? 1 / (double)e.Count
: weights[i] / sum;
if (rnd < w)
return e[i];
rnd -= w;
}
throw new Exception("Should not happen");
}
I've my own solution for this:
public class Randomizator3000
{
public class Item<T>
{
public T value;
public float weight;
public static float GetTotalWeight<T>(Item<T>[] p_itens)
{
float __toReturn = 0;
foreach(var item in p_itens)
{
__toReturn += item.weight;
}
return __toReturn;
}
}
private static System.Random _randHolder;
private static System.Random _random
{
get
{
if(_randHolder == null)
_randHolder = new System.Random();
return _randHolder;
}
}
public static T PickOne<T>(Item<T>[] p_itens)
{
if(p_itens == null || p_itens.Length == 0)
{
return default(T);
}
float __randomizedValue = (float)_random.NextDouble() * (Item<T>.GetTotalWeight(p_itens));
float __adding = 0;
for(int i = 0; i < p_itens.Length; i ++)
{
float __cacheValue = p_itens[i].weight + __adding;
if(__randomizedValue <= __cacheValue)
{
return p_itens[i].value;
}
__adding = __cacheValue;
}
return p_itens[p_itens.Length - 1].value;
}
}
And using it should be something like that (thats in Unity3d)
using UnityEngine;
using System.Collections;
public class teste : MonoBehaviour
{
Randomizator3000.Item<string>[] lista;
void Start()
{
lista = new Randomizator3000.Item<string>[10];
lista[0] = new Randomizator3000.Item<string>();
lista[0].weight = 10;
lista[0].value = "a";
lista[1] = new Randomizator3000.Item<string>();
lista[1].weight = 10;
lista[1].value = "b";
lista[2] = new Randomizator3000.Item<string>();
lista[2].weight = 10;
lista[2].value = "c";
lista[3] = new Randomizator3000.Item<string>();
lista[3].weight = 10;
lista[3].value = "d";
lista[4] = new Randomizator3000.Item<string>();
lista[4].weight = 10;
lista[4].value = "e";
lista[5] = new Randomizator3000.Item<string>();
lista[5].weight = 10;
lista[5].value = "f";
lista[6] = new Randomizator3000.Item<string>();
lista[6].weight = 10;
lista[6].value = "g";
lista[7] = new Randomizator3000.Item<string>();
lista[7].weight = 10;
lista[7].value = "h";
lista[8] = new Randomizator3000.Item<string>();
lista[8].weight = 10;
lista[8].value = "i";
lista[9] = new Randomizator3000.Item<string>();
lista[9].weight = 10;
lista[9].value = "j";
}
void Update ()
{
Debug.Log(Randomizator3000.PickOne<string>(lista));
}
}
In this example each value has a 10% chance do be displayed as a debug =3
Based loosely on python's numpy.random.choice(a=items, p=probs), which takes an array and a probability array of the same size.
public T RandomChoice<T>(IEnumerable<T> a, IEnumerable<double> p)
{
IEnumerator<T> ae = a.GetEnumerator();
Random random = new Random();
double target = random.NextDouble();
double accumulator = 0;
foreach (var prob in p)
{
ae.MoveNext();
accumulator += prob;
if (accumulator > target)
{
break;
}
}
return ae.Current;
}
The probability array p must sum to (approx.) 1. This is to keep it consistent with the numpy interface (and mathematics), but you could easily change that if you wanted.
Consider this:
[Flags]
enum Colors
{
Red=1,
Green=2,
Blue=4
}
Colors myColor=Colors.Red|Colors.Blue;
Currently, I'm doing it as follows:
int length=myColors.ToString().Split(new char[]{','}).Length;
But I hope there is a more efficient way of finding the length, maybe based on bitset operations.
Please, if possible, provide explanation why and how your solution works.
Also, if this a duplicate, please point to it and I'll delete this question. The only similar questions on SO I've been able to find were concerned about finding the length of all possible combinations of Colors enum, but not of the myColors variable.
UPDATE: I carefully benchmarked every solution (1 000 000 iterations each) and here is the results:
Stevo3000 - 8ms
MattEvans - 10ms
Silky - 34ms
Luke - 1757ms
Guffa - 4226ms
Tomas Levesque - 32810ms
The Stevo3000 is a clear winner (with Matt Evans holding silver medal).
Thank you very much for your help.
UPDATE 2:
This solution runs even faster: 41 ms for 100 000 000 iterations (roughly 40 times faster (32bit OS) than Stevo3000)
UInt32 v = (UInt32)co;
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
UInt32 count = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24;
The following code will give you the number of bits that are set for a given number of any type varying in size from byte up to long.
public static int GetSetBitCount(long lValue)
{
int iCount = 0;
//Loop the value while there are still bits
while (lValue != 0)
{
//Remove the end bit
lValue = lValue & (lValue - 1);
//Increment the count
iCount++;
}
//Return the count
return iCount;
}
This code is very efficient as it only iterates once for each bit rather than once for every possible bit as in the other examples.
Here are a few extension methods to manipulate Flags enumerations :
public static class EnumExtensions
{
private static void CheckEnumWithFlags<T>()
{
if (!typeof(T).IsEnum)
throw new ArgumentException(string.Format("Type '{0}' is not an enum", typeof(T).FullName));
if (!Attribute.IsDefined(typeof(T), typeof(FlagsAttribute)))
throw new ArgumentException(string.Format("Type '{0}' doesn't have the 'Flags' attribute", typeof(T).FullName));
}
public static bool IsFlagSet<T>(this T value, T flag) where T : struct
{
CheckEnumWithFlags<T>();
long lValue = Convert.ToInt64(value);
long lFlag = Convert.ToInt64(flag);
return (lValue & lFlag) != 0;
}
public static IEnumerable<T> GetFlags<T>(this T value) where T : struct
{
CheckEnumWithFlags<T>();
foreach (T flag in Enum.GetValues(typeof(T)).Cast<T>())
{
if (value.IsFlagSet(flag))
yield return flag;
}
}
public static T SetFlags<T>(this T value, T flags, bool on) where T : struct
{
CheckEnumWithFlags<T>();
long lValue = Convert.ToInt64(value);
long lFlag = Convert.ToInt64(flags);
if (on)
{
lValue |= lFlag;
}
else
{
lValue &= (~lFlag);
}
return (T)Enum.ToObject(typeof(T), lValue);
}
public static T SetFlags<T>(this T value, T flags) where T : struct
{
return value.SetFlags(flags, true);
}
public static T ClearFlags<T>(this T value, T flags) where T : struct
{
return value.SetFlags(flags, false);
}
public static T CombineFlags<T>(this IEnumerable<T> flags) where T : struct
{
CheckEnumWithFlags<T>();
long lValue = 0;
foreach (T flag in flags)
{
long lFlag = Convert.ToInt64(flag);
lValue |= lFlag;
}
return (T)Enum.ToObject(typeof(T), lValue);
}
}
In your case you can use the GetFlags method :
int count = myColors.GetFlags().Count();
It's probably not as efficient as Luke's answer, but it's easier to use...
Here's my take on this... it counts the number of set bits in the value
int val = (int)myColor;
int count = 0;
while (val > 0)
{
if((val & 1) != 0)
{
count++;
}
val = val >> 1;
}
Here's a reasonably easy way of counting the bits. Each bit is shifted in-turn to the LSB of an Int64 which is AND-ed with 1 (to mask out any of the other bits) and then added to the running total.
int length = Enumerable.Range(0, 64).Sum(x => ((long)myColor >> x) & 1);
A rough approximation will be just counting the number of bits set in myColors, but that will only work if every enumeration members' value is power of 2.
Assuming they are flags, you can just use one of the methods here, to count the number of bits set.
It works because, as long as they are flags, when each one is 'OR'd' on, it sets one bit.
-- Edit
Sample code using one of the methods on that link:
[Flags]
enum Test
{
F1 = 1,
F2 = 2,
F3 = 4
}
class Program
{
static void Main(string[] args)
{
int v = (int) (Test.F1 | Test.F2 | Test.F3); // count bits set in this (32-bit value)
int c = 0; // store the total here
int[] S = {1, 2, 4, 8, 16}; // Magic Binary Numbers
int[] B = {0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF, 0x0000FFFF};
c = v - ((v >> 1) & B[0]);
c = ((c >> S[1]) & B[1]) + (c & B[1]);
c = ((c >> S[2]) + c) & B[2];
c = ((c >> S[3]) + c) & B[3];
c = ((c >> S[4]) + c) & B[4];
Console.WriteLine(c);
Console.Read();
}
}
I've made a helper method for myself. Maybe it'll be useful for others.
public static class EnumHelper
{
public static UInt32 NumFlags(this Enum e)
{
UInt32 v = Convert.ToUInt32(e);
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
UInt32 count = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24;
return count;
}
}
The solution that is most reliable is to test for each value in the enumeration:
int len = 0;
foreach (Colors color in Enum.GetValues(typeof(Colors))) {
if ((myColor & color) == color) {
len++;
}
}
This will work even if the value has bits set where there are no defined value in the enumeration, for example:
Colors myColor = (Colors)65535;
This will also work for enumerations with values that use more than a single bit:
[Flags]
enum Colors {
Red = 0xFF0000,
Green = 0x00FF00,
Blue = 0x0000FF
}
int value = Enum.GetNames(typeof(Colors)).Length;
public static int NumberOfOptions(int value)
{
int result = (int)Math.Pow(2, value-1);
return result;
}
Try this...
Colors.GetValues().Length();
...or is that too obvious?
EDIT:
OK, I just read the question again, and realised that you need the length of 'mycolors', not 'Colors' - let me think about that.
FURTHER EDIT:
Now I'm confused - the OP's posted solution would never work, as myColor.ToString() returns '5' and applying Split(new char[]{','}) to this would result in a array with a length of 1.
Did the OP actually get this to work?