I have a method called "GetValue()" which is supposed to return the value "A", "B", "C" or "D" on each method call.
I want this method to return the value "A" in 30% of the method calls and the value "B" in 14% of the method calls, the value "C" 31%.. and so on...
Wich is the best way to distribute theese values smoothly, I do not want the method to return the value "A" xxx times in a row becouse the value "A" are farest from it's requested outcome percentage.
Please, all answeres are appreciated.
You can use the Random class to achieve this:
private static Random Generator = new Random();
public string GetValue()
{
var next = Generator.Next(100);
if (next < 30) return "A";
if (next < 44) return "B";
if (next < 75) return "C";
return "D";
}
Update
For a more generic random weighted value store, the following may be a good starting point:
public class WeightedValueStore<T> : IDisposable
{
private static readonly Random Generator = new Random();
private readonly List<Tuple<int, T>> _values = new List<Tuple<int, T>>();
private readonly ReaderWriterLockSlim _valueLock = new ReaderWriterLockSlim();
public void AddValue(int weight, T value)
{
_valueLock.EnterWriteLock();
try
{
_values.Add(Tuple.Create(weight, value));
}
finally
{
_valueLock.ExitWriteLock();
}
}
public T GetValue()
{
_valueLock.EnterReadLock();
try
{
var totalWeight = _values.Sum(t => t.Item1);
var next = Random.Next(totalWeight);
foreach (var tuple in _values)
{
next -= tuple.Item1;
if (next < 0) return tuple.Item2;
}
return default(T); // Or throw exception here - only reachable if _values has no elements.
}
finally
{
_valueLock.ExitReadLock();
}
}
public void Dispose()
{
_valueLock.Dispose();
}
}
Which would then be useable like so:
public string GetValue()
{
using (var valueStore = new WeightedValueStore<string>())
{
valueStore.AddValue(30, "A");
valueStore.AddValue(14, "B");
valueStore.AddValue(31, "C");
valueStore.AddValue(25, "D");
return valueStore.GetValue();
}
}
Use Random.
Take care of the seed. See this link.
Example:
// You can provide a seed as a parameter of the Random() class.
private static Random RandomGenerator = new Random();
private static string Generate()
{
int value = RandomGenerator.Next(100);
if (value < 30)
{
return "A";
}
else if (value < 44)
{
return "B";
}
else
{
return "C";
}
}
If you want that distribution by average, you can just pick a random number and check it.
Random rnd = new Random();
int value = rnd.Next(100); // get a number in the range 0 - 99
if (value < 30) return "A";
if (value < 30+14) return "B";
if (value < 30+14+31) return "C";
return "D";
Note that you should create the random generator once, and reuse it for subsequent calls. If you create a new one each time, they will be initialised with the same random sequence if two method calls come too close in time.
If you want exactly that distribution for 100 items, then you would create an array with 100 items, where 30 are "A", 14 are "B", and so on. Shuffle the array (look up Fisher-Yates), and return one item from the array for each method call.
Let's say you have the arrays
String[] possibleOutcomes = new String[] { "A", "B", "C", "D" }
and
int[] possibleOutcomeProbabilities = new int[] { 30, 14, 31, 25 }
You can use the following strategy whenever you are required to output one of the outcomes:
Find the sum of all elements in possibleOutcomeProbabilities. Lets call this sum totalProbability.
Generate a random number between 1 and totalProbability. Lets call this randomly generated number outcomeBucket.
Iterate over possibleOutcomeProbabilities to determine which outcome outcomeBucket corresponds to. You then pick the corresponding outcome from possibleOutcomes.
This strategy will certainly not give you first 30% outcomes as A, next 14% as B, etc. However, as probability works, over a sufficiently large number of outcomes, this strategy will ensure that your possible outcomes are distributed as per their expected probabilities. This strategy gives you the advantage that outcome probabilities are not required to add up to 100%. You can even specify relative probabilities, such as, 1:2:3:4, etc.
If you are really worried about the fastest possible implementation for the strategy, you can tweak it as follows:
a. Calculate totalProbability only once, or when the probablities are changed.
b. Before calculating totalProbability, see if the elements in possibleOutcomeProbabilities have any common divisors and eliminate those. This will give you a smaller probability space to traverse each time.
try this:
Random r = new Random();
private string GetValue()
{
double d = r.Next();
if(d < 0.3)
return "A";
else if(d < 0.5)
return "B";
...etc.
}
EDIT: just make sure that the Random variable is created outside the function or you'll get the same value each time.
I would not recommend any hard-coded approach (it is hard to maintain and it's bad practice). I'd prefer a more generic solution instead.
enum PossibleOutcome { A, B, C, D, Undefined }
// sample data: possible outcome vs its probability
static readonly Dictionary<PossibleOutcome, double> probabilities = new Dictionary<PossibleOutcome, double>()
{
{PossibleOutcome.A, 0.31},
{PossibleOutcome.B, 0.14},
{PossibleOutcome.C, 0.30},
{PossibleOutcome.D, 0.25}
};
static Random random = new Random();
static PossibleOutcome GetValue()
{
var result = random.NextDouble();
var sum = 0.0;
foreach (var probability in probabilities)
{
sum += probability.Value;
if (result <= sum)
{
return probability.Key;
}
}
return PossibleOutcome.Undefined; // it shouldn't happen
}
static void Main(string[] args)
{
if (probabilities.Sum(pair => pair.Value) != 1.0)
{
throw new ApplicationException("Probabilities must add up to 100%!");
}
for (var i = 0; i < 100; i++)
{
Console.WriteLine(GetValue().ToString());
}
Console.ReadLine();
}
Related
I'm attempting the following coding challenge in C#:
Manage robot factory settings.
When a robot comes off the factory floor, it has no name.
The first time you turn on a robot, a random name is generated in the
format of two uppercase letters followed by three digits, such as
RX837 or BC811.
Every once in a while we need to reset a robot to its factory
settings, which means that its name gets wiped. The next time you ask,
that robot will respond with a new random name.
The names must be random: they should not follow a predictable
sequence. Using random names means a risk of collisions. Your solution
must ensure that every existing robot has a unique name.
I've created a Robot class which passes 7 of my 8 unit tests. The one failing is:
[Fact]
public void Robot_names_are_unique()
{
const int robotsCount = 10_000;
var robots = new List<Robot>(robotsCount); // Needed to keep a reference to the robots as IDs of recycled robots may be re-issued
var names = new HashSet<string>(robotsCount);
for (int i = 0; i < robotsCount; i++) {
var robot = new Robot();
robots.Add(robot);
Assert.True(names.Add(robot.Name));
Assert.Matches(#"^[A-Z]{2}\d{3}$", robot.Name);
}
}
I walked through my code and I believe the issue is because I'm generating random values but I'm not ensuring the values are unique when creating many names. Here's my class:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
public class Robot
{
Random random = new Random();
Dictionary<string, bool> usedNames = new Dictionary<string, bool>();
public Robot()
{
Name = RandomName();
}
private string _name;
public string Name
{
get { return _name; }
set { _name = value; }
}
public void Reset()
{
Name = RandomName();
}
private string RandomName()
{
Random rand = new Random();
int nums = random.Next(000, 1000);
var val = nums.ToString("000");
const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string letters = new string(Enumerable.Repeat(chars, 2)
.Select(s => s[random.Next(s.Length)]).ToArray());
string name = $"{letters}{val}";
if (usedNames.ContainsKey(name))
{
// Implement here or refactor with loop?
}
return name;
}
}
However, after reviewing my code, I feel like there is a better approach. I was thinking the approach would involve iterating through the possible numbers and letters in the name sequentially from start to finish to ensure that each name is unique. Am I on the right track? What could I do better?
We have only
26 * 26 * 1000 == 676000
possible names. Let's generate them all and shuffle. Then we can take next robot name from names one after one:
// Yates algorithm will be faster then ordering by random (here I've used Guid)
static string[] Names = Enumerable
.Range(0, 26 * 26)
.SelectMany(letters => Enumerable
.Range(0, 1000)
.Select(i => $"{(char)('A' + letters / 26)}{(char)('A' + letters % 26)}{i:000}"))
.OrderBy(item => Guid.NewGuid())
.ToArray();
static int currentIndex = -1;
// Interlocked: let's implement thread safe method
static string NextName() =>
Names[Interlocked.Increment(ref currentIndex) % Names.Length];
Demo:
for (int i = 0; i < 10; ++i)
Console.WriteLine(NextName());
Outcome: (may vary from workstation to workstation)
JQ393
GQ249
JZ370
OC621
GD309
CP822
DK698
AD610
XY300
WV698
Edit: If we want to reuse names (which are dropped when robot is set to factory default settings) we can use Queue instead of array:
static ConcurrentQueue<string> Names = new ConcurrentQueue<string>(Enumerable
.Range(0, 26 * 26)
.SelectMany(letters => Enumerable
.Range(0, 1000)
.Select(i => $"{(char)('A' + letters / 26)}{(char)('A' + letters % 26)}{i:000}"))
.OrderBy(item => Guid.NewGuid()));
static string NextName() => Names.TryDequeue(out string result) ? result : "???";
static string ScrapName(string name) => Names.Enqueue(name);
static string ResetName(string oldName) {
string newName = Names.TryDequeue(out string result)
? result
: "???";
if (!string.IsNullOrEmpty(oldName))
Names.Enqueue(oldName);
return newName;
}
One option is to create a class to generate the names. That class should keep track of already created names. This method works better if the number of robots is not huge.
public class NameGenerator
{
static HashSet<string> created = new HashSet<string>();
static Random rand = new Random();
const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
public static string GetName()
{
if (created.Count == 676000) {
// Throw an exception?
}
string name;
do {
name = $"{chars[rand.Next(chars.Length)]}{chars[rand.Next(chars.Length)]}{rand.Next(0, 1000):D3}";
} while (!created.Add(name));
return name;
}
public static void Reset() {
created = new HashSet<string>();
}
}
Some quick profiling:
Number of IDs generated
Time (s)
Time (ms) to create last
Approx. mem used (MB) }
1,000
~0
<1
0.05
10,000
0.005
<1
0.52
50,000
0.032
<1
2.4
100,000
0.078
<1
4.9
250,000
0.229
<1
11.1
500,000
0.626
<1
22.8
600,000
0.961
<1
25.1
625,000
1.143
<1
25.8
650,000
1.390
<1
26.3
676,000
5.386
293
38.5
Obviously there is a large increase once you approach the 676,000 limit.
There are a lot of possible names. Unless you plan on having almost half a million robots, a good solution would be to create a custom, reusable generator that keeps track of all generated names.
public class UniqueNameGenerator
{
private readonly HashSet<string> generatedNames;
private readonly Random generator;
public UniqueNameGenerator(Random random = null)
{
this.generatedNames = new HashSet<string>();
this.generator = random ?? new Random();
}
public string GenerateName()
{
string name;
do
{
name = this.TryGenerateName();
}
while(this.generatedNames.Contains(name));
this.generatedNames.Add(name);
return name;
}
private string TryGenerateName()
{
var nameBuilder = new StringBuilder();
nameBuilder.Append(this.PickRandomLetter('A', 'Z'));
nameBuilder.Append(this.PickRandomLetter('A', 'Z'));
nameBuilder.Append(this.PickRandomNumber(0, 1000));
return nameBuilder.ToString();
}
private int PickRandomNumber(int min, int max)
{
return this.generator.Next(min, max + 1);
}
private char PickRandomLetter(char from, char to)
{
var letterIndex = this.generator.Next((int)from, (int)to);
return (char)letterIndex;
}
}
Keep a static instance of this inside the Robot class, or better, create a RobotFactory that creates robots with a single instance of the UniqueNameGenerator.
use random.choice to select 2 random characters and 3 random numbers
import random
def generate_license():
letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
numbers = "0123456789"
license = ""
for i in range(2):
license += random.choice(letters)
for i in range(3):
license += random.choice(numbers)
return license
for i in range(30):
print(generate_license())
output:
FD508
FI820
TY975
NR415
GD041
IK313
GR103
WR994
PL631
WT808
UV119
KO727
LK584
GM629
BM545
VX728
UN773
AM000
UW267
KE949
KW182
TL030
YW536
AF038
PQ493
TT153
NP626
JK151
WA536
OU825
Length = input Long(can be 2550, 2880, 2568, etc)
List<long> = {618, 350, 308, 300, 250, 232, 200, 128}
The program takes a long value, for that particular long value we have to find the possible combination from the above list which when added give me a input result(same value can be used twice). There can be a difference of +/- 30.
Largest numbers have to be used most.
Ex:Length = 868
For this combinations can be
Combination 1 = 618 + 250
Combination 2 = 308 + 232 + 200 +128
Correct Combination would be Combination 1
But there should also be different combinations.
public static void Main(string[] args)
{
//subtotal list
List<int> totals = new List<int>(new int[] { 618, 350, 308, 300, 250, 232, 200, 128 });
// get matches
List<int[]> results = KnapSack.MatchTotal(2682, totals);
// print results
foreach (var result in results)
{
Console.WriteLine(string.Join(",", result));
}
Console.WriteLine("Done.");
}
internal static List<int[]> MatchTotal(int theTotal, List<int> subTotals)
{
List<int[]> results = new List<int[]>();
while (subTotals.Contains(theTotal))
{
results.Add(new int[1] { theTotal });
subTotals.Remove(theTotal);
}
if (subTotals.Count == 0)
return results;
subTotals.Sort();
double mostNegativeNumber = subTotals[0];
if (mostNegativeNumber > 0)
mostNegativeNumber = 0;
if (mostNegativeNumber == 0)
subTotals.RemoveAll(d => d > theTotal);
for (int choose = 0; choose <= subTotals.Count; choose++)
{
IEnumerable<IEnumerable<int>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);
results.AddRange(from combo in combos where combo.Sum() == theTotal select combo.ToArray());
}
return results;
}
public static class Combination
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)
{
return choose == 0 ?
new[] { new T[0] } :
elements.SelectMany((element, i) =>
elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));
}
}
I Have used the above code, can it be more simplified, Again here also i get unique values. A value can be used any number of times. But the largest number has to be given the most priority.
I have a validation to check whether the total of the sum is greater than the input value. The logic fails even there..
The algorithm you have shown assumes that the list is sorted in ascending order. If not, then you shall first have to sort the list in O(nlogn) time and then execute the algorithm.
Also, it assumes that you are only considering combinations of pairs and you exit on the first match.
If you want to find all combinations, then instead of "break", just output the combination and increment startIndex or decrement endIndex.
Moreover, you should check for ranges (targetSum - 30 to targetSum + 30) rather than just the exact value because the problem says that a margin of error is allowed.
This is the best solution according to me because its complexity is O(nlogn + n) including the sorting.
V4 - Recursive Method, using Stack structure instead of stack frames on thread
It works (tested in VS), but there could be some bugs remaining.
static int Threshold = 30;
private static Stack<long> RecursiveMethod(long target)
{
Stack<long> Combination = new Stack<long>(establishedValues.Count); //Can grow bigger, as big as (target / min(establishedValues)) values
Stack<int> Index = new Stack<int>(establishedValues.Count); //Can grow bigger
int lowerBound = 0;
int dimensionIndex = lowerBound;
long fail = -1 * Threshold;
while (true)
{
long thisVal = establishedValues[dimensionIndex];
dimensionIndex++;
long afterApplied = target - thisVal;
if (afterApplied < fail)
lowerBound = dimensionIndex;
else
{
target = afterApplied;
Combination.Push(thisVal);
if (target <= Threshold)
return Combination;
Index.Push(dimensionIndex);
dimensionIndex = lowerBound;
}
if (dimensionIndex >= establishedValues.Count)
{
if (Index.Count == 0)
return null; //No possible combinations
dimensionIndex = Index.Pop();
lowerBound = dimensionIndex;
target += Combination.Pop();
}
}
}
Maybe V3 - Suggestion for Ordered solution trying every combination
Although this isn't chosen as the answer for the related question, I believe this is a good approach - https://stackoverflow.com/a/17258033/887092(, otherwise you could try the chosen answer (although the output for that is only 2 items in set being summed, rather than up to n items)) - it will enumerate every option including multiples of the same value. V2 works but would be slightly less efficient than an ordered solution, as the same failing-attempt will likely be attempted multiple times.
V2 - Random Selection - Will be able to reuse the same number twice
I'm a fan of using random for "intelligence", allowing the computer to brute force the solution. It's also easy to distribute - as there is no state dependence between two threads trying at the same time for example.
static int Threshold = 30;
public static List<long> RandomMethod(long Target)
{
List<long> Combinations = new List<long>();
Random rnd = new Random();
//Assuming establishedValues is sorted
int LowerBound = 0;
long runningSum = Target;
while (true)
{
int newLowerBound = FindLowerBound(LowerBound, runningSum);
if (newLowerBound == -1)
{
//No more beneficial values to work with, reset
runningSum = Target;
Combinations.Clear();
LowerBound = 0;
continue;
}
LowerBound = newLowerBound;
int rIndex = rnd.Next(LowerBound, establishedValues.Count);
long val = establishedValues[rIndex];
runningSum -= val;
Combinations.Add(val);
if (Math.Abs(runningSum) <= 30)
return Combinations;
}
}
static int FindLowerBound(int currentLowerBound, long runningSum)
{
//Adjust lower bound, so we're not randomly trying a number that's too high
for (int i = currentLowerBound; i < establishedValues.Count; i++)
{
//Factor in the threshold, because an end aggregate which exceeds by 20 is better than underperforming by 21.
if ((establishedValues[i] - Threshold) < runningSum)
{
return i;
}
}
return -1;
}
V1 - Ordered selection - Will not be able to reuse the same number twice
Add this very handy extension function (uses a binary algorithm to find all combinations):
//Make sure you put this in a static class inside System namespace
public static IEnumerable<List<T>> EachCombination<T>(this List<T> allValues)
{
var collection = new List<List<T>>();
for (int counter = 0; counter < (1 << allValues.Count); ++counter)
{
List<T> combination = new List<T>();
for (int i = 0; i < allValues.Count; ++i)
{
if ((counter & (1 << i)) == 0)
combination.Add(allValues[i]);
}
if (combination.Count == 0)
continue;
yield return combination;
}
}
Use the function
static List<long> establishedValues = new List<long>() {618, 350, 308, 300, 250, 232, 200, 128, 180, 118, 155};
//Return is a list of the values which sum to equal the target. Null if not found.
List<long> FindFirstCombination(long target)
{
foreach (var combination in establishedValues.EachCombination())
{
//if (combination.Sum() == target)
if (Math.Abs(combination.Sum() - target) <= 30) //Plus or minus tolerance for difference
return combination;
}
return null; //Or you could throw an exception
}
Test the solution
var target = 858;
var result = FindFirstCombination(target);
bool success = (result != null && result.Sum() == target);
//TODO: for loop with random selection of numbers from the establishedValues, Sum and test through FindFirstCombination
I have a set of values, and an associated percentage for each:
a: 70% chance
b: 20% chance
c: 10% chance
I want to select a value (a, b, c) based on the percentage chance given.
how do I approach this?
my attempt so far looks like this:
r = random.random()
if r <= .7:
return a
elif r <= .9:
return b
else:
return c
I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows.
(any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)
Here is a complete solution in C#:
public class ProportionValue<T>
{
public double Proportion { get; set; }
public T Value { get; set; }
}
public static class ProportionValue
{
public static ProportionValue<T> Create<T>(double proportion, T value)
{
return new ProportionValue<T> { Proportion = proportion, Value = value };
}
static Random random = new Random();
public static T ChooseByRandom<T>(
this IEnumerable<ProportionValue<T>> collection)
{
var rnd = random.NextDouble();
foreach (var item in collection)
{
if (rnd < item.Proportion)
return item.Value;
rnd -= item.Proportion;
}
throw new InvalidOperationException(
"The proportions in the collection do not add up to 1.");
}
}
Usage:
var list = new[] {
ProportionValue.Create(0.7, "a"),
ProportionValue.Create(0.2, "b"),
ProportionValue.Create(0.1, "c")
};
// Outputs "a" with probability 0.7, etc.
Console.WriteLine(list.ChooseByRandom());
For Python:
>>> import random
>>> dst = 70, 20, 10
>>> vls = 'a', 'b', 'c'
>>> picks = [v for v, d in zip(vls, dst) for _ in range(d)]
>>> for _ in range(12): print random.choice(picks),
...
a c c b a a a a a a a a
>>> for _ in range(12): print random.choice(picks),
...
a c a c a b b b a a a a
>>> for _ in range(12): print random.choice(picks),
...
a a a a c c a c a a c a
>>>
General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use random.choice to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., 70, 20, 10 makes a 100-items list where 7, 2, 1 would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario.
Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).
Knuth references Walker's method of aliases. Searching on this, I find http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/ and http://prxq.wordpress.com/2006/04/17/the-alias-method/. This gives the exact probabilities required in constant time per number generated with linear time for setup (curiously, n log n time for setup if you use exactly the method Knuth describes, which does a preparatory sort you can avoid).
Take the list of and find the cumulative total of the weights: 70, 70+20, 70+20+10. Pick a random number greater than or equal to zero and less than the total. Iterate over the items and return the first value for which the cumulative sum of the weights is greater than this random number:
def select( values ):
variate = random.random() * sum( values.values() )
cumulative = 0.0
for item, weight in values.items():
cumulative += weight
if variate < cumulative:
return item
return item # Shouldn't get here, but just in case of rounding...
print select( { "a": 70, "b": 20, "c": 10 } )
This solution, as implemented, should also be able to handle fractional weights and weights that add up to any number so long as they're all non-negative.
Let T = the sum of all item weights
Let R = a random number between 0 and T
Iterate the item list subtracting each item weight from R and return the item that causes the result to become <= 0.
def weighted_choice(probabilities):
random_position = random.random() * sum(probabilities)
current_position = 0.0
for i, p in enumerate(probabilities):
current_position += p
if random_position < current_position:
return i
return None
Because random.random will always return < 1.0, the final return should never be reached.
import random
def selector(weights):
i=random.random()*sum(x for x,y in weights)
for w,v in weights:
if w>=i:
break
i-=w
return v
weights = ((70,'a'),(20,'b'),(10,'c'))
print [selector(weights) for x in range(10)]
it works equally well for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
print [selector(weights) for x in range(10)]
If you have a lot of weights, you can use bisect to reduce the number of iterations required
import random
import bisect
def make_acc_weights(weights):
acc=0
acc_weights = []
for w,v in weights:
acc+=w
acc_weights.append((acc,v))
return acc_weights
def selector(acc_weights):
i=random.random()*sum(x for x,y in weights)
return weights[bisect.bisect(acc_weights, (i,))][1]
weights = ((70,'a'),(20,'b'),(10,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
Also works fine for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
today, the update of python document give an example to make a random.choice() with weighted probabilities:
If the weights are small integer ratios, a simple technique is to build a sample population with repeats:
>>> weighted_choices = [('Red', 3), ('Blue', 2), ('Yellow', 1), ('Green', 4)]
>>> population = [val for val, cnt in weighted_choices for i in range(cnt)]
>>> random.choice(population)
'Green'
A more general approach is to arrange the weights in a cumulative distribution with itertools.accumulate(), and then locate the random value with bisect.bisect():
>>> choices, weights = zip(*weighted_choices)
>>> cumdist = list(itertools.accumulate(weights))
>>> x = random.random() * cumdist[-1]
>>> choices[bisect.bisect(cumdist, x)]
'Blue'
one note: itertools.accumulate() needs python 3.2 or define it with the Equivalent.
I think you can have an array of small objects (I implemented in Java although I know a little bit C# but I am afraid can write wrong code), so you may need to port it yourself. The code in C# will be much smaller with struct, var but I hope you get the idea
class PercentString {
double percent;
String value;
// Constructor for 2 values
}
ArrayList<PercentString> list = new ArrayList<PercentString();
list.add(new PercentString(70, "a");
list.add(new PercentString(20, "b");
list.add(new PercentString(10, "c");
double percent = 0;
for (int i = 0; i < list.size(); i++) {
PercentString p = list.get(i);
percent += p.percent;
if (random < percent) {
return p.value;
}
}
If you are really up to speed and want to generate the random values quickly, the Walker's algorithm mcdowella mentioned in https://stackoverflow.com/a/3655773/1212517 is pretty much the best way to go (O(1) time for random(), and O(N) time for preprocess()).
For anyone who is interested, here is my own PHP implementation of the algorithm:
/**
* Pre-process the samples (Walker's alias method).
* #param array key represents the sample, value is the weight
*/
protected function preprocess($weights){
$N = count($weights);
$sum = array_sum($weights);
$avg = $sum / (double)$N;
//divide the array of weights to values smaller and geq than sum/N
$smaller = array_filter($weights, function($itm) use ($avg){ return $avg > $itm;}); $sN = count($smaller);
$greater_eq = array_filter($weights, function($itm) use ($avg){ return $avg <= $itm;}); $gN = count($greater_eq);
$bin = array(); //bins
//we want to fill N bins
for($i = 0;$i<$N;$i++){
//At first, decide for a first value in this bin
//if there are small intervals left, we choose one
if($sN > 0){
$choice1 = each($smaller);
unset($smaller[$choice1['key']]);
$sN--;
} else{ //otherwise, we split a large interval
$choice1 = each($greater_eq);
unset($greater_eq[$choice1['key']]);
}
//splitting happens here - the unused part of interval is thrown back to the array
if($choice1['value'] >= $avg){
if($choice1['value'] - $avg >= $avg){
$greater_eq[$choice1['key']] = $choice1['value'] - $avg;
}else if($choice1['value'] - $avg > 0){
$smaller[$choice1['key']] = $choice1['value'] - $avg;
$sN++;
}
//this bin comprises of only one value
$bin[] = array(1=>$choice1['key'], 2=>null, 'p1'=>1, 'p2'=>0);
}else{
//make the second choice for the current bin
$choice2 = each($greater_eq);
unset($greater_eq[$choice2['key']]);
//splitting on the second interval
if($choice2['value'] - $avg + $choice1['value'] >= $avg){
$greater_eq[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
}else{
$smaller[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
$sN++;
}
//this bin comprises of two values
$choice2['value'] = $avg - $choice1['value'];
$bin[] = array(1=>$choice1['key'], 2=>$choice2['key'],
'p1'=>$choice1['value'] / $avg,
'p2'=>$choice2['value'] / $avg);
}
}
$this->bins = $bin;
}
/**
* Choose a random sample according to the weights.
*/
public function random(){
$bin = $this->bins[array_rand($this->bins)];
$randValue = (lcg_value() < $bin['p1'])?$bin[1]:$bin[2];
}
Here is my version that can apply to any IList and normalize the weight. It is based on Timwi's solution : selection based on percentage weighting
/// <summary>
/// return a random element of the list or default if list is empty
/// </summary>
/// <param name="e"></param>
/// <param name="weightSelector">
/// return chances to be picked for the element. A weigh of 0 or less means 0 chance to be picked.
/// If all elements have weight of 0 or less they all have equal chances to be picked.
/// </param>
/// <returns></returns>
public static T AnyOrDefault<T>(this IList<T> e, Func<T, double> weightSelector)
{
if (e.Count < 1)
return default(T);
if (e.Count == 1)
return e[0];
var weights = e.Select(o => Math.Max(weightSelector(o), 0)).ToArray();
var sum = weights.Sum(d => d);
var rnd = new Random().NextDouble();
for (int i = 0; i < weights.Length; i++)
{
//Normalize weight
var w = sum == 0
? 1 / (double)e.Count
: weights[i] / sum;
if (rnd < w)
return e[i];
rnd -= w;
}
throw new Exception("Should not happen");
}
I've my own solution for this:
public class Randomizator3000
{
public class Item<T>
{
public T value;
public float weight;
public static float GetTotalWeight<T>(Item<T>[] p_itens)
{
float __toReturn = 0;
foreach(var item in p_itens)
{
__toReturn += item.weight;
}
return __toReturn;
}
}
private static System.Random _randHolder;
private static System.Random _random
{
get
{
if(_randHolder == null)
_randHolder = new System.Random();
return _randHolder;
}
}
public static T PickOne<T>(Item<T>[] p_itens)
{
if(p_itens == null || p_itens.Length == 0)
{
return default(T);
}
float __randomizedValue = (float)_random.NextDouble() * (Item<T>.GetTotalWeight(p_itens));
float __adding = 0;
for(int i = 0; i < p_itens.Length; i ++)
{
float __cacheValue = p_itens[i].weight + __adding;
if(__randomizedValue <= __cacheValue)
{
return p_itens[i].value;
}
__adding = __cacheValue;
}
return p_itens[p_itens.Length - 1].value;
}
}
And using it should be something like that (thats in Unity3d)
using UnityEngine;
using System.Collections;
public class teste : MonoBehaviour
{
Randomizator3000.Item<string>[] lista;
void Start()
{
lista = new Randomizator3000.Item<string>[10];
lista[0] = new Randomizator3000.Item<string>();
lista[0].weight = 10;
lista[0].value = "a";
lista[1] = new Randomizator3000.Item<string>();
lista[1].weight = 10;
lista[1].value = "b";
lista[2] = new Randomizator3000.Item<string>();
lista[2].weight = 10;
lista[2].value = "c";
lista[3] = new Randomizator3000.Item<string>();
lista[3].weight = 10;
lista[3].value = "d";
lista[4] = new Randomizator3000.Item<string>();
lista[4].weight = 10;
lista[4].value = "e";
lista[5] = new Randomizator3000.Item<string>();
lista[5].weight = 10;
lista[5].value = "f";
lista[6] = new Randomizator3000.Item<string>();
lista[6].weight = 10;
lista[6].value = "g";
lista[7] = new Randomizator3000.Item<string>();
lista[7].weight = 10;
lista[7].value = "h";
lista[8] = new Randomizator3000.Item<string>();
lista[8].weight = 10;
lista[8].value = "i";
lista[9] = new Randomizator3000.Item<string>();
lista[9].weight = 10;
lista[9].value = "j";
}
void Update ()
{
Debug.Log(Randomizator3000.PickOne<string>(lista));
}
}
In this example each value has a 10% chance do be displayed as a debug =3
Based loosely on python's numpy.random.choice(a=items, p=probs), which takes an array and a probability array of the same size.
public T RandomChoice<T>(IEnumerable<T> a, IEnumerable<double> p)
{
IEnumerator<T> ae = a.GetEnumerator();
Random random = new Random();
double target = random.NextDouble();
double accumulator = 0;
foreach (var prob in p)
{
ae.MoveNext();
accumulator += prob;
if (accumulator > target)
{
break;
}
}
return ae.Current;
}
The probability array p must sum to (approx.) 1. This is to keep it consistent with the numpy interface (and mathematics), but you could easily change that if you wanted.
I've made a class (code below) that handles the creation of a "matching" quiz item on a test, this is the output:
It works fine.
However, in order to get it completely random, I have to put the thread to sleep for at least 300 counts between the random shuffling of the two columns, anything lower than 300 returns both columns sorted in the same order, as if it is using the same seed for randomness:
LeftDisplayIndexes.Shuffle();
Thread.Sleep(300);
RightDisplayIndexes.Shuffle();
What do I have to do to make the shuffling of the two columns completely random without this time wait?
full code:
using System.Collections.Generic;
using System;
using System.Threading;
namespace TestSort727272
{
class Program
{
static void Main(string[] args)
{
MatchingItems matchingItems = new MatchingItems();
matchingItems.Add("one", "111");
matchingItems.Add("two", "222");
matchingItems.Add("three", "333");
matchingItems.Add("four", "444");
matchingItems.Setup();
matchingItems.DisplayTest();
matchingItems.DisplayAnswers();
Console.ReadLine();
}
}
public class MatchingItems
{
public List<MatchingItem> Collection { get; set; }
public List<int> LeftDisplayIndexes { get; set; }
public List<int> RightDisplayIndexes { get; set; }
private char[] _numbers = { '1', '2', '3', '4', '5', '6', '7', '8' };
private char[] _letters = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h' };
public MatchingItems()
{
Collection = new List<MatchingItem>();
LeftDisplayIndexes = new List<int>();
RightDisplayIndexes = new List<int>();
}
public void Add(string leftText, string rightText)
{
MatchingItem matchingItem = new MatchingItem(leftText, rightText);
Collection.Add(matchingItem);
LeftDisplayIndexes.Add(Collection.Count - 1);
RightDisplayIndexes.Add(Collection.Count - 1);
}
public void DisplayTest()
{
Console.WriteLine("");
Console.WriteLine("--TEST:-------------------------");
for (int i = 0; i < Collection.Count; i++)
{
int leftIndex = LeftDisplayIndexes[i];
int rightIndex = RightDisplayIndexes[i];
Console.WriteLine("{0}. {1,-12}{2}. {3}", _numbers[i], Collection[leftIndex].LeftText, _letters[i], Collection[rightIndex].RightText);
}
}
public void DisplayAnswers()
{
Console.WriteLine("");
Console.WriteLine("--ANSWERS:-------------------------");
for (int i = 0; i < Collection.Count; i++)
{
string leftLabel = _numbers[i].ToString();
int leftIndex = LeftDisplayIndexes[i];
int rightIndex = RightDisplayIndexes.IndexOf(leftIndex);
string answerLabel = _letters[rightIndex].ToString();
Console.WriteLine("{0}. {1}", leftLabel, answerLabel);
}
}
public void Setup()
{
do
{
LeftDisplayIndexes.Shuffle();
Thread.Sleep(300);
RightDisplayIndexes.Shuffle();
} while (SomeLinesAreMatched());
}
private bool SomeLinesAreMatched()
{
for (int i = 0; i < LeftDisplayIndexes.Count; i++)
{
int leftIndex = LeftDisplayIndexes[i];
int rightIndex = RightDisplayIndexes[i];
if (leftIndex == rightIndex)
return true;
}
return false;
}
public void DisplayAsAnswer(int numberedIndex)
{
Console.WriteLine("");
Console.WriteLine("--ANSWER TO {0}:-------------------------", _numbers[numberedIndex]);
for (int i = 0; i < Collection.Count; i++)
{
int leftIndex = LeftDisplayIndexes[i];
int rightIndex = RightDisplayIndexes[i];
Console.WriteLine("{0}. {1,-12}{2}. {3}", _numbers[i], Collection[leftIndex].LeftText, _letters[i], Collection[rightIndex].RightText);
}
}
}
public class MatchingItem
{
public string LeftText { get; set; }
public string RightText { get; set; }
public MatchingItem(string leftText, string rightText)
{
LeftText = leftText;
RightText = rightText;
}
}
public static class Helpers
{
public static void Shuffle<T>(this IList<T> list)
{
Random rng = new Random();
int n = list.Count;
while (n > 1)
{
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}
}
}
Move Random rng = new Random(); to a static variable.
MSDN says "The default seed value is derived from the system clock and has finite resolution". When you create many Random objects within a small time range they all get the same seed and the first value will be equal to all Random objects.
By reusing the same Random object you will advance to the next random value from a given seed.
Only make one instance of the Random class. When you call it without a constructor it grabs a random seed from the computer clock, so you could get the same one twice.
public static class Helpers
{
static Random rng = new Random();
public static void Shuffle<T>(this IList<T> list)
{
int n = list.Count;
while (n > 1)
{
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}
}
I have to put the thread to sleep for
at least 300 counts between the random
shuffling of the two columns, anything
lower than 300 returns both columns
sorted in the same order, as if it is
using the same seed for randomness
You've answered your own question here. It is "as if it is using the same seed" because it is using the same seed! Due to the relatively coarse granularity of the Windows system clock, multiple Random instances constructed at nearly the same time will have the same seed value.
As Albin suggests, you should just have one Random object and use that. This way instead of a bunch of pseudorandom sequences that all start at the same seed and are therefore identical, your Shuffle method will be based on a single pseudorandom sequence.
Considering that you have it as an extension method, you may desire for it to be reusable. In this case, consider having an overload that accepts a Random and one that doesn't:
static void Shuffle<T>(this IList<T> list, Random random)
{
// Your code goes here.
}
static void Shuffle<T>(this IList<T> list)
{
list.Shuffle(new Random());
}
This allows the caller to provide a static Random object if he/she's going to be calling Shuffle many times consecutively; on the other hand, if it's just a one-time thing, Shuffle can take care of the Random instantiation itself.
One last thing I want to point out is that since the solution involves using a single shared Random object, you should be aware that the Random class is not thread-safe. If there's a chance you might be calling Shuffle from multiple threads concurrently, you'll need to lock your Next call (or: what I prefer to do is have a [ThreadStatic] Random object for each thread, each one seeded on a random value provided by a "core" Random -- but that's a bit more involved).
Otherwise you could end up with Next suddenly just retuning an endless sequence of zeroes.
The problem is that you are creating your Random objects too close to each other in time. When you do that, their internal pseudo-random generators are seeded with the same system time, and the sequence of numbers they produce will be identical.
The simplest solution is to reuse a single Random object, either by passing it as an argument to your shuffle algorithm or storing it as a member-variable of the class in which the shuffle is implemented.
The way random generators work, roughly, is that they have a seed from which the random values are derived. When you create a new Random object, this seed is set to be the current system time, in seconds or milliseconds.
Let's say when you create the first Random object, the seed is 10000. After calling it three times, the seeds were 20000, 40000, 80000, generating whatever numbers form the seeds (let's say 5, 6, 2). If you create a new Random object very quickly, the same seed will be used, 10000. So, if you call it three times, you'll get the same seeds, 20000, 40000, and 80000, and the same numbers from them.
However, if you re-use the same object, the latest seed was 80000, so instead you'll generate three new seeds, 160000, 320000, and 640000, which are very likely to give you new values.
That's why you have to use one random generator, without creating a new one every time.
Try to use Random() just one time. You'll get the idea.
Hi I coded this OneAtRandom() extension method:
public static class GenericIListExtensions
{
public static T OneAtRandom<T>(this IList<T> list)
{
list.ThrowIfNull("list");
if (list.Count == 0)
throw new ArgumentException("OneAtRandom() cannot be called on 'list' with 0 elements");
int randomIdx = new Random().Next(list.Count);
return list[randomIdx];
}
}
Testing it using this unit test fails:
[Test]
public void ShouldNotAlwaysReturnTheSameValueIfOneAtRandomCalledOnListOfLengthTwo()
{
int SomeStatisticallyUnlikelyNumberOf = 100;
IList<string> list = new List<string>() { FirstString, SecondString };
bool firstStringFound = false;
bool secondStringFound = false;
SomeStatisticallyUnlikelyNumberOf.Times(() =>
{
string theString = list.OneAtRandom();
if (theString == FirstString) firstStringFound = true;
if (theString == SecondString) secondStringFound = true;
});
Assert.That(firstStringFound && secondStringFound);
}
It seems that int randomIdx = new Random().Next(list.Count);is generating the same number 100 times in a row, I think possibly because the seed is based on the time?
How can I get this to work properly?
Thanks :)
You shouldn't be calling new Random()for every iteration because it causes it to be reseeded and generate the same sequence of numbers again. Create one Random object at the start of your application and pass it into your function as a parameter.
public static class GenericIListExtensions
{
public static T OneAtRandom<T>(this IList<T> list, Random random)
{
list.ThrowIfNull("list");
if (list.Count == 0)
throw new ArgumentException("OneAtRandom() cannot be called on 'list' with 0 elements");
int randomIdx = random.Next(list.Count);
return list[randomIdx];
}
}
This also has the advantage of making your code more testable as you can pass in a Random that is seeded to a value of your choice so that your tests are repeatable.
No; it's generating the same number 100 times because you're not seeding the generator.
Move the "new Random()" to the constructor or a static var, and use the generated object.
You could use a seed based on the current time to create the instance of Random. A sample on MSDN uses the following code:
int randomInstancesToCreate = 4;
Random[] randomEngines = new Random[randomInstancesToCreate];
for (int ctr = 0; ctr < randomInstancesToCreate; ctr++)
{
randomEngines[ctr] = new Random(unchecked((int) (DateTime.Now.Ticks >> ctr)));
}