Related
I have the following function which I am using to find the terminal accumulative positive and negative value, which is working:
public class CumulativeTotal
{
[Test]
public void CalculatesTerminalValue()
{
IEnumerable<decimal> sequence = new decimal[] { 10, 20, 20, -20, -50, 10 };
var values = FindTerminalValues(sequence);
Assert.That(values.Item1, Is.EqualTo(-20));
Assert.That(values.Item2, Is.EqualTo(50));
Assert.Pass();
}
public static Tuple<decimal,decimal> FindTerminalValues(IEnumerable<decimal> values)
{
decimal largest = 0;
decimal smallest = 0;
decimal current = 0;
foreach (var value in values)
{
current += value;
if (current > largest)
largest = current;
else if (current < smallest)
smallest = current;
}
return new Tuple<decimal, decimal>(smallest,largest);
}
}
However, in the interests of learning, how could i implement with Linq?
I can see a package MoreLinq, but not sure where to start!
You can try standard Linq Aggregate method:
// Let's return named tuple: unlike min, max
// current .Item1 and .Item2 are not readable
public static (decimal min, decimal max) FindTerminalValues(IEnumerable<decimal> values) {
//public method arguments validation
if (values is null)
throw new ArgumentNullException(nameof(values));
(var min, var max, _) = values
.Aggregate((min: decimal.MaxValue, max: decimal.MinValue, curr: 0m),
(s, a) => (Math.Min(s.min, s.curr + a),
Math.Max(s.max, s.curr + a),
s.curr + a));
return (min, max);
}
yes, you can use MoreLinq like this, it has the Scan method.
public static Tuple<decimal, decimal> FindTerminalValues(IEnumerable<decimal> values)
{
var cumulativeSum = values.Scan((acc, x) => acc + x).ToList();
decimal min = cumulativeSum.Min();
decimal max = cumulativeSum.Max();
return new Tuple<decimal, decimal>(min, max);
}
The Scan extension method generates a new sequence by applying a function to each element in the input sequence, using the previous element as an accumulator. In this case, the function is simply the addition operator, so the Scan method generates a sequence of the cumulative sum of the input sequence.
The major flaw in the code you've presented is that if the running sum of the the sequence stays below zero or above zero the whole time then the algorithm incorrectly returns zero as one of the terminals.
Take this:
IEnumerable<decimal> sequence = new decimal[] { 10, 20, };
Your current algorithm returns (0, 30) when it should be (10, 30).
To correct that you must start with the first value of the sequence as the default minimum and maximum.
Here's an implementation that does that:
public static (decimal min, decimal max) FindTerminalValues(IEnumerable<decimal> values)
{
if (!values.Any())
throw new System.ArgumentException("no values");
decimal first = values.First();
IEnumerable<decimal> scan = values.Scan((x, y) => x + y);
return scan.Aggregate(
(min: first, max: first),
(a, x) =>
(
min: x < a.min ? x : a.min,
max: x > a.max ? x : a.max)
);
}
It uses System.Interactive to get the Scan operator (but you could use MoreLinq.
However, the one downside to this approach is that IEnumerable<decimal> is not guaranteed to return the same values every time. You either need to (1) pass in a decimal[], List<decimal>, or other structure that will always return the same sequence, or (2) ensure you only iterate the IEnumerable<decimal> once.
Here's how to do (2):
public static (decimal min, decimal max) FindTerminalValues(IEnumerable<decimal> values)
{
var e = values.GetEnumerator();
if (!e.MoveNext())
throw new System.ArgumentException("no values");
var terminal = (min: e.Current, max: e.Current);
decimal value = e.Current;
while (e.MoveNext())
{
value += e.Current;
terminal = (Math.Min(value, terminal.min), Math.Max(value, terminal.max));
}
return terminal;
}
You can use the Aggregate method in LINQ to achieve this.
The Aggregate method applies a function to each element in a sequence and returns the accumulated result. It takes as parameter an initial accumulator object to keep track of the smallest and largest function.
public static Tuple<decimal,decimal> FindTerminalValues(IEnumerable<decimal> values)
{
return values.Aggregate(
// Initial accumulator value:
new Tuple<decimal, decimal>(0, 0),
// Accumulation function:
(acc, value) =>
{
// Add the current value to the accumulator:
var current = acc.Item1 + value;
// Update the smallest and largest accumulated values:
var smallest = Math.Min(current, acc.Item1);
var largest = Math.Max(current, acc.Item2);
// Return the updated accumulator value:
return new Tuple<decimal, decimal>(smallest, largest);
});
}
I have a variable representing a quantity in some given unit:
enum Unit
{
Single,
Thousand,
Million,
Billion,
Trillion
}
public class Quantity()
{
public double number;
public Unit numberUnit;
public Int64 GetNumberInSingleUnits()
{
// ???
}
}
For example, imagine
var GDP_Of_America = new Quantiy { number = 16.66, numberUnit = Unit.Trillion };
Int64 gdp = GDP_Of_America.GetNumberinSingleUnits(); // should return 16,660,000,000,000
My question is basically - how can I implement the "GetNumberInSingleUnits" function?
I can't just multiply with some UInt64 factor, e.g.
double num = 0.5;
UInt64 factor = 1000000000000;
var result = num * factor; // won't work! results in double
As the regular numeric operations reslt in a double, but the result may be larger than the range of valid doubles.
How could I do this conversion?
ps, I know the class "Quantity" is not a great way to store information - but this is bound by the input data of my application, which is in non-single (e.g. millions, billions etc) units.
Like I said, decimals can help you here:
public enum Unit
{
Micro = -6, Milli = -3, Centi = -2, Deci = -1,
One /* Don't really like this name */, Deca, Hecto, Kilo, Mega = 6, Giga = 9
}
public struct Quantity
{
public decimal Value { get; private set; }
public Unit Unit { get; private set; }
public Quantity(decimal value, Unit unit) :
this()
{
Value = value;
Unit = unit;
}
public decimal OneValue /* This one either */
{
get
{
return Value * (decimal)Math.Pow(10, (int)Unit);
}
}
}
With decimals you won't lose a lot of precision until after you decide to convert them to long (and beware of over/underflows).
Anton's answer seems like a good solution.
Just for the sake of discussion, another potential way.
I don't like this one at all as it seems very messy; However I think this might avoid imprecisions, if these ever turned out to be an issue with decimals.
public Int64 GetAsInt64(double number)
{
// Returns 1 for single units, 3 for thousands, 6 for millions, etc.
uint e = GetFactorExponent();
// Converts to scientific notation
// e.g. number = -1.2345, unit millions to "-1.2345e6"
string str = String.Format("{0:#,0.###########################}", number) + "e" + e;
// Parses scientific notation into Int64
Int64 result = Int64.Parse(str, NumberStyles.AllowLeadingSign | NumberStyles.AllowDecimalPoint | NumberStyles.AllowExponent | NumberStyles.AllowThousands);
return result;
}
I am working on a system that needs to accept and display complex fractions. The code for accepting fractions and turning them in to a double works, but when I want to display that value, I need to convert back to a fractional representation.
EDIT: I have fixed the overflow problem, but that didnt solve fractions like 1/3 or 5/6. SO I have devised a very hacky way to do this. I have code which generates the decimal representation of every fraction 0->64 over 1->64, and saves the most simplified form. This way, I can iterate through the list and find the closest fraction, and simply display that. Will post code once I have some.
I have code now that works for the vast majority of numbers, but occasionally I will get a tiny fraction like 1/321. This gets converted to a double, but cannot be converted back, because in my approach, the numerator causes an integer overflow.
Here is my code, I'm wondering if there is a better approach, or if there is someway to safely convert these to longs without losing the precision needed for a correct result:
public static String DecimalToFraction(double dec)
{
string str = dec.ToString();
if (str.Contains('.'))
{
String[] parts = str.Split('.');
long whole = long.Parse(parts[0]);
long numerator = long.Parse(parts[1]);
long denominator = (long)Math.Pow(10, parts[1].Length);
long divisor = GCD(numerator, denominator);
long num = numerator / divisor;
long den = denominator / divisor;
String fraction = num + "/" + den;
if (whole > 0)
{
return whole + " " + fraction;
}
else
{
return fraction;
}
}
else
{
return str;
}
}
public static long GCD(long a, long b)
{
return b == 0 ? a : GCD(b, a % b);
}
Recently I had to code a similar scenario. In my case, converting from decimal to rational number had to be a little more mathematically correct so I ended up implementing a Continued Fraction algorithm.
Although it is tailored made to my concrete implementation of RationalNumber, you should get the idea. It's a relatively simple algorithm that works reasonably well for any rational number approximation. Note that the implementation will give you the closest approximation with the required precision.
/// <summary>
/// Represents a rational number with 64-bit signed integer numerator and denominator.
/// </summary>
[Serializable]
public struct RationalNumber : IComparable, IFormattable, IConvertible, IComparable<RationalNumber>, IEquatable<RationalNumber>
{
private const int MAXITERATIONCOUNT = 20;
public RationalNumber(long number) {...}
public RationalNumber(long numerator, long denominator) {...}
public RationalNumber(RationalNumber numerator, RationalNumer denominator) {...}
...
/// <summary>
/// Defines an implicit conversion of a 64-bit signed integer to a rational number.
/// </summary>
/// <param name="value">The value to convert to a rational number.</param>
/// <returns>A rational number that contains the value of the value parameter as its numerator and 1 as its denominator.</returns>
public static implicit operator RationalNumber(long value)
{
return new RationalNumber(value);
}
/// <summary>
/// Defines an explicit conversion of a rational number to a double-precision floating-point number.
/// </summary>
/// <param name="value">The value to convert to a double-precision floating-point number.</param>
/// <returns>A double-precision floating-point number that contains the resulting value of dividing the rational number's numerator by it's denominator.</returns>
public static explicit operator double(RationalNumber value)
{
return (double)value.numerator / value.Denominator;
}
...
/// <summary>
/// Adds two rational numbers.
/// </summary>
/// <param name="left">The first value to add.</param>
/// <param name="right">The second value to add.</param>
/// <returns>The sum of left and right.</returns>
public static RationalNumber operator +(RationalNumber left, RationalNumber right)
{
//First we try directly adding in a checked context. If an overflow occurs we use the least common multiple and return the result. If it overflows again, it
//will be up to the consumer to decide what he will do with it.
//Cost penalty should be minimal as adding numbers that cause an overflow should be very rare.
RationalNumber result;
try
{
long numerator = checked(left.numerator * right.Denominator + right.numerator * left.Denominator);
long denominator = checked(left.Denominator * right.Denominator);
result = new RationalNumber(numerator,denominator);
}
catch (OverflowException)
{
long lcm = RationalNumber.getLeastCommonMultiple(left.Denominator, right.Denominator);
result = new RationalNumber(left.numerator * (lcm / left.Denominator) + right.numerator * (lcm / right.Denominator), lcm);
}
return result;
}
private static long getGreatestCommonDivisor(long i1, long i2)
{
Debug.Assert(i1 != 0 || i2 != 0, "Whoops!. Both arguments are 0, this should not happen.");
//Division based algorithm
long i = Math.Abs(i1);
long j = Math.Abs(i2);
long t;
while (j != 0)
{
t = j;
j = i % j;
i = t;
}
return i;
}
private static long getLeastCommonMultiple(long i1, long i2)
{
if (i1 == 0 && i2 == 0)
return 0;
long lcm = i1 / getGreatestCommonDivisor(i1, i2) * i2;
return lcm < 0 ? -lcm : lcm;
}
...
/// <summary>
/// Returns the nearest rational number approximation to a double-precision floating-point number with a specified precision.
/// </summary>
/// <param name="target">Target value of the approximation.</param>
/// <param name="precision">Minimum precision of the approximation.</param>
/// <returns>Nearest rational number with, at least, the required precision.</returns>
/// <exception cref="System.ArgumentException">Can not find a rational number approximation with specified precision.</exception>
/// <exception cref="System.OverflowException">target is larger than Mathematics.RationalNumber.MaxValue or smaller than Mathematics.RationalNumber.MinValue.</exception>
/// <remarks>It is important to clarify that the method returns the first rational number found that complies with the specified precision.
/// The method is not required to return an exact rational number approximation even if such number exists.
/// The returned rational number will always be in coprime form.</remarks>
public static RationalNumber GetNearestRationalNumber(double target, double precision)
{
//Continued fraction algorithm: http://en.wikipedia.org/wiki/Continued_fraction
//Implemented recursively. Problem is figuring out when precision is met without unwinding each solution. Haven't figured out how to do that.
//Current implementation evaluates a Rational approximation for increasing algorithm depths until precision criteria is met or maximum depth is reached (MAXITERATIONCOUNT)
//Efficiency is probably improvable but this method will not be used in any performance critical code. No use in optimizing it unless there is a good reason.
//Current implementation works reasonably well.
RationalNumber nearestRational = RationalNumber.zero;
int steps = 0;
while (Math.Abs(target - (double)nearestRational) > precision)
{
if (steps > MAXITERATIONCOUNT)
throw new ArgumentException(Strings.RationalMaximumIterationsExceptionMessage, "precision");
nearestRational = getNearestRationalNumber(target, 0, steps++);
}
return nearestRational;
}
private static RationalNumber getNearestRationalNumber(double number, int currentStep, int maximumSteps)
{
long integerPart;
integerPart = checked((long)number);
double fractionalPart = number - integerPart;
while (currentStep < maximumSteps && fractionalPart != 0)
{
return integerPart + new RationalNumber(1, getNearestRationalNumber(1 / fractionalPart, ++currentStep, maximumSteps));
}
return new RationalNumber(integerPart);
}
}
UPDATE: Whoops, forgot to include the operator + code. Fixed it.
You could use BigRational, which Microsoft released under their BCL project on codeplex. It supports arbitrarily large rational numbers, and actually stores it internally as a ratio. The nice thing is that you can treat it largely as a normal numeric type, since all of the operators are overloaded for you.
Interestingly, it lacks a way to print the number as a decimal. I wrote some code that did this, though, in a previous answer of mine. However, there are no guarantees on its performance or quality (I barely remember writing it).
Keep the number as a fraction:
struct Fraction
{
private int _numerator;
private int _denominator;
public int Numerator { get { return _numerator; } }
public int Denominator { get { return _denominator; } }
public double Value { get { return ((double) Numerator)/Denominator; } }
public Fraction( int n, int d )
{
// move negative to numerator.
if( d < 0 )
{
_numerator = -n;
_denominator = -d;
}
else if( d > 0 )
{
_numerator = n;
_denominator = d;
}
else
throw new NumberFormatException( "Denominator cannot be 0" );
}
public void ToString()
{
string ret = "";
int whole = Numerator / Denominator;
if( whole != 0 )
ret += whole + " ";
ret += Math.Abs(Numerator % Denominator) + "/" + Denominator;
return ret;
}
}
Please check these 2 methods:
/// <summary>
/// Converts Decimals into Fractions.
/// </summary>
/// <param name="value">Decimal value</param>
/// <returns>Fraction in string type</returns>
public string DecimalToFraction(double value)
{
string result;
double numerator, realValue = value;
int num, den, decimals, length;
num = (int)value;
value = value - num;
value = Math.Round(value, 5);
length = value.ToString().Length;
decimals = length - 2;
numerator = value;
for (int i = 0; i < decimals; i++)
{
if (realValue < 1)
{
numerator = numerator * 10;
}
else
{
realValue = realValue * 10;
numerator = realValue;
}
}
den = length - 2;
string ten = "1";
for (int i = 0; i < den; i++)
{
ten = ten + "0";
}
den = int.Parse(ten);
num = (int)numerator;
result = SimplifiedFractions(num, den);
return result;
}
/// <summary>
/// Converts Fractions into Simplest form.
/// </summary>
/// <param name="num">Numerator</param>
/// <param name="den">Denominator</param>
/// <returns>Simplest Fractions in string type</returns>
string SimplifiedFractions(int num, int den)
{
int remNum, remDen, counter;
if (num > den)
{
counter = den;
}
else
{
counter = num;
}
for (int i = 2; i <= counter; i++)
{
remNum = num % i;
if (remNum == 0)
{
remDen = den % i;
if (remDen == 0)
{
num = num / i;
den = den / i;
i--;
}
}
}
return num.ToString() + "/" + den.ToString();
}
}
We have a very data intensive system. It stores raw data, then computes percentages based on the number of correct responses / total trials.
Recently we have had customers who want to import old data into our system.
I need a way to covert a percentage to the nearest fraction.
Examples.
33% needs to give me 2/6. EVEN though 1/3 is .33333333
67% needs to give me 4/6. EVEN though 4/6 is .6666667
I realize I could just compute that to be 67/100, but that means i'd have to add 100 data points to the system when 6 would suffice.
Does anyone have any ideas?
EDIT
Denominator could be anything. They are giving me a raw, rounded percentage and i'm trying to get as close to it with RAW data as possible
Your requirements are contradicting: On the one hand, you want to "convert a percentage to the nearest fraction" (*), but on the other hand, you want fractions with small(est) numbers. You need to find some compromise when/how to drop precision in favor of smaller numbers. Your problem as it stands is not solvable.
(*) The nearest fraction f for any given (integer) percentage n is n/100. Per definition.
I have tried to satisfy your requirement by using continued fractions. By limiting the depth to three I got a reasonable approximation.
I failed to come up with an iterative (or recursive) approach in resonable time. Nevertheless I have cleaned it up a little. (I know that 3 letter variable names are not good but I can't think of good names for them :-/ )
The code gives you the best rational approximation within the specified tolerance it can find. The resulting fraction is reduced and is the best approximation among all fractions with the same or lower denominator.
public partial class Form1 : Form
{
Random rand = new Random();
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
for (int i = 0; i < 10; i++)
{
double value = rand.NextDouble();
var fraction = getFraction(value);
var numerator = fraction.Key;
var denominator = fraction.Value;
System.Console.WriteLine(string.Format("Value {0:0.0000} approximated by {1}/{2} = {3:0.0000}", value, numerator, denominator, (double)numerator / denominator));
}
/*
Output:
Value 0,4691 approximated by 8/17 = 0,4706
Value 0,0740 approximated by 1/14 = 0,0714
Value 0,7690 approximated by 3/4 = 0,7500
Value 0,7450 approximated by 3/4 = 0,7500
Value 0,3748 approximated by 3/8 = 0,3750
Value 0,7324 approximated by 3/4 = 0,7500
Value 0,5975 approximated by 3/5 = 0,6000
Value 0,7544 approximated by 3/4 = 0,7500
Value 0,7212 approximated by 5/7 = 0,7143
Value 0,0469 approximated by 1/21 = 0,0476
Value 0,2755 approximated by 2/7 = 0,2857
Value 0,8763 approximated by 7/8 = 0,8750
Value 0,8255 approximated by 5/6 = 0,8333
Value 0,6170 approximated by 3/5 = 0,6000
Value 0,3692 approximated by 3/8 = 0,3750
Value 0,8057 approximated by 4/5 = 0,8000
Value 0,3928 approximated by 2/5 = 0,4000
Value 0,0235 approximated by 1/43 = 0,0233
Value 0,8528 approximated by 6/7 = 0,8571
Value 0,4536 approximated by 5/11 = 0,4545
*/
}
private KeyValuePair<int, int> getFraction(double value, double tolerance = 0.02)
{
double f0 = 1 / value;
double f1 = 1 / (f0 - Math.Truncate(f0));
int a_t = (int)Math.Truncate(f0);
int a_r = (int)Math.Round(f0);
int b_t = (int)Math.Truncate(f1);
int b_r = (int) Math.Round(f1);
int c = (int)Math.Round(1 / (f1 - Math.Truncate(f1)));
if (Math.Abs(1.0 / a_r - value) <= tolerance)
return new KeyValuePair<int, int>(1, a_r);
else if (Math.Abs(b_r / (a_t * b_r + 1.0) - value) <= tolerance)
return new KeyValuePair<int, int>(b_r, a_t * b_r + 1);
else
return new KeyValuePair<int, int>(c * b_t + 1, c * a_t * b_t + a_t + c);
}
}
Would it have to return 2/6 rather than 1/3? If its always in 6ths, then
Math.Round((33 * 6)/100) = 2
Answering my own question here. Would this work?
public static Fraction Convert(decimal value) {
for (decimal numerator = 1; numerator <= 10; numerator++) {
for (decimal denomenator = 1; denomenator < 10; denomenator++) {
var result = numerator / denomenator;
if (Math.Abs(value - result) < .01m)
return new Fraction() { Numerator = numerator, Denomenator = denomenator };
}
}
throw new Exception();
}
This will keep my denominator below 10.
I have a set of values, and an associated percentage for each:
a: 70% chance
b: 20% chance
c: 10% chance
I want to select a value (a, b, c) based on the percentage chance given.
how do I approach this?
my attempt so far looks like this:
r = random.random()
if r <= .7:
return a
elif r <= .9:
return b
else:
return c
I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows.
(any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)
Here is a complete solution in C#:
public class ProportionValue<T>
{
public double Proportion { get; set; }
public T Value { get; set; }
}
public static class ProportionValue
{
public static ProportionValue<T> Create<T>(double proportion, T value)
{
return new ProportionValue<T> { Proportion = proportion, Value = value };
}
static Random random = new Random();
public static T ChooseByRandom<T>(
this IEnumerable<ProportionValue<T>> collection)
{
var rnd = random.NextDouble();
foreach (var item in collection)
{
if (rnd < item.Proportion)
return item.Value;
rnd -= item.Proportion;
}
throw new InvalidOperationException(
"The proportions in the collection do not add up to 1.");
}
}
Usage:
var list = new[] {
ProportionValue.Create(0.7, "a"),
ProportionValue.Create(0.2, "b"),
ProportionValue.Create(0.1, "c")
};
// Outputs "a" with probability 0.7, etc.
Console.WriteLine(list.ChooseByRandom());
For Python:
>>> import random
>>> dst = 70, 20, 10
>>> vls = 'a', 'b', 'c'
>>> picks = [v for v, d in zip(vls, dst) for _ in range(d)]
>>> for _ in range(12): print random.choice(picks),
...
a c c b a a a a a a a a
>>> for _ in range(12): print random.choice(picks),
...
a c a c a b b b a a a a
>>> for _ in range(12): print random.choice(picks),
...
a a a a c c a c a a c a
>>>
General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use random.choice to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., 70, 20, 10 makes a 100-items list where 7, 2, 1 would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario.
Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).
Knuth references Walker's method of aliases. Searching on this, I find http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/ and http://prxq.wordpress.com/2006/04/17/the-alias-method/. This gives the exact probabilities required in constant time per number generated with linear time for setup (curiously, n log n time for setup if you use exactly the method Knuth describes, which does a preparatory sort you can avoid).
Take the list of and find the cumulative total of the weights: 70, 70+20, 70+20+10. Pick a random number greater than or equal to zero and less than the total. Iterate over the items and return the first value for which the cumulative sum of the weights is greater than this random number:
def select( values ):
variate = random.random() * sum( values.values() )
cumulative = 0.0
for item, weight in values.items():
cumulative += weight
if variate < cumulative:
return item
return item # Shouldn't get here, but just in case of rounding...
print select( { "a": 70, "b": 20, "c": 10 } )
This solution, as implemented, should also be able to handle fractional weights and weights that add up to any number so long as they're all non-negative.
Let T = the sum of all item weights
Let R = a random number between 0 and T
Iterate the item list subtracting each item weight from R and return the item that causes the result to become <= 0.
def weighted_choice(probabilities):
random_position = random.random() * sum(probabilities)
current_position = 0.0
for i, p in enumerate(probabilities):
current_position += p
if random_position < current_position:
return i
return None
Because random.random will always return < 1.0, the final return should never be reached.
import random
def selector(weights):
i=random.random()*sum(x for x,y in weights)
for w,v in weights:
if w>=i:
break
i-=w
return v
weights = ((70,'a'),(20,'b'),(10,'c'))
print [selector(weights) for x in range(10)]
it works equally well for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
print [selector(weights) for x in range(10)]
If you have a lot of weights, you can use bisect to reduce the number of iterations required
import random
import bisect
def make_acc_weights(weights):
acc=0
acc_weights = []
for w,v in weights:
acc+=w
acc_weights.append((acc,v))
return acc_weights
def selector(acc_weights):
i=random.random()*sum(x for x,y in weights)
return weights[bisect.bisect(acc_weights, (i,))][1]
weights = ((70,'a'),(20,'b'),(10,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
Also works fine for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
today, the update of python document give an example to make a random.choice() with weighted probabilities:
If the weights are small integer ratios, a simple technique is to build a sample population with repeats:
>>> weighted_choices = [('Red', 3), ('Blue', 2), ('Yellow', 1), ('Green', 4)]
>>> population = [val for val, cnt in weighted_choices for i in range(cnt)]
>>> random.choice(population)
'Green'
A more general approach is to arrange the weights in a cumulative distribution with itertools.accumulate(), and then locate the random value with bisect.bisect():
>>> choices, weights = zip(*weighted_choices)
>>> cumdist = list(itertools.accumulate(weights))
>>> x = random.random() * cumdist[-1]
>>> choices[bisect.bisect(cumdist, x)]
'Blue'
one note: itertools.accumulate() needs python 3.2 or define it with the Equivalent.
I think you can have an array of small objects (I implemented in Java although I know a little bit C# but I am afraid can write wrong code), so you may need to port it yourself. The code in C# will be much smaller with struct, var but I hope you get the idea
class PercentString {
double percent;
String value;
// Constructor for 2 values
}
ArrayList<PercentString> list = new ArrayList<PercentString();
list.add(new PercentString(70, "a");
list.add(new PercentString(20, "b");
list.add(new PercentString(10, "c");
double percent = 0;
for (int i = 0; i < list.size(); i++) {
PercentString p = list.get(i);
percent += p.percent;
if (random < percent) {
return p.value;
}
}
If you are really up to speed and want to generate the random values quickly, the Walker's algorithm mcdowella mentioned in https://stackoverflow.com/a/3655773/1212517 is pretty much the best way to go (O(1) time for random(), and O(N) time for preprocess()).
For anyone who is interested, here is my own PHP implementation of the algorithm:
/**
* Pre-process the samples (Walker's alias method).
* #param array key represents the sample, value is the weight
*/
protected function preprocess($weights){
$N = count($weights);
$sum = array_sum($weights);
$avg = $sum / (double)$N;
//divide the array of weights to values smaller and geq than sum/N
$smaller = array_filter($weights, function($itm) use ($avg){ return $avg > $itm;}); $sN = count($smaller);
$greater_eq = array_filter($weights, function($itm) use ($avg){ return $avg <= $itm;}); $gN = count($greater_eq);
$bin = array(); //bins
//we want to fill N bins
for($i = 0;$i<$N;$i++){
//At first, decide for a first value in this bin
//if there are small intervals left, we choose one
if($sN > 0){
$choice1 = each($smaller);
unset($smaller[$choice1['key']]);
$sN--;
} else{ //otherwise, we split a large interval
$choice1 = each($greater_eq);
unset($greater_eq[$choice1['key']]);
}
//splitting happens here - the unused part of interval is thrown back to the array
if($choice1['value'] >= $avg){
if($choice1['value'] - $avg >= $avg){
$greater_eq[$choice1['key']] = $choice1['value'] - $avg;
}else if($choice1['value'] - $avg > 0){
$smaller[$choice1['key']] = $choice1['value'] - $avg;
$sN++;
}
//this bin comprises of only one value
$bin[] = array(1=>$choice1['key'], 2=>null, 'p1'=>1, 'p2'=>0);
}else{
//make the second choice for the current bin
$choice2 = each($greater_eq);
unset($greater_eq[$choice2['key']]);
//splitting on the second interval
if($choice2['value'] - $avg + $choice1['value'] >= $avg){
$greater_eq[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
}else{
$smaller[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
$sN++;
}
//this bin comprises of two values
$choice2['value'] = $avg - $choice1['value'];
$bin[] = array(1=>$choice1['key'], 2=>$choice2['key'],
'p1'=>$choice1['value'] / $avg,
'p2'=>$choice2['value'] / $avg);
}
}
$this->bins = $bin;
}
/**
* Choose a random sample according to the weights.
*/
public function random(){
$bin = $this->bins[array_rand($this->bins)];
$randValue = (lcg_value() < $bin['p1'])?$bin[1]:$bin[2];
}
Here is my version that can apply to any IList and normalize the weight. It is based on Timwi's solution : selection based on percentage weighting
/// <summary>
/// return a random element of the list or default if list is empty
/// </summary>
/// <param name="e"></param>
/// <param name="weightSelector">
/// return chances to be picked for the element. A weigh of 0 or less means 0 chance to be picked.
/// If all elements have weight of 0 or less they all have equal chances to be picked.
/// </param>
/// <returns></returns>
public static T AnyOrDefault<T>(this IList<T> e, Func<T, double> weightSelector)
{
if (e.Count < 1)
return default(T);
if (e.Count == 1)
return e[0];
var weights = e.Select(o => Math.Max(weightSelector(o), 0)).ToArray();
var sum = weights.Sum(d => d);
var rnd = new Random().NextDouble();
for (int i = 0; i < weights.Length; i++)
{
//Normalize weight
var w = sum == 0
? 1 / (double)e.Count
: weights[i] / sum;
if (rnd < w)
return e[i];
rnd -= w;
}
throw new Exception("Should not happen");
}
I've my own solution for this:
public class Randomizator3000
{
public class Item<T>
{
public T value;
public float weight;
public static float GetTotalWeight<T>(Item<T>[] p_itens)
{
float __toReturn = 0;
foreach(var item in p_itens)
{
__toReturn += item.weight;
}
return __toReturn;
}
}
private static System.Random _randHolder;
private static System.Random _random
{
get
{
if(_randHolder == null)
_randHolder = new System.Random();
return _randHolder;
}
}
public static T PickOne<T>(Item<T>[] p_itens)
{
if(p_itens == null || p_itens.Length == 0)
{
return default(T);
}
float __randomizedValue = (float)_random.NextDouble() * (Item<T>.GetTotalWeight(p_itens));
float __adding = 0;
for(int i = 0; i < p_itens.Length; i ++)
{
float __cacheValue = p_itens[i].weight + __adding;
if(__randomizedValue <= __cacheValue)
{
return p_itens[i].value;
}
__adding = __cacheValue;
}
return p_itens[p_itens.Length - 1].value;
}
}
And using it should be something like that (thats in Unity3d)
using UnityEngine;
using System.Collections;
public class teste : MonoBehaviour
{
Randomizator3000.Item<string>[] lista;
void Start()
{
lista = new Randomizator3000.Item<string>[10];
lista[0] = new Randomizator3000.Item<string>();
lista[0].weight = 10;
lista[0].value = "a";
lista[1] = new Randomizator3000.Item<string>();
lista[1].weight = 10;
lista[1].value = "b";
lista[2] = new Randomizator3000.Item<string>();
lista[2].weight = 10;
lista[2].value = "c";
lista[3] = new Randomizator3000.Item<string>();
lista[3].weight = 10;
lista[3].value = "d";
lista[4] = new Randomizator3000.Item<string>();
lista[4].weight = 10;
lista[4].value = "e";
lista[5] = new Randomizator3000.Item<string>();
lista[5].weight = 10;
lista[5].value = "f";
lista[6] = new Randomizator3000.Item<string>();
lista[6].weight = 10;
lista[6].value = "g";
lista[7] = new Randomizator3000.Item<string>();
lista[7].weight = 10;
lista[7].value = "h";
lista[8] = new Randomizator3000.Item<string>();
lista[8].weight = 10;
lista[8].value = "i";
lista[9] = new Randomizator3000.Item<string>();
lista[9].weight = 10;
lista[9].value = "j";
}
void Update ()
{
Debug.Log(Randomizator3000.PickOne<string>(lista));
}
}
In this example each value has a 10% chance do be displayed as a debug =3
Based loosely on python's numpy.random.choice(a=items, p=probs), which takes an array and a probability array of the same size.
public T RandomChoice<T>(IEnumerable<T> a, IEnumerable<double> p)
{
IEnumerator<T> ae = a.GetEnumerator();
Random random = new Random();
double target = random.NextDouble();
double accumulator = 0;
foreach (var prob in p)
{
ae.MoveNext();
accumulator += prob;
if (accumulator > target)
{
break;
}
}
return ae.Current;
}
The probability array p must sum to (approx.) 1. This is to keep it consistent with the numpy interface (and mathematics), but you could easily change that if you wanted.