I want to generate uniform integers that satisfy 0 <= result <= maxValue.
I already have a generator that returns uniform values in the full range of the built in unsigned integer types. Let's call the methods for this byte Byte(), ushort UInt16(), uint UInt32() and ulong UInt64(). Assume that the result of these methods is perfectly uniform.
The signature of the methods I want are uint UniformUInt(uint maxValue) and ulong UniformUInt(ulong maxValue).
What I'm looking for:
Correctness
I'd prefer the return values to be distributed in the given interval.
But a very small bias is acceptable if it increases performance significantly. By that I mean a bias of an order that allows distinguisher with probability 2/3 given 2^64 values.
It must work correctly for any maxValue.
Performance
The method should be fast.
Efficiency
The method does consume little raw randomness, since depending on the underlying generator, generating the raw bytes might be costly. Wasting a few bits is fine, but consuming say 128 bits to generate a single number is probably excessive.
It's also possible to cache some left over randomness from the previous call in some member variables.
Be careful with int overflows, and wrapping behavior.
I already have a solution(I'll post it as an answer), but it's a bit ugly for my tastes. So I'd like to get ideas for better solutions.
Suggestions on how to unit test with large maxValues would be nice too, since I can't generate a histogram with 2^64 buckets and 2^74 random values. Another complication is that with certain bugs, only some maxValue distributions are biased a lot, and others only very slightly.
How about something like this as a general-purpose solution? The algorithm is based on that used by Java's nextInt method, rejecting any values that would cause a non-uniform distribution. So long as the output of your UInt32 method is perfectly uniform then this should be too.
uint UniformUInt(uint inclusiveMaxValue)
{
unchecked
{
uint exclusiveMaxValue = inclusiveMaxValue + 1;
// if exclusiveMaxValue is a power of two then we can just use a mask
// also handles the edge case where inclusiveMaxValue is uint.MaxValue
if ((exclusiveMaxValue & (~exclusiveMaxValue + 1)) == exclusiveMaxValue)
return UInt32() & inclusiveMaxValue;
uint bits, val;
do
{
bits = UInt32();
val = bits % exclusiveMaxValue;
// if (bits - val + inclusiveMaxValue) overflows then val has been
// taken from an incomplete chunk at the end of the range of bits
// in that case we reject it and loop again
} while (bits - val + inclusiveMaxValue < inclusiveMaxValue);
return val;
}
}
The rejection process could, theoretically, keep looping forever; in practice the performance should be pretty good. It's difficult to suggest any generally applicable optimisations without knowing (a) the expected usage patterns, and (b) the performance characteristics of your underlying RNG.
For example, if most callers will be specifying a max value <= 255 then it might not make sense to ask for four bytes of randomness every time. On the other hand, the performance benefit of requesting fewer bytes might be outweighed by the additional cost of always checking how many you actually need. (And, of course, once you do have specific information then you can keep optimising and testing until your results are good enough.)
I am not sure, that his is an answer. It definitly needs more space than a comment, so I have to write it here, but I am willing to delete if others think this is stupid.
From the OQ I get, that
Entropy bits are very expensive
Everything else should be considered expensive, but less so than entropy.
My idea is to use binary digits to half, quater ... the maxValue space, until it is reduced to a number. Somthing like
I'l use maxValue=333 (decimal) as an example and assume a function getBit(), that randomly returns 0 or 1
offset:=0
space:=maxValue
while (space>0)
//Right-shift the value, keeping the rightmost bit this should be
//efficient on x86 and x64, if coded in real code, not pseudocode
remains:=space & 1
part:=floor(space/2)
space:=part
//In the 333 example, part is now 166, but 2*166=332 If we were to simply chose one
//half of the space, we would be heavily biased towards the upper half, so in case
//we have a remains, we consume a bit of entropy to decide which half is bigger
if (remains)
if(getBit())
part++;
//Now we decide which half to chose, consuming a bit of entropy
if (getBit())
offset+=part;
//Exit condition: The remeinind number space=0 is guaranteed to be met
//In the 333 example, offset will be 0, 166 or 167, remaining space will be 166
}
randomResult:=offset
getBit() can either come from your entropy source, if it is bit-based, or by consuming n bits of entropy at once on first call (obviously with n being the optimum for your entropy source), and shifting this until empty.
My current solution. A bit ugly for my tastes. It also has two divisions per generated number, which might negatively impact performance (I haven't profiled this part yet).
uint UniformUInt(uint maxResult)
{
uint rand;
uint count = maxResult + 1;
if (maxResult < 0x100)
{
uint usefulCount = (0x100 / count) * count;
do
{
rand = Byte();
} while (rand >= usefulCount);
return rand % count;
}
else if (maxResult < 0x10000)
{
uint usefulCount = (0x10000 / count) * count;
do
{
rand = UInt16();
} while (rand >= usefulCount);
return rand % count;
}
else if (maxResult != uint.MaxValue)
{
uint usefulCount = (uint.MaxValue / count) * count;//reduces upper bound by 1, to avoid long division
do
{
rand = UInt32();
} while (rand >= usefulCount);
return rand % count;
}
else
{
return UInt32();
}
}
ulong UniformUInt(ulong maxResult)
{
if (maxResult < 0x100000000)
return InternalUniformUInt((uint)maxResult);
else if (maxResult < ulong.MaxValue)
{
ulong rand;
ulong count = maxResult + 1;
ulong usefulCount = (ulong.MaxValue / count) * count;//reduces upper bound by 1, since ulong can't represent any more
do
{
rand = UInt64();
} while (rand >= usefulCount);
return rand % count;
}
else
return UInt64();
}
Related
I am developing a gaming platform that is subject to heavy regulatory scrutiny. I chose Math.NET because it seemed like a good fit. However I have just received this comment back from our auditors.
Comments please if this is accurate and how it can be resolved?
In RandomSource(), Next(int, int) is defined as follows:
public override sealed int Next(int minValue, int maxValue)
{
if (minValue > maxValue)
{
throw new ArgumentException(Resources.ArgumentMinValueGreaterThanMaxValue);
}
if (_threadSafe)
{
lock (_lock)
{
return (int)(DoSample()*(maxValue - minValue)) + minValue;
}
}
return (int)(DoSample()*(maxValue - minValue)) + minValue;
}
This creates a bias in the same way as before. Using an un-scaled value from the RNG and multiplying it by the range without previously eliminating the bias (unless the range is a power of 2, there will be a bias ).
Update: The implementation of Next(minInclusive, maxExclusive) has been changed in Math.NET Numerics v3.13 following this discussion. Since v3.13 it is no longer involving floating point numbers, but instead samples integers with as many bits as needed to support the requested range (power of two) and rejects those outside of the actual range. This way it avoids adding any bias on top of the byte sampling itself (as provided e.g. by the crypto RNG)
Assumption: DoSample() returns a uniformly distributed sample in the range [0,1) (double precision floating point number).
Multiplying it with the range R=max-min will result in a uniformly distributed sample in the range [0,R). Casting this to an integer, which is essentially a floor, will result in a uniformly distributed discrete sample of one of 0,1,2,...,R-1. I don't see where the fact that R is even, odd, or a power of two may affect bias in this step.
A few runs to compute 100'000'000 samples also do not indicate obvious bias, but of course this is no proof:
var r = new CryptoRandomSource();
long[] h = new long[8];
for (int i = 0; i < 100000000; i++)
{
h[r.Next(2,7)]++;
}
0
0
19996313
20001286
19998092
19998328
20005981
0
0
0
20000288
20002035
20006269
19994927
19996481
0
0
0
19998296
19997777
20001463
20002759
19999705
0
I've come up with this solution for a value between 0 and max inclusive. I'm no maths expert so comments welcome.
It seems to satisfy the regulatory spec I have which says
2b) If a particular random number selected is outside the range of equal distribution of re-scaling values, it is permissible to discard that random number and select the next in sequence for the purpose of re-scaling."
private readonly CryptoRandomSource _random = new CryptoRandomSource();
private int GetRandomNumber(int max)
{
int number;
var nextPowerOfTwo = (int)Math.Pow(2, Math.Ceiling(Math.Log(max) / Math.Log(2)));
do
{
// Note: 2nd param of Next is an *exclusive* value. Add 1 to satisfy this
number = _random.Next(0, nextPowerOfTwo + 1);
} while (number > max);
return number;
}
I have a method that converts value to a newBase number of length length.
The logic in english is:
If we calculated every possible combination of numbers from 0 to (c-1)
with a length of x
what set would occur at point i
While the method below does work perfectly, because very large numbers are used, it can take a long time to complete:
For example, value=(((65536^480000)-1)/2), newbase=(65536), length=(480000) takes about an hour to complete on a 64 bit architecture, quad core PC).
private int[] GetValues(BigInteger value, int newBase, int length)
{
Stack<int> result = new Stack<int>();
while (value > 0)
{
result.Push((int)(value % newBase));
if (value < newBase)
value = 0;
else
value = value / newBase;
}
for (var i = result.Count; i < length; i++)
{
result.Push(0);
}
return result.ToArray();
}
My question is, how can I change this method into something that will allow multiple threads to work out part of the number?
I am working C#, but if you're not familiar with that then pseudocode is fine too.
Note: The method is from this question: Cartesian product subset returning set of mostly 0
If that GetValues method is really the bottleneck, there are several things you can do to speed it up.
First, you're dividing by newBase every time through the loop. Since newBase is an int, and the BigInteger divide method divides by a BigInteger, you're potentially incurring the cost of an int-to-BigInteger conversion on every iteration. You might consider:
BigInteger bigNewBase = newBase;
Also, you can cut the number of divides in half by calling DivRem:
while (value > 0)
{
BigInteger rem;
value = BigInteger.DivRem(value, bigNewBase, out rem);
result.Push((int)rem);
}
One other optimization, as somebody mentioned in comments, would be to store the digits in a pre-allocated array. You'll have to call Array.Reverse to get them in the proper order, but that takes approximately no time.
That method, by the way, doesn't lend itself to parallelizing because computing each digit depends on the computation of the previous digit.
I need to set all the low order bits of a given BigInteger to 0 until only two 1 bits are left. In other words leave the highest and second-highest bits set while unsetting all others.
The number could be any combination of bits. It may even be all 1s or all 0s. Example:
MSB 0000 0000
1101 1010
0010 0111
...
...
...
LSB 0100 1010
We can easily take out corner cases such as 0, 1, PowerOf2, etc. Not sure how to apply popular bit manipulation algorithms on a an array of bytes representing one number.
I have already looked at bithacks but have the following constraints. The BigInteger structure only exposes underlying data through the ToByteArray method which itself is expensive and unnecessary. Since there is no way around this, I don't want to slow things down further by implementing a bit counting algorithm optimized for 32/64 bit integers (which most are).
In short, I have a byte [] representing an arbitrarily large number. Speed is the key factor here.
NOTE: In case it helps, the numbers I am dealing with have around 5,000,000 bits. They keep on decreasing with each iteration of the algorithm so I could probably switch techniques as the magnitude of the number decreases.
Why I need to do this: I am working with a 2D graph and am particularly interested in coordinates whose x and y values are powers of 2. So (x+y) will always have two bits set and (x-y) will always have consecutive bits set. Given an arbitrary coordinate (x, y), I need to transform an intersection by getting values with all bits unset except the first two MSB.
Try the following (not sure if it's actually valid C#, but it should be close enough):
// find the next non-zero byte (I'm assuming little endian) or return -1
int find_next_byte(byte[] data, int i) {
while (data[i] == 0) --i;
return i;
}
// find a bit mask of the next non-zero bit or return 0
int find_next_bit(int value, int b) {
while (b > 0 && ((value & b) == 0)) b >>= 1;
return b;
}
byte[] data;
int i = find_next_byte(data, data.Length - 1);
// find the first 1 bit
int b = find_next_bit(data[i], 1 << 7);
// try to find the second 1 bit
b = find_next_bit(data[i], b >> 1);
if (b > 0) {
// found 2 bits, removing the rest
if (b > 1) data[i] &= ~(b - 1);
} else {
// we only found 1 bit, find the next non-zero byte
i = find_next_byte(data, i - 1);
b = find_next_bit(data[i], 1 << 7);
if (b > 1) data[i] &= ~(b - 1);
}
// remove the rest (a memcpy would be even better here,
// but that would probably require unmanaged code)
for (--i; i >= 0; --i) data[i] = 0;
Untested.
Probably this would be a bit more performant if compiled as unmanaged code or even with a C or C++ compiler.
As harold noted correctly, if you have no a priori knowledge about your number, this O(n) method is the best you can do. If you can, you should keep the position of the highest two non-zero bytes, which would drastically reduce the time needed to perform your transformation.
I'm not sure if this is getting optimised out or not but this code appears to be 16x faster than ToByteArray. It also avoids the memory copy and it means you get to the results as uint instead of byte so you should have further improvements there.
//create delegate to get private _bit field
var par = Expression.Parameter(typeof(BigInteger));
var bits = Expression.Field(par, "_bits");
var lambda = Expression.Lambda(bits, par);
var func = (Func<BigInteger, uint[]>)lambda.Compile();
//test call our delegate
var bigint = BigInteger.Parse("3498574578238348969856895698745697868975687978");
int time = Environment.TickCount;
for (int y = 0; y < 10000000; y++)
{
var x = func(bigint);
}
Console.WriteLine(Environment.TickCount - time);
//compare time to ToByteArray
time = Environment.TickCount;
for (int y = 0; y < 10000000; y++)
{
var x = bigint.ToByteArray();
}
Console.WriteLine(Environment.TickCount - time);
From there finding the top 2 bits should be pretty easy. The first bit will be in the first int I presume, then it is just a matter of searching for the second top most bit. If it is in the same integer then just set the first bit to zero and find the topmost bit, otherwise search for the next no zero int and find the topmost bit.
EDIT: to make things simple just copy/paste this class into your project. This creates extension methods that means you can just call mybigint.GetUnderlyingBitsArray(). I added a method to get the Sign also and, to make it more generic, have created a function that will allow accessing any private field of any object. I found this to be slower than my original code in debug mode but the same speed in release mode. I would advise performance testing this yourself.
static class BigIntegerEx
{
private static Func<BigInteger, uint[]> getUnderlyingBitsArray;
private static Func<BigInteger, int> getUnderlyingSign;
static BigIntegerEx()
{
getUnderlyingBitsArray = CompileFuncToGetPrivateField<BigInteger, uint[]>("_bits");
getUnderlyingSign = CompileFuncToGetPrivateField<BigInteger, int>("_sign");
}
private static Func<TObject, TField> CompileFuncToGetPrivateField<TObject, TField>(string fieldName)
{
var par = Expression.Parameter(typeof(TObject));
var field = Expression.Field(par, fieldName);
var lambda = Expression.Lambda(field, par);
return (Func<TObject, TField>)lambda.Compile();
}
public static uint[] GetUnderlyingBitsArray(this BigInteger source)
{
return getUnderlyingBitsArray(source);
}
public static int GetUnderlyingSign(this BigInteger source)
{
return getUnderlyingSign(source);
}
}
I'm using the MD5 algorithm to hash the key for an on-disk hash table (I know it's questionable whether this is the best algorithm to use for this, but I'm going with it for now. The problem is generalizable to any algorithm that produces a byte array). My problem is this:
The size of the hash code determines the number of combinations (buckets) in the hash table. Since MD5 is 128 bit, there are a huge number of combinations (~ 3.4e38) which is way too big for my purpose. So what I want to do is pick off the first n bits of the byte array that MD5 produces, and convert those into a long (or ulong) value. Since MD5 produces a byte array, it would be easy to do if I wanted an integral number of bytes, but this leads to too big a jump in the number of combinations. I'm finding the single bit version to be a lot trickier.
Goal:
n = 10 // I.e. I want 2^10 combinations
long pos = someFcn(byte[] key, n)
where key is the value being hashed, and n is the number of bits of the MD5 result I want to use. Pos, then, will be an integer from 0 to 1023 (in the case of n = 10). If n = 11, the code will be from 0 to 2^11-1 = 2027, etc. Has to be somewhat fast/efficient.
Doesn't seem that hard but it's eluding me. Any help would be much appreciated. Thanks.
First, convert the first four bytes into an integer, with BitConverter.ToInt32. It's getting four bytes no matter what, but this probably won't make it measurably slower, since you're working with 32-bit registers for the rest of the calculations anyway, and complex stuff like "if it's < 16 then do this with the first two bytes" will just make it more complicated
Then, given that integer, take the lowest N bits. If you really want a specific number of bits [a power of two number of buckets] not known at compile time, ~((-1)<<N) is a nice trick to get 2^N-1.
Or you could simply use ToUInt32 instead and modulo a prime number [it might be slightly better to convert to UInt64 instead, then you've got fully half the bits to start with, in this case]
To obtain the first 10 bits, for example:
int result = ((int)key[0] << 2) | (((int)key[1] >> 6) & 0x03)
If you have an array like this,
unsigned char data[2000];
then you can just scrape off the first n bits into an integer like so:
typedef unsigned long long int MyInt;
MyInt scrape(size_t n, unsigned char * data)
{
MyInt result = 0;
size_t b;
for (b = 0; b < n / 8; ++b)
{
result <<= 8;
result += data[b];
}
const size_t remaining_bits = n % 8;
result <<= remaining_bits;
result += (data[b] >> (8 - remaining_bits));
return result;
}
I'm assuming that CHAR_BITS == 8, feel free to generalize the code if you like. Also the size of the array times 8 must be at least n.
Imagine two bitmasks, I'll just use 8 bits for simplicity:
01101010
10111011
The 2nd, 4th, and 6th bits are both 1. I want to pick one of those common "on" bits at random. But I want to do this in O(1).
The only way I've found to do this so far is pick a random "on" bit in one, then check the other to see if it's also on, then repeat until I find a match. This is still O(n), and in my case the majority of the bits are off in both masks. I do of course & them together to initially check if there's any common bits at all.
Is there a way to do this? If so, I can increase the speed of my function by around 6%. I'm using C# if that matters. Thanks!
Mike
If you are willing to have an O(lg n) solution, at the cost of a possibly nonuniform probability, recursively half split, i.e. and with the top half of the bits set and the bottom half set. If both are nonzero then chose one randomly, else choose the nonzero one. Then half split what remains, etc. This will take 10 comparisons for a 32 bit number, maybe not as few as you would like, but better than 32.
You can save a few ands by choosing to and with the high half or low half at random, and if there are no hits taking the other half, and if there are hits taking the half tested.
The random number only needs to be generated once, as you are only using one bit at each test, just shift the used bit out when you are done with it.
If you have lots of bits, this will be more efficient. I do not see how you can get this down to O(1) though.
For example, if you have a 32 bit number first and the anded combination with either 0xffff0000 or 0x0000ffff if the result is nonzero (say you anded with 0xffff0000) conitinue on with 0xff000000 of 0x00ff0000, and so on till you get to one bit. This ends up being a lot of tedious code. 32 bits takes 5 layers of code.
Do you want a uniform random distribution? If so, I don't see any good way around counting the bits and then selecting one at random, or selecting random bits until you hit one that is set.
If you don't care about uniform, you can select a set bit out of a word randomly with:
unsigned int pick_random(unsigned int w, int size) {
int bitpos = rng() % size;
unsigned int mask = ~((1U << bitpos) - 1);
if (mask & w)
w &= mask;
return w - (w & (w-1));
}
where rng() is your random number generator, w is the word you want to pick from, and size is the relevant size of the word in bits (which may be the machine wordsize, or may be less as long as you don't set the upper bits of the word. Then, for your example, you use pick_random(0x6a & 0xbb, 8) or whatever values you like.
This function uniformly randomly selects one bit which is high in both masks. If there are
no possible bits to pick, zero is returned instead. The running time is O(n), where n is the number of high bits in the anded masks. So if you have a low number of high bits in your masks, this function could be faster even though the worst case is O(n) which happens when all the bits are high. The implementation in C is as follows:
unsigned int randomMasksBit(unsigned a, unsigned b){
unsigned int i = a & b; // Calculate the bits which are high in both masks.
unsigned int count = 0
unsigned int randomBit = 0;
while (i){ // Loop through all high bits.
count++;
// Randomly pick one bit from the bit stream uniformly, by selecting
// a random floating point number between 0 and 1 and checking if it
// is less then the probability needed for random selection.
if ((rand() / (double)RAND_MAX) < (1 / (double)count)) randomBit = i & -i;
i &= i - 1; // Move on to the next high bit.
}
return randomBit;
}
O(1) with uniform distribution (or as uniform as random generator offers) can be done, depending on whether you count certain mathematical operation as O(1). As a rule we would, though in the case of bit-tweaking one might make a case that they are not.
The trick is that while it's easy enough to get the lowest set bit and to get the highest set bit, in order to have uniform distribution we need to randomly pick a partitioning point, and then randomly pick whether we'll go for the highest bit below it or the lowest bit above (trying the other approach if that returns zero).
I've broken this down a bit more than might be usual to allow the steps to be more easily followed. The only question on constant timing I can see is whether Math.Pow and Math.Log should be considered O(1).
Hence:
public static uint FindRandomSharedBit(uint x, uint y)
{//and two nums together, to find shared bits.
return FindRandomBit(x & y);
}
public static uint FindRandomBit(uint val)
{//if there's none, we can escape out quickly.
if(val == 0)
return 0;
Random rnd = new Random();
//pick a partition point. Note that Random.Next(1, 32) is in range 1 to 31
int maskPoint = rnd.Next(1, 32);
//pick which to try first.
bool tryLowFirst = rnd.Next(0, 2) == 1;
// will turn off all bits above our partition point.
uint lowerMask = Convert.ToUInt32(Math.Pow(2, maskPoint) - 1);
//will turn off all bits below our partition point
uint higherMask = ~lowerMask;
if(tryLowFirst)
{
uint lowRes = FindLowestBit(val & higherMask);
return lowRes != 0 ? lowRes : FindHighestBit(val & lowerMask);
}
uint hiRes = FindHighestBit(val & lowerMask);
return hiRes != 0 ? hiRes : FindLowestBit(val & higherMask);
}
public static uint FindLowestBit(uint masked)
{ //e.g 00100100
uint minusOne = masked - 1; //e.g. 00100011
uint xord = masked ^ minusOne; //e.g. 00000111
uint plusOne = xord + 1; //e.g. 00001000
return plusOne >> 1; //e.g. 00000100
}
public static uint FindHighestBit(uint masked)
{
double db = masked;
return (uint)Math.Pow(2, Math.Floor(Math.Log(masked, 2)));
}
I believe that, if you want uniform, then the answer will have to be Theta(n) in terms of the number of bits, if it has to work for all possible combinations.
The following C++ snippet (stolen) should be able to check if any given num is a power of 2.
if (!var || (var & (var - 1))) {
printf("%u is not power of 2\n", var);
}
else {
printf("%u is power of 2\n", var);
}
If you have few enough bits to worry about, you can get O(1) using a lookup table:
var lookup8bits = new int[256][] = {
new [] {},
new [] {0},
new [] {1},
new [] {0, 1},
...
new [] {0, 1, 2, 3, 4, 5, 6, 7}
};
Failing that, you can find the least significant bit of a number x with (x & -x), assuming 2s complement. For example, if x = 46 = 101110b, then -x = 111...111010010b, hence x & -x = 10.
You can use this technique to enumerate the set bits of x in O(n) time, where n is the number of set bits in x.
Note that computing a pseudo random number is going to take you a lot longer than enumerating the set bits in x!
This can't be done in O(1), and any solution for a fixed number of N bits (unless it's totally really ridiculously stupid) will have a constant upper bound, for that N.