What are the chances of Random.NextDouble being exactly 0?

What are the chances of Random.NextDouble being exactly 0? - c#

The documentation of Random.NextDouble():
Returns a random floating-point number that is greater than or equal to 0.0, and less than 1.0.
So, it can be exactly 0. But what are the chances for that?
var random = new Random();
var x = random.NextDouble()
if(x == 0){
// probability for this?
}
It would be easy to calculate the probability for Random.Next() being 0, but I have no idea how to do it in this case...

As mentioned in comments, it depends on internal implementation of NextDouble. In "old" .NET Framework, and in modern .NET up to version 5, it looks like this:
protected virtual double Sample() {
return (InternalSample()*(1.0/MBIG));
}
InternalSample returns integer in 0 to Int32.MaxValue range, 0 included, int.MaxValue excluded. We can assume that the distribution of InternalSample is uniform (in the docs for Next method, which just calls InternalSample, there are clues that it is, and it seems there is no reason to use non-uniform distribution in general-purpose RNG for integers). That means every number is equally likely. Then, we have 2,147,483,647 numbers in distribution, and the probability to draw 0 is 1 / 2,147,483,647.
In modern .NET 6+ there are two implementations. First is used when you provide explicit seed value to Random constructor. This implementation is the same as above, and is used for compatibility reasons - so that old code relying on the seed value to produce deterministic results will not break while moving to the new .NET version.
Second implementation is a new one and is used when you do NOT pass seed into Random constructor. Source code:
public override double NextDouble() =>
// As described in http://prng.di.unimi.it/:
// "A standard double (64-bit) floating-point number in IEEE floating point format has 52 bits of significand,
// plus an implicit bit at the left of the significand. Thus, the representation can actually store numbers with
// 53 significant binary digits. Because of this fact, in C99 a 64-bit unsigned integer x should be converted to
// a 64-bit double using the expression
// (x >> 11) * 0x1.0p-53"
(NextUInt64() >> 11) * (1.0 / (1ul << 53));
We first obtain random 64-bit unsigned integer. Now, we could multiply it by 1 / 2^64 to obtain double in 0..1 range, but that would make the resulting distribution biased. double is represented by 53-bit mantissa (52 bits are explicit and one is implicit) ,exponent and sign. For all integer values exponent is the same, so that leaves us with 53 bits to represent integer values. But we have 64-bit integer here. This means integer values less than 2^53 can be represented exactly by double but bigger integers can not. For example:
ulong l1 = 1ul << 53;
ulong l2 = l1 + 1;
double d1 = l1;
double d2 = l2;
Console.WriteLine(d1 == d2);
Prints "true", so two different integers map to the same double value. That means if we just multiply our 64-bit integer by 1 / 2^64 - we'll get a biased non-uniform distribution, because many integers bigger than 2^53-1 will map to the same values.
So instead, we throw away 11 bits, and multiply the result by 1 / 2^53 to get uniform distibution in 0..1 range. The probability to get 0 is then 1 / 2^53 (1 / 9,007,199,254,740,992). This implementation is better than the old one, because is provides much more different doubles in 0 .. 1 range (2^53 compared to 2^32 in old one).
You also asked in comments:
If one knows how many numbers there are between 0 inclusive and 1
exclusive (according to IEEE 754), it would be possible to answer the
'probability' question, because 0 is one of all of them
That's not so. There are actually more than 2^53 representable numbers between 0..1 in IEEE 754. We have 52 bits of mantissa, then we have 11 bits of exponent, half of which is for negative exponents. Almost all negative exponents (rougly half of that 11 bit range) combined with mantissa gives us distinct value in 0..1 range.
Why we can't use full 0..1 range which IEEE allows us to generate random number? Because this range is not uniform (like the full double range is not uniform itself). For example there are more representable numbers in 0 .. 0.5 range than in 0.5 .. 1 range.

This is from a strictly academic persepective.
From Double Struct:
All floating-point numbers also have a limited number of significant
digits, which also determines how accurately a floating-point value
approximates a real number. A Double value has up to 15 decimal digits
of precision, although a maximum of 17 digits is maintained
internally. This means that some floating-point operations may lack
the precision to change a floating point value.
If only 15 decimal digits are significant, then your possible return values are:
0.000000000000000
To:
0.999999999999999
Said differently, you have 10^15 possible (comparably different, "distinct") values (see Permutations in the first answer):
10^15 = 1,000,000,000,000,000
Zero is just ONE of those possibilities:
1 / 1,000,000,000,000,000 = 0.000000000000001
Stated as a percentage:
0.0000000000001% chance of zero being randomly selected?
I think this is the closest "correct" answer you're going to get...
...whether it performs this way in practice is possibly a different story.

Just create a simple program, and let it run until you are satisfied by the number of tries done. (see: https://onlinegdb.com/ij1M50gRQ)
Random r = new Random();
Double d ;
int attempts=0;
int attempts0=0;
while (true) {
d = Math.Round(r.NextDouble(),3);
if(d==0) attempts0++;
attempts++;
if (attempts%1000000==0) Console.WriteLine($"Attempts: {attempts}, with {attempts0} times a 0 value, this is {Math.Round(100.0*attempts0/attempts,3)} %");
}
example output:
...
Attempts: 208000000, with 103831 times a 0 value, this is 0.05 %
Attempts: 209000000, with 104315 times a 0 value, this is 0.05 %
Attempts: 210000000, with 104787 times a 0 value, this is 0.05 %
Attempts: 211000000, with 105305 times a 0 value, this is 0.05 %
Attempts: 212000000, with 105853 times a 0 value, this is 0.05 %
Attempts: 213000000, with 106349 times a 0 value, this is 0.05 %
Attempts: 214000000, with 106839 times a 0 value, this is 0.05 %
...
Changing the value of d to be rounded to 2 decimals will return 0.5%

Related

Where does the noise in ToString("G51") come from?

If you evaluate
(Math.PI).ToString("G51")
the result is
"3,141592653589793115997963468544185161590576171875"
Math.Pi is correct up to the 15th decimal which is in accordance with the precision of a double. "G17" is used for round tripping, so I expected the ToString to stop after "G17", but instead it continued to produce numbers up to "G51".
Q Since the digits after 15th decimal are beyond the precision afforded by the 53 bits of mantissa, how are the remaining digits calculated?

A double has a 53 bit mantissa which corresponds to 15 decimals
A double has a 53 bit mantissa which corresponds to roughly 15 decimals.
Math.Pi is truncated at 53 bits in binary, which is not precisely equal to
3.1415926535897
but it's also not quite equal to
3.14159265358979
, etc.
Just like 1/3 is not quite 0.3, or 0.33, etc.
If you continue the decimal expansion long enough it will eventually terminate or start repeating, since Math.PI, unlike π is a rational number.
And as you can verify with a BigRational type
using System.Numerics;
var r = new BigRational(Math.PI);
BigRational dem;
BigRational.NumDen(r, out dem);
var num = r * dem;
Console.WriteLine(num);
Console.WriteLine("-----------------");
Console.WriteLine(dem);
it's
884279719003555
---------------
281474976710656
And since the denominator is a power of 2 (which it has to be since Math.PI has a binary mantissa and so a terminating representation in base-2). It also therefore has a terminating representation in base-10, which is exactly what was given by (Math.PI).ToString("G51"):
3.141592653589793115997963468544185161590576171875
as
Console.WriteLine(r == BigRational.Parse("3.141592653589793115997963468544185161590576171875"));
outputs
True

The .ToString("G51") method uses 'school math'. First it takes the integer part, then it multiplies the Remainder by 10, use the integer part of that calculation and multiplies the Remainder part by 10.
It will keep doing that as long as there is a remainder and up to 51 decimal digits.
Since double has an exponent, it will keep having a remainder with about 18 decimal digits, which produces the next digit in ToString method.
However, as you have seen, the Remainder part is not accurate but simply a result of a math function.

Why does C# round (100-1)/2 to 49 instead of 50?

According to my calculator: (100-1) / 2 = 49.5
If I have an int like this:
int mid = (100 - 1) / 2
And printing mid will give me:
49
Why will C# give me 49 instead of 50? Aren't you supposed to round to the next whole number if it is .5 so that the number would be 50?

When performing integer division (by which we mean both arguments are integral types) the CLR will truncate the result; effectively rounding down for positive results.
If you want "standard" or midpoint rounding, you need to explicitly use Math.Round and floating point division (at least one argument is float, double or decimal).

This isn't rounding but integer arithmetic which truncates the floating point number, so
(100 - 1) / 2 == 99 / 2 = 49.5 which truncates to 49
and
(100 - 1) / -2 = 99 / -2 = -49.5 which truncates to -49
If you want real rounding then you must make at least one of the variables a floating point value and then call Math.Round on the result.

As others mentioned, integer division will truncate. This makes sense since 99 % 2 == 1 and you would expect that subtracting the modulus from the original value wouldn't change the division result, i.e. (99/2 == (99-(99%2))/2), so 99/2 = 98/2

Double vs Decimal Rounding in C#

Why does:
double dividend = 1.0;
double divisor = 3.0;
Console.WriteLine(dividend / divisor * divisor);
output 1.0,
but:
decimal dividend = 1;
decimal divisor = 3;
Console.WriteLine(dividend / divisor * divisor);
outputs 0.9999999999999999999999999999
?
I understand that 1/3 can't be computed exactly, so there must be some rounding.
But why does Double round the answer to 1.0, but Decimal does not?
Also, why does double compute 1.0/3.0 to be 0.33333333333333331?
If rounding is used, then wouldn't the last 3 get rounded to 0, why 1?

Why 1/3 as a double is 0.33333333333333331
The closest way to represent 1/3 in binary is like this:
0.0101010101...
That's the same as the series 1/4 + (1/4)^2 + (1/4)^3 + (1/4)^4...
Of course, this is limited by the number of bits you can store in a double. A double is 64 bits, but one of those is the sign bit and another 11 represent the exponent (think of it like scientific notation, but in binary). So the rest, which is called the mantissa or significand is 52 bits. Assume a 1 to start and then use two bits for each subsequent power of 1/4. That means you can store:
1/4 + 1/4^2 + ... + 1/4 ^ 27
which is 0.33333333333333331
Why multiplying by 3 rounds this to 1
So 1/3 represented in binary and limited by the size of a double is:
0.010101010101010101010101010101010101010101010101010101
I'm not saying that's how it's stored. Like I said, you store the bits starting after the 1, and you use separate bits for the exponent and the sign. But I think it's useful to consider how you'd actually write it in base 2.
Let's stick with this "mathematician's binary" representation and ignore the size limits of a double. You don't have to do it this way, but I find it convenient. If we want to take this approximation for 1/3 and multiply by 3, that's the same as bit shifting to multiply by 2 and then adding what you started with. This gives us 1/3 * 3 = 0.111111111111111111111111111111111111111111111111111111
But can a double store that? No, remember, you can only have 52 bits of mantissa after the first 1, and that number has 54 ones. So we know that it'll be rounded, in this case rounded up to exactly 1.
Why for decimal you get 0.9999999999999999999999999999
With decimal, you get 96 bits to represent an integer, with additional bits representing the exponent up to 28 powers of 10. So even though ultimately it's all stored as binary, here we're working with powers of 10 so it makes sense to think of the number in base 10. 96 bits lets us express up to 79,228,162,514,264,337,593,543,950,335, but to represent 1/3 we're going to go with all 3's, up to the 28 of them that we can shift to the right of the decimal point: 0.3333333333333333333333333333.
Multiplying this approximation for 1/3 by 3 gives us a number we can represent exactly. It's just 28 9's, all shifted to the right of the decimal point: 0.9999999999999999999999999999. So unlike with double's there's not a second round of rounding at this point.

This is by design of the decimal type which is optimized for accuracy unlike the double type which is optimized for low accuracy but higher performance.
The Decimal value type represents decimal numbers ranging from positive 79,228,162,514,264,337,593,543,950,335 to negative 79,228,162,514,264,337,593,543,950,335.
The Decimal value type is appropriate for financial calculations requiring large numbers of significant integral and fractional digits and no round-off errors. The Decimal type does not eliminate the need for rounding. Rather, it minimizes errors due to rounding. Thus your code produces a result of 0.9999999999999999999999999999 rather than 1.
One reason that infinite decimals are a necessary extension of finite decimals is to represent fractions. Using long division, a simple division of integers like 1⁄9 becomes a recurring decimal, 0.111…, in which the digits repeat without end. This decimal yields a quick proof for 0.999… = 1. Multiplication of 9 times 1 produces 9 in each digit, so 9 × 0.111… equals 0.999… and 9 × 1⁄9 equals 1, so 0.999… = 1:

Floating point operations ambiguity [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why is floating point arithmetic in C# imprecise?
Why is there a bias in floating point ops? Any specific reason?
Output:
160
139
static void Main()
{
float x = (float) 1.6;
int y = (int)(x * 100);
float a = (float) 1.4;
int b = (int)(a * 100);
Console.WriteLine(y);
Console.WriteLine(b);
Console.ReadKey();
}

Any rational number that has a denominator that is not a power of 2 will lead to an infinite number of digits when represented as a binary. Here you have 8/5 and 7/5. Therefore there is no exact binary representation as a floating-point number (unless you have infinite memory).
The exact binary representation of 1.6 is 110011001100110011001100110011001100...
The exact binary representation of 1.4 is 101100110011001100110011001100110011...
Both values have an infinite number of digits (1100 is repeated endlessly).
float values have a precision of 24 bits. So the binary representation of any value will be rounded to 24 bits. If you round the given values to 24 bits you get:
1.6: 110011001100110011001101 (decimal 13421773) - rounded up
1.4: 101100110011001100110011 (decimal 11744051) - rounded down
Both values have an exponent of 0 (the first bit is 2^0 = 1, the second is 2^-1 = 0.5 etc.).
Since the first bit in a 24 bit value is 2^23 you can calculate the exact decimal values by dividing the 24 bit values (13421773 and 11744051) by two 23 times.
The values are: 1.60000002384185791015625 and 1.39999997615814208984375.
When using floating-point types you always have to consider that their precision is finite. Values that can be written exact as decimal values might be rounded up or down when represented as binaries. Casting to int does not respect that because it truncates the given values. You should always use something like Math.Round.
If you really need an exact representation of rational numbers you need a completely different approach. Since rational numbers are fractions you can use integers to represent them. Here is an example of how you can achieve that.
However, you can not write Rational x = (Rational)1.6 then. You have to write something like Rational x = new Rational(8, 5) (or new Rational(16, 10) etc.).

This is due to the fact that floating point arithmetic is not precise. When you set a to 1.4, internally it may not be exactly 1.4, just as close as can be made with machine precision. If it is fractionally less than 1.4, then multiplying by 100 and casting to integer will take only the integer portion which in this case would be 139. You will get far more technically precise answers but essentially this is what is happening.
In the case of your output for the 1.6 case, the floating point representation may actually be minutely larger than 1.6 and so when you multiply by 100, the total is slightly larger than 160 and so the integer cast gives you what you expect. The fact is that there is simply not enough precision available in a computer to store every real number exactly.
See this link for details of the conversion from floating point to integer types http://msdn.microsoft.com/en-us/library/aa691289%28v=vs.71%29.aspx - it has its own section.

The floating point types float (32 bit) and double (64 bit) have a limited precision and more over the value is represented as a binary value internally. Just as you cannot represent 1/7 precisely in a decimal system (~ 0.1428571428571428...), 1/10 cannot be represented precisely in a binary system.
You can however use the decimal type. It still has a limited (however high) precision, but the numbers a represented in a decimal way internally. Therefore a value like 1/10 is represented exactly like 0.1000000000000000000000000000 internally. 1/7 is still a problem for decimal. But at least you don't get a loss of precision by converting to binary and then back to decimal.
Consider using decimal.

Check if decimal contains decimal places by looking at the bytes

There is a similar question in here. Sometimes that solution gives exceptions because the numbers might be to large.
I think that if there is a way of looking at the bytes of a decimal number it will be more efficient. For example a decimal number has to be represented by some n number of bytes. For example an Int32 is represented by 32 bits and all the numbers that start with the bit of 1 are negative. Maybe there is some kind of similar relationship with decimal numbers. How could you look at the bytes of a decimal number? or the bytes of an integer number?

If you are really talking about decimal numbers (as opposed to floating-point numbers), then Decimal.GetBits will let you look at the individual bits of a decimal. The MSDN page also contains a description of the meaning of the bits.
On the other hand, if you just want to check whether a number has a fractional part or not, doing a simple
var hasFractionalPart = (myValue - Math.Round(myValue) != 0)
is much easier than decoding the binary structure. This should work for decimals as well as classic floating-point data types such as float or double. In the latter case, due to floating-point rounding error, it might make sense to check for Math.Abs(myValue - Math.Round(myValue)) < someThreshold instead of comparing to 0.

If you want a reasonably efficient way of getting the 'decimal' value of a decimal type you can just mod it by one.
decimal number = 4.75M;
decimal fractionalPart = number % 1;
Console.WriteLine(fractionalPart); //will print 0.75
While it may not be the theoretically optimal solution, it'll be quite fast, and almost certainly fast enough for your purposes (far better than string manipulation and parsing, which is a common naive approach).

You can use Decimal.GetBits in order to retrieve the bits from a decimal structure.
The MSDN page linked above details how they are laid out in memory:
The binary representation of a Decimal number consists of a 1-bit sign, a 96-bit integer number, and a scaling factor used to divide the integer number and specify what portion of it is a decimal fraction. The scaling factor is implicitly the number 10, raised to an exponent ranging from 0 to 28.
The return value is a four-element array of 32-bit signed integers.
The first, second, and third elements of the returned array contain the low, middle, and high 32 bits of the 96-bit integer number.
The fourth element of the returned array contains the scale factor and sign. It consists of the following parts:
Bits 0 to 15, the lower word, are unused and must be zero.
Bits 16 to 23 must contain an exponent between 0 and 28, which indicates the power of 10 to divide the integer number.
Bits 24 to 30 are unused and must be zero.
Bit 31 contains the sign; 0 meaning positive, and 1 meaning negative.

Going with Oded's detailed info to use GetBits, I came up with this
const int EXP_MASK = 0x00FF0000;
bool hasDecimal = (Decimal.GetBits(value)[3] & EXP_MASK) != 0x0;

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.