Where does the noise in ToString("G51") come from?

Where does the noise in ToString("G51") come from? - c#

If you evaluate
(Math.PI).ToString("G51")
the result is
"3,141592653589793115997963468544185161590576171875"
Math.Pi is correct up to the 15th decimal which is in accordance with the precision of a double. "G17" is used for round tripping, so I expected the ToString to stop after "G17", but instead it continued to produce numbers up to "G51".
Q Since the digits after 15th decimal are beyond the precision afforded by the 53 bits of mantissa, how are the remaining digits calculated?

A double has a 53 bit mantissa which corresponds to 15 decimals
A double has a 53 bit mantissa which corresponds to roughly 15 decimals.
Math.Pi is truncated at 53 bits in binary, which is not precisely equal to
3.1415926535897
but it's also not quite equal to
3.14159265358979
, etc.
Just like 1/3 is not quite 0.3, or 0.33, etc.
If you continue the decimal expansion long enough it will eventually terminate or start repeating, since Math.PI, unlike π is a rational number.
And as you can verify with a BigRational type
using System.Numerics;
var r = new BigRational(Math.PI);
BigRational dem;
BigRational.NumDen(r, out dem);
var num = r * dem;
Console.WriteLine(num);
Console.WriteLine("-----------------");
Console.WriteLine(dem);
it's
884279719003555
---------------
281474976710656
And since the denominator is a power of 2 (which it has to be since Math.PI has a binary mantissa and so a terminating representation in base-2). It also therefore has a terminating representation in base-10, which is exactly what was given by (Math.PI).ToString("G51"):
3.141592653589793115997963468544185161590576171875
as
Console.WriteLine(r == BigRational.Parse("3.141592653589793115997963468544185161590576171875"));
outputs
True

The .ToString("G51") method uses 'school math'. First it takes the integer part, then it multiplies the Remainder by 10, use the integer part of that calculation and multiplies the Remainder part by 10.
It will keep doing that as long as there is a remainder and up to 51 decimal digits.
Since double has an exponent, it will keep having a remainder with about 18 decimal digits, which produces the next digit in ToString method.
However, as you have seen, the Remainder part is not accurate but simply a result of a math function.

Related

What are the chances of Random.NextDouble being exactly 0?

The documentation of Random.NextDouble():
Returns a random floating-point number that is greater than or equal to 0.0, and less than 1.0.
So, it can be exactly 0. But what are the chances for that?
var random = new Random();
var x = random.NextDouble()
if(x == 0){
// probability for this?
}
It would be easy to calculate the probability for Random.Next() being 0, but I have no idea how to do it in this case...

As mentioned in comments, it depends on internal implementation of NextDouble. In "old" .NET Framework, and in modern .NET up to version 5, it looks like this:
protected virtual double Sample() {
return (InternalSample()*(1.0/MBIG));
}
InternalSample returns integer in 0 to Int32.MaxValue range, 0 included, int.MaxValue excluded. We can assume that the distribution of InternalSample is uniform (in the docs for Next method, which just calls InternalSample, there are clues that it is, and it seems there is no reason to use non-uniform distribution in general-purpose RNG for integers). That means every number is equally likely. Then, we have 2,147,483,647 numbers in distribution, and the probability to draw 0 is 1 / 2,147,483,647.
In modern .NET 6+ there are two implementations. First is used when you provide explicit seed value to Random constructor. This implementation is the same as above, and is used for compatibility reasons - so that old code relying on the seed value to produce deterministic results will not break while moving to the new .NET version.
Second implementation is a new one and is used when you do NOT pass seed into Random constructor. Source code:
public override double NextDouble() =>
// As described in http://prng.di.unimi.it/:
// "A standard double (64-bit) floating-point number in IEEE floating point format has 52 bits of significand,
// plus an implicit bit at the left of the significand. Thus, the representation can actually store numbers with
// 53 significant binary digits. Because of this fact, in C99 a 64-bit unsigned integer x should be converted to
// a 64-bit double using the expression
// (x >> 11) * 0x1.0p-53"
(NextUInt64() >> 11) * (1.0 / (1ul << 53));
We first obtain random 64-bit unsigned integer. Now, we could multiply it by 1 / 2^64 to obtain double in 0..1 range, but that would make the resulting distribution biased. double is represented by 53-bit mantissa (52 bits are explicit and one is implicit) ,exponent and sign. For all integer values exponent is the same, so that leaves us with 53 bits to represent integer values. But we have 64-bit integer here. This means integer values less than 2^53 can be represented exactly by double but bigger integers can not. For example:
ulong l1 = 1ul << 53;
ulong l2 = l1 + 1;
double d1 = l1;
double d2 = l2;
Console.WriteLine(d1 == d2);
Prints "true", so two different integers map to the same double value. That means if we just multiply our 64-bit integer by 1 / 2^64 - we'll get a biased non-uniform distribution, because many integers bigger than 2^53-1 will map to the same values.
So instead, we throw away 11 bits, and multiply the result by 1 / 2^53 to get uniform distibution in 0..1 range. The probability to get 0 is then 1 / 2^53 (1 / 9,007,199,254,740,992). This implementation is better than the old one, because is provides much more different doubles in 0 .. 1 range (2^53 compared to 2^32 in old one).
You also asked in comments:
If one knows how many numbers there are between 0 inclusive and 1
exclusive (according to IEEE 754), it would be possible to answer the
'probability' question, because 0 is one of all of them
That's not so. There are actually more than 2^53 representable numbers between 0..1 in IEEE 754. We have 52 bits of mantissa, then we have 11 bits of exponent, half of which is for negative exponents. Almost all negative exponents (rougly half of that 11 bit range) combined with mantissa gives us distinct value in 0..1 range.
Why we can't use full 0..1 range which IEEE allows us to generate random number? Because this range is not uniform (like the full double range is not uniform itself). For example there are more representable numbers in 0 .. 0.5 range than in 0.5 .. 1 range.

This is from a strictly academic persepective.
From Double Struct:
All floating-point numbers also have a limited number of significant
digits, which also determines how accurately a floating-point value
approximates a real number. A Double value has up to 15 decimal digits
of precision, although a maximum of 17 digits is maintained
internally. This means that some floating-point operations may lack
the precision to change a floating point value.
If only 15 decimal digits are significant, then your possible return values are:
0.000000000000000
To:
0.999999999999999
Said differently, you have 10^15 possible (comparably different, "distinct") values (see Permutations in the first answer):
10^15 = 1,000,000,000,000,000
Zero is just ONE of those possibilities:
1 / 1,000,000,000,000,000 = 0.000000000000001
Stated as a percentage:
0.0000000000001% chance of zero being randomly selected?
I think this is the closest "correct" answer you're going to get...
...whether it performs this way in practice is possibly a different story.

Just create a simple program, and let it run until you are satisfied by the number of tries done. (see: https://onlinegdb.com/ij1M50gRQ)
Random r = new Random();
Double d ;
int attempts=0;
int attempts0=0;
while (true) {
d = Math.Round(r.NextDouble(),3);
if(d==0) attempts0++;
attempts++;
if (attempts%1000000==0) Console.WriteLine($"Attempts: {attempts}, with {attempts0} times a 0 value, this is {Math.Round(100.0*attempts0/attempts,3)} %");
}
example output:
...
Attempts: 208000000, with 103831 times a 0 value, this is 0.05 %
Attempts: 209000000, with 104315 times a 0 value, this is 0.05 %
Attempts: 210000000, with 104787 times a 0 value, this is 0.05 %
Attempts: 211000000, with 105305 times a 0 value, this is 0.05 %
Attempts: 212000000, with 105853 times a 0 value, this is 0.05 %
Attempts: 213000000, with 106349 times a 0 value, this is 0.05 %
Attempts: 214000000, with 106839 times a 0 value, this is 0.05 %
...
Changing the value of d to be rounded to 2 decimals will return 0.5%

Why does C# round (100-1)/2 to 49 instead of 50?

According to my calculator: (100-1) / 2 = 49.5
If I have an int like this:
int mid = (100 - 1) / 2
And printing mid will give me:
49
Why will C# give me 49 instead of 50? Aren't you supposed to round to the next whole number if it is .5 so that the number would be 50?

When performing integer division (by which we mean both arguments are integral types) the CLR will truncate the result; effectively rounding down for positive results.
If you want "standard" or midpoint rounding, you need to explicitly use Math.Round and floating point division (at least one argument is float, double or decimal).

This isn't rounding but integer arithmetic which truncates the floating point number, so
(100 - 1) / 2 == 99 / 2 = 49.5 which truncates to 49
and
(100 - 1) / -2 = 99 / -2 = -49.5 which truncates to -49
If you want real rounding then you must make at least one of the variables a floating point value and then call Math.Round on the result.

As others mentioned, integer division will truncate. This makes sense since 99 % 2 == 1 and you would expect that subtracting the modulus from the original value wouldn't change the division result, i.e. (99/2 == (99-(99%2))/2), so 99/2 = 98/2

Float and double - Significand numbers- Mantissa POV?

With Single precision (32 bits): the bits division goes like this :
So we have 23 bits of mantissa/Significand .
So we can represent 2^23 numbers (via 23 bits ) : which is 8388608 --> which is 7 digit long.
BUT
I was reading that the mantissa is normalized (the leading digit in the mantissa will always be a 1) - so the pattern is actually 1.mmm and only the mmm is represented in the mantissa.
for example : look here :
0.75 is represented but it's actually 1.75
Question #1
So basically it adds 1 more precision digit....no ?
If so then we have 8 Significand !
So why does msdn says : 7 ?
Question #2
In double there are 52 bits for mantissa. (0..51)
If I add 1 for the normalized mantissa so its 2^53 possibilites which is : 9007199254740992 ( 16 digits)
and MS does say : 15-16 :
Why is this inconsistency ? am I missing something ?

It doesn't add one more decimal digit - just a single binary digit. So instead of 23 bits, you have 24 bits. This is handy, because the only number you can't represent as starting with a one is zero, and that's a special value.
In short, you're not looking at 2 ^ 24 (that would be a decimal number, base-10) - you're looking at 2 ^ (-24). That's the most important difference between float-double and decimal. decimal is what you imagine floats to be, ie. a simple exponent-shifted, base-10 number. float and double aren't that.
Now, decimal digits versus binary digits is a tricky matter. You're mistaken in your understanding that the precision has anything to do with the 2 ^ 24 figure - that would only be true if you were talking about e.g. the decimal type, which actually stores decimal values as decimal point offsets of a normal (huge-ass) integer.
Just like 1 / 3 cannot be written in decimal (0.333333...), many simple decimal numbers can't be represented in a float precisely (0.2 is the typical example). decimal doesn't have a problem with that - it's just 2 shifted one digit to the right, easy peasy. For floats, however, you have to represent this value as a sum of negative powers of two - 0.5, 0.25, 0.125 ... The same would apply in the opposite direction if 2 wasn't a factor of 10 - every finite binary "decimal" can be represented with finite precision in decimal.
Now, in fact, float can easily represent a number with 24 decimal digits - it just has to be 2 ^ (-24) - a number you're not going to encounter in your usual day job, and a weird number in decimal. So where does the 7 (actually more like 7.22...) come from? Simple, just do a decimal logarithm of 2 ^ (-24).
The fact that it seems that 0.2 can be represented "exactly" in a float is simply because everytime you e.g. convert it to a string, you're rounding. So, even though the number isn't 0.2 exactly, it ends up that way when you convert it to a decimal number.
All this means that when you need decimal precision, you want to use decimal, as simple as that. This is not because it's a better base for calculations, it's simply because humans use it, and they will not be happy if your application gives different results from what they calculate on a piece of paper - especially when dealing with money. Accountants are very focused on having everything correct to the least significant digit.
Floats are used where it's not about decimal precision, but rather about generally having some sort of precision - this makes them well suited for physics calculations and similar, because you don't actually care about having the number come up the same in decimal - you're working with a given precision, and you're going to get that - 24 significant binary "decimals".

The implied leading 1 adds one more binary digit of precision, not decimal.

Accuracy of decimal

I use the decimal type for high precise calculation (monetary).
But I came across this simple division today:
1 / (1 / 37) which should result in 37 again
http://www.wolframalpha.com/input/?i=1%2F+%281%2F37%29
But C# gives me:
37.000000000000000000000000037M
I tried both these:
1m/(1m/37m);
and
Decimal.Divide(1, Decimal.Divide(1, 37))
but both yield the same results. How is the behaviour explainable?

Decimal stores the value as decimal floating point with only limited precision. The result of 1 / 37 is not precicely stored, as it's stored as 0.027027027027027027027027027M. The true number has the group 027 going indefinitely in decimal representation. For that reason, you cannot get the precise numbers in decimal representation for every possible number.
If you use Double in the same calculation, the end result is correct in this case (but it does not mean it will always be better).
A good answer on that topic is here: Difference between decimal, float and double in .NET?

Decimal data type has an accuracy of 28-29 significant digits.
So what you have to understand is when you consider 28-29 significant digits you are still not exact.
So when you compute a decimal value for (1/37) what you have to note is that at this stage you are only getting an accuracy of 28-29 digits. e.g 1/37 is 0.02 when you take 2 significant digits and 0.027 when you take 3 significant digits. Imagine you divide 1 with these values in each case. you get a 50 in first case and in second case you get 37.02...Considering 28-29 digits (decimal ) takes you to an accuracy of 37.000000000000000000000000037. If you have to get an exact 37 you simply need more than 28-29 significant digits than the decimal offers.
Always do computations with maximum significant digits and round off only your answer with Math.Round for desired result.

Double vs Decimal Rounding in C#

Why does:
double dividend = 1.0;
double divisor = 3.0;
Console.WriteLine(dividend / divisor * divisor);
output 1.0,
but:
decimal dividend = 1;
decimal divisor = 3;
Console.WriteLine(dividend / divisor * divisor);
outputs 0.9999999999999999999999999999
?
I understand that 1/3 can't be computed exactly, so there must be some rounding.
But why does Double round the answer to 1.0, but Decimal does not?
Also, why does double compute 1.0/3.0 to be 0.33333333333333331?
If rounding is used, then wouldn't the last 3 get rounded to 0, why 1?

Why 1/3 as a double is 0.33333333333333331
The closest way to represent 1/3 in binary is like this:
0.0101010101...
That's the same as the series 1/4 + (1/4)^2 + (1/4)^3 + (1/4)^4...
Of course, this is limited by the number of bits you can store in a double. A double is 64 bits, but one of those is the sign bit and another 11 represent the exponent (think of it like scientific notation, but in binary). So the rest, which is called the mantissa or significand is 52 bits. Assume a 1 to start and then use two bits for each subsequent power of 1/4. That means you can store:
1/4 + 1/4^2 + ... + 1/4 ^ 27
which is 0.33333333333333331
Why multiplying by 3 rounds this to 1
So 1/3 represented in binary and limited by the size of a double is:
0.010101010101010101010101010101010101010101010101010101
I'm not saying that's how it's stored. Like I said, you store the bits starting after the 1, and you use separate bits for the exponent and the sign. But I think it's useful to consider how you'd actually write it in base 2.
Let's stick with this "mathematician's binary" representation and ignore the size limits of a double. You don't have to do it this way, but I find it convenient. If we want to take this approximation for 1/3 and multiply by 3, that's the same as bit shifting to multiply by 2 and then adding what you started with. This gives us 1/3 * 3 = 0.111111111111111111111111111111111111111111111111111111
But can a double store that? No, remember, you can only have 52 bits of mantissa after the first 1, and that number has 54 ones. So we know that it'll be rounded, in this case rounded up to exactly 1.
Why for decimal you get 0.9999999999999999999999999999
With decimal, you get 96 bits to represent an integer, with additional bits representing the exponent up to 28 powers of 10. So even though ultimately it's all stored as binary, here we're working with powers of 10 so it makes sense to think of the number in base 10. 96 bits lets us express up to 79,228,162,514,264,337,593,543,950,335, but to represent 1/3 we're going to go with all 3's, up to the 28 of them that we can shift to the right of the decimal point: 0.3333333333333333333333333333.
Multiplying this approximation for 1/3 by 3 gives us a number we can represent exactly. It's just 28 9's, all shifted to the right of the decimal point: 0.9999999999999999999999999999. So unlike with double's there's not a second round of rounding at this point.

This is by design of the decimal type which is optimized for accuracy unlike the double type which is optimized for low accuracy but higher performance.
The Decimal value type represents decimal numbers ranging from positive 79,228,162,514,264,337,593,543,950,335 to negative 79,228,162,514,264,337,593,543,950,335.
The Decimal value type is appropriate for financial calculations requiring large numbers of significant integral and fractional digits and no round-off errors. The Decimal type does not eliminate the need for rounding. Rather, it minimizes errors due to rounding. Thus your code produces a result of 0.9999999999999999999999999999 rather than 1.
One reason that infinite decimals are a necessary extension of finite decimals is to represent fractions. Using long division, a simple division of integers like 1⁄9 becomes a recurring decimal, 0.111…, in which the digits repeat without end. This decimal yields a quick proof for 0.999… = 1. Multiplication of 9 times 1 produces 9 in each digit, so 9 × 0.111… equals 0.999… and 9 × 1⁄9 equals 1, so 0.999… = 1:

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.