How do I detect total loss of precision with Doubles?

How do I detect total loss of precision with Doubles? - c#

When I run the following code, I get 0 printed on both lines:
Double a = 9.88131291682493E-324;
Double b = a*0.1D;
Console.WriteLine(b);
Console.WriteLine(BitConverter.DoubleToInt64Bits(b));
I would expect to get Double.NaN if an operation result gets out of range. Instead I get 0. It looks that to be able to detect when this happens I have to check:
Before the operation check if any of the operands is zero
After the operation, if neither of operands were zero, check if the result is zero. If not let it run. If it is zero, assign Double.NaN to it instead to indicate that it's not really a zero, it's just a result that can't be represented within this variable.
That's rather unwieldy. Is there a better way? What Double.NaN is designed for? I'm assuming some operations must have return it, surely designers did not put it there just in case? Is it possible that this is a bug in BCL? (I know unlikely, but, that's why I'd like to understand how that Double.NaN is supposed to work)
Update
By the way, this problem is not specific for double. decimal exposes it all the same:
Decimal a = 0.0000000000000000000000000001m;
Decimal b = a* 0.1m;
Console.WriteLine(b);
That also gives zero.
In my case I need double, because I need the range they provide (I'm working on probabilistic calculations) and I'm not that worried about precision.
What I need though is to be able to detect when my results stop mean anything, that is when calculations drop the value so low, that it can no longer be presented by double.
Is there a practical way of detecting this?

Double works exactly according to the floating point numbers specification, IEEE 754. So no, it's not an error in BCL - it's just the way IEEE 754 floating points work.
The reason, of course, is that it's not what floats are designed for at all. Instead, you might want to use decimal, which is a precise decimal number, unlike float/double.
There's a few special values in floating point numbers, with different meanings:
Infinity - e.g. 1f / 0f.
-Infinity - e.g. -1f / 0f.
NaN - e.g. 0f / 0f or Math.Sqrt(-1)
However, as the commenters below noted, while decimal does in fact check for overflows, coming too close to zero is not considered an overflow, just like with floating point numbers. So if you really need to check for this, you will have to make your own * and / methods. With decimal numbers, you shouldn't really care, though.
If you need this kind of precision for multiplication and division (that is, you want your divisions to be reversible by multiplication), you should probably use rational numbers instead - two integers (big integers if necessary). And use a checked context - that will produce an exception on overflow.
IEEE 754 in fact does handle underflow. There's two problems:
The return value is 0 (or -1 for negative undreflow). The exception flag for underflow is set, but there's no way to get that in .NET.
This only occurs for the loss of precision when you get too close to zero. But you lost most of your precision way long before that. Whatever "precise" number you had is long gone - the operations are not reversible, and they are not precise.
So if you really do care about reversibility etc., stick to rational numbers. Neither decimal nor double will work, C# or not. If you're not that precise, you shouldn't care about underflows anyway - just pick the lowest reasonable number, and declare anything under that as "invalid"; may sure you're far away from the actual maximum precision - double.Epsilon will not help, obviously.

All you need is epsilon.
This is a "small number" which is small enough so you're no longer interested in.
You could use:
double epsilon = 1E-50;
and whenever one of your factors gets smaller than epislon you take action (for example treat it like 0.0)

Related

Does multiplying a floating point number by 0 always yield 0?

Floating point numbers are tricky, since many natural arithmetic properties don't hold.
I SUPPOSE that this particular property holds nevertheless, but I prefer to ask rather than be hit by hard to detect errors.
Assume that d is an arbitrary variable of type double. May I assume that after the following operation:
d *= 0;
The following check will always return true?
d == 0
I suppose that this will not hold if d is positive / negative infinity or NaN. However, are there any other problems I need to be aware of?
I hear that in floating point there are actually two zeroes, namely +0 and -0. If d is negative in the beginning, will d *= 0 return -0 instead? If so, will it be equal to 0?
I hear that floating point operations are subjected to inaccuracy. Thus, is it possible that multiplying by 0 will instead return something like 0.0000001 or -0.000001 which will not be equal to 0? My assumption is that this will likely be impossible, but I don't have enough knowledge to back this assumption up so I prefer to ask.
Any other problems I didn't foresee?

The short answer is yes, given your specific example, d will be 0. This is because 0 has an exact representation in every FP model, so when you multiply a double by the value '0' (or '0.0'), the result is not subject to rounding/truncation errors.
The issues you mentioned come into play during FP arithmetic as a result of the inherent approximations that occur when you have a finite resolution.
For example, the most accurate 64-bit representation of the value 0.002 in the IEEE-754 floating point standard is 0x3F60624D_D2F1A9FC which amounts to 2.00000000000000004163336342344E-3 (source: http://www.binaryconvert.com/convert_double.html)
The value 0 does not suffer from this loss of precision. However, obviously if you were to do something like this, you would run into problems:
double d = 0.009;
d -= (0.001 * 9);
if (d == 0)
{
//This check will fail
}
This is why it is almost always advised against to use exact equality when comparing floating point values.

Why .Net supports 'infinity' value in floats? [duplicate]

I'm just curious, why in IEEE-754 any non zero float number divided by zero results in infinite value? It's a nonsense from the mathematical perspective. So I think that correct result for this operation is NaN.
Function f(x) = 1/x is not defined when x=0, if x is a real number. For example, function sqrt is not defined for any negative number and sqrt(-1.0f) if IEEE-754 produces a NaN value. But 1.0f/0 is Inf.
But for some reason this is not the case in IEEE-754. There must be a reason for this, maybe some optimization or compatibility reasons.
So what's the point?

It's a nonsense from the mathematical perspective.
Yes. No. Sort of.
The thing is: Floating-point numbers are approximations. You want to use a wide range of exponents and a limited number of digits and get results which are not completely wrong. :)
The idea behind IEEE-754 is that every operation could trigger "traps" which indicate possible problems. They are
Illegal (senseless operation like sqrt of negative number)
Overflow (too big)
Underflow (too small)
Division by zero (The thing you do not like)
Inexact (This operation may give you wrong results because you are losing precision)
Now many people like scientists and engineers do not want to be bothered with writing trap routines. So Kahan, the inventor of IEEE-754, decided that every operation should also return a sensible default value if no trap routines exist.
They are
NaN for illegal values
signed infinities for Overflow
signed zeroes for Underflow
NaN for indeterminate results (0/0) and infinities for (x/0 x != 0)
normal operation result for Inexact
The thing is that in 99% of all cases zeroes are caused by underflow and therefore in 99%
of all times Infinity is "correct" even if wrong from a mathematical perspective.

I'm not sure why you would believe this to be nonsense.
The simplistic definition of a / b, at least for non-zero b, is the unique number of bs that has to be subtracted from a before you get to zero.
Expanding that to the case where b can be zero, the number that has to be subtracted from any non-zero number to get to zero is indeed infinite, because you'll never get to zero.
Another way to look at it is to talk in terms of limits. As a positive number n approaches zero, the expression 1 / n approaches "infinity". You'll notice I've quoted that word because I'm a firm believer in not propagating the delusion that infinity is actually a concrete number :-)
NaN is reserved for situations where the number cannot be represented (even approximately) by any other value (including the infinities), it is considered distinct from all those other values.
For example, 0 / 0 (using our simplistic definition above) can have any amount of bs subtracted from a to reach 0. Hence the result is indeterminate - it could be 1, 7, 42, 3.14159 or any other value.
Similarly things like the square root of a negative number, which has no value in the real plane used by IEEE754 (you have to go to the complex plane for that), cannot be represented.

In mathematics, division by zero is undefined because zero has no sign, therefore two results are equally possible, and exclusive: negative infinity or positive infinity (but not both).
In (most) computing, 0.0 has a sign. Therefore we know what direction we are approaching from, and what sign infinity would have. This is especially true when 0.0 represents a non-zero value too small to be expressed by the system, as it frequently the case.
The only time NaN would be appropriate is if the system knows with certainty that the denominator is truly, exactly zero. And it can't unless there is a special way to designate that, which would add overhead.

NOTE:
I re-wrote this following a valuable comment from #Cubic.
I think the correct answer to this has to come from calculus and the notion of limits. Consider the limit of f(x)/g(x) as x->0 under the assumption that g(0) == 0. There are two broad cases that are interesting here:
If f(0) != 0, then the limit as x->0 is either plus or minus infinity, or it's undefined. If g(x) takes both signs in the neighborhood of x==0, then the limit is undefined (left and right limits don't agree). If g(x) has only one sign near 0, however, the limit will be defined and be either positive or negative infinity. More on this later.
If f(0) == 0 as well, then the limit can be anything, including positive infinity, negative infinity, a finite number, or undefined.
In the second case, generally speaking, you cannot say anything at all. Arguably, in the second case NaN is the only viable answer.
Now in the first case, why choose one particular sign when either is possible or it might be undefined? As a practical matter, it gives you more flexibility in cases where you do know something about the sign of the denominator, at relatively little cost in the cases where you don't. You may have a formula, for example, where you know analytically that g(x) >= 0 for all x, say, for example, g(x) = x*x. In that case the limit is defined and it's infinity with sign equal to the sign of f(0). You might want to take advantage of that as a convenience in your code. In other cases, where you don't know anything about the sign of g, you cannot generally take advantage of it, but the cost here is just that you need to trap for a few extra cases - positive and negative infinity - in addition to NaN if you want to fully error check your code. There is some price there, but it's not large compared to the flexibility gained in other cases.
Why worry about general functions when the question was about "simple division"? One common reason is that if you're computing your numerator and denominator through other arithmetic operations, you accumulate round-off errors. The presence of those errors can be abstracted into the general formula format shown above. For example f(x) = x + e, where x is the analytically correct, exact answer, e represents the error from round-off, and f(x) is the floating point number that you actually have on the machine at execution.

Value of a double variable not exact after multiplying with 100 [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 7 years ago.
If I execute the following expression in C#:
double i = 10*0.69;
i is: 6.8999999999999995. Why?
I understand numbers such as 1/3 can be hard to represent in binary as it has infinite recurring decimal places but this is not the case for 0.69. And 0.69 can easily be represented in binary, one binary number for 69 and another to denote the position of the decimal place.
How do I work around this? Use the decimal type?

Because you've misunderstood floating point arithmetic and how data is stored.
In fact, your code isn't actually performing any arithmetic at execution time in this particular case - the compiler will have done it, then saved a constant in the generated executable. However, it can't store an exact value of 6.9, because that value cannot be precisely represented in floating point point format, just like 1/3 can't be precisely stored in a finite decimal representation.
See if this article helps you.

why doesn't the framework work around this and hide this problem from me and give me the
right answer,0.69!!!
Stop behaving like a dilbert manager, and accept that computers, though cool and awesome, have limits. In your specific case, it doesn't just "hide" the problem, because you have specifically told it not to. The language (the computer) provides alternatives to the format, that you didn't choose. You chose double, which has certain advantages over decimal, and certain downsides. Now, knowing the answer, you're upset that the downsides don't magically disappear.
As a programmer, you are responsible for hiding this downside from managers, and there are many ways to do that. However, the makers of C# have a responsibility to make floating point work correctly, and correct floating point will occasionally result in incorrect math.
So will every other number storage method, as we do not have infinite bits. Our job as programmers is to work with limited resources to make cool things happen. They got you 90% of the way there, just get the torch home.

And 0.69 can easily be represented in
binary, one binary number for 69 and
another to denote the position of the
decimal place.
I think this is a common mistake - you're thinking of floating point numbers as if they are base-10 (i.e decimal - hence my emphasis).
So - you're thinking that there are two whole-number parts to this double: 69 and divide by 100 to get the decimal place to move - which could also be expressed as:
69 x 10 to the power of -2.
However floats store the 'position of the point' as base-2.
Your float actually gets stored as:
68999999999999995 x 2 to the power of some big negative number
This isn't as much of a problem once you're used to it - most people know and expect that 1/3 can't be expressed accurately as a decimal or percentage. It's just that the fractions that can't be expressed in base-2 are different.

but why doesn't the framework work around this and hide this problem from me and give me the right answer,0.69!!!
Because you told it to use binary floating point, and the solution is to use decimal floating point, so you are suggesting that the framework should disregard the type you specified and use decimal instead, which is very much slower because it is not directly implemented in hardware.
A more efficient solution is to not output the full value of the representation and explicitly specify the accuracy required by your output. If you format the output to two decimal places, you will see the result you expect. However if this is a financial application decimal is precisely what you should use - you've seen Superman III (and Office Space) haven't you ;)
Note that it is all a finite approximation of an infinite range, it is merely that decimal and double use a different set of approximations. The advantage of decimal is it produces the same approximations that you would if you were performing the calculation yourself. For example if you calculated 1/3, you would eventually stop writing 3's when it was 'good enough'.

For the same reason that 1 / 3 in a decimal systems comes out as 0.3333333333333333333333333333333333333333333 and not the exact fraction, which is infinitely long.

To work around it (e.g. to display on screen) try this:
double i = (double) Decimal.Multiply(10, (Decimal) 0.69);
Everyone seems to have answered your first question, but ignored the second part.

Why does a parsed double not equal an initialized double supposedly of the same value?

When I execute this line:
double dParsed = double.Parse("0.00000002036");
dParsed actually gets the value: 0.000000020360000000000002
Compared to this line,
double dInitialized = 0.00000002036;
in which case the value of dInitialized is exactly 0.00000002036
Here they are in the debugger:
This inconsistency is a trifle annoying, because I want to run tests along the lines of:
[Subject("parsing doubles")]
public class when_parsing_crazy_doubles
{
static double dInitialized = 0.00000002036;
static double dParsed;
Because of = () => dParsed = double.Parse("0.00000002036");
It should_match = () => dParsed.ShouldBeLike(dInitialized);
}
This of course fails with:
Machine.Specifications.SpecificationException
"":
Expected: [2.036E-08]
But was: [2.036E-08]
In my production code, the 'parsed' doubles are read from a data file whereas the comparison values are hard coded as object initializers. Over many hundreds of records, 4 or 5 of them don't match. The original data appears in the text file like this:
0.00000002036 0.90908165072 6256.77753019160
So the values being parsed have only 11 decimal places. Any ideas for working around this inconsistency?
While I accept that comparing doubles for equality is risky, I'm surprised that the compiler can get an exact representation when the text is used as an object initializer, but that double.Parse can't get an exact representation when parsing exactly the same text. How can I limit the parsed doubles to 11 decimal places?

Compared to this line,
double dInitialized = 0.00000002036;
in which case the value of dInitialized is exactly 0.00000002036
If you have anything remotely resembling a commodity computer, dInitialized is not initialized as exactly 0.00000002036. It can't be because the base 10 number 0.00000002036 does not have a finite representation in base 2.
Your mistake is expecting two doubles to compare equal. That's usually not a good idea. Unless you have very good reasons and know what you are doing, it is best to not compare two doubles for equality or inequality. Instead test whether the difference between the two lies within some small epsilon of zero.
Getting the size of that epsilon right is a bit tricky. If your two numbers are both small, (less than one, for example), an epsilon of 1e-15 might well be appropriate. If the numbers are large (larger than ten, for example), that small of an epsilon value is equivalent to testing for equality.
Edit: I didn't answer the question.
How can I limit the parsed doubles to 11 decimal places?
If you don't have to worry about very small values,
static double epsilon = 1e-11;
if (Math.Abs(dParsed-dInitialized) > epsilon*Math.Abs(dInitialized)) {
noteTestAsFailed();
}
You should be able to safely change that epsilon to 4e-16.
Edit #2: Why is it that the compiler and double.Parse produce different internal representations for the same text?
That's kind of obvious, isn't it? The compiler and double.Parse use different algorithms. The number in question 0.00000002036 is very close to being on the cusp of whether rounding up or rounding down should be used to yield a representable value that is within half an ULP of the desired value (0.00000002036). The "right" value is the one that is within a half an ULP of the desired value. In this case, the compiler makes the right decision of picking the rounded-down value while the parser makes the wrong decision of picking the rounded-up value.
The value 0.00000002036 is a nasty corner case. It is not an exactly representable value. The two closest values that can be represented exactly as IEEE doubles are 6153432421838462/2^78 and 6153432421838463/2^78. The value halfway between these two is 12306864843676925/2^79, which is very, very close to 0.00000002036. That's what makes this a corner case. I suspect all of the values you found where the compiled value is not identically equal to the value from double.Parse are corner cases, cases where the desired value is almost halfway between the two closest exactly representable values.
Edit #3:
Here are a number of different ways to interpret 0.00000002036:
2/1e8 + 3/1e10 + 6/1e11
2*1e-8 + 3*1e-10 + 6*1e-11
2.036 * 1e-8
2.036 / 1e8
2036 * 1e-11
2036 / 1e11
On an ideal computer all of these will be the same. Don't count on that being the case on a computer that uses finite precision arithmetic.

Division by zero: int vs. float

Dividing an int by zero, will throw an exception, but a float won't - at least in Java. Why does a float have additional NaN info, while an int type doesn't?

The representation of a float has been designed such that there are some special combination of bits reserved to store special values such as NaN, infinity, etc.
There are no unused representations for an int type - every bit pattern corresponds to an integer. This has many advantages:
The range of an integer type is as large as possible - no bit patterns are wasted.
The representation of an integer is easy to understand because there are no special cases.
Integer arithmetic can be done at extremely high speed even on very simple processors.

A clear Explanation about float arithmetic is given here
http://www.artima.com/underthehood/floatingP.html

I think the real reason, the root of this, is the well known fact: computers store everything in zeroes and ones.
What does it have to do with integers, floats and zero division? It's pretty simple. If you have only zeroes and ones, it is pretty easy to combine them into integer numbers, like you do with decimal digits. So "10" becomes two, "11" becomes three and so on. This kind of integer representation is so natural that no one would think of inventing anything else for integers, it would just make CPUs more complicated and things more confusing. The only "invention" that was required is to figure out how to store negative numbers, but that's also very natural and simple if you start from the point that x+(-x) should always be equal to zero, without using any special kind of addition here. That's why 11111111 is -1 for 8-bit integers, because if you add 1 to it, it becomes 100000000, then 8th bit is truncated due to overflow and you get your zero. But this natural format has no place for infinities and NaNs, and nobody wanted to invent a non-natural representation just for that. Well, I won't be surprised if someone actually did that, but there is no way such format would become well-known and widely used.
Now, for floating-point numbers, there is no natural representation. Even if we translate 0.5 to binary, it would still be something like 0.1 only now we have "binary point" instead of decimal point. But CPUs can't naturally represent a "point", only 1 and 0. So some kind of special format was needed. There was simply no other way to go. And then someone probably suggested, "Hey guys, while we are at it, why not to include special representation for infinity and other numeric nonsense?" and so it was done.
This is the reason why these formats are so different. How to handle divisions by zero, it's up to language designers, but for floating-points they have the choice between inf/NaN and exceptions, while for integers they don't naturally have such kind of thing.

Basically, it's a purely arbitrary decision.
The traditional int tries to use all the bits for representing possible numbers, whereas IEEE 754 standard reserves a special value for NaN.
The standard could be changed for ints to include special values, at a cost of less efficient operations. The developers usually expect int operations to be very efficient, whereas the operations with floating point numbers are (purely psychologically) more allowed to be slower.

Ints and floats are represented differently inside the machine. Integers usually use a signed, two's complement representation that is (essentially) the number written out in base two. Floats, on the other hand, use a more complex representation that can hold much larger and much smaller values. However, the machine reserves several special bit patterns for floats to mean things other than numbers. There's values for NaN, and for positive or negative infinity, for example. This means that if you divide a float by zero, there is a series of bits that the computer can use to encode that you divided by zero. For ints, all bit patterns are used to encode numbers, so there's no meaningful series of bits the computer could use to represent the error.
This isn't an essential property of ints, though. One could, in theory, make an integer representation that handles division by zero by returning some NaN variant. It's just not what's done in practice.

Java reflects the way most CPUs are implemented. Integer divide by zero causes an interrupt on x86/x64 and Floating point divide by zero results in Infinity, Negative infinity or NaN. Note: with floating point you can also divide by negative zero. :P

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.