Exact double precision by correct rounding

Exact double precision by correct rounding - c#

Although my question sounds trivial, it really is NOT. Hope you can help me.
I want to implement interval arithmetic in my .NET (C#) project. This means that every number is defined by an lower bound and an upper bound. This is helpfull for problems like
1 / 3 = 0.333333333333333 (15 significant digits)
since you would then have
1 / 3 = [ 0.33333333333333 , 0.333333333333334 ] (14 significant digits each)
, so I now FOR SURE that the right answer lays between those two numbers. Without the interval representation I would already have a rounding error with me (i.e. 0.0000000000000003).
To achieve this I wrote my own Interval type that overloads all standard operators like +-*/, etc. To make this type work correctly I need to be able to round the result of 1 / 3 in two directions. Rounding the result down will give me the lower bound for my interval, rounding the result up will give me the upper bound for my interval.
.NET has the Math.Round(double,int) method which rounds the double to int decimal places. Looks great but it can't be forced to round up/down. Math.Round(1.0/3.0,14) would round down, but the also needed up-rounding to 0.33...34 can't be achieved like this.
But there are Math.Ceil and Math.Floor you might say! Okay, those methods round to the next lower or upper integer. So if I want to round to 14 decimal places I first need to reform my result:
1 / 3 = 0.333333333333333 -> *E14 -> 33333333333333.3
So now I can call Math.Ceil and Math.Floor and get both rounded results after reforming back
33333333333333 & 33333333333334 -> /E14 -> 0.33333333333333 & 0.33333333333334
Looks great, but: Let's say my number goes near the double.MaxValue. I can't just *E14 a value near double.MaxValue since this will give me an OverflowException. So this is no solution either.
And, to top all of these facts: All this fails even harder when trying to round 0.9999999999999999999999999 (more than 15 digits) since the internal representation is already rounded to 1 before I can even start trying to round down.
I could try to somehow parse a string containing the double but this won't help since (1/3 * 3).ToString() will already print 1 instead of 0.99...9.
Decimal does not work either since I don't want that deep precision, 14 digits are enough; but I still want that double range!
In C++, where several interval arithmetic implementations exist, this problem could be solved by telling the processor dynamically to swith its roundmode to for example "always down" or "always up". I couldn't find any way to do this in .NET.
So, do you have any ideas?
Thanks in advance!

Assume nextDown(x) is a function that returns the largest double that is less than x, and nextUp(x) is a function that returns the smallest double that is greater than x. See Get next smallest Double number for implementation ideas.
Where you would have rounded a lower bound result down, instead use the nextDown of the round-to-nearest result. Where you would have rounded an upper bound up, use the nextUp of the round-to-nearest result.
This method ensures the interval continues to contain the exact real number result. It introduces extra rounding error - in some cases the lower bound will be one ULP smaller than it should be, and/or the upper bound will be one ULP bigger. However, it is a minimal widening of the interval, much less widening than you would get working in decimal or by suppressing low significance bits.

This might be more like a long comment than a real answer.
This code returns an "interval" (I just use Tuple<,>, you can use your own Interval type) based on truncating the seven least significant bits:
static Tuple<double, double> GetMinMaxIntervalBasedOnBinaryNumbersThatAreRoundOnLastSevenBits(double number)
{
if (double.IsInfinity(number) || double.IsNaN(number))
return Tuple.Create(number, number); // maybe treat this case differently
var i = BitConverter.DoubleToInt64Bits(number);
const int numberOfBitsToClear = 7; // your seven, can change this value, must be below 52
const long precision = 1L << numberOfBitsToClear;
const long bitMask = ~(precision - 1L);
//truncate i
i &= bitMask;
return Tuple.Create(BitConverter.Int64BitsToDouble(i), BitConverter.Int64BitsToDouble(i + precision));
}
Disclaimer: I am not sure if this is useful for any purpose. In particular not sure it is useful for interval arithmetic.
With this code, GetMinMaxIntervalBasedOnBinaryNumbersThatAreRoundOnLastSevenBits(1.0 / 3.0) returns the tuple (0.333333333333329, 0.333333333333336).
This code, just like the code you ask for in your question, has the obvious "issue" that if the original value is close to (or even equal to) one of the "round" numbers we use, then the returned interval is "skewed", with the original number being close to one of the ends of the interval. For example, with input 42.0 (already round), you get out the tuple (42, 42.0000000000009).
One good thing about this code is I expect it to be extremely fast.

Related

How do I detect total loss of precision with Doubles?

When I run the following code, I get 0 printed on both lines:
Double a = 9.88131291682493E-324;
Double b = a*0.1D;
Console.WriteLine(b);
Console.WriteLine(BitConverter.DoubleToInt64Bits(b));
I would expect to get Double.NaN if an operation result gets out of range. Instead I get 0. It looks that to be able to detect when this happens I have to check:
Before the operation check if any of the operands is zero
After the operation, if neither of operands were zero, check if the result is zero. If not let it run. If it is zero, assign Double.NaN to it instead to indicate that it's not really a zero, it's just a result that can't be represented within this variable.
That's rather unwieldy. Is there a better way? What Double.NaN is designed for? I'm assuming some operations must have return it, surely designers did not put it there just in case? Is it possible that this is a bug in BCL? (I know unlikely, but, that's why I'd like to understand how that Double.NaN is supposed to work)
Update
By the way, this problem is not specific for double. decimal exposes it all the same:
Decimal a = 0.0000000000000000000000000001m;
Decimal b = a* 0.1m;
Console.WriteLine(b);
That also gives zero.
In my case I need double, because I need the range they provide (I'm working on probabilistic calculations) and I'm not that worried about precision.
What I need though is to be able to detect when my results stop mean anything, that is when calculations drop the value so low, that it can no longer be presented by double.
Is there a practical way of detecting this?

Double works exactly according to the floating point numbers specification, IEEE 754. So no, it's not an error in BCL - it's just the way IEEE 754 floating points work.
The reason, of course, is that it's not what floats are designed for at all. Instead, you might want to use decimal, which is a precise decimal number, unlike float/double.
There's a few special values in floating point numbers, with different meanings:
Infinity - e.g. 1f / 0f.
-Infinity - e.g. -1f / 0f.
NaN - e.g. 0f / 0f or Math.Sqrt(-1)
However, as the commenters below noted, while decimal does in fact check for overflows, coming too close to zero is not considered an overflow, just like with floating point numbers. So if you really need to check for this, you will have to make your own * and / methods. With decimal numbers, you shouldn't really care, though.
If you need this kind of precision for multiplication and division (that is, you want your divisions to be reversible by multiplication), you should probably use rational numbers instead - two integers (big integers if necessary). And use a checked context - that will produce an exception on overflow.
IEEE 754 in fact does handle underflow. There's two problems:
The return value is 0 (or -1 for negative undreflow). The exception flag for underflow is set, but there's no way to get that in .NET.
This only occurs for the loss of precision when you get too close to zero. But you lost most of your precision way long before that. Whatever "precise" number you had is long gone - the operations are not reversible, and they are not precise.
So if you really do care about reversibility etc., stick to rational numbers. Neither decimal nor double will work, C# or not. If you're not that precise, you shouldn't care about underflows anyway - just pick the lowest reasonable number, and declare anything under that as "invalid"; may sure you're far away from the actual maximum precision - double.Epsilon will not help, obviously.

All you need is epsilon.
This is a "small number" which is small enough so you're no longer interested in.
You could use:
double epsilon = 1E-50;
and whenever one of your factors gets smaller than epislon you take action (for example treat it like 0.0)

Why does a parsed double not equal an initialized double supposedly of the same value?

When I execute this line:
double dParsed = double.Parse("0.00000002036");
dParsed actually gets the value: 0.000000020360000000000002
Compared to this line,
double dInitialized = 0.00000002036;
in which case the value of dInitialized is exactly 0.00000002036
Here they are in the debugger:
This inconsistency is a trifle annoying, because I want to run tests along the lines of:
[Subject("parsing doubles")]
public class when_parsing_crazy_doubles
{
static double dInitialized = 0.00000002036;
static double dParsed;
Because of = () => dParsed = double.Parse("0.00000002036");
It should_match = () => dParsed.ShouldBeLike(dInitialized);
}
This of course fails with:
Machine.Specifications.SpecificationException
"":
Expected: [2.036E-08]
But was: [2.036E-08]
In my production code, the 'parsed' doubles are read from a data file whereas the comparison values are hard coded as object initializers. Over many hundreds of records, 4 or 5 of them don't match. The original data appears in the text file like this:
0.00000002036 0.90908165072 6256.77753019160
So the values being parsed have only 11 decimal places. Any ideas for working around this inconsistency?
While I accept that comparing doubles for equality is risky, I'm surprised that the compiler can get an exact representation when the text is used as an object initializer, but that double.Parse can't get an exact representation when parsing exactly the same text. How can I limit the parsed doubles to 11 decimal places?

Compared to this line,
double dInitialized = 0.00000002036;
in which case the value of dInitialized is exactly 0.00000002036
If you have anything remotely resembling a commodity computer, dInitialized is not initialized as exactly 0.00000002036. It can't be because the base 10 number 0.00000002036 does not have a finite representation in base 2.
Your mistake is expecting two doubles to compare equal. That's usually not a good idea. Unless you have very good reasons and know what you are doing, it is best to not compare two doubles for equality or inequality. Instead test whether the difference between the two lies within some small epsilon of zero.
Getting the size of that epsilon right is a bit tricky. If your two numbers are both small, (less than one, for example), an epsilon of 1e-15 might well be appropriate. If the numbers are large (larger than ten, for example), that small of an epsilon value is equivalent to testing for equality.
Edit: I didn't answer the question.
How can I limit the parsed doubles to 11 decimal places?
If you don't have to worry about very small values,
static double epsilon = 1e-11;
if (Math.Abs(dParsed-dInitialized) > epsilon*Math.Abs(dInitialized)) {
noteTestAsFailed();
}
You should be able to safely change that epsilon to 4e-16.
Edit #2: Why is it that the compiler and double.Parse produce different internal representations for the same text?
That's kind of obvious, isn't it? The compiler and double.Parse use different algorithms. The number in question 0.00000002036 is very close to being on the cusp of whether rounding up or rounding down should be used to yield a representable value that is within half an ULP of the desired value (0.00000002036). The "right" value is the one that is within a half an ULP of the desired value. In this case, the compiler makes the right decision of picking the rounded-down value while the parser makes the wrong decision of picking the rounded-up value.
The value 0.00000002036 is a nasty corner case. It is not an exactly representable value. The two closest values that can be represented exactly as IEEE doubles are 6153432421838462/2^78 and 6153432421838463/2^78. The value halfway between these two is 12306864843676925/2^79, which is very, very close to 0.00000002036. That's what makes this a corner case. I suspect all of the values you found where the compiled value is not identically equal to the value from double.Parse are corner cases, cases where the desired value is almost halfway between the two closest exactly representable values.
Edit #3:
Here are a number of different ways to interpret 0.00000002036:
2/1e8 + 3/1e10 + 6/1e11
2*1e-8 + 3*1e-10 + 6*1e-11
2.036 * 1e-8
2.036 / 1e8
2036 * 1e-11
2036 / 1e11
On an ideal computer all of these will be the same. Don't count on that being the case on a computer that uses finite precision arithmetic.

What is the best way to recognize and convert these "minimum" double values into C# Double.MinValue?

I have a database table that needs to be converted into current form. This table has three columns that are of type Double (it's Pervasive.SQL, if anyone cares).
My problem is that this table has been around for a long time, and it's been acted upon by code going back some 15 years or better.
Historically, we have always used Double.MinValue (or whatever language equivalent at the time) to represent "blank" values provided by the user. The absence of a value, in other words, is actually stored as a value that we can recognize later and react to intelligently.
So, today my problem is that I need to loop through these records and insert them into a newly created table (this is the "conversion" I spoke of). However, I am not seeing consistent values in the tables I am converting. Here are the ones I know of for sure:
2.2250738585072014E-308
3.99285938963E-313
3.99099435427E-313
1.1125369292536007E-308
-5.389000690742776E279
2.104687961E-314
Now, I recognize that there are other ways that Double.MinValue might exist or at least be represented. Having done some google searches, I found that the first one is another representation of Double.MinValue (actually DBL_MIN referenced here: http://msdn.microsoft.com/en-us/library/6bs3y5ya(v=vs.100).aspx).
I don't want to get too long-winded, so I'll solicit questions if this is not enough information to help me. Suffice it to say, I need a reliable way of spotting all of the previous values of "minimum" and replace them with the C# Double.MinValue constant as I am looping these data rows.
If it proves to be dataRow["Value"] < someConstant, then so be it. But I'll let the math theorists help me out with that determination.
Thank you for the time.
EDIT:
Here's what I am doing with these values as I find them. It's part of a generic method that assembles values to be written to the database:
else if (c.DataType == typeof(System.Double))
{
if (inRow[c] == DBNull.Value)
retString += #"NULL";
else
{
Double d;
if (Double.TryParse(inRow[c].ToString(), out d))
retString += d.ToStringFull();
}
}
Until now, it simply accepted them. And that's bad because when the application finds them, they look like acceptable data, and not like Double.MinValue. Therefore, not seen as blanks. But that's what they are.

This is utter craziness. Let's look at some of those numbers in detail. These are all tiny numbers just barely larger than zero:
2.2250738585072014E-308
This is 1 / 21022 -- it is a normal double. This is one of the two "special" numbers in your set; it is the smallest normal double that is larger than zero. The rest of the small doubles on your list are subnormal doubles.
1.1125369292536007E-308
This is 1 / 21023 -- it is a subnormal double. This is also a special number; it is half the smallest normal double larger than zero. (I originally said that it was the largest subnormal double but of course that is not true; see the comments.)
3.99285938963E-313
This isn't anything special. It's a subnormal double equal to a fraction where the numerator is 154145 and the denominator is a rather large power of two.
3.99099435427E-313
This isn't anything special either. This time the numerator is 154073.
2.104687961E-314
This isn't anything special either. The numerator is 2129967929 and the denominator is an even larger power of two.
All the numbers so far have been very close to zero and positive. This number is very far from zero and negative, and therefore stands out:
-5.389000690742776E279
But again it is nothing special; it is nowhere even close to the negative double with the largest absolute value, which is about -1.79E308, about a billion times larger.
This is a complete mess.
My advice is stop this madness immediately. It makes absolutely no sense to use values that are incredibly close to zero to represent "blank" values; values that are incredibly close to zero should be rounded to zero, not treated as blanks!
Double already has a representative for "blank" values, namely Double.NaN -- Not A Number; it is bizarre to use a valid value to represent an invalid value when the domain already includes a specific "invalid" value. (Remember that there are actually a large number of distinct NaN bit patterns; use IsNaN to determine if a double is a NaN.)
So my advice is:
Examine individually every number in the database that is a subnormal or very small normal double. Some of those probably ought to be zero and ended up as tiny values due to rounding errors. Replace them with zero. The ones that ought to be blank, replace with database null (best practice) or double NaN (acceptable, but not as good as database null.)
Write a program to find every number in the database that is impossibly large in absolute value and replace it with database null or double NaN.
Update all clients so that they understand the convention you're using to represent blank values.

You seem to want to check if a double is really small and positive or really big, finite, and negative. (Others have detailed some problems with your approach in the comments; I'm not going to go into that here.) A test like this:
if (d == d && (d > 0 && d < 1e-290 || d < -1e270 && d + d != d))
might do roughly what you want. You'll probably need to tweak the numbers above. The d == d test is checking for NaN, while the d + d != d test is checking for infinities.

C# bug in a for loop

I shouldn't get the negative numbers, see the screenshot below:
See the pic below:
Here is the code:
for (double i=8.0; i<=12;i=i+0.5)
{
double aa= (i - Convert.ToInt32(i)) ;
Console.WriteLine(" "+i+" "+aa);
}

If you check the documentation:
Return Value
Type: System.Int32
value, rounded to the nearest 32-bit signed integer. If value is halfway between two whole numbers, the even number is returned; that is, 4.5 is converted to 4, and 5.5 is converted to 6.
This means that every other number will round up, and then down, then up, and then down, which means you'll get negative numbers half the time.
The purpose of this method is to even out bias introduced by always rounding in a particular direction. Consider summing up a huge number of values, rounding them each first. If you always round up, the final sum will always be larger than summing the un-rounded values and then rounding the sum. However, if you round half up and half down according to the rule laid out above, the final sum of the rounded numbers is more likely to be closer to a rounded sum.
You can also read more about this on wikipedia: Round. It is sometimes called bankers rounding although as far as I know banks doesn't use this method.
To ensure you're rounding as you want to:
Down: Math.Floor(Double)
Up: Math.Ceiling(Double)
Even/AwayFromZero: Math.Round(Double, MidpointRounding)

I don't know what you would expect, but double is rounded in this case, not truncated
value: rounded to the nearest 32-bit signed integer. If value is halfway between two whole numbers, the even number is returned; that is, 4.5 is converted to 4, and 5.5 is converted to 6.
Check Convert.ToInt32(double) documentation

try this solve your problem negative marks
for (double i = 8.0; i <= 12; i = i + 0.5)
{
double aa = Convert.ToInt32(i);
Console.WriteLine(aa+" " +i );
}

double aa= (i - Convert.ToInt32(i)) ;
looks like it's alternatively rounding up and down.
Not particularly surprising

c# Subtract is not accurate even with decimals?

I'm learning TDD and, decided to create a Calculator class to start.
i did the basic first, and now I'm on the Square Root function.
I'm using this method to get the root http://www.math.com/school/subject1/lessons/S1U1L9DP.html
i tested it with few numbers, and I always get the accurate answers.
is pretty easy to understand.
Now I'm having a weird problem, because with some numbers, im getting the right answer, and with some, I don't.
I debugged the code, and found out that I'm not getting the right answer when I use subtract.
I'm using decimals to get the most accurate result.
when I do:
18 / 4.25
I am currently getting: 4.2352941176470588235294117647
when it should be: 4.2352941176470588235294117647059 (using windows calculator)
in the end of the road, this is the closest i get to the root of 18:
4.2426406871192851464050688705 ^ 2 = 18.000000000000000000000022892
my question is:
Can i get more accurate then this?

4.2352941176470588235294117647 contains 29 digits.
decimal is define to have 28-29 significant digits. You can't store a more accurate number in a decimal.
What field of engineering or science are you working in where the 30th and more digits are significant to the accuracy of the overall calculation?
(It would also, possibly, help if you'd shown some more actual code. The only code you've shown is 18 / 4.25, which can't be an actual expression in your code, since the second number is a double literal, and you can't assign the result of this expression to a decimal without a cast).
If you need arbitrary precision, then there isn't a standard "BigRational" type, but there is a BigInteger. You could use that to construct a BigRational type if you need that (storing numerator and denominator as two separate integers). One guess of why there isn't a standard type yet is that decisions on when to e.g. normalize such rationals may affect performance or equality comparisons.

Floating point calculations are not accurate. Decimals make the accuracy better, because they are 128-bit long, but they are still floating point numbers.
Comparing two floating point numbers is not done with ==, but rather:
static bool SameDecimal(decimal a, decimal b)
{
return Math.Abs(a-b) < 1e-10;
}
This method will allow you to compare two decimals (I assume 1e-10 is a small enough difference for you, it should be for everyday uses).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.