why am i losing data when going from double to decimal - c#

I have the following code which I can't change...
public static decimal Convert(decimal value, Measurement currentMeasurement, Measurement targetMeasurement, bool roundResult = true)
{
double result = Convert(System.Convert.ToDouble(value), currentMeasurement, targetMeasurement, roundResult);
return System.Convert.ToDecimal(result);
}
now result is returned as -23.333333333333336 but once the conversion to a decimal takes place it becomes -23.3333333333333M.
I thought decimals could hold bigger values and were hence more accurate so how am I losing data going from double to decimal?

This is by design. Quoting from the documentation of Convert.ToDecimal:
The Decimal value returned by this method contains a maximum of
15 significant digits. If the value parameter contains more than 15
significant digits, it is rounded using rounding to nearest. The
following example illustrates how the Convert.ToDecimal(Double) method
uses rounding to nearest to return a Decimal value with 15
significant digits.
Console.WriteLine(Convert.ToDecimal(123456789012345500.12D)); // Displays 123456789012346000
Console.WriteLine(Convert.ToDecimal(123456789012346500.12D)); // Displays 123456789012346000
Console.WriteLine(Convert.ToDecimal(10030.12345678905D)); // Displays 10030.123456789
Console.WriteLine(Convert.ToDecimal(10030.12345678915D)); // Displays 10030.1234567892
The reason for this is mostly that double can only guarantee 15 decimal digits of precision, anyway. Everything that's displayed after them (converted to a string it's 17 digits because that's what double uses internally and because that's the number you might need to exactly reconstruct every possible double value from a string) is not guaranteed to be part of the exact value being represented. So Convert takes the only sensible route and rounds them away. After all, if you have a type that can represent decimal values exactly, you wouldn't want to start with digits that are inaccurate.
So you're not losing data, per se. In fact, you're only losing garbage. Garbage you thought of being data.
EDIT: To clarify my point from the comments: Conversions between different numeric data types may incur a loss of precision. This is especially the case between double and decimal because both types are capable of representing values the other type cannot represent. Furthermore, both double and decimal have fairly specific use cases they're intended for, which is also evident from the documentation (emphasis mine):
The Double value type represents a double-precision 64-bit number with
values ranging from negative 1.79769313486232e308 to positive
1.79769313486232e308, as well as positive or negative zero, PositiveInfinity, NegativeInfinity, and not a number (NaN). It is
intended to represent values that are extremely large (such as
distances between planets or galaxies) or extremely small (the
molecular mass of a substance in kilograms) and that often are
imprecise (such as the distance from earth to another solar system).
The Double type complies with the IEC 60559:1989 (IEEE 754) standard
for binary floating-point arithmetic.
The Decimal value type represents decimal numbers ranging from
positive 79,228,162,514,264,337,593,543,950,335 to negative
79,228,162,514,264,337,593,543,950,335. The Decimal value type is
appropriate for financial calculations that require large numbers of
significant integral and fractional digits and no round-off errors.
The Decimal type does not eliminate the need for rounding. Rather, it
minimizes errors due to rounding.
This basically means that for quantities that will not grow unreasonably large and you need an accurate decimal representation, you should use decimal (sounds fairly obvious when written that way). In practice this most often means financial calculations, as the documentation already states.
On the other hand, in the vast majority of other cases, double is the right way to go and usually does not hurt as a default choice (languages like Lua and JavaScript get away just fine with double being the only numeric data type).
In your specific case, since you mentioned in the comments that those are temperature readings, it is very, very, very simple: Just use double throughout. You have temperature readings. Wikipedia suggests that highly-specialized thermometers reach around 10−3 °C precision. Which basically means that the differences in value of around 10−13 (!) you are worried about here are simply irrelevant. Your thermometer gives you (let's be generous) five accurate digits here and you worry about the ten digits of random garbage that come after that. Just don't.
I'm sure a physicist or other scientist might be able to chime in here with proper handling of measurements and their precision, but we were taught in school that it's utter bullshit to even give values more precise than the measurements are. And calculations potentially affect (and reduce) that precision.

Related

Can casting a long value to double potentially cause a value to be lost [duplicate]

I have read in different post on stackoverflow and in the C# documentation, that converting long (or any other data type representing a number) to double loses precision. This is quite obvious due to the representation of floating point numbers.
My question is, how big is the loss of precision if I convert a larger number to double? Do I have to expect differences larger than +/- X ?
The reason I would like to know this, is that I have to deal with a continuous counter which is a long. This value is read by my application as string, needs to be cast and has to be divided by e.g. 10 or some other small number and is then processed further.
Would decimal be more appropriate for this task?
converting long (or any other data type representing a number) to double loses precision. This is quite obvious due to the representation of floating point numbers.
This is less obvious than it seems, because precision loss depends on the value of long. For values between -252 and 252 there is no precision loss at all.
How big is the loss of precision if I convert a larger number to double? Do I have to expect differences larger than +/- X
For numbers with magnitude above 252 you will experience some precision loss, depending on how much above the 52-bit limit you go. If the absolute value of your long fits in, say, 58 bits, then the magnitude of your precision loss will be 58-52=6 bits, or +/-64.
Would decimal be more appropriate for this task?
decimal has a different representation than double, and it uses a different base. Since you are planning to divide your number by "small numbers", different representations would give you different errors on division. Specifically, double will be better at handling division by powers of two (2, 4, 8, 16, etc.) because such division can be accomplished by subtracting from exponent, without touching the mantissa. Similarly, large decimals would suffer no loss of significant digits when divided by ten, hundred, etc.
long
long is a 64-bit integer type and can hold values from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (max. 19 digits).
double
double is 64-bit floating-point type that has precision of 15 to 16 digits. So data can certainly be lost in case your numbers are greater than ~100,000,000,000,000.
decimal
decimal is a 128-bit decimal type and can hold up to 28-29 digits. So it's always safe to cast long to decimal.
Recommendation
I would advice that you find out the exact expectations about the numbers you will be working with. Then you can take an informed decision in choosing the appropriate data type. Since you are reading your numbers from a string, isn't it possible that they will be even greater than 28 digits? In that case, none of the types listed will work for you, and instead you'll have to use some sort of a BigInt implementation.

C# decimal type equality

Is the equality comparison for C# decimal types any more likely to work as we would intuitively expect than other floating point types?
I guess that depends on your intuition. I would assume that some people would think of the result of dividing 1 by 3 as the fraction 1/3, and others would think more along the lines of "Oh, 1 divided by 3 can't be represented as a decimal number, we'll have to decide how many digits to keep, let's go with 0.333".
If you think in the former way, Decimal won't help you much, but if you think in the latter way, and are explicit about rounding when needed, it is more likely that operations that are "intuitively" not subject to rounding errors in decimal, e.g. dividing by 10, will behave as you expect. This is more intuitive to most people than the behavior of a binary floating-point type, where powers of 2 behave nicely, but powers of 10 do not.
Basically, no. The Decimal type simply represents a specialised sort of floating-point number that is designed to reduce rounding error specifically in the base 10 system. That is, the internal representation of a Decimal is in fact in base 10 (denary) and not the usual binary. Hence, it is a rather more appropriate type for monetary calculations -- though not of course limited to such applications.
From the MSDN page for the structure:
The Decimal value type represents decimal numbers ranging from positive 79,228,162,514,264,337,593,543,950,335 to negative 79,228,162,514,264,337,593,543,950,335. The Decimal value type is appropriate for financial calculations requiring large numbers of significant integral and fractional digits and no round-off errors. The Decimal type does not eliminate the need for rounding. Rather, it minimizes errors due to rounding. For example, the following code produces a result of 0.9999999999999999999999999999 rather than 1.
A decimal number is a floating-point value that consists of a sign, a numeric value where each digit in the value ranges from 0 to 9, and a scaling factor that indicates the position of a floating decimal point that separates the integral and fractional parts of the numeric value.

How to decide what to use - double or decimal? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
decimal vs double! - Which one should I use and when?
I'm using double type for price in my trading software.
I've noticed that sometimes there are a odd errors.
They occur if price contains 4 digits after "dot", like 2.1234.
When I sent from my program "2.1234" on the market order appears at the price of "2.1235".
I don't use decimal because I don't need "extreme" precision. I don't need to distinguish for examle "2.00000000003" from "2.00000000002". I need maximum 6 digits after a dot.
The question is - where is the line? When to use decimal?
Should I use decimal for any finansical operations? Even if I need just one digit after the dot? (1.1 1.2 etc.)
I know decimal is pretty slow so I would prefer to use double unless decimal is absolutely required.
Use decimal whenever you're dealing with quantities that you want to (and can) be represented exactly in base-10. That includes monetary values, because you want 2.1234 to be represented exactly as 2.1234.
Use double when you don't need an exact representation in base-10. This is usually good for handling measurements, because those are already approximations, not exact quantities.
Of course, if having or not an exact representation in base-10 is not important to you, other factors come into consideration, which may or may not matter depending on the specific situation:
double has a larger range (it can handle very large and very small magnitudes);
decimal has more precision (has more significant digits);
you may need to use double to interact with some older APIs that are not aware of decimal;
double is faster than decimal;
decimal has a larger memory footprint;
When accuracy is needed and important, use decimal.
When accuracy is not that important, then you can use double.
In your case, you should be using decimal, as its financial matter.
For financial operation I always use the decimal type
Use decimal it's built for representing powers of 10 well (i.e. prices).
Decimal is the way to go when dealing with prices.
If it's financial software you should probably use decimal. This wiki article summarises quite nicely.
A simple response is in this example:
decimal d = 0.3M+0.3M+0.3M;
bool ret = d == 0.9M; // true
double db = 0.3 + 0.3 + 0.3;
bool dret = db == 0.9; // false
the test with the double fails since 0.3 in its binary representation ( base 2 ) is periodic, so you loose precision the decimal is represented by BCD, so base 10, and you did not loose significant digit unexpectedly. The Decimal are unfortunately dramattically slower than double. Usually we use decimal for financial calculation, where any digit has to be considered to avoid tolerance, double/float for engineering.
Double is meant as a generic floating-point data type, decimal is specifically meant for money and financial domains. Even though double usually works just fine decimal might prevent problems in some cases (e.g. rounding errors when you get to values in the billions)
There is an Explantion of it on MSDN
As soon as you start to do calculations on doubles you may get unexpected rounding problems because a double uses a binary representation of the number while the decimal uses a decimal representation preserving the decimal digits. That is probably what you are experiencing. If you only serialize and deserialize doubles to text or database without doing any rounding you will actually not loose any precision.
However, decimals are much more suited for representing monetary values where you are concerned about the decimal digits (and not the binary digits that a double uses internally). But if you need to do complex calculations (e.g. integrals as used by actuary computations) you will have to convert the decimal to double before doing the calculation negating the advantages of using decimals.
A decimal also "remembers" how many digits it has, e.g. even though decimal 1.230 is equal to 1.23 the first is still aware of the trailing zero and can display it if formatted as text.
If you always know the maximum amount of decimals you are going to have (digits after the point). Then the best practice is to use fixed point notation. That will give you an exact result while still working very fast.
The simplest manner in which to use fixed point is to simply store the number in an int of thousand parts. For example if the price always have 2 decimals you would be saving the amount of cents ($12.45 is stored in an int with value 1245 which thus would represent 1245 cents). With four decimals you would be storing pieces of ten-thousands (12.3456 would be stored in an int with value 123456 representing 123456 ten-thousandths) etc etc.
The disadvantage of this is that you would sometimes need a conversion if for example you are multiplying two values together (0.1 * 0.1 = 0.01 while 1 * 1 = 1, the unit has changed from tenths to hundredths). And if you are going to use some other mathematical functions you also has to take things like this into consideration.
On the other hand if the amount of decimals vary a lot using fixed point is a bad idea. And if high-precision floating point calculations are needed the decimal datatype was constructed for exactly that purpose.

Why is System.Math and for example MathNet.Numerics based on double?

All the methods in System.Math takes double as parameters and returns parameters. The constants are also of type double. I checked out MathNet.Numerics, and the same seems to be the case there.
Why is this? Especially for constants. Isn't decimal supposed to be more exact? Wouldn't that often be kind of useful when doing calculations?
This is a classic speed-versus-accuracy trade off.
However, keep in mind that for PI, for example, the most digits you will ever need is 41.
The largest number of digits of pi
that you will ever need is 41. To
compute the circumference of the
universe with an error less than the
diameter of a proton, you need 41
digits of pi †. It seems safe to
conclude that 41 digits is sufficient
accuracy in pi for any circle
measurement problem you're likely to
encounter. Thus, in the over one
trillion digits of pi computed in
2002, all digits beyond the 41st have
no practical value.
In addition, decimal and double have a slightly different internal storage structure. Decimals are designed to store base 10 data, where as doubles (and floats), are made to hold binary data. On a binary machine (like every computer in existence) a double will have fewer wasted bits when storing any number within its range.
Also consider:
System.Double 8 bytes Approximately ±5.0e-324 to ±1.7e308 with 15 or 16 significant figures
System.Decimal 12 bytes Approximately ±1.0e-28 to ±7.9e28 with 28 or 29 significant figures
As you can see, decimal has a smaller range, but a higher precision.
No, - decimals are no more "exact" than doubles, or for that matter, any type. The concept of "exactness", (when speaking about numerical representations in a compuiter), is what is wrong. Any type is absolutely 100% exact at representing some numbers. unsigned bytes are 100% exact at representing the whole numbers from 0 to 255. but they're no good for fractions or for negatives or integers outside the range.
Decimals are 100% exact at representing a certain set of base 10 values. doubles (since they store their value using binary IEEE exponential representation) are exact at representing a set of binary numbers.
Neither is any more exact than than the other in general, they are simply for different purposes.
To elaborate a bit furthur, since I seem to not be clear enough for some readers...
If you take every number which is representable as a decimal, and mark every one of them on a number line, between every adjacent pair of them there is an additional infinity of real numbers which are not representable as a decimal. The exact same statement can be made about the numbers which can be represented as a double. If you marked every decimal on the number line in blue, and every double in red, except for the integers, there would be very few places where the same value was marked in both colors.
In general, for 99.99999 % of the marks, (please don't nitpick my percentage) the blue set (decimals) is a completely different set of numbers from the red set (the doubles).
This is because by our very definition for the blue set is that it is a base 10 mantissa/exponent representation, and a double is a base 2 mantissa/exponent representation. Any value represented as base 2 mantissa and exponent, (1.00110101001 x 2 ^ (-11101001101001) means take the mantissa value (1.00110101001) and multiply it by 2 raised to the power of the exponent (when exponent is negative this is equivilent to dividing by 2 to the power of the absolute value of the exponent). This means that where the exponent is negative, (or where any portion of the mantissa is a fractional binary) the number cannot be represented as a decimal mantissa and exponent, and vice versa.
For any arbitrary real number, that falls randomly on the real number line, it will either be closer to one of the blue decimals, or to one of the red doubles.
Decimal is more precise but has less of a range. You would generally use Double for physics and mathematical calculations but you would use Decimal for financial and monetary calculations.
See the following articles on msdn for details.
Double
http://msdn.microsoft.com/en-us/library/678hzkk9.aspx
Decimal
http://msdn.microsoft.com/en-us/library/364x0z75.aspx
Seems like most of the arguments here to "It does not do what I want" are "but it's faster", well so is ANSI C+Gmp library, but nobody is advocating that right?
If you particularly want to control accuracy, then there are other languages which have taken the time to implement exact precision, in a user controllable way:
http://www.doughellmann.com/PyMOTW/decimal/
If precision is really important to you, then you are probably better off using languages that mathematicians would use. If you do not like Fortran then Python is a modern alternative.
Whatever language you are working in, remember the golden rule:
Avoid mixing types...
So do convert a and b to be the same before you attempt a operator b
If I were to hazard a guess, I'd say those functions leverage low-level math functionality (perhaps in C) that does not use decimals internally, and so returning a decimal would require a cast from double to decimal anyway. Besides, the purpose of the decimal value type is to ensure accuracy; these functions do not and cannot return 100% accurate results without infinite precision (e.g., irrational numbers).
Neither Decimal nor float or double are good enough if you require something to be precise. Furthermore, Decimal is so expensive and overused out there it is becoming a regular joke.
If you work in fractions and require ultimate precision, use fractions. It's same old rule, convert once and only when necessary. Your rounding rules too will vary per app, domain and so on, but sure you can find an odd example or two where it is suitable. But again, if you want fractions and ultimate precision, the answer is not to use anything but fractions. Consider you might want a feature of arbitrary precision as well.
The actual problem with CLR in general is that it is so odd and plain broken to implement a library that deals with numerics in generic fashion largely due to bad primitive design and shortcoming of the most popular compiler for the platform. It's almost the same as with Java fiasco.
double just turns out to be the best compromise covering most domains, and it works well, despite the fact MS JIT is still incapable of utilising a CPU tech that is about 15 years old now.
[piece to users of MSDN slowdown compilers]
Double is a built-in type. Is is supported by FPU/SSE core (formerly known as "Math coprocessor"), that's why it is blazingly fast. Especially at multiplication and scientific functions.
Decimal is actually a complex structure, consisting of several integers.

How deal with the fact that most decimal fractions cannot be accurately represented in binary?

So, we know that fractions such as 0.1, cannot be accurately represented in binary base, which cause precise problems (such as mentioned here: Formatting doubles for output in C#).
And we know we have the decimal type for a decimal representation of numbers... but the problem is, a lot of Math methods, do not supporting decimal type, so we have convert them to double, which ruins the number again.
so what should we do?
Oh, what should we do about the fact that most decimal fractions cannot be represented in binary? or for that matter, that binary fractions cannot be represented in Decimal ?
or, even, that an infinity (in fact, a non-countable infinity) of real numbers in all bases cannot be accurately represented in any computerized system??
nothing! To recall an old cliche, You can get close enough for government work... In fact, you can get close enough for any work... There is no limit to the degree of accuracy the computer can generate, it just cannot be infinite, (which is what would be required for a number representation scheme to be able to represent every possible real number)
You see, for every number representation scheme you can design, in any computer, it can only represent a finite number of distinct different real numbers with 100.00 % accuracy. And between each adjacent pair of those numbers (those that can be represented with 100% accuracy), there will always be an infinity of other numbers that it cannot represent with 100% accuracy.
so what should we do?
We just keep on breathing. It really isn't a structural problem. We have a limited precision but usually more than enough. You just have to remember to format/round when presenting the numbers.
The problem in the following snippet is with the WriteLine(), not in the calculation(s):
double x = 6.9 - 10 * 0.69;
Console.WriteLine("x = {0}", x);
If you have a specific problem, th post it. There usually are ways to prevent loss of precision. If you really need >= 30 decimal digits, you need a special library.
Keep in mind that the precision you need, and the rounding rules required, will depend on your problem domain.
If you are writing software to control a nuclear reactor, or to model the first billionth of a second of the universe after the big bang (my friend actually did that), you will need much higher precision than if you are calculating sales tax (something I do for a living).
In the finance world, for example, there will be specific requirements on precision either implicitly or explicitly. Some US taxing jurisdictions specify tax rates to 5 digits after the decimal place. Your rounding scheme needs to allow for that much precision. When much of Western Europe converted to the Euro, there was a very specific approach to rounding that was written into law. During that transition period, it was essential to round exactly as required.
Know the rules of your domain, and test that your rounding scheme satisfies those rules.
I think everyone implying:
Inverting a sparse matrix? "There's an app for that", etc, etc
Numerical computation is one well-flogged horse. If you have a problem, it was probably put to pasture before 1970 or even much earlier, carried forward library by library or snippet by snippet into the future.
you could shift the decimal point so that the numbers are whole, then do 64 bit integer arithmetic, then shift it back. Then you would only have to worry about overflow problems.
And we know we have the decimal type
for a decimal representation of
numbers... but the problem is, a lot
of Math methods, do not supporting
decimal type, so we have convert them
to double, which ruins the number
again.
Several of the Math methods do support decimal: Abs, Ceiling, Floor, Max, Min, Round, Sign, and Truncate. What these functions have in common is that they return exact results. This is consistent with the purpose of decimal: To do exact arithmetic with base-10 numbers.
The trig and Exp/Log/Pow functions return approximate answers, so what would be the point of having overloads for an "exact" arithmetic type?

Categories

Resources