C# decimal type equality - c#

Is the equality comparison for C# decimal types any more likely to work as we would intuitively expect than other floating point types?

I guess that depends on your intuition. I would assume that some people would think of the result of dividing 1 by 3 as the fraction 1/3, and others would think more along the lines of "Oh, 1 divided by 3 can't be represented as a decimal number, we'll have to decide how many digits to keep, let's go with 0.333".
If you think in the former way, Decimal won't help you much, but if you think in the latter way, and are explicit about rounding when needed, it is more likely that operations that are "intuitively" not subject to rounding errors in decimal, e.g. dividing by 10, will behave as you expect. This is more intuitive to most people than the behavior of a binary floating-point type, where powers of 2 behave nicely, but powers of 10 do not.

Basically, no. The Decimal type simply represents a specialised sort of floating-point number that is designed to reduce rounding error specifically in the base 10 system. That is, the internal representation of a Decimal is in fact in base 10 (denary) and not the usual binary. Hence, it is a rather more appropriate type for monetary calculations -- though not of course limited to such applications.
From the MSDN page for the structure:
The Decimal value type represents decimal numbers ranging from positive 79,228,162,514,264,337,593,543,950,335 to negative 79,228,162,514,264,337,593,543,950,335. The Decimal value type is appropriate for financial calculations requiring large numbers of significant integral and fractional digits and no round-off errors. The Decimal type does not eliminate the need for rounding. Rather, it minimizes errors due to rounding. For example, the following code produces a result of 0.9999999999999999999999999999 rather than 1.
A decimal number is a floating-point value that consists of a sign, a numeric value where each digit in the value ranges from 0 to 9, and a scaling factor that indicates the position of a floating decimal point that separates the integral and fractional parts of the numeric value.

Related

Can casting a long value to double potentially cause a value to be lost [duplicate]

I have read in different post on stackoverflow and in the C# documentation, that converting long (or any other data type representing a number) to double loses precision. This is quite obvious due to the representation of floating point numbers.
My question is, how big is the loss of precision if I convert a larger number to double? Do I have to expect differences larger than +/- X ?
The reason I would like to know this, is that I have to deal with a continuous counter which is a long. This value is read by my application as string, needs to be cast and has to be divided by e.g. 10 or some other small number and is then processed further.
Would decimal be more appropriate for this task?
converting long (or any other data type representing a number) to double loses precision. This is quite obvious due to the representation of floating point numbers.
This is less obvious than it seems, because precision loss depends on the value of long. For values between -252 and 252 there is no precision loss at all.
How big is the loss of precision if I convert a larger number to double? Do I have to expect differences larger than +/- X
For numbers with magnitude above 252 you will experience some precision loss, depending on how much above the 52-bit limit you go. If the absolute value of your long fits in, say, 58 bits, then the magnitude of your precision loss will be 58-52=6 bits, or +/-64.
Would decimal be more appropriate for this task?
decimal has a different representation than double, and it uses a different base. Since you are planning to divide your number by "small numbers", different representations would give you different errors on division. Specifically, double will be better at handling division by powers of two (2, 4, 8, 16, etc.) because such division can be accomplished by subtracting from exponent, without touching the mantissa. Similarly, large decimals would suffer no loss of significant digits when divided by ten, hundred, etc.
long
long is a 64-bit integer type and can hold values from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (max. 19 digits).
double
double is 64-bit floating-point type that has precision of 15 to 16 digits. So data can certainly be lost in case your numbers are greater than ~100,000,000,000,000.
decimal
decimal is a 128-bit decimal type and can hold up to 28-29 digits. So it's always safe to cast long to decimal.
Recommendation
I would advice that you find out the exact expectations about the numbers you will be working with. Then you can take an informed decision in choosing the appropriate data type. Since you are reading your numbers from a string, isn't it possible that they will be even greater than 28 digits? In that case, none of the types listed will work for you, and instead you'll have to use some sort of a BigInt implementation.

why am i losing data when going from double to decimal

I have the following code which I can't change...
public static decimal Convert(decimal value, Measurement currentMeasurement, Measurement targetMeasurement, bool roundResult = true)
{
double result = Convert(System.Convert.ToDouble(value), currentMeasurement, targetMeasurement, roundResult);
return System.Convert.ToDecimal(result);
}
now result is returned as -23.333333333333336 but once the conversion to a decimal takes place it becomes -23.3333333333333M.
I thought decimals could hold bigger values and were hence more accurate so how am I losing data going from double to decimal?
This is by design. Quoting from the documentation of Convert.ToDecimal:
The Decimal value returned by this method contains a maximum of
15 significant digits. If the value parameter contains more than 15
significant digits, it is rounded using rounding to nearest. The
following example illustrates how the Convert.ToDecimal(Double) method
uses rounding to nearest to return a Decimal value with 15
significant digits.
Console.WriteLine(Convert.ToDecimal(123456789012345500.12D)); // Displays 123456789012346000
Console.WriteLine(Convert.ToDecimal(123456789012346500.12D)); // Displays 123456789012346000
Console.WriteLine(Convert.ToDecimal(10030.12345678905D)); // Displays 10030.123456789
Console.WriteLine(Convert.ToDecimal(10030.12345678915D)); // Displays 10030.1234567892
The reason for this is mostly that double can only guarantee 15 decimal digits of precision, anyway. Everything that's displayed after them (converted to a string it's 17 digits because that's what double uses internally and because that's the number you might need to exactly reconstruct every possible double value from a string) is not guaranteed to be part of the exact value being represented. So Convert takes the only sensible route and rounds them away. After all, if you have a type that can represent decimal values exactly, you wouldn't want to start with digits that are inaccurate.
So you're not losing data, per se. In fact, you're only losing garbage. Garbage you thought of being data.
EDIT: To clarify my point from the comments: Conversions between different numeric data types may incur a loss of precision. This is especially the case between double and decimal because both types are capable of representing values the other type cannot represent. Furthermore, both double and decimal have fairly specific use cases they're intended for, which is also evident from the documentation (emphasis mine):
The Double value type represents a double-precision 64-bit number with
values ranging from negative 1.79769313486232e308 to positive
1.79769313486232e308, as well as positive or negative zero, PositiveInfinity, NegativeInfinity, and not a number (NaN). It is
intended to represent values that are extremely large (such as
distances between planets or galaxies) or extremely small (the
molecular mass of a substance in kilograms) and that often are
imprecise (such as the distance from earth to another solar system).
The Double type complies with the IEC 60559:1989 (IEEE 754) standard
for binary floating-point arithmetic.
The Decimal value type represents decimal numbers ranging from
positive 79,228,162,514,264,337,593,543,950,335 to negative
79,228,162,514,264,337,593,543,950,335. The Decimal value type is
appropriate for financial calculations that require large numbers of
significant integral and fractional digits and no round-off errors.
The Decimal type does not eliminate the need for rounding. Rather, it
minimizes errors due to rounding.
This basically means that for quantities that will not grow unreasonably large and you need an accurate decimal representation, you should use decimal (sounds fairly obvious when written that way). In practice this most often means financial calculations, as the documentation already states.
On the other hand, in the vast majority of other cases, double is the right way to go and usually does not hurt as a default choice (languages like Lua and JavaScript get away just fine with double being the only numeric data type).
In your specific case, since you mentioned in the comments that those are temperature readings, it is very, very, very simple: Just use double throughout. You have temperature readings. Wikipedia suggests that highly-specialized thermometers reach around 10−3 °C precision. Which basically means that the differences in value of around 10−13 (!) you are worried about here are simply irrelevant. Your thermometer gives you (let's be generous) five accurate digits here and you worry about the ten digits of random garbage that come after that. Just don't.
I'm sure a physicist or other scientist might be able to chime in here with proper handling of measurements and their precision, but we were taught in school that it's utter bullshit to even give values more precise than the measurements are. And calculations potentially affect (and reduce) that precision.

Which values cannot be represented correctly by a double

The Double data type cannot correctly represent some base 10 values. This is because of how floating point numbers represent real numbers. What this means is that when representing monetary values, one should use the decimal value type to prevent errors. (feel free to correct errors in this preamble)
What I want to know is what are the values which present such a problem under the Double data-type under a 64 bit architecture in the standard .Net framework (C# if that makes a difference) ?
I expect the answer the be a formula or rule to find such values but I would also like some example values.
Any number which cannot be written as the sum of positive and negative powers of 2 cannot be exactly represented as a binary floating-point number.
The common IEEE formats for 32- and 64-bit representations of floating-point numbers impose further constraints; they limit the number of binary digits in both the significand and the exponent. So there are maximum and minimum representable numbers (approximately +/- 10^308 (base-10) if memory serves) and limits to the precision of a number that can be represented. This limit on the precision means that, for 64-bit numbers, the difference between the exponent of the largest power of 2 and the smallest power in a number is limited to 52, so if your number includes a term in 2^52 it can't also include a term in 2^-1.
Simple examples of numbers which cannot be exactly represented in binary floating-point numbers include 1/3, 2/3, 1/5.
Since the set of floating-point numbers (in any representation) is finite, and the set of real numbers is infinite, one algorithm to find a real number which is not exactly representable as a floating-point number is to select a real number at random. The probability that the real number is exactly representable as a floating-point number is 0.
You generally need to be prepared for the possibility that any value you store in a double has some small amount of error. Unless you're storing a constant value, chances are it could be something with at least some error. If it's imperative that there never be any error, and the values aren't constant, you probably shouldn't be using a floating point type.
What you probably should be asking in many cases is, "How do I deal with the minor floating point errors?" You'll want to know what types of operations can result in a lot of error, and what types don't. You'll want to ensure that comparing two values for "equality" actually just ensures they are "close enough" rather than exactly equal, etc.
This question actually goes beyond any single programming language or platform. The inaccuracy is actually inherent in binary data.
Consider that with a double, each number N to the left (at 0-based index I) of the decimal point represents the value N * 2^I and every digit to the right of the decimal point represents the value N * 2^(-I).
As an example, 5.625 (base 10) would be 101.101 (base 2).
Given this calculation, and decimal value that can't be calculated as a sum of 2^(-I) for different values of I would have an incorrect value as a double.
A float is represented as s, e and m in the following formula
s * m * 2^e
This means that any number that cannot be represented using the given expression (and in the respective domains of s, e and m) cannot be represented exactly.
Basically, you can represent all numbers between 0 and 2^53 - 1 multiplied by a certain power of two (possibly a negative power).
As an example, all numbers between 0 and 2^53 - 1 can be represented multiplied with 2^0 = 1. And you can also represent all those numbers by dividing them by 2 (with a .5 fraction). And so on.
This answer does not fully cover the topic, but I hope it helps.

Addition of Double values inconsistent

I came across following issue while developing some engineering rule value engine using eval(...) implementation.
Dim first As Double = 1.1
Dim second As Double = 2.2
Dim sum As Double = first + second
If (sum = 3.3) Then
Console.WriteLine("Matched")
Else
Console.WriteLine("Not Matched")
End If
'Above condition returns false because sum's value is 3.3000000000000003 instead of 3.3
It looks like 15th digit is round-tripped. Someone may give better explanation on this pls.
Is Math.Round(...) only solution available OR there is something else also I can attempt?
You are not adding decimals - you are adding up doubles.
Not all doubles can be represented accurately in a computer, hence the error. I suggest reading this article for background (What Every Computer Scientist Should Know About Floating-Point Arithmetic).
Use the Decimal type instead, it doesn't suffer from these issues.
Dim first As Decimal = 1.1
Dim second As Decimal = 2.2
Dim sum As Decimal= first + second
If (sum = 3.3) Then
Console.WriteLine("Matched")
Else
Console.WriteLine("Not Matched")
End If
that's how the double number work in PC.
The best way to compare them is to use such a construction
if (Math.Abs(second - first) <= 1E-9)
Console.WriteLine("Matched")
instead if 1E-9 you can use another number, that would represent the possible error in comparison.
Equality comparisons with floating point operations are always inaccurate because of how fractional values are represented within the machine. You should have some sort of epsilon value by which you're comparing against. Here is an article that describes it much more thoroughly:
http://www.cygnus-software.com/papers/comparingfloats/Comparing%20floating%20point%20numbers.htm
Edit: Math.Round will not be an ideal choice because of the error generated with it for certain comparisons. You are better off determining an epsilon value that can be used to limit the amount of error in the comparison (basically determining the level of accuracy).
A double uses floating-point arithmetic, which is approximate but more efficient. If you need to compare against exact values, use the decimal data type instead.
In C#, Java, Python, and many other languages, decimals/floats are not perfect. Because of the way they are represented (using multipliers and exponents), they often have inaccuracies. See http://www.yoda.arachsys.com/csharp/decimal.html for more info.
From the documentaiton:
http://msdn.microsoft.com/en-us/library/system.double.aspx
Floating-Point Values and Loss of
Precision
Remember that a floating-point number
can only approximate a decimal number,
and that the precision of a
floating-point number determines how
accurately that number approximates a
decimal number. By default, a Double
value contains 15 decimal digits of
precision, although a maximum of 17
digits is maintained internally. The
precision of a floating-point number
has several consequences:
Two floating-point numbers that appear
equal for a particular precision might
not compare equal because their least
significant digits are different.
A mathematical or comparison operation
that uses a floating-point number
might not yield the same result if a
decimal number is used because the
floating-point number might not
exactly approximate the decimal
number.
A value might not roundtrip if a
floating-point number is involved. A
value is said to roundtrip if an
operation converts an original
floating-point number to another form,
an inverse operation transforms the
converted form back to a
floating-point number, and the final
floating-point number is equal to the
original floating-point number. The
roundtrip might fail because one or
more least significant digits are lost
or changed in a conversion.
In addition, the result of arithmetic
and assignment operations with Double
values may differ slightly by platform
because of the loss of precision of
the Double type. For example, the
result of assigning a literal Double
value may differ in the 32-bit and
64-bit versions of the .NET Framework.
The following example illustrates this
difference when the literal value
-4.42330604244772E-305 and a variable whose value is -4.42330604244772E-305
are assigned to a Double variable.
Note that the result of the
Parse(String) method in this case does
not suffer from a loss of precision.
THis is a well known problem with floating point arithmatic. Look into binary coding for further details.
Use the type "decimal" if that will fit your needs.
But in general, you should never compare floating point values to constant floating point values with the equality sign.
Failing that, compare to the number of places that you want to compare to (e.g. say it is 4 then you would go (if sum > 3.2999 and sum < 3.3001)

Is .NET “decimal” arithmetic independent of platform/architecture?

I asked about System.Double recently and was told that computations may differ depending on platform/architecture. Unfortunately, I cannot find any information to tell me whether the same applies to System.Decimal.
Am I guaranteed to get exactly the same result for any particular decimal computation independently of platform/architecture?
Am I guaranteed to get exactly the same result for any particular decimal computation independently of platform/architecture?
The C# 4 spec is clear that the value you get will be computed the same on any platform.
As LukeH's answer notes, the ECMA version of the C# 2 spec grants leeway to conforming implementations to provide more precision, so an implementation of C# 2.0 on another platform might provide a higher-precision answer.
For the purposes of this answer I'll just discuss the C# 4.0 specified behaviour.
The C# 4.0 spec says:
The result of an operation on values of type decimal is that which would result from calculating an exact result (preserving scale, as defined for each operator) and then rounding to fit the representation. Results are rounded to the nearest representable value, and, when a result is equally close to two representable values, to the value that has an even number in the least significant digit position [...]. A zero result always has a sign of 0 and a scale of 0.
Since the calculation of the exact value of an operation should be the same on any platform, and the rounding algorithm is well-defined, the resulting value should be the same regardless of platform.
However, note the parenthetical and that last sentence about the zeroes. It might not be clear why that information is necessary.
One of the oddities of the decimal system is that almost every quantity has more than one possible representation. Consider exact value 123.456. A decimal is the combination of a 96 bit integer, a 1 bit sign, and an eight-bit exponent that represents a number from -28 to 28. That means that exact value 123.456 could be represented by decimals 123456 x 10-3 or 1234560 x 10-4 or 12345600 x 10-5. Scale matters.
The C# specification also mandates how information about scale is computed. The literal 123.456m would be encoded as 123456 x 10-3, and 123.4560m would be encoded as 1234560 x 10-4.
Observe the effects of this feature in action:
decimal d1 = 111.111000m;
decimal d2 = 111.111m;
decimal d3 = d1 + d1;
decimal d4 = d2 + d2;
decimal d5 = d1 + d2;
Console.WriteLine(d1);
Console.WriteLine(d2);
Console.WriteLine(d3);
Console.WriteLine(d4);
Console.WriteLine(d5);
Console.WriteLine(d3 == d4);
Console.WriteLine(d4 == d5);
Console.WriteLine(d5 == d3);
This produces
111.111000
111.111
222.222000
222.222
222.222000
True
True
True
Notice how information about significant zero figures is preserved across operations on decimals, and that decimal.ToString knows about that and displays the preserved zeroes if it can. Notice also how decimal equality knows to make comparisons based on exact values, even if those values have different binary and string representations.
The spec I think does not actually say that decimal.ToString() needs to correctly print out values with trailing zeroes based on their scales, but it would be foolish of an implementation to not do so; I would consider that a bug.
I also note that the internal memory format of a decimal in the CLR implementation is 128 bits, subdivided into: 16 unused bits, 8 scale bits, 7 more unused bits, 1 sign bit and 96 mantissa bits. The exact layout of those bits in memory is not defined by the specification, and if another implementation wants to stuff additional information into those 23 unused bits for its own purposes, it can do so. In the CLR implementation the unused bits are supposed to always be zero.
Even though the format of floating point types is clearly defined, floating point calculations can indeed have differing results depending on architecture, as stated in section 4.1.6 of the C# specification:
Floating-point operations may be
performed with higher precision than
the result type of the operation. For
example, some hardware architectures
support an “extended” or “long double”
floating-point type with greater range
and precision than the double type,
and implicitly perform all
floating-point operations using this
higher precision type. Only at
excessive cost in performance can such
hardware architectures be made to
perform floating-point operations with
less precision, and rather than
require an implementation to forfeit
both performance and precision, C#
allows a higher precision type to be
used for all floating-point
operations.
While the decimal type is subject to approximation in order for a value to be represented within its finite range, the range is, by definition, defined to be suitable for financial and monetary calculations. Therefore, it has a higher precision (and smaller range) than float or double. It is also more clearly defined than the other floating point types such that it would appear to be platform-independent (see section 4.1.7 - I suspect this platform independence is more because there isn't standard hardware support for types with the size and precision of decimal rather than because of the type itself, so this may change with future specifications and hardware architectures).
If you need to know if a specific implementation of the decimal type is correct, you should be able to craft some unit tests using the specification that will test the correctness.
The decimal type is represented in what amounts to base-10 using a struct (containing integers, I believe), as opposed to double and other floating-point types, which represent non-integral values in base-2. Therefore, decimals are exact representations of base-10 values, within a standardized precision, on any architecture. This is true for any architecture running a correct implementation of the .NET spec.
So to answer your question, since the behavior of decimal is standardized this way in the specification, decimal values should be the same on any architecture conforming to that spec. If they don't conform to that spec, then they're not really .NET.
"Decimal" .NET Type vs. "Float" and "Double" C/C++ Type
A reading of the specification suggests that decimal -- like float and double -- might be allowed some leeway in its implementation so long as it meets certain minimum standards.
Here are some excerpts from the ECMA C# spec (section 11.1.7). All emphasis in bold is mine.
The decimal type can represent values including those in
the range 1 x 10−28 through 1 x 1028 with
at least 28 significant digits.
The finite set of values of type decimal are of the form
(-1)s x c x 10-e, where the sign s
is 0 or 1, the coefficient c is given by 0 <= c < Cmax,
and the scale e is such that Emin <= e <= Emax, where
Cmax is at least 1 x 1028, Emin <= 0, and
Emax >= 28. The decimal type does not necessarily
support signed zeros, infinities, or NaN's.
For decimals with an absolute value less than 1.0m, the
value is exact to at least the 28th decimal
place. For decimals with an absolute value greater than or
equal to 1.0m, the value is exact to at least 28 digits.
Note that the wording of the Microsoft C# spec (section 4.1.7) is significantly different to that of the ECMA spec. It appears to lock down the behaviour of decimal a lot more strictly.

Categories

Resources