Very, very large C# floating-point numbers

Very, very large C# floating-point numbers - c#

I am doing some population modeling (for fun, mostly to play with the concepts of Carrying Capacity and the Logistics Function). The model works with multiple planets (about 100,000 of them, right now). When the population reaches carrying capacity on one planet, the inhabitants start branching out to nearby planets, and so on.
Problem: 100,000+ planets can house a LOT of people. More than a C# Decimal can handle. Since I'm doing averages and other stuff with these numbers, I need the capability to work with floating points (or I'd just use a BigInt library).
Does anyone know of a BigFloatingPoint class (or whatever) I can use? Google is being very unhelpful today. I could probably write a class that would work well enough, but I'd rather use something pre-existing, if such a thing exists.

Use units of megapeople to achieve more headroom.
Also, Decimal lets you have 100,000 planets each with 100000000000000 times the population of the Earth, if my arithmetic is right. Are you sure that's not enough?

Even if each planet has 100 billion people, the total is still only 1E16. This is well within the limit of a signed 64 bit integer (2^63 goes to 9,223,372,036,854,775,807 which is almost 1E19...
You could go with a Million Billion people per planet, with 100000 planets before you got close to the limit...
As to fractions and averages and such, can't you convert to a Float or double when you do any such calculations ?

Do you really need 28 digit precision? Could you use floating point for some calculations?
(double to be exact: ±5.0e−324 to ±1.7e308)

Related

Most efficient way of multiplying and dividing fixed scale decimal numbers

Background
I work in the field of financial trading and am currently optimizing a real-time C# trading application.
Through extensive profiling I have identified that the performance of System.Decimal is now a bottleneck. As a result I am currently coding up a couple of more efficient fixed scale 64-bit 'decimal' structures (one signed, one unsigned) to perform base10 arithmatic. Using a fixed scale of 9 (i.e. 9 digits after the decimal point) means the underlying 64-bit integer can be used to represent the values:
-9,223,372,036.854775808 to 9,223,372,036.854775807
and
0 to 18,446,744,073.709551615
respectively.
This makes most operations trivial (i.e. comparisons, addition, subtraction). However, for multiplication and division I am currently falling back on the implementation provided by System.Decimal. I assume the external FCallMultiply method it invokes for multiplication uses either the Karatsuba or Toom–Cook algorithm under the covers. For division, I'm not sure which particular algorithm it would use.
Question
Does anyone know if, due to the fixed scale of my decimal values, there are any faster multiplication and division algorithms I can employ which are likely to out-perform System.Decimal.
I would appreciate your thoughts...

I have done something similar, by using the Schönhage Strassen algorithm.
I cannot find any sources now, but you can try to convert this code to the C# language.
P.S. i cannot say for sure about System.Decimal, but the "Karatsuba algorithm" is used by System.Numerics.BigInteger

My take of fixed point arithmetic (in general, not knowing about about C# or .NET in particular (VS Express acting up) (then, there's Fixed point math in c#? and Why no fixed point type in C#?):
The main point is a fixed scale - and that this is conceptual, first and foremost - the hardware couldn't care less about meaning/interpretation of numbers (or much anything) (unless it supports something, if for marketing reasons)
the easy: addition/subtraction - just ignore scaling
multiplication: compute the double-wide product, divide by scale
division: multiply (widened) dividend by scale and divide
the ugly - transcendental functions beyond exponentiation (exponentiate, multiply by scale to half that power)
in choosing a scale, don't forget conversion to and from digits, which may vastly outnumber multiplication&division (and give using a square a thought, see above …)
That said, "multiples of word size" and powers of two have been popular choices for scale due to speed in multiplying and dividing by such a scale. This still may make a difference with contemporary processors, if not for main ALUs of PCs - think SIMD extensions, GPUs, embedded …
Given what little I was able to discern of your application and requirements (consider disclosing more), three generic choices to consider are 10^9 (to the 9th power), 2^30 and 2^32. The latter representations may be called 34.30 and 32.32 for the bit lengths of their integral and fractional parts, respectively.
With a language that allows to create types (especially supporting operators in addition to invokable procedures), I deem designing and implementing that new type according the principle of least surprise important.

Value of a double variable not exact after multiplying with 100 [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 7 years ago.
If I execute the following expression in C#:
double i = 10*0.69;
i is: 6.8999999999999995. Why?
I understand numbers such as 1/3 can be hard to represent in binary as it has infinite recurring decimal places but this is not the case for 0.69. And 0.69 can easily be represented in binary, one binary number for 69 and another to denote the position of the decimal place.
How do I work around this? Use the decimal type?

Because you've misunderstood floating point arithmetic and how data is stored.
In fact, your code isn't actually performing any arithmetic at execution time in this particular case - the compiler will have done it, then saved a constant in the generated executable. However, it can't store an exact value of 6.9, because that value cannot be precisely represented in floating point point format, just like 1/3 can't be precisely stored in a finite decimal representation.
See if this article helps you.

why doesn't the framework work around this and hide this problem from me and give me the
right answer,0.69!!!
Stop behaving like a dilbert manager, and accept that computers, though cool and awesome, have limits. In your specific case, it doesn't just "hide" the problem, because you have specifically told it not to. The language (the computer) provides alternatives to the format, that you didn't choose. You chose double, which has certain advantages over decimal, and certain downsides. Now, knowing the answer, you're upset that the downsides don't magically disappear.
As a programmer, you are responsible for hiding this downside from managers, and there are many ways to do that. However, the makers of C# have a responsibility to make floating point work correctly, and correct floating point will occasionally result in incorrect math.
So will every other number storage method, as we do not have infinite bits. Our job as programmers is to work with limited resources to make cool things happen. They got you 90% of the way there, just get the torch home.

And 0.69 can easily be represented in
binary, one binary number for 69 and
another to denote the position of the
decimal place.
I think this is a common mistake - you're thinking of floating point numbers as if they are base-10 (i.e decimal - hence my emphasis).
So - you're thinking that there are two whole-number parts to this double: 69 and divide by 100 to get the decimal place to move - which could also be expressed as:
69 x 10 to the power of -2.
However floats store the 'position of the point' as base-2.
Your float actually gets stored as:
68999999999999995 x 2 to the power of some big negative number
This isn't as much of a problem once you're used to it - most people know and expect that 1/3 can't be expressed accurately as a decimal or percentage. It's just that the fractions that can't be expressed in base-2 are different.

but why doesn't the framework work around this and hide this problem from me and give me the right answer,0.69!!!
Because you told it to use binary floating point, and the solution is to use decimal floating point, so you are suggesting that the framework should disregard the type you specified and use decimal instead, which is very much slower because it is not directly implemented in hardware.
A more efficient solution is to not output the full value of the representation and explicitly specify the accuracy required by your output. If you format the output to two decimal places, you will see the result you expect. However if this is a financial application decimal is precisely what you should use - you've seen Superman III (and Office Space) haven't you ;)
Note that it is all a finite approximation of an infinite range, it is merely that decimal and double use a different set of approximations. The advantage of decimal is it produces the same approximations that you would if you were performing the calculation yourself. For example if you calculated 1/3, you would eventually stop writing 3's when it was 'good enough'.

For the same reason that 1 / 3 in a decimal systems comes out as 0.3333333333333333333333333333333333333333333 and not the exact fraction, which is infinitely long.

To work around it (e.g. to display on screen) try this:
double i = (double) Decimal.Multiply(10, (Decimal) 0.69);
Everyone seems to have answered your first question, but ignored the second part.

'Beautify' number by rounding erroneous digits appropriately

I want my cake and to eat it. I want to beautify (round) numbers to the largest extent possible without compromising accuracy for other calculations. I'm using doubles in C# (with some string conversion manipulation too).
Here's the issue. I understand the inherent limitations in double number representation (so please don't explain that). HOWEVER, I want to round the number in some way to appear aesthetically pleasing to the end user (I am making a calculator). The problem is rounding by X significant digits works in one case, but not in the other, whilst rounding by decimal place works in the other, but not the first case.
Observe:
CASE A: Math.Sin(Math.Pi) = 0.000000000000000122460635382238
CASE B: 0.000000000000001/3 = 0.000000000000000333333333333333
For the first case, I want to round by DECIMAL PLACES. That would give me the nice neat zero I'm looking for. Rounding by Sig digits would mean I would keep the erroneous digits too.
However for the second case, I want to round by SIGNIFICANT DIGITS, as I would lose tons of accuracy if I rounded merely by decimal places.
Is there a general way I can cater to both types of calculation?

I don't thinks it's feasible to do that to the result itself and precision has nothing to do with it.
Consider this input: (1+3)/2^3 . You can "beautify" it by showing the result as sin(30) or cos(60) or 1/2 and a whole lot of other interpretations. Choosing the wrong "beautification" can mislead your user, making them think their function has something to do with sin(x).
If your calculator keeps all the initial input as variables you could keep all the operations postponed until you need the result and then make sure you simplify the result until it matches your needs. And you'll need to consider using rational numbers, e, Pi and other irrational numbers may not be as easy to deal with.

The best solution to this is to keep every bit you can get during calculations, and leave the display format up to the end user. The user should have some idea how many significant digits make sense in their situation, given both the nature of the calculations and the use of the result.
Default to a reasonable number of significant digits for a few calculations in the floating point format you are using internally - about 12 if you are using double. If the user changes the format, immediately redisplay in the new format.

The best solution is to use arbitrary-precision and/or symbolic arithmetic, although these result in much more complex code and slower speed. But since performance isn't important for a calculator (in case of a button calculator and not the one that you enter expressions to calculate) you can use them without issue
Anyway there's a good trade-off which is to use decimal floating point. You'll need to limit the input/output precision but use a higher precision for the internal representation so that you can discard values very close to zero like the sin case above. For better results you could detect some edge cases such as sine/cosine of 45 degree's multiples... and directly return the exact result.
Edit: just found a good solution but haven't had an opportunity to try.
Here’s something I bet you never think about, and for good reason: how are floating-point numbers rendered as text strings? This is a surprisingly tough problem, but it’s been regarded as essentially solved since about 1990.
Prior to Steele and White’s "How to print floating-point numbers accurately", implementations of printf and similar rendering functions did their best to render floating point numbers, but there was wide variation in how well they behaved. A number such as 1.3 might be rendered as 1.29999999, for instance, or if a number was put through a feedback loop of being written out and its written representation read back, each successive result could drift further and further away from the original.
...
In 2010, Florian Loitsch published a wonderful paper in PLDI, "Printing floating-point numbers quickly and accurately with integers", which represents the biggest step in this field in 20 years: he mostly figured out how to use machine integers to perform accurate rendering! Why do I say "mostly"? Because although Loitsch's "Grisu3" algorithm is very fast, it gives up on about 0.5% of numbers, in which case you have to fall back to Dragon4 or a derivative
Here be dragons: advances in problems you didn’t even know you had

Which SQL Server field type is best for storing price values?

I am wondering what's the best type for a price field in SQL Server for a shop-like structure?
Looking at this overview we have data types called money, smallmoney, then we have decimal/numeric and lastly float and real.
Name, memory/disk-usage and value ranges:
Money: 8 bytes (values: -922,337,203,685,477.5808 to +922,337,203,685,477.5807)
Smallmoney: 4 bytes (values: -214,748.3648 to +214,748.3647)
Decimal: 9 [default, min. 5] bytes (values: -10^38 +1 to 10^38 -1 )
Float: 8 bytes (values: -1.79E+308 to 1.79E+308 )
Real: 4 bytes (values: -3.40E+38 to 3.40E+38 )
Is it really wise to store price values in those types? What about eg. INT?
Int: 4 bytes (values: -2,147,483,648 to 2,147,483,647)
Lets say a shop uses dollars, they have cents, but I don't see prices being $49.2142342 so the use of a lot of decimals showing cents seems waste of SQL bandwidth. Secondly, most shops wouldn't show any prices near 200.000.000 (not in normal web-shops at least, unless someone is trying to sell me a famous tower in Paris)
So why not go for an int?
An int is fast, its only 4 bytes and you can easily make decimals, by saving values in cents instead of dollars and then divide when you present the values.
The other approach would be to use smallmoney which is 4 bytes too, but this will require the math part of the CPU to do the calc, where as Int is integer power... on the downside you will need to divide every single outcome.
Are there any "currency" related problems with regional settings when using smallmoney/money fields? what will these transfer too in C#/.NET ?
Any pros/cons? Go for integer prices or smallmoney or some other?
What does your experience tell?

If you're absolutely sure your numbers will always stay within the range of smallmoney, use that and you can save a few bytes. Otherwise, I would use money. But remember, storage is cheap these days. The extra 4 bytes over 100 million records is still less than half a GB. As #marc_s points out, however, using smallmoney if you can will reduce the memory footprint of SQL server.
Long story short, if you can get away with smallmoney, do. If you think you might go over the max, use money.
But, do not use a floating-decimal type or you will get rounding issues and will start losing or gaining random cents, unless you deal with them properly.
My argument against using int: Why reinvent the wheel by storing an int and then having to remember to divide by 100 (10000) to retrieve the value and multiply back when you go to store the value. My understanding is the money types use an int or long as the underlying storage type anyway.
As far as the corresponding data type in .NET, it will be decimal (which will also avoid rounding issues in your C# code).

Use the Money datatype if you are storing money (unless modelling huge amounts of money like the national debt) - it avoids precision/rounding issues.
The Many Benefits of Money…Data Type!

USE NUMERIC / DECIMAL. Avoid MONEY / SMALLMONEY. Here's an example of why. Sooner or later the MONEY / SMALLMONEY types will likely let you down due to rounding errors. The money types are completely redundant and achieve nothing useful - a currency amount being just another decimal number like any other.
Lastly, the MONEY / SMALLMONEY types are proprietary to Microsoft. NUMERIC / DECIMAL are part of the SQL standard. They are used, recognised and understood by more people and are supported by most DBMSs and other software.

Personally, I'd use smallmoney or money to store shop prices.
Using int adds complexity elsewhere.
And 200 million is perfectly valid price in Korean Won or Indonesian Rupees too...

SQL data types money and smallmoney both resolve to c# decimal type:
http://msdn.microsoft.com/en-us/library/system.data.sqltypes.sqlmoney(v=VS.71).aspx
So I'm thinking that you might as well go for decimal. Personally I've been using double all my life working in the financial industry and haven't experienced performance issues, etc. Actually, I've found that for certain calculations, etc., having a larger data type allows for higher degree of accuracy.

I would go for the Money datatype. Invididually you may not exceed the value in Smallmoney, but it would be easy for multiple items to exceed it.

In my pawnshop app, the pawnshop operators lend from $5.00 to $10,000.00
When they calculate the loan amount they round it to the nearest dollar in order to
avoid dealing with cents (the same applies for interest payments). When the loan amount is above $50.00 they will round it to the nearest $5.00 (i.e. $50, $55, $60 ...), again to minimize running out of dollar bills. Therefore, I use DECIMAL(7,2) for transaction.calculated_loan_amount and DECIMAL(5,0) for transaction.loan_amount.
The app calculates the loan amount to the penny and places that amount in loan_amount where it gets rounded to the nearest dollar when below $50 or to the nearest $5.00 when greater.

How deal with the fact that most decimal fractions cannot be accurately represented in binary?

So, we know that fractions such as 0.1, cannot be accurately represented in binary base, which cause precise problems (such as mentioned here: Formatting doubles for output in C#).
And we know we have the decimal type for a decimal representation of numbers... but the problem is, a lot of Math methods, do not supporting decimal type, so we have convert them to double, which ruins the number again.
so what should we do?

Oh, what should we do about the fact that most decimal fractions cannot be represented in binary? or for that matter, that binary fractions cannot be represented in Decimal ?
or, even, that an infinity (in fact, a non-countable infinity) of real numbers in all bases cannot be accurately represented in any computerized system??
nothing! To recall an old cliche, You can get close enough for government work... In fact, you can get close enough for any work... There is no limit to the degree of accuracy the computer can generate, it just cannot be infinite, (which is what would be required for a number representation scheme to be able to represent every possible real number)
You see, for every number representation scheme you can design, in any computer, it can only represent a finite number of distinct different real numbers with 100.00 % accuracy. And between each adjacent pair of those numbers (those that can be represented with 100% accuracy), there will always be an infinity of other numbers that it cannot represent with 100% accuracy.

so what should we do?
We just keep on breathing. It really isn't a structural problem. We have a limited precision but usually more than enough. You just have to remember to format/round when presenting the numbers.
The problem in the following snippet is with the WriteLine(), not in the calculation(s):
double x = 6.9 - 10 * 0.69;
Console.WriteLine("x = {0}", x);
If you have a specific problem, th post it. There usually are ways to prevent loss of precision. If you really need >= 30 decimal digits, you need a special library.

Keep in mind that the precision you need, and the rounding rules required, will depend on your problem domain.
If you are writing software to control a nuclear reactor, or to model the first billionth of a second of the universe after the big bang (my friend actually did that), you will need much higher precision than if you are calculating sales tax (something I do for a living).
In the finance world, for example, there will be specific requirements on precision either implicitly or explicitly. Some US taxing jurisdictions specify tax rates to 5 digits after the decimal place. Your rounding scheme needs to allow for that much precision. When much of Western Europe converted to the Euro, there was a very specific approach to rounding that was written into law. During that transition period, it was essential to round exactly as required.
Know the rules of your domain, and test that your rounding scheme satisfies those rules.

I think everyone implying:
Inverting a sparse matrix? "There's an app for that", etc, etc
Numerical computation is one well-flogged horse. If you have a problem, it was probably put to pasture before 1970 or even much earlier, carried forward library by library or snippet by snippet into the future.

you could shift the decimal point so that the numbers are whole, then do 64 bit integer arithmetic, then shift it back. Then you would only have to worry about overflow problems.

And we know we have the decimal type
for a decimal representation of
numbers... but the problem is, a lot
of Math methods, do not supporting
decimal type, so we have convert them
to double, which ruins the number
again.
Several of the Math methods do support decimal: Abs, Ceiling, Floor, Max, Min, Round, Sign, and Truncate. What these functions have in common is that they return exact results. This is consistent with the purpose of decimal: To do exact arithmetic with base-10 numbers.
The trig and Exp/Log/Pow functions return approximate answers, so what would be the point of having overloads for an "exact" arithmetic type?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.