I have tried BigInteger, decimal, float and long but no luck.
Screenshot of required output example
It is a fairly easy task to write your own rational class; remember, rationals are just pairs of integers, and you already have BigInteger.
In this series of articles I show how to devise your own big integer and big rational classes starting from absolutely nothing, not even integers. Note that this is not fast and not intended to be fast; it is intended to be educational. You can use the techniques I describe in this series to help you when designing your arithmetic class.
https://ericlippert.com/2013/09/16/math-from-scratch-part-one/
Or, if you don't want to write it yourself, you can always use the one from Microsoft:
http://bcl.codeplex.com/wikipage?title=BigRational&referringTitle=Home
But that said...
I need a minimum of 128 decimal places to calculate precise probabilities of events between different time steps
Do you need 128 decimal places to represent 128 digits of precision, or of magnitude? Because if it is just magnitude, then simply do a transformation of your probability math into logarithms and do the math in doubles.
The easiest way to achieve arbitrary precision numbers is to combine the BigInteger class from System.Numerics with an int exponent. You could use BigInteger for your exponent, but this is likely overkill as the numbers would be well beyong meaningful in scale.
So if you create a class along these lines:
public class ArbDecimal
{
BigInteger value;
int exponent;
public override string ToString()
{
StringBuilder sb = new StringBuilder();
int place;
foreach (char digit in value.ToString())
{
if (place++ == value.ToString().Length - exponent)
{
sb.Append('.');
}
sb.Append(digit);
}
return sb.ToString();
}
}
You should then be able to define your mathematical operations using the laws of indices with the value and exponent fields.
For instance, to achieve addition, you would scale the larger value to have the same exponent as the smaller one by multiplying it by 10^(largerExp-smallerExp) then adding the two values and rescaling.
In your class, the number 0.01 would be represented like:
value = 1
exponent = -2
Due to the fact that 1*10^-2 = 0.01.
Utilising this method, you can store arbitrarily precise (and large) numbers limited only by the available ram and the .NET framework's object size limit.
Related
When I Initialize a ulong with the value 18446744073709551615 and then add a 1 to It and display to the Console It displays a 0 which is totally expected.
I know this question sounds stupid but I have to ask It. if my Computer has a 64-bit architecture CPU how is my calculator able to work with larger numbers than 18446744073709551615?
I suppose floating-point has a lot to do here.
I would like to know exactly how this happens.
Thank you.
working with larger numbers than 18446744073709551615
"if my Computer has a 64-bit architecture CPU" --> The architecture bit size is largely irrelevant.
Consider how you are able to add 2 decimal digits whose sum is more than 9. There is a carry generated and then used when adding the next most significant decimal place.
The CPU can do the same but with base 18446744073709551616 instead of base 10. It uses a carry bit as well as a sign and overflow bit to perform extended math.
I suppose floating-point has a lot to do here.
This is nothing to do with floating point.
; you say you're using ulong, which means your using unsigned 64-but arithmetic. The largest value you can store is therefore "all ones", for 64 bits - aka UInt64.MaxValue, which as you've discovered: https://learn.microsoft.com/en-us/dotnet/api/system.uint64.maxvalue
If you want to store arbitrarily large numbers: there are APIs for that - for example BigInteger. However, arbitrary size cones at a cost, so it isn't the default, and certainly isn't what you get when you use ulong (or double, or decimal, etc - all the compiler-level numeric types have fixed size).
So: consider using BigInteger
You either way have a 64 bits architecture processor and limited to doing 64 bits math - your problem is a bit hard to explain without taking an explicit example of how this is solved with BigInteger in System.Numerics namespace, available in .NET Framework 4.8 for example. The basis is to 'decompose' the number into an array representation.
mathematical expression 'decompose' here meaning :
"express (a number or function) as a combination of simpler components."
Internally BigInteger uses an internal array (actually multiple internal constructs) and a helper class called BigIntegerBuilder. In can implicitly convert an UInt64 integer without problem, for even bigger numbers you can use the + operator for example.
BigInteger bignum = new BigInteger(18446744073709551615);
bignum += 1;
You can read about the implicit operator here:
https://referencesource.microsoft.com/#System.Numerics/System/Numerics/BigInteger.cs
public static BigInteger operator +(BigInteger left, BigInteger right)
{
left.AssertValid();
right.AssertValid();
if (right.IsZero) return left;
if (left.IsZero) return right;
int sign1 = +1;
int sign2 = +1;
BigIntegerBuilder reg1 = new BigIntegerBuilder(left, ref sign1);
BigIntegerBuilder reg2 = new BigIntegerBuilder(right, ref sign2);
if (sign1 == sign2)
reg1.Add(ref reg2);
else
reg1.Sub(ref sign1, ref reg2);
return reg1.GetInteger(sign1);
}
In the code above from ReferenceSource you can see that we use the BigIntegerBuilder to add the left and right parts, which are also BigInteger constructs.
Interesting, it seems to keep its internal structure into an private array called "_bits", so that is the answer to your question. BigInteger keeps track of an array of 32-bits valued integer array and is therefore able to handle big integers, even beyond 64 bits.
You can drop this code into a console application or Linqpad (which has the .Dump() method I use here) and inspect :
BigInteger bignum = new BigInteger(18446744073709551615);
bignum.GetType().GetField("_bits",
BindingFlags.NonPublic | BindingFlags.Instance).GetValue(bignum).Dump();
A detail about BigInteger is revealed in a comment in the source code of BigInteger on Reference Source. So for integer values, BigInteger stores the value in the _sign field, for other values the field _bits is used.
Obviously, the internal array needs to be able to be converted into a representation in the decimal system (base-10) so humans can read it, the ToString() method converts the BigInteger to a string representation.
For a better in-depth understanding here, consider doing .NET source stepping to step way into the code how you carry out the mathematics here. But for a basic understanding, the BigInteger uses an internal representation of which is composed with 32 bits array which is transformed into a readable format which allows bigger numbers, bigger than even Int64.
// For values int.MinValue < n <= int.MaxValue, the value is stored in sign
// and _bits is null. For all other values, sign is +1 or -1 and the bits are in _bits
I'm trying to make a function HundredPosToZero to convert hundred position to zero, for example :
HundredPosToZero(4239) // 4039
This is my implement :
public HundredPosToZero(int num){
return num / 1000 * 1000 + num % 100;
}
However, I'm thinking of why not use bitwise operator like 4239 & 1011 to do the same thing? But I can not figure out how to implement it since 4239 is not a binary, any advice with this approach?
Not really. There is nothing special about decimal hundreds in binary. It would be possible with e.g. a hexadecimal number, but decimal numbers don't play well with binary :)
It is simply impossible if you want this operation to be the same for all numbers as "hundreds" in regular representation of integers don't occupy the same set of bits.
If you use some other binary representation of numbers (like BCD) that allocates groups of bits to unique decimal digits then you can do that easily.
The Double data type cannot correctly represent some base 10 values. This is because of how floating point numbers represent real numbers. What this means is that when representing monetary values, one should use the decimal value type to prevent errors. (feel free to correct errors in this preamble)
What I want to know is what are the values which present such a problem under the Double data-type under a 64 bit architecture in the standard .Net framework (C# if that makes a difference) ?
I expect the answer the be a formula or rule to find such values but I would also like some example values.
Any number which cannot be written as the sum of positive and negative powers of 2 cannot be exactly represented as a binary floating-point number.
The common IEEE formats for 32- and 64-bit representations of floating-point numbers impose further constraints; they limit the number of binary digits in both the significand and the exponent. So there are maximum and minimum representable numbers (approximately +/- 10^308 (base-10) if memory serves) and limits to the precision of a number that can be represented. This limit on the precision means that, for 64-bit numbers, the difference between the exponent of the largest power of 2 and the smallest power in a number is limited to 52, so if your number includes a term in 2^52 it can't also include a term in 2^-1.
Simple examples of numbers which cannot be exactly represented in binary floating-point numbers include 1/3, 2/3, 1/5.
Since the set of floating-point numbers (in any representation) is finite, and the set of real numbers is infinite, one algorithm to find a real number which is not exactly representable as a floating-point number is to select a real number at random. The probability that the real number is exactly representable as a floating-point number is 0.
You generally need to be prepared for the possibility that any value you store in a double has some small amount of error. Unless you're storing a constant value, chances are it could be something with at least some error. If it's imperative that there never be any error, and the values aren't constant, you probably shouldn't be using a floating point type.
What you probably should be asking in many cases is, "How do I deal with the minor floating point errors?" You'll want to know what types of operations can result in a lot of error, and what types don't. You'll want to ensure that comparing two values for "equality" actually just ensures they are "close enough" rather than exactly equal, etc.
This question actually goes beyond any single programming language or platform. The inaccuracy is actually inherent in binary data.
Consider that with a double, each number N to the left (at 0-based index I) of the decimal point represents the value N * 2^I and every digit to the right of the decimal point represents the value N * 2^(-I).
As an example, 5.625 (base 10) would be 101.101 (base 2).
Given this calculation, and decimal value that can't be calculated as a sum of 2^(-I) for different values of I would have an incorrect value as a double.
A float is represented as s, e and m in the following formula
s * m * 2^e
This means that any number that cannot be represented using the given expression (and in the respective domains of s, e and m) cannot be represented exactly.
Basically, you can represent all numbers between 0 and 2^53 - 1 multiplied by a certain power of two (possibly a negative power).
As an example, all numbers between 0 and 2^53 - 1 can be represented multiplied with 2^0 = 1. And you can also represent all those numbers by dividing them by 2 (with a .5 fraction). And so on.
This answer does not fully cover the topic, but I hope it helps.
I'm messing around with Fourier transformations. Now I've created a class that does an implementation of the DFT (not doing anything like FFT atm). This is the implementation I've used:
public static Complex[] Dft(double[] data)
{
int length = data.Length;
Complex[] result = new Complex[length];
for (int k = 1; k <= length; k++)
{
Complex c = Complex.Zero;
for (int n = 1; n <= length; n++)
{
c += Complex.FromPolarCoordinates(data[n-1], (-2 * Math.PI * n * k) / length);
}
result[k-1] = 1 / Math.Sqrt(length) * c;
}
return result;
}
And these are the results I get from Dft({2,3,4})
Well it seems pretty okay, since those are the values I expect. There is only one thing I find confusing. And it all has to do with the rounding of doubles.
First of all, why are the first two numbers not exactly the same (0,8660..443 8 ) vs (0,8660..443). And why can't it calculate a zero, where you'd expect it. I know 2.8E-15 is pretty close to zero, but well it's not.
Anyone know how these, marginal, errors occur and if I can and want to do something about it.
It might seem that there's not a real problem, because it's just small errors. However, how do you deal with these rounding errors if you're for example comparing 2 values.
5,2 + 0i != 5,1961524 + i2.828107*10^-15
Cheers
I think you've already explained it to yourself - limited precision means limited precision. End of story.
If you want to clean up the results, you can do some rounding of your own to a more reasonable number of siginificant digits - then your zeros will show up where you want them.
To answer the question raised by your comment, don't try to compare floating point numbers directly - use a range:
if (Math.Abs(float1 - float2) < 0.001) {
// they're the same!
}
The comp.lang.c FAQ has a lot of questions & answers about floating point, which you might be interested in reading.
From http://support.microsoft.com/kb/125056
Emphasis mine.
There are many situations in which precision, rounding, and accuracy in floating-point calculations can work to generate results that are surprising to the programmer. There are four general rules that should be followed:
In a calculation involving both single and double precision, the result will not usually be any more accurate than single precision. If double precision is required, be certain all terms in the calculation, including constants, are specified in double precision.
Never assume that a simple numeric value is accurately represented in the computer. Most floating-point values can't be precisely represented as a finite binary value. For example .1 is .0001100110011... in binary (it repeats forever), so it can't be represented with complete accuracy on a computer using binary arithmetic, which includes all PCs.
Never assume that the result is accurate to the last decimal place. There are always small differences between the "true" answer and what can be calculated with the finite precision of any floating point processing unit.
Never compare two floating-point values to see if they are equal or not- equal. This is a corollary to rule 3. There are almost always going to be small differences between numbers that "should" be equal. Instead, always check to see if the numbers are nearly equal. In other words, check to see if the difference between them is very small or insignificant.
Note that although I referenced a microsoft document, this is not a windows problem. It's a problem with using binary and is in the CPU itself.
And, as a second side note, I tend to use the Decimal datatype instead of double: See this related SO question: decimal vs double! - Which one should I use and when?
In C# you'll want to use the 'decimal' type, not double for accuracy with decimal points.
As to the 'why'... repsensenting fractions in different base systems gives different answers. For example 1/3 in a base 10 system is 0.33333 recurring, but in a base 3 system is 0.1.
The double is a binary value, at base 2. When converting to base 10 decimal you can expect to have these rounding errors.
All the methods in System.Math takes double as parameters and returns parameters. The constants are also of type double. I checked out MathNet.Numerics, and the same seems to be the case there.
Why is this? Especially for constants. Isn't decimal supposed to be more exact? Wouldn't that often be kind of useful when doing calculations?
This is a classic speed-versus-accuracy trade off.
However, keep in mind that for PI, for example, the most digits you will ever need is 41.
The largest number of digits of pi
that you will ever need is 41. To
compute the circumference of the
universe with an error less than the
diameter of a proton, you need 41
digits of pi †. It seems safe to
conclude that 41 digits is sufficient
accuracy in pi for any circle
measurement problem you're likely to
encounter. Thus, in the over one
trillion digits of pi computed in
2002, all digits beyond the 41st have
no practical value.
In addition, decimal and double have a slightly different internal storage structure. Decimals are designed to store base 10 data, where as doubles (and floats), are made to hold binary data. On a binary machine (like every computer in existence) a double will have fewer wasted bits when storing any number within its range.
Also consider:
System.Double 8 bytes Approximately ±5.0e-324 to ±1.7e308 with 15 or 16 significant figures
System.Decimal 12 bytes Approximately ±1.0e-28 to ±7.9e28 with 28 or 29 significant figures
As you can see, decimal has a smaller range, but a higher precision.
No, - decimals are no more "exact" than doubles, or for that matter, any type. The concept of "exactness", (when speaking about numerical representations in a compuiter), is what is wrong. Any type is absolutely 100% exact at representing some numbers. unsigned bytes are 100% exact at representing the whole numbers from 0 to 255. but they're no good for fractions or for negatives or integers outside the range.
Decimals are 100% exact at representing a certain set of base 10 values. doubles (since they store their value using binary IEEE exponential representation) are exact at representing a set of binary numbers.
Neither is any more exact than than the other in general, they are simply for different purposes.
To elaborate a bit furthur, since I seem to not be clear enough for some readers...
If you take every number which is representable as a decimal, and mark every one of them on a number line, between every adjacent pair of them there is an additional infinity of real numbers which are not representable as a decimal. The exact same statement can be made about the numbers which can be represented as a double. If you marked every decimal on the number line in blue, and every double in red, except for the integers, there would be very few places where the same value was marked in both colors.
In general, for 99.99999 % of the marks, (please don't nitpick my percentage) the blue set (decimals) is a completely different set of numbers from the red set (the doubles).
This is because by our very definition for the blue set is that it is a base 10 mantissa/exponent representation, and a double is a base 2 mantissa/exponent representation. Any value represented as base 2 mantissa and exponent, (1.00110101001 x 2 ^ (-11101001101001) means take the mantissa value (1.00110101001) and multiply it by 2 raised to the power of the exponent (when exponent is negative this is equivilent to dividing by 2 to the power of the absolute value of the exponent). This means that where the exponent is negative, (or where any portion of the mantissa is a fractional binary) the number cannot be represented as a decimal mantissa and exponent, and vice versa.
For any arbitrary real number, that falls randomly on the real number line, it will either be closer to one of the blue decimals, or to one of the red doubles.
Decimal is more precise but has less of a range. You would generally use Double for physics and mathematical calculations but you would use Decimal for financial and monetary calculations.
See the following articles on msdn for details.
Double
http://msdn.microsoft.com/en-us/library/678hzkk9.aspx
Decimal
http://msdn.microsoft.com/en-us/library/364x0z75.aspx
Seems like most of the arguments here to "It does not do what I want" are "but it's faster", well so is ANSI C+Gmp library, but nobody is advocating that right?
If you particularly want to control accuracy, then there are other languages which have taken the time to implement exact precision, in a user controllable way:
http://www.doughellmann.com/PyMOTW/decimal/
If precision is really important to you, then you are probably better off using languages that mathematicians would use. If you do not like Fortran then Python is a modern alternative.
Whatever language you are working in, remember the golden rule:
Avoid mixing types...
So do convert a and b to be the same before you attempt a operator b
If I were to hazard a guess, I'd say those functions leverage low-level math functionality (perhaps in C) that does not use decimals internally, and so returning a decimal would require a cast from double to decimal anyway. Besides, the purpose of the decimal value type is to ensure accuracy; these functions do not and cannot return 100% accurate results without infinite precision (e.g., irrational numbers).
Neither Decimal nor float or double are good enough if you require something to be precise. Furthermore, Decimal is so expensive and overused out there it is becoming a regular joke.
If you work in fractions and require ultimate precision, use fractions. It's same old rule, convert once and only when necessary. Your rounding rules too will vary per app, domain and so on, but sure you can find an odd example or two where it is suitable. But again, if you want fractions and ultimate precision, the answer is not to use anything but fractions. Consider you might want a feature of arbitrary precision as well.
The actual problem with CLR in general is that it is so odd and plain broken to implement a library that deals with numerics in generic fashion largely due to bad primitive design and shortcoming of the most popular compiler for the platform. It's almost the same as with Java fiasco.
double just turns out to be the best compromise covering most domains, and it works well, despite the fact MS JIT is still incapable of utilising a CPU tech that is about 15 years old now.
[piece to users of MSDN slowdown compilers]
Double is a built-in type. Is is supported by FPU/SSE core (formerly known as "Math coprocessor"), that's why it is blazingly fast. Especially at multiplication and scientific functions.
Decimal is actually a complex structure, consisting of several integers.