I've been using System.Math quite a lot lately and the other day I was wondering, how Microsoft would have implemented the Sqrt method in the library. So I popped open my best mate Reflector and tried to Disassemble the method in the library, but it showed:
[MethodImpl(MethodImplOptions.InternalCall),ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
public static extern double Sqrt(double d);
That day for the first time ever, I realized how dependent my kids are on the framework, to eat.
Jokes apart, but i was wondering what sort of algorithm MS would have used to implement this method or in other words how would you write your own implementation of Math.Sqrt in C# if you had no library support.
Any of the methods you find back with Reflector or the Reference Source that have the MethodImplOptions.InternalCall attribute are actually implemented in C++ inside the CLR. You can get the source code for these from the SSCLI20 distribution. The relevant file is clr/src/vm/ecall.cpp, it contains a table of method names with function pointers, used by the JIT compiler to directly embed the call address into the generated machine code. The relevant table section is
FCIntrinsic("Cos", COMDouble::Cos, CORINFO_INTRINSIC_Cos)
FCIntrinsic("Sqrt", COMDouble::Sqrt, CORINFO_INTRINSIC_Sqrt)
FCIntrinsic("Round", COMDouble::Round, CORINFO_INTRINSIC_Round)
...
Which takes you to clr/src/classlibnative/float/comfloat.cpp
FCIMPL1_V(double, COMDouble::Sqrt, double d)
WRAPPER_CONTRACT;
STATIC_CONTRACT_SO_TOLERANT;
return (double) sqrt(d);
FCIMPLEND
It just calls the CRT function. But that's not what happens in the x86 jitter, note the 'intrinsic' in the table declaration. You won't find that in the SSLI20 version of the jitter, it is a simple one unencumbered by patents. The shipping one however does turn it into an intrinsic:
double d = 2.0;
Console.WriteLine(Math.Sqrt(d));
translates to
00000008 fld dword ptr ds:[0072156Ch]
0000000e fsqrt
..etc
In other words, Math.Sqrt() translates to a single floating point machine code instruction. Check this answer for details on how that beats native code handily.
The function will be translated into assembler instructions. Such as the fsqrt instruction of the x87.
You could implement floating point numbers in software, but that will most likely be much slower. I think for Sqrt an iterative algorithm the typical implementation.
Google.com will give you more answers than StackOverflow.com
Have a look at this page:
http://en.wikipedia.org/wiki/Methods_of_computing_square_roots
One algorithm can be found under the title " Binary numeral system (base 2)" in the above wiki page.
But, software implementations will NOT be efficient. Modern CPU's have hardware implementations for math functions in FPU. You just need to invoke the correct instructions of the processor (in assembly or machine language)
public double Sqrt(int number)
{
double x = number / 2;
for (int i = 0; i < 100; i++) x = (x + number / x) / 2d;
return x;
}
Very crude method but if I used something more elaborate such as log method, you could ask "and how can I implement the log method?"
To recreate the System.Math.Sqrt function, just do this:
public static double Sqrt(double n) => Math.Pow(n, 1 / 2);
Related
I have values greater than 1.97626258336499E-323
I cant use BigInteger also as it handler only integer values
Any help is appreciated
Here is the code that failed also failed with some solution given by some users:
BigValue / (Math.Pow((1 + ret), j));
WHere BigValue is something like 15000.25
ret is -0.99197104212554987
And j will go to around 500-600.
I am not gettting how to use Rational Class for this too
BigRational from the base class library team from Microsoft. It uses big integers to store it as a fraction, but supports all kinds of operators.
When it comes to printing it as a decimal, I think you need to write your own implementation for that. I have one written somewhere for this class, but I'd have to find it.
Here is something that may be useful. I used it a while back with no problem. It is a .Net BigDecimal class, you can download it from codeplex(or just look at the source):
http://bigdecimal.codeplex.com/releases/view/44790
It is written in VB.Net (.Net 4.0), but that shouldn't matter.
An example of its use in C#: http://www.dreamincode.net/forums/blog/217/entry-2522-the-madman-scribblings/
You will have to switch languages to one that has a BigFloat type (e.g. Haskel and Python have native packages) or else find a third party big float library with a C# binding. There was some discussion of such a binding for GNU MP, but it faded away. Maybe you'll write one!
See also this discussion where MS BigRational is discussed. However this is different from BigFloat.
One solution might be to do your problems in log space instead.
your example would become:
exp(log(Number) - log(1-0.9999999) * 400)
Learn how to use logs to work with numbers like these. Yes, you CAN use a big float package, but that is overkill almost always. You can usually get what you need using logs.
I've encountered non-optimal code in several open source projects, when programmers do not think about what they are using.
There is up to a 10 times performance difference between two cases, because of Math.Pow use Exp and Ln functions in internal, how it is explained in this answer.
The usual multiplication is better than powering in most cases (with small powers), but the best, of course, is the Exponentation by squaring algorithm.
Thus, I think that the compiler or JITter must perform such optimization with powers and other functions. Why is it still not introduced? Am I right?
Read the anwser you've referenced again, it clearly states that CRT uses a pow() function which Microsoft bought from Intel. The example you see using Math.Log and Math.Exp is an example the writer of the article has found in a programming book.
The "problem" with general exponentiation methods is that that they are build to produce the most accurate results for all cases. This often results in sub-optimal performance for certain cases. To increase the preformance of these certain cases, conditional logic must be added which results in performance loss for all cases. Because squaring or cubing a value is that simple to write without the Math.Pow method, there is no need to optimize these cases and taking the extra loss for all other cases.
i would say that would be a bad idea, because both methods do NOT return the same results every time.
here is a small test script
var r = new Random();
var any = Enumerable.Range(0, 1000).AsParallel().All(p =>
{
var d = r.NextDouble();
var pow = Math.Pow(d, 2.0);
var sqr = d * d;
var identical = pow == sqr;
if (!identical)
MessageBox.Show(d.ToString());
return identical;
});
there are different accuracies of both implementations. if a reliable calculation is done, it should be reproducable. if for example just in the release implementation the square optimization would be used, then the debug and release version would return different solutions. that can be quite a mess for error debugging ...
I'm working on a n image processing library which extends OpenCV, HALCON, ... . The library must be with .NET Framework 3.5 and since my experiences with .NET are limited I would like to ask some questions regarding the performance.
I have encountered a few specific things which I cannot explain to myself properly and would like you to ask a) why and b) what is the best practise to deal with the cases.
My first question is about Math.pow. I already found some answers here on StackOverflow which explains it quite well (a) but not what to do about this(b). My benchmark Program looks like this
Stopwatch watch = new Stopwatch(); // from the Diagnostics class
watch.Start();
for (int i = 0; i < 1000000; i++)
double result = Math.Pow(4,7) // the function call
watch.Stop()
The result was not very nice (~300ms on my computer) (I have run the test 10 times and calcuated the average value).
My first idea was to check wether this is because it is a static function. So I implemented my own class
class MyMath
{
public static double Pow (double x, double y) //Using some expensive functions to calculate the power
{
return Math.Exp(Math.Log(x) * y);
}
public static double PowLoop (double x, int y) // Using Loop
{
double res = x;
for(int i = 1; i < y; i++)
res *= x;
return res;
}
public static double Pow7 (double x) // Using inline calls
{
return x * x * x * x * x * x * x;
}
}
THe third thing I checked were if I would replace the Math.Pow(4,7) directly through 4*4*4*4*4*4*4.
The results are (the average out of 10 test runs)
300 ms Math.Pow(4,7)
356 ms MyMath.Pow(4,7) //gives wrong rounded results
264 ms MyMath.PowLoop(4,7)
92 ms MyMath.Pow7(4)
16 ms 4*4*4*4*4*4*4
Now my situation now is basically like this: Don't use Math for Pow. My only problem is just that... do I really have to implement my own Math-class now? It seems somehow ineffective to implement an own class just for the power function. (Btw. PowLoop and Pow7 are even faster in the Release build by ~25% while Math.Pow is not).
So my final questions are
a) am I wrong if I wouldn't use Math.Pow at all (but for fractions maybe) (which makes me somehow sad).
b) if you have code to optimize, are you really writing all such mathematical operations directly?
c) is there maybe already a faster (open-source^^) library for mathematical operations
d) the source of my question is basically: I have assumed that the .NET Framework itself already provides very optimized code / compile results for such basic operations - be it the Math-Class or handling arrays and I was a little surprised how much benefit I would gain by writing my own code. Are there some other, general "fields" or something else to look out in C# where I cannot trust C# directly.
Two things to bear in mind:
You probably don't need to optimise this bit of code. You've just done a million calls to the function in less than a second. Is this really going to cause big problems in your program?
Math.Pow is probably fairly optimal anyway. At a guess, it will be calling a proper numerics library written in a lower level language, which means you shouldn't expect orders of magnitude increases.
Numerical programming is harder than you think. Even the algorithms that you think you know how to calculate, aren't calculated that way. For example, when you calculate the mean, you shouldn't just add up the numbers and divide by how many numbers you have. (Modern numerics libraries use a two pass routine to correct for floating point errors.)
That said, if you decide that you definitely do need to optimise, then consider using integers rather than floating point values, or outsourcing this to another numerics library.
Firstly, integer operations are much faster than floating point. If you don't need floating point values, don't use the floating point data type. This generally true for any programming language.
Secondly, as you have stated yourself, Math.Pow can handle reals. It makes use of a much more intricate algorithm than a simple loop. No wonder it is slower than simply looping. If you get rid of the loop and just do n multiplications, you are also cutting off the overhead of setting up the loop - thus making it faster. But if you don't use a loop, you have to know
the value of the exponent beforehand - it can't be supplied at runtime.
I am not really sure why Math.Exp and Math.Log is faster. But if you use Math.Log, you can't find the power of negative values.
Basically int are faster and avoiding loops avoid extra overhead. But you are trading off some flexibility when you go for those. But it is generally a good idea to avoid reals when all you need are integers, but in this case coding up a custom function when one already exists seems a little too much.
The question you have to ask yourself is whether this is worth it. Is Math.Pow actually slowing your program down? And in any case, the Math.Pow already bundled with your language is often the fastest or very close to that. If you really wanted to make an alternate implementation that is really general purpose (i.e. not limited to only integers, positive values, etc.), you will probably end up using the same algorithm used in the default implementation anyway.
When you are talking about making a million iterations of a line of code then obviously every little detail will make a difference.
Math.Pow() is a function call which will be substantially slower than your manual 4*4...*4 example.
Don't write your own class as its doubtful you'll be able to write anything more optimised than the standard Math class.
In Java I run:
System.out.println(Math.log(249.0/251.0));
Output: -0.008000042667076265
In C# I run: <- fixed
Math.Log (x/y); \\where x, y are almost assuredly 249.0 and 251.0 respectively
Output: -0.175281838 (printed out later in the program)
Google claims:
Log(249.0/251.0)
Output: -0.00347437439
And MacOS claims about the same thing (the first difference between google and Snow Leopard is at about 10^-8, which is negligible.
Is there any reason that these results should all vary so widely or am I missing something very obvious? (I did check that java and C# both use base e). Even mildly different values of e don't seem to account for such a big difference. Any suggestions?
EDIT:
Verifying on Wolfram Alpha seems to suggest that Java is right (or that Wolfram Alpha uses Java Math for logarithms...) and that my C# program doesn't have the right input, but I am disinclined to believe this because taking (e^(google result) - 249/251) gives me an error of 0.0044 which is pretty big in my opinion, suggesting that there is a different problem at hand...
You're looking at logarithms with different bases:
Java's System.out.println(Math.log(249.0/251.0)); is a natural log (base e)
C#'s Math.Log (x,y); gives the log of x with base specified by y
Google's Log(249.0/251.0) gives the log base 10
Though I don't get the result you do from C# (Math.Log( 249.0, 251.0) == 0.998552147171426).
You have a mistake somewhere in your C# program between where the log is calculated and where it is printed out. Math.Log gives the correct answer:
class P
{
static void Main()
{
System.Console.WriteLine(System.Math.Log(249.0/251.0));
}
}
prints out -0.00800004266707626
Even experienced programmers write C# code like this sometimes:
double x = 2.5;
double y = 3;
if (x + 0.5 == 3) {
// this will never be executed
}
Basically, it's common knowledge that two doubles (or floats) can never be precisely equal to each other, because of the way the computer handles floating point arithmetic.
The problem is, everyone sort-of knows this, but code like this is still all over the place. It's just so easy to overlook.
Questions for you:
How have you dealt with this in your development organization?
Is this such a common thing that the compiler should be checking that we all should be screaming really loud for VS2010 to include a compile-time warning if someone is comparing two doubles/floats?
UPDATE: Folks, thanks for the comments. I want to clarify that I most certainly understand that the code above is incorrect. Yes, you never want to == compare doubles and floats. Instead, you should use epsilon-based comparison. That's obvious. The real question here is "how do you pinpoint the problem", not "how do you solve the technical issue".
Floating point values certainly can be equal to each other, and in the case you've given they always will be equal. You should almost never compare for equality using equals, but you do need to understand why - and why the example you've shown isn't appropriate.
I don't think it's something the compiler should necessarily warn about, but you may want to see whether it's something FxCop can pick up on. I can't see it in the warning list, but it may be there somewhere...
Personally I'm reasonably confident that competent developers would be able to spot this in code review, but that does rely on you having a code review in place to start with. It also relies on your developers knowing when to use double and when to use decimal, which is something I've found often isn't the case...
static int _yes = 0;
static int _no = 0;
static void Main(string[] args)
{
for (int i = 0; i < 1000000; i++)
{
double x = 1;
double y = 2;
if (y - 1 == x)
{
_yes++;
}
else
{
_no++;
}
}
Console.WriteLine("Yes: " + _yes);
Console.WriteLine("No: " + _no);
Console.Read();
}
Output
Yes: 1000000
No: 0
In our organization we have a lot of financial calculations and we don't use float and double for such tasks. We use Decimal in .NET, BigDecimal in Java and Numeric in MSSQL to escape round-off errors.
This article describes the problem: What Every CS Should Know About floating-Point Arithmetic
If FxCop or similar (as Jon suggests) doesn't work out for you a more heavy handed approach might be to take a copy of the code - replace all instances of float or double with a class you've written that's somewhat similar to System.Double, except that you overload the == operator to generate a warning!
I don't know if this is feasible in practice as I've not tried it - but let us know if you do try :-)
Mono's Gendarme is an FxCop-like tool. It has a rule called AvoidFloatingPointEqualityRule under the Correctness category. You could try it to find instances of this error in your code. I haven't used it, but it should analyse regular .net dll's. The FxCop rule with the same name was removed long ago.