C# and strange exponentiation of complex numbers - c#

Have been playing around with complex numbers in C#, and I found something interesting. Not sure if its a bug or if I have just missed something, but when I run the following code:
var num = new Complex(0, 1);
var complexPow = Complex.Pow(num, 2);
var numTimesNum = num * num;
Console.WriteLine("Complex.Pow(num, 2) = {0} num*num = {1}", complexPow.ToString(), numTimesNum.ToString());
I get the following output:
Complex.Pow(num, 2) = (-1, 1.22460635382238E-16) num*num = (-1, 0)
If memory serves, a complex number times itself should be just -1 with no imaginary part (or rather an imaginary part of 0). So why doesnt Complex.Pow(num, 2) give -1? Where does the 1.22460635382238E-16 come from?
If it matters, I am using mono since Im not in Windows atm. Think it might be 64-bit, since I am running a 64-bit OS, but I am not sure where to check.
Take care,
Kerr.
EDIT:
Ok, I explained poorly. I of course mean the square of i is -1, not the square of any complex number. Thanks for pointing it out. A bit tired right now, so my brain doesnt work too well, lol.
EDIT 2:
To clarify something, I have been reading a little math lately and decided to make a small scripting language for fun. Ok, "scripting language" is an over statement, it just evaluates equations and nothing more.

You're seeing floating-point imprecision.
1.22460635382238E-16 is actually 0.00000000000000012....
Complex.Pow() is probably implemented through De Moivre's formula, using trigonometry to compute arbitrary powers.
It is therefore subject to inaccuracy from both floating-point arithmetic and trig.
It apparently does not have any special-case code for integral powers, which can be simpler.
Ordinary complex multiplication only involves simple arithmetic, so it is not subject to floating-point inaccuracies when the numbers are integral.

So why doesnt Complex.Pow(num, 2) give -1? Where does the 1.22460635382238E-16 come from?
I suspect it's a rounding error, basically. It's a very small number, after all. I don't know the details of Complex.Pow, but I wouldn't be at all surprised to find it used some trigonometry somewhere - you may well be observing the fact that pi/2 isn't exactly representable as a double.
The * operation is able to avoid this by being more simply defined - Complex.Pow could be special-cased to just use x * x where the power is 2, but I expect that hasn't been done. Instead, a general algorithm is used which gives an answer very close to the hypothetical one, but which can result in small errors.

So why doesnt Complex.Pow(num, 2) give -1? Where does the 1.22460635382238E-16 come from?
Standard issues with floating point representations, and the algorithms that are used to compute Complex.Pow (it's not as simple as you think). Note that 1.22460635382238E-16 is extremely small, close to machine epsilon. Additionally, a key fact here is that (0, 1) is really (1, pi / 2) in polar coordinates, and pi / 2 doesn't have an exact representation in floating point.
If this is at all uncomfortable to you, I recommend reading What Every Computer Scientist Should Know About Floating-Point Arithmetic. It should be required reading for college CS curriculum.

Related

margin of error trying to check if a point lays on line

Working with floating point values it is as easy as breathing to run on approximation errors by comparing quantities which should be the same. I want to know if there is a way built in some MSDN (or even external) library for c# to ignore the problem.
An example could be: more than comparing 2 float values like this
if(myVector3.X == anotherVector3.X)
I would appreciate something like this
if(myVector3.X.isInTheNeighbourhood(anotherVector3.X))
This is not well-written, I know. That is just to simplify the explaination. What I am exactly doing is checking if a point (Vector3) lays on line segment. So basically the calculations I make are nothing more than
(x - x1)/(x2 - x1) = (y - y1)/(y2 - y1) = (z - z1)/(z2 - z1)
But these values won't be always the same, so I need to write down some code which includes a sort of tolerance a sort of Neighbourhood mathematical concept to accept values close enought to the line.
Hope that I made myself clear.
Has anyone a solution for this?
I'm proposing the use of an exact predicate. This is an alternative approach to what you actually asked, but it might be worth considering.
Suppose your three points really are at locations indicated by their respective double precision coordinates. A simple double precision computation like the one suggested in your question will likely still return a wrong result. However, there are techniques to obtain exact results to detect these cases. One technique is by turning all numbers into arbitrary precision floating point (or integer) numbers and do most of the computation using integers under the hood. Another, which should work faster on modern hardware, expresses intermediate results as sums of doubles. E.g. when computing a+b, you obtain two resulting numbers, the first is the sum as you'd usually compute it, the other is a correction term to note down the error. In many cases, the bigger terms are sufficient to make a choice, which leads to the concept of adaptive precision.
All of this, including the application to geometric predicates, has been outlined nicely by Jonathan Richard Shewchuk in his page Adaptive Precision Floating-Point Arithmetic and Fast Robust Predicates for Computational Geometry. He has a paper on it, and some C code which should be possible to adapt to C#. Or perhaps to compile in C and link to C#, thus forming a mixed language project. Note however that it makes some assumptions on how the compiler treats floating point computations. In particular, you have to be careful that certain intermediate results don't have excess precision, like the 80-bit numbers stored in a 80387-style floating point unit. I've looked into this myself recently, and the easiest solution might be asking the compiler to use SSE instructions instead of x87 ones.
In your case, you are asking whether a point p lies on the segment connecting p1 and p2. One predicate which would be very much in the spirit of that paper would be the position of a fourth point in relation to a plane spanned by three others: it is either above, below or within that plane, and the orient3d predicate can tell you which. So one possible approach would be taking four arbitrary points q1 through q4 in general position, i.e. not all in a single plane, and no three on a single line. Then you could check whether p,p1,p2,qi are coplanar (sign is zero) for all i ∈ {1,2,3,4}. If they are, then p does at least lie on the line spanned by p1 and p2. Next you could check the orientations of p,p1,qi,qj for different i,j until you find a non-zero sign, then you could see whether p,p2,qi,qj has a different sign. If it does, then p is indeed between p1 and p2, hence on the same line. If you find no i,j such that p,p1,qi,qj is non-zero, then p=p1. Likewise if you have found one non-zero sign, but the corresponding sign of p,p2,qi,qj is zero, then p=p2. It is up to you whether you want to include the endpoints of your line segment. This paragraph might not be the most elegant way to do this, but it leverages the existing orient3d implementation, so it might be easier to use than writing a new predicate from scratch.
Note that in most cases, a point will not exactly lie on a line segment, due to rounding of the point coordinates themselves. The above will only help you to reliably detect those rare cases when it does, so you can deal with them. It may also allow you to make consistent choices for other predicates as well. In the world of CGAL, this approach would be termed “exact predicates, inexact constructions”, since the predicates can be computed reliably, but the geometric objects on which they operate are still subject to approximation. If you really need points that reliably lie on the line segments, and not just close by, then an exact construction approach using some exact number type would be preferable. Or you go with the approaches the other answers suggest.
You need to calculate the distance of the point to the line. That's simple mathematics. Then you decide how far of a distance "close enough" is in your case.
This problem is described as floating-point tolerance in this article and it points out the importance of measuring relative tolerance rather than absolute when comparing large values: http://realtimecollisiondetection.net/blog/?p=89
I've never had a situation where large floating point values are possible so I've always hard-coded a magic value into my comparisons. Eg:
if (Math.Abs(value1, value2) < 0.05f)
{
...
}
Which is "bad" but it works so long as value1 and value2 can't get too big.
In your case you really want to calculate the distance of the point from the line and then check that distance is small enough, rather than testing that the point is exactly on the line. I am rubbish at math myself so don't understand this but let me copy someone else's answer for calculating the shortest distance between a 3D point and a 3D line:
Vector3 u = new Vector3(x2 - x1, y2 - y1, z2 - z1);
Vector3 pq = new Vector3(x3 - x1, y3 - y1, z3 - z1);
float distance = Vector3.Cross(pq, u).Length / u.Length;
3D Perpendicular Point on Line From 3D point

How to combine float representation with discontinous function?

I have read tons of things about floating error, and floating approximation, and all that.
The thing is : I never read an answer to a real world problem. And today, I came across a real world problem. And this is really bad, and I really don't know how to escape.
Take a look at this example :
[TestMethod]
public void TestMethod1()
{
float t1 = 8460.32F;
float t2 = 5990;
var x = t1 - t2;
var y = F(x);
Assert.AreEqual(x, y);
}
float F(float x)
{
if (x <= 2470.32F) { return x; }
else { return -x; }
}
x is supposed to be 2470.32. But in fact, due to rounding error, its value is 2470.32031.
Most of the time, this is not a problem. Functions are continuous, and all is good, the result is off by a little value.
But here, we have a discontinous function, and the error is really, really big. The test failed exactly on the discontinuous point.
How could I handle the rounding error with discontinuous functions?
The key problem here is:
The function has a large (and significant) change in output value in certain cases when there is a small change in input value.
You are passing an incorrect input value to the function.
As you write, “due to rounding error, [x’s value] is 2470.32031”. Suppose you could write any code you desire—simply describe the function to be performed, and a team of expert programmers will provide complete, bug-free source code within seconds. What would you tell them?
The problem you are posing is, “I am going to pass a wrong value, 2470.32031, to this function. I want it to know that the correct value is something else and to provide the result for the correct value, which I did not pass, instead of the incorrect value, which I did pass.”
In general, that problem is impossible to solve, because it is impossible to distinguish when 2470.32031 is passed to the function but 2470.32 is intended from when 2470.32031 is passed to the function and 2470.32031 is intended. You cannot expect a computer to read your mind. When you pass incorrect input, you cannot expect correct output.
What this tells us is that no solution inside of the function F is possible. Therefore, we must zoom out and look at the larger problem. You must examine whether the value passed to F can be improved (calculated in a better way or with higher precision or with supplementary information) or whether the nature of the problem is such that, when 2470.32031 is passed, 2470.32 is always intended, so that this knowledge can be incorporated into F.
NOTE: this answer is essentially the same as the one of Eric
It just enlighten the test point of view, since a test is a form of specification.
The problem here is that testMethod1 does not test F.
It rather tests that conversion of decimal quantity 8460.32 to float and float subtraction are inexact.
But is it the intention of the test?
All you can say is that in certain bad conditions (near discontinuity), a small error on input will result in a large error on output, so the test could express that it is an expected result.
Note that function F is almost perfect, except maybe for the float value 2470.32F itself.
Indeed, the floating point approximation will round the decimal by excess (1/3200 exactly).
So the answer should be:
Assert.AreEqual(F(2470.32F), -2470.32F); /* because 2470.32F exceed the decimal 2470.32 */
If you want to test such low level requirements, you'll need a library with high (arbitrary/infinite) precision to perform the tests.
If you can't afford such imprecision on function F, then Float is a mismatch., and you'll have to find another implementation with increased, arbitrary or infinite precision.
It's up to you to specify your needs, and testMethod1 shall explicit this specification better than it does right now.
If you need the 8460.32 number to be exactly that without rounding error, you could look at the .NET Decimal type which was created explicitly to represent base 10 fractional numbers without rounding error. How they perform that magic is beyond me.
Now, I realize this may be impractical for you to do because the float presumably comes from somewhere and refactoring it to Decimal type could be way too much to do, but if you need it to have that much precision for the discontinuous function that relies on that value you'll either need a more precise type or some mathematical trickery. Perhaps there is some way to always ensure that a float is created that has rounding error such that it's always less than the actual number? I'm not sure if such a thing exists but it should also solve your issue.
You have three numbers represented in your application, you have accepted imprecision in each of them by representing them as floats.
So I think you can reasonably claim that your program is working correctly
(oneNumber +/- some imprecision ) - (another number +/- some imprecision)
is not quite bigger than another number +/- some imprecision
when viewed in decimal representation on paper it looks wrong but that's not what you've implemented. What's the origin of the data? How precisely was 8460.32 known? Had it been 8460.31999 what should have happened? 8460.32001? Was the original value known to such precision?
In the end if you want to model more accuracy use a different data type, as suggested elsewhere.
I always just assume that when comparing floating point values a small margin of error is needed because of rounding issues. In your case, this would most likely mean choosing values in your test method that aren't quite so stringent--e.g., define a very small error constant and subtract that value from x. Here's a SO question that relates to this.
Edit to better address the concluding question: Presumably it doesn't matter what the function outputs on the discontinuity exactly, so test just slightly on either side of it. If it does matter, then really about the best you can do is allow either of two outputs from the function at that point.

Multiply or divide in C#/.NET [duplicate]

Here's a silly fun question:
Let's say we have to perform a simple operation where we need half of the value of a variable. There are typically two ways of doing this:
y = x / 2.0;
// or...
y = x * 0.5;
Assuming we're using the standard operators provided with the language, which one has better performance?
I'm guessing multiplication is typically better so I try to stick to that when I code, but I would like to confirm this.
Although personally I'm interested in the answer for Python 2.4-2.5, feel free to also post an answer for other languages! And if you'd like, feel free to post other fancier ways (like using bitwise shift operators) as well.
Python:
time python -c 'for i in xrange(int(1e8)): t=12341234234.234 / 2.0'
real 0m26.676s
user 0m25.154s
sys 0m0.076s
time python -c 'for i in xrange(int(1e8)): t=12341234234.234 * 0.5'
real 0m17.932s
user 0m16.481s
sys 0m0.048s
multiplication is 33% faster
Lua:
time lua -e 'for i=1,1e8 do t=12341234234.234 / 2.0 end'
real 0m7.956s
user 0m7.332s
sys 0m0.032s
time lua -e 'for i=1,1e8 do t=12341234234.234 * 0.5 end'
real 0m7.997s
user 0m7.516s
sys 0m0.036s
=> no real difference
LuaJIT:
time luajit -O -e 'for i=1,1e8 do t=12341234234.234 / 2.0 end'
real 0m1.921s
user 0m1.668s
sys 0m0.004s
time luajit -O -e 'for i=1,1e8 do t=12341234234.234 * 0.5 end'
real 0m1.843s
user 0m1.676s
sys 0m0.000s
=>it's only 5% faster
conclusions: in Python it's faster to multiply than to divide, but as you get closer to the CPU using more advanced VMs or JITs, the advantage disappears. It's quite possible that a future Python VM would make it irrelevant
Always use whatever is the clearest. Anything else you do is trying to outsmart the compiler. If the compiler is at all intelligent, it will do the best to optimize the result, but nothing can make the next guy not hate you for your crappy bitshifting solution (I love bit manipulation by the way, it's fun. But fun != readable)
Premature optimization is the root of all evil. Always remember the three rules of optimization!
Don't optimize.
If you are an expert, see rule #1
If you are an expert and can justify the need, then use the following procedure:
Code it unoptimized
determine how fast is "Fast enough"--Note which user requirement/story requires that metric.
Write a speed test
Test existing code--If it's fast enough, you're done.
Recode it optimized
Test optimized code. IF it doesn't meet the metric, throw it away and keep the original.
If it meets the test, keep the original code in as comments
Also, doing things like removing inner loops when they aren't required or choosing a linked list over an array for an insertion sort are not optimizations, just programming.
I think this is getting so nitpicky that you would be better off doing whatever makes the code more readable. Unless you perform the operations thousands, if not millions, of times, I doubt anyone will ever notice the difference.
If you really have to make the choice, benchmarking is the only way to go. Find what function(s) are giving you problems, then find out where in the function the problems occur, and fix those sections. However, I still doubt that a single mathematical operation (even one repeated many, many times) would be a cause of any bottleneck.
Multiplication is faster, division is more accurate. You'll lose some precision if your number isn't a power of 2:
y = x / 3.0;
y = x * 0.333333; // how many 3's should there be, and how will the compiler round?
Even if you let the compiler figure out the inverted constant to perfect precision, the answer can still be different.
x = 100.0;
x / 3.0 == x * (1.0/3.0) // is false in the test I just performed
The speed issue is only likely to matter in C/C++ or JIT languages, and even then only if the operation is in a loop at a bottleneck.
If you want to optimize your code but still be clear, try this:
y = x * (1.0 / 2.0);
The compiler should be able to do the divide at compile-time, so you get a multiply at run-time. I would expect the precision to be the same as in the y = x / 2.0 case.
Where this may matter a LOT is in embedded processors where floating-point emulation is required to compute floating-point arithmetic.
Just going to add something for the "other languages" option.
C: Since this is just an academic exercise that really makes no difference, I thought I would contribute something different.
I compiled to assembly with no optimizations and looked at the result.
The code:
int main() {
volatile int a;
volatile int b;
asm("## 5/2\n");
a = 5;
a = a / 2;
asm("## 5*0.5");
b = 5;
b = b * 0.5;
asm("## done");
return a + b;
}
compiled with gcc tdiv.c -O1 -o tdiv.s -S
the division by 2:
movl $5, -4(%ebp)
movl -4(%ebp), %eax
movl %eax, %edx
shrl $31, %edx
addl %edx, %eax
sarl %eax
movl %eax, -4(%ebp)
and the multiplication by 0.5:
movl $5, -8(%ebp)
movl -8(%ebp), %eax
pushl %eax
fildl (%esp)
leal 4(%esp), %esp
fmuls LC0
fnstcw -10(%ebp)
movzwl -10(%ebp), %eax
orw $3072, %ax
movw %ax, -12(%ebp)
fldcw -12(%ebp)
fistpl -16(%ebp)
fldcw -10(%ebp)
movl -16(%ebp), %eax
movl %eax, -8(%ebp)
However, when I changed those ints to doubles (which is what python would probably do), I got this:
division:
flds LC0
fstl -8(%ebp)
fldl -8(%ebp)
flds LC1
fmul %st, %st(1)
fxch %st(1)
fstpl -8(%ebp)
fxch %st(1)
multiplication:
fstpl -16(%ebp)
fldl -16(%ebp)
fmulp %st, %st(1)
fstpl -16(%ebp)
I haven't benchmarked any of this code, but just by examining the code you can see that using integers, division by 2 is shorter than multiplication by 2. Using doubles, multiplication is shorter because the compiler uses the processor's floating point opcodes, which probably run faster (but actually I don't know) than not using them for the same operation. So ultimately this answer has shown that the performance of multiplaction by 0.5 vs. division by 2 depends on the implementation of the language and the platform it runs on. Ultimately the difference is negligible and is something you should virtually never ever worry about, except in terms of readability.
As a side note, you can see that in my program main() returns a + b. When I take the volatile keyword away, you'll never guess what the assembly looks like (excluding the program setup):
## 5/2
## 5*0.5
## done
movl $5, %eax
leave
ret
it did both the division, multiplication, AND addition in a single instruction! Clearly you don't have to worry about this if the optimizer is any kind of respectable.
Sorry for the overly long answer.
Firstly, unless you are working in C or ASSEMBLY, you're probably in a higher level language where memory stalls and general call overheads will absolutely dwarf the difference between multiply and divide to the point of irrelevance. So, just pick what reads better in that case.
If you're talking from a very high level it won't be measurably slower for anything you're likely to use it for. You'll see in other answers, people need to do a million multiply/divides just to measure some sub-millisecond difference between the two.
If you're still curious, from a low level optimisation point of view:
Divide tends to have a significantly longer pipeline than multiply. This means it takes longer to get the result, but if you can keep the processor busy with non-dependent tasks, then it doesn't end up costing you any more than a multiply.
How long the pipeline difference is is completely hardware dependant. Last hardware I used was something like 9 cycles for a FPU multiply and 50 cycles for a FPU divide. Sounds a lot, but then you'd lose 1000 cycles for a memory miss, so that can put things in perspective.
An analogy is putting a pie in a microwave while you watch a TV show. The total time it took you away from the TV show is how long it was to put it in the microwave, and take it out of the microwave. The rest of your time you still watched the TV show. So if the pie took 10 minutes to cook instead of 1 minute, it didn't actually use up any more of your tv watching time.
In practice, if you're going to get to the level of caring about the difference between Multiply and Divide, you need to understand pipelines, cache, branch stalls, out-of-order prediction, and pipeline dependencies. If this doesn't sound like where you were intending to go with this question, then the correct answer is to ignore the difference between the two.
Many (many) years ago it was absolutely critical to avoid divides and always use multiplies, but back then memory hits were less relevant, and divides were much worse. These days I rate readability higher, but if there's no readability difference, I think its a good habit to opt for multiplies.
Write whichever is more clearly states your intent.
After your program works, figure out what's slow, and make that faster.
Don't do it the other way around.
Do whatever you need. Think of your reader first, do not worry about performance until you are sure you have a performance problem.
Let compiler do the performance for you.
Actually there is a good reason that as a general rule of thumb multiplication will be faster than division. Floating point division in hardware is done either with shift and conditional subtract algorithms ("long division" with binary numbers) or - more likely these days - with iterations like Goldschmidt's algorithm. Shift and subtract needs at least one cycle per bit of precision (the iterations are nearly impossible to parallelize as are the shift-and-add of multiplication), and iterative algorithms do at least one multiplication per iteration. In either case, it's highly likely that the division will take more cycles. Of course this does not account for quirks in compilers, data movement, or precision. By and large, though, if you are coding an inner loop in a time sensitive part of a program, writing 0.5 * x or 1.0/2.0 * x rather than x / 2.0 is a reasonable thing to do. The pedantry of "code what's clearest" is absolutely true, but all three of these are so close in readability that the pedantry is in this case just pedantic.
If you are working with integers or non floating point types don't forget your bitshifting operators: << >>
int y = 10;
y = y >> 1;
Console.WriteLine("value halved: " + y);
y = y << 1;
Console.WriteLine("now value doubled: " + y);
Multiplication is usually faster - certainly never slower.
However, if it is not speed critical, write whichever is clearest.
I have always learned that multiplication is more efficient.
Floating-point division is (generally) especially slow, so while floating-point multiplication is also relatively slow, it's probably faster than floating-point division.
But I'm more inclined to answer "it doesn't really matter", unless profiling has shown that division is a bit bottleneck vs. multiplication. I'm guessing, though, that the choice of multiplication vs. division isn't going to have a big performance impact in your application.
This becomes more of a question when you are programming in assembly or perhaps C. I figure that with most modern languages that optimization such as this is being done for me.
Be wary of "guessing multiplication is typically better so I try to stick to that when I code,"
In context of this specific question, better here means "faster". Which is not very useful.
Thinking about speed can be a serious mistake. There are profound error implications in the specific algebraic form of the calculation.
See Floating Point arithmetic with error analysis. See Basic Issues in Floating Point Arithmetic and Error Analysis.
While some floating-point values are exact, most floating point values are an approximation; they are some ideal value plus some error. Every operation applies to the ideal value and the error value.
The biggest problems come from trying to manipulate two nearly-equal numbers. The right-most bits (the error bits) come to dominate the results.
>>> for i in range(7):
... a=1/(10.0**i)
... b=(1/10.0)**i
... print i, a, b, a-b
...
0 1.0 1.0 0.0
1 0.1 0.1 0.0
2 0.01 0.01 -1.73472347598e-18
3 0.001 0.001 -2.16840434497e-19
4 0.0001 0.0001 -1.35525271561e-20
5 1e-05 1e-05 -1.69406589451e-21
6 1e-06 1e-06 -4.23516473627e-22
In this example, you can see that as the values get smaller, the difference between nearly equal numbers create non-zero results where the correct answer is zero.
I've read somewhere that multiplication is more efficient in C/C++; No idea regarding interpreted languages - the difference is probably negligible due to all the other overhead.
Unless it becomes an issue stick with what is more maintainable/readable - I hate it when people tell me this but its so true.
I would suggest multiplication in general, because you don't have to spend the cycles ensuring that your divisor is not 0. This doesn't apply, of course, if your divisor is a constant.
As with posts #24 (multiplication is faster) and #30 - but sometimes they are both just as easy to understand:
1*1e-6F;
1/1e6F;
~ I find them both just as easy to read, and have to repeat them billions of times. So it is useful to know that multiplication is usually faster.
There is a difference, but it is compiler dependent. At first on vs2003 (c++) I got no significant difference for double types (64 bit floating point). However running the tests again on vs2010, I detected a huge difference, up to factor 4 faster for multiplications. Tracking this down, it seems that vs2003 and vs2010 generates different fpu code.
On a Pentium 4, 2.8 GHz, vs2003:
Multiplication: 8.09
Division: 7.97
On a Xeon W3530, vs2003:
Multiplication: 4.68
Division: 4.64
On a Xeon W3530, vs2010:
Multiplication: 5.33
Division: 21.05
It seems that on vs2003 a division in a loop (so the divisor was used multiple times) was translated to a multiplication with the inverse. On vs2010 this optimization is not applied any more (I suppose because there is slightly different result between the two methods). Note also that the cpu performs divisions faster as soon as your numerator is 0.0. I do not know the precise algorithm hardwired in the chip, but maybe it is number dependent.
Edit 18-03-2013: the observation for vs2010
Java android, profiled on Samsung GT-S5830
public void Mutiplication()
{
float a = 1.0f;
for(int i=0; i<1000000; i++)
{
a *= 0.5f;
}
}
public void Division()
{
float a = 1.0f;
for(int i=0; i<1000000; i++)
{
a /= 2.0f;
}
}
Results?
Multiplications(): time/call: 1524.375 ms
Division(): time/call: 1220.003 ms
Division is about 20% faster than multiplication (!)
After such a long and interesting discussion here is my take on this: There is no final answer to this question. As some people pointed out it depends on both, the hardware (cf piotrk and gast128) and the compiler (cf #Javier's tests). If speed is not critical, if your application does not need to process in real-time huge amount of data, you may opt for clarity using a division whereas if processing speed or processor load are an issue, multiplication might be the safest.
Finally, unless you know exactly on what platform your application will be deployed, benchmark is meaningless. And for code clarity, a single comment would do the job!
Here's a silly fun answer:
x / 2.0 is not equivalent to x * 0.5
Let's say you wrote this method on Oct 22, 2008.
double half(double x) => x / 2.0;
Now, 10 years later you learn that you can optimize this piece of code. The method is referenced in hundreds of formulas throughout your application. So you change it, and experience a remarkable 5% performance improvement.
double half(double x) => x * 0.5;
Was it the right decision to change the code? In maths, the two expressions are indeed equivalent. In computer science, that does not always hold true. Please read Minimizing the effect of accuracy problems for more details. If your calculated values are - at some point - compared with other values, you will change the outcome of edge cases. E.g.:
double quantize(double x)
{
if (half(x) > threshold))
return 1;
else
return -1;
}
Bottom line is; once you settle for either of the two, then stick to it!
Well, if we assume that an add/subtrack operation costs 1, then multiply costs 5, and divide costs about 20.
Technically there is no such thing as division, there is just multiplication by inverse elements. For example You never divide by 2, you in fact multiply by 0.5.
'Division' - let's kid ourselves that it exists for a second - is always harder that multiplication because to 'divide' x by y one first needs to compute the value y^{-1} such that y*y^{-1} = 1 and then do the multiplication x*y^{-1}. If you already know y^{-1} then not calculating it from y must be an optimization.

Why does System.MidpointRounding.AwayFromZero not round up in this instance?

In .NET, why does System.Math.Round(1.035, 2, MidpointRounding.AwayFromZero) yield 1.03 instead of 1.04? I feel like the answer to my question lies in the section labeled "Note to Callers" at http://msdn.microsoft.com/en-us/library/ef48waz8.aspx, but I'm unable to wrap my head around the explanation.
Your suspicion is exactly right. Numbers with fractional portion, when expressed as literals in .NET, are by default doubles. A double (like a float) is an approximation of a decimal value, not a precise decimal value. It is the closest value that can be expressed in base-2 (binary). In this case, the approximation is ever so vanishingly on the small side of 1.035. If you write it using an explicit Decimal it works as you expect:
Console.WriteLine(Math.Round(1.035m, 2, MidpointRounding.AwayFromZero));
Console.ReadKey();
To understand why doubles and floats work the way they do, imagine representing the number 1/3 in decimal (or binary, which suffers from the same problem). You can't- it translates to .3333333...., meaning that to represent it accurately would require an infinite amount of memory.
Computers get around this using approximations. I'd explain precisely how, but I'd probably get it wrong. You can read all about it here though: http://en.wikipedia.org/wiki/IEEE_754-1985
The binary representation of 1.035d is 0x3FF08F5C28F5C28F, which in fact is 1.03499999999999992006394222699E0, so System.Math.Round(1.035, 2, MidpointRounding.AwayFromZero) yield 1.03 instead of 1.04, so it's correct.
However, the binary representation of 4.005d is 0x4010051EB851EB85, which is 4.00499999999999989341858963598, so System.Math.Round(4.005, 2, MidpointRounding.AwayFromZero) should yield 4.00, but it yield 4.01 which is wrong (or a smart 'fix'). If you check it in MS SQL select ROUND(CAST(4.005 AS float), 2), it's 4.00
I don't understand why .NET apply this 'smart fix' which makes things worse.
You can check binary representation of a double at:
http://www.binaryconvert.com/convert_double.html
I'ts because the BINARY representation of 1.035 closer to 1.03 than 1.04
For better results do it this way -
decimal result = decimal.Round(1.035m, 2, MidpointRounding.AwayFromZero);
I believe the example you're referring to is a different issue; as far as I understand they're saying that 0.1 isn't stored, in float, as exactly 0.1, it's actually slightly off because of how floats are stored in binary. As such let's suppose it actually looks more like 0.0999999999999 (or similar), something very, very slightly less than 0.1 - so slightly that it doesn't tend to make much difference. Well, no, they're saying: one noticeable difference would be that adding this to your number and rounding would actually appear to go the wrong way because even though the numbers are extremely close it's still considered "less than" the .5 for rounding.
If I misunderstood that page, I hope somebody corrects me :)
I don't see how it relates to your call, though, because you're being more explicit. Perhaps it's just storing your number in a similar fashion.
At a guess I'd say that internally 1.035 can't be represented in binary as exactly 1.035 and it's probably (under the hood) 1.0349999999999999, which would be why it rounds down.
Just a guess though.
Decimal rounding is OK, but it is a slow operation.
A faster workaround would be to multiply the number by (1.0 + 1E-15) to do the trick, works like a charm for the MidpointRounding.AwayFromZero enum option.

long/large numbers and modulus in .NET

I'm currently writing a quick custom encoding method where I take a stamp a key with a number to verify that it is a valid key.
Basically I was taking whatever number that comes out of the encoding and multiplying it by a key.
I would then multiply those numbers to the deploy to the user/customer who purchases the key. I wanted to simply use (Code % Key == 0) to verify that the key is valid, but for large values the mod function does not seem to function as expected.
Number = 468721387;
Key = 12345678;
Code = Number * Key;
Using the numbers above:
Code % Key == 11418772
And for smaller numbers it would correctly return 0. Is there a reliable way to check divisibility for a long in .NET?
Thanks!
EDIT:
Ok, tell me if I'm special and missing something...
long a = DateTime.Now.Ticks;
long b = 12345;
long c = a * b;
long d = c % b;
d == 10001 (Bad)
and
long a = DateTime.Now.Ticks;
long b = 12;
long c = a * b;
long d = c % b;
d == 0 (Good)
What am I doing wrong?
As others have said, your problem is integer overflow. You can make this more obvious by checking "Check for arithmetic overflow/underflow" in the "Advanced Build Settings" dialog. When you do so, you'll get an OverflowException when you perform *DateTime.Now.Ticks * 12345*.
One simple solution is just to change "long" to "decimal" (or "double") in your code.
In .NET 4.0, there is a new BigInteger class.
Finally, you say you're "... writing a quick custom encoding method ...", so a simple homebrew solution may be satisfactory for your needs. However, if this is production code, you might consider more robust solutions involving cryptography or something from a third-party who specializes in software licensing.
The answers that say that integer overflow is the likely culprit are almost certainly correct; you can verify that by putting a "checked" block around the multiplication and seeing if it throws an exception.
But there is a much larger problem here that everyone seems to be ignoring.
The best thing to do is to take a large step back and reconsider the wisdom of this entire scheme. It appears that you are attempting to design a crypto-based security system but you are clearly not an expert on cryptographic arithmetic. That is a huge red warning flag. If you need a crypto-based security system DO NOT ATTEMPT TO ROLL YOUR OWN. There are plenty of off-the-shelf crypto systems that are built by experts, heavily tested, and readily available. Use one of them.
If you are in fact hell-bent on rolling your own crypto, getting the math right in 64 bits is the least of your worries. 64 bit integers are way too small for this crypto application. You need to be using a much larger integer size; otherwise, finding a key that matches the code is trivial.
Again, I cannot emphasize strongly enough how difficult it is to construct correct crypto-based security code that actually protects real users from real threats.
Integer Overflow...see my comment.
The value of the multiplication you're doing overflows the int data type and causes it to wrap (int values fall between +/-2147483647).
Pick a more appropriate data type to hold a value as large as 5786683315615386 (the result of your multiplication).
UPDATE
Your new example changes things a little.
You're using long, but now you're using System.DateTime.Ticks which on Mono (not sure about the MS platform) is returning 633909674610619350.
When you multiply that by a large number, you are now overflowing a long just like you were overflowing an int previously. At that point, you'll probably need to use a double to work with the values you want (decimal may work as well, depending on how large your multiplier gets).
Apparently, your Code fails to fit in the int data type. Try using long instead:
long code = (long)number * key;
The (long) cast is necessary. Without the cast, the multiplication will be done in 32-bit integer form (assuming number and key variables are typed int) and the result will be casted to long which is not what you want. By casting one of the operands to long, you tell the compiler to perform the multiplication on two long numbers.

Categories

Resources