Parallel for loop with BigINtegers - c#

So i am working on a program to factor a given number. you write a number in the command line and it gives you all the factors. now, doing this sequentially is VERY slow. because it uses only one thread, if i'm not mistaking. now, i thought about doing it with Parallel.For and it worked, only with integers, so wanted to try it with BigIntegers
heres the normal code:
public static void Factor(BigInteger f)
{
for(BigInteger i = 1; i <= f; i++)
{
if (f % i == 0)
{
Console.WriteLine("{0} is a factor of {1}", i, f);
}
}
}
the above code should be pretty easy to understand. but as i said, this Very slow for big numbers (above one billion it starts to get impractical) and heres my parallel code:
public static void ParallelFacotr(BigInteger d)
{
Parallel.For<BigInteger>(0, d, new ParallelOptions() { MaxDegreeOfParallelism = Environment.ProcessorCount }, i =>
{
try
{
if (d % i == 0)
{
Console.WriteLine("{0} is a factor of {1}", i, d);
}
}
catch (DivideByZeroException)
{
Console.WriteLine("Done!");
Console.ReadKey();
Launcher.Main();
}
});
}
now, the above code (parallel one) works just fine with integers (int) but its VERY fast. it factored 1000.000.000 in just 2 seconds. so i thought why not try it with bigger integers. and also, i thought that putting <Biginteger> after the parallel.for would do it. but it doesn't. so, how do you work with bigintegers in parallel for loop? and i already tried just a regular parallel loop with a bigInteger as the argument, but then it gives an error saying that it cannot convert from BigInteger to int. so how do you

Improve your algorithm efficiency first.
While it is possible to use BigInteger you will not have CPU ALU arithmetical logic to resolve arbitrarily big numbers logic in hardware, so it will be noticeably slower. So unless you need bigger numbers than 9 quintillion or exactly 9,223,372,036,854,775,807 then you can use type long.
A second thing to not is that you do not need to loop over all elements as it needs to be multiple of something, so you can reduce
for(BigInteger i = 1; i <= f; i++)
for(long i = 1; i <= Math.Sqrt(f); i++)
That would mean that instead of needing to iterate over 1000000000 items you iterate over 31623.
Additionally, if you still plan on using BigInt then check the parameters:
It should be something in the lines of
Parallel.For(
0,
(int)d,
() => BigInteger.Zero,
(x, state, subTotal) => subTotal + BigInteger.One,
Just for trivia. Some programming languages are more efficient in solving problems than others and in this case, there is a languages Wolfram (previously Mathematica) when solving problems is simpler, granted that you know what you are doing.
However they do have google alternative that answers you directly and it has a decent AI that processes your natural language to give you an exact answer as best as it could.
So finding factors of numbers is easy as:
Factor[9223372036854775809]
or use web api https://www.wolframalpha.com/input/?i=factor+9223372036854775809
You can also call Wolfram kernel from C#, but terms and conditions apply.

Related

Should try-catch be avoided for known cases

I have a case which I know will happen but very scarce. For example in every 10 thousand times the code runs, this might happen once.
I can check for this case by a simple if but this if will run many times with no use.
On the other hand I can place the code in try-catch block and when that special case happens I do what is needed to recover.
The question is which one is better? I know that generally speaking try-catch should not be used for known cases because of the overhead issue and also the application logic should not rely on catch code, but running an if multiple times will have more performance issue. I have tested this using this small test code:
static void Main(string[] args)
{
Stopwatch sc = new Stopwatch();
var list = new List<int>();
var rnd = new Random();
for (int i = 0; i < 100000000; i++)
{
list.Add(rnd.Next());
}
sc.Start();
DoWithIf(list);
sc.Stop();
Console.WriteLine($"Done with IFs in {sc.ElapsedMilliseconds} milliseconds");
sc.Restart();
DoWithTryCatch(list);
sc.Stop();
Console.WriteLine($"Done with TRY-CATCH in {sc.ElapsedMilliseconds} milliseconds");
Console.ReadKey();
}
private static int[] DoWithTryCatch(List<int> list)
{
var res = new int[list.Count ];
try
{
for (int i = 0; i < list.Count; i++)
{
res[i] = list[i];
}
return res;
}
catch
{
return res;
}
}
private static int[] DoWithIf(List<int> list)
{
var res = new int[list.Count - 1];
for (int i = 0; i < list.Count; i++)
{
if (i < res.Length)
res[i] = list[i];
}
return res;
}
This code simply copies a lot of numbers to an array with not enough size. In my machine checking array bounds each time takes around 210 milliseconds to run while using try-catch that will hit catch once runs in around 190 milliseconds.
Also if you think it depends on the case my case is that I get push notifications in an app and will check if I have the topic of the message. If not I will get and store the topic information for next messages. There are many messages in few topics.
So, it would be accurate to say that in your test, the if option was slower than the try...catch option by 20 milliseconds, for a loop of 100000000 times.
That translates to 20 / 100,000,000 - that's 0.0000002 milliseconds for each iteration.
Do you really think that kind of nano-optimization is worth writing code that goes goes against proper design standards?
Exceptions are for exceptional cases, the things that you can't control or can't test in advance - for instance, when you are reading data from a database and the connection terminates in the middle - stuff like that.
Using exceptions for things that can be easily tested with simple code - well, that's just plain wrong.
If, for instance, you would have demonstrated a meaningful performance difference between these two options then perhaps you could justify using try...catch instead of if - but that's clearly not the case here.
So, to summarize - use if, not try...catch.
You should design your code for clarity, not for performance.
Write code that conveys the algorithm it is implementing in the clearest way possible.
Set performance goals and measure your code's performance against them.
If your code doesn't measure to your performance goals, Find the bottle necks and treat them.
Don't go wasting your time on nano-optimizations when you design the code.
In your case, you have somehow missed the obvious optimization: if you worry that calling an if 100.000 times is too much... don't?
private static int[] DoWithIf(List<int> list)
{
var res = new int[list.Count - 1];
var bounds = Math.Min(res.Length, list.Count)
for (int i = 0; i < bounds; i++)
{
res[i] = list[i];
}
return res;
}
So I know this is only a test case, but the answer is: optimize if you need it and for what you need it. If you have something in a loop that's supposedly costly, then try to move it out of the loop. Optimize based on logic, not based on compiler constructs. If you are down to optimizing compiler constructs, you should not be coding in a managed and/or high level language anyway.

Shuffling an array bottlenecks on Random.Nex(int)

I've been working on a small piece of code that sorts the provided array. The array should be sorted as fast as possible. Randomization is not that important. After profiling the method I found out that the biggest hog is Random.Next. Which takes up about 70% of the method execution time. After searching online for faster random generators I found no plug and play libraries that offer any improved performance.
So I was wondering whether there are any ways to improve the performance of this code any more.
[MethodImpl(MethodImplOptions.NoInlining)]
private static void Shuffle(byte[] chars)
{
for (var i = 0; i < chars.Length; i++)
{
var index = rnd.Next(chars.Length);
byte tmpStore = chars[index];
chars[index] = chars[i];
chars[i] = tmpStore;
}
}
Alright, this is getting into micro-optimization territory.
Random.Next(int) actually performs some ops internally that we can optimize out:
int index = (int)(rnd.Next() * (1.0 / int.Max) * chars.Length);
Since you're using the same maxValue over and over in a loop, a trivial optimization would be to precalculate your denominator outside of the loop. This way we get rid of an int->double conversion and a multiply:
double d = chars.Length / (double)int.Max;
And then:
int index = (int)(rnd.Next() * d);
On a separate note: your shuffle isn't going to have a uniform distribution. See Jeff Atwood's post The Danger of Naïveté which deals specifically with this subject and shows how to perform a uniform Fisher-Yates shuffle.
If n^n isn't too big for the double range, you could create one random double number, multiply it by n^n, then use modulo(n) each iteration as the current random number prior to dividing the random result by n as preparation for the next iteration.

Fibonacci program does not complete execution

Problem Description
By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms
Program
using System;
class fibonacci {
// function to return Nth value of fibonacci
public static long fibo_n(long N) {
long fibon=0;
switch (N) {
case 1: fibon=1; break;
case 2: fibon=2; break;
}
while(N>2) {
fibon=fibo_n(N-1)+fibo_n(N-2);
}
return fibon;
}
}
class fibo_tester {
static void Main() {
int N=2;
long sum=0;
while(fibonacci.fibo_n(N) <= 13) {
sum = sum + fibonacci.fibo_n(N);
N=N+3;
}
Console.WriteLine("Sum is {0}: ", sum);
}
}
I reduced the number to 13 for testing instead of 4 million, but it still hangs. Could someone please advise?
EDIT 2
switch (N) {
case 1: fibon=1; break;
case 2: fibon=2; break;
default: fibon=fibo_n(N-1)+fibo_n(N-2); break;
}
while(N>2) {
fibon=fibo_n(N-1)+fibo_n(N-2);
}
Is going to loop indefinitely, N is never updated in the loop. You probably need to remove the While() clause and change it all to return fibo_n(N-1)+fibo_n(N-2); I'm not sure what you are doing with the switch statements, etc. But that should be a start.
I would actually replace it with this (if you want to use the switch):
class fibonacci
{
// function to return Nth value of fibonacci
public static long fibo_n(long N)
{
switch (N)
{
case 0:
return 0;
case 1:
return 1;
default:
return fibo_n(N - 1) + fibo_n(N - 2);
}
}
}
You might want to consider storing the values for each value of N in a dictionary or some sort of set so that you can look them up later. Since in your main program it seems like you are going to be looping over the values (successively larger N's that may have already been calculated). I'm not sure what the N+3 is about in your main loop, but you probably are going to miss something there (false assumption of the N-1, N-2 in the recursion perhaps?)
Also, if summing and dependent on your platform and how large of a value you test for (sicne your testing the sum of the first X Fibonacci numbers) you may have to use a ulong or find some other datatype that can handle larger numbers. If I don't change everything from long to ulong on my system, the values wrap around. Fibonacci number can't be negative anyways, so why not use a ulong, or uint64 or something with more bits.
You are using BOTH a loop AND recursion - you'll need to pick one or the other. The problem spot in your code is here:
while (N>2) {
fibon = fibo_n(N-1)+fibo_n(N-2);
}
In this spot the value of N never actually changes - that happens in the recursive step. One example (not good) way to write this is:
public static long fibo_n(long N) {
if (N <= 0) return 0;
if (N == 1) return 1;
if (N <= 4) return N - 1;
return fibo_n(N-1) + fibo_n(N-2);
}
Best of luck!
Performance Note:
The reason this isn't a good approach is because function calls use memory in C# and they also take time. Looping is always faster than recursion and in this case there isn't a good reason to solve this with recursion (as opposed to a simple loop). I'll leave the code as is so you can compare to what you have, but I think you can find much better better code options elsewhere (although not all of these will be in C#).
Let's say you call your method with N equal to 3.
The switch doesn't do anything.
N is greater than 2, so we go into the while loop. The while loop does something; it computes a value, assigns it to fibon. Then N, since it hasn't changed, is still greater than 2, so we compute it again. It'll still be greater than two, so we compute it again. We never stop computing it.
Every time you assign a value to fibon you should instead be returning that value, because you're done; you have nothing left to compute. There's also no need for the while loop. You never need to execute that code more than once.
Your implementation is also super inefficient. To compute fib(5) you compute fib(4) and fib(3). To Then when computing fib(4) you compute fib(3) [again] and fib(2), when computing fib(3) both times they each compute fib(2) (so that's three times there), and if we do the math we end up seeing that you perform on the order of 2^n computations when computing fib(n). On top of that, you're calling fib n times as you're counting up from zero in your main loop. That's tons and tons and tons of needless work re-computing the same values. If you just have a simple loop adding the numbers together as you go you can compute the result with on the order of N computations.

Is there a better performing functional version of this iterative algorithm in C#?

I was hoping to figure out a way to write the below in a functional style with extension functions. Ideally this functional style would perform well compared to the iterative/loop version. I'm guessing that there isn't a way. Probably because of the many additional function calls and stack allocations, etc.
Fundamentally I think the pattern which is making it troublesome is that it both calculates a value to use for the Predicate and then needs that calculated value again as part of the resulting collection.
// This is what is passed to each function.
// Do not assume the array is in order.
var a = (0).To(999999).ToArray().Shuffle();
// Approx times in release mode (on my machine):
// Functional is avg 20ms per call
// Iterative is avg 5ms per call
// Linq is avg 14ms per call
private static List<int> Iterative(int[] a)
{
var squares = new List<int>(a.Length);
for (int i = 0; i < a.Length; i++)
{
var n = a[i];
if (n % 2 == 0)
{
int square = n * n;
if (square < 1000000)
{
squares.Add(square);
}
}
}
return squares;
}
private static List<int> Functional(int[] a)
{
return
a
.Where(x => x % 2 == 0 && x * x < 1000000)
.Select(x => x * x)
.ToList();
}
private static List<int> Linq(int[] a)
{
var squares =
from num in a
where num % 2 == 0 && num * num < 1000000
select num * num;
return squares.ToList();
}
An alternative to Konrad's suggestion. This avoids the double calculation, but also avoids even calculating the square when it doesn't have to:
return a.Where(x => x % 2 == 0)
.Select(x => x * x)
.Where(square => square < 1000000)
.ToList();
Personally, I wouldn't sweat the difference in performance until I'd seen it be significant in a larger context.
(I'm assuming that this is just an example, by the way. Normally you'd possibly compute the square root of 1000000 once and then just compare n with that, to shave off a few milliseconds. It does require two comparisons or an Abs operation though, of course.)
EDIT: Note that a more functional version would avoid using ToList at all. Return IEnumerable<int> instead, and let the caller transform it into a List<T> if they want to. If they don't, they don't take the hit. If they only want the first 5 values, they can call Take(5). That laziness could be a big performance win over the original version, depending on the context.
Just solving your problem of the double calculation:
return (from x in a
let sq = x * x
where x % 2 == 0 && sq < 1000000
select sq).ToList();
That said, I’m not sure that this will lead to much performance improvement. Is the functional variant actually noticeably faster than the iterative one? The code offers quite a lot of potential for automated optimisation.
How about some parallel processing? Or does the solution have to be LINQ (which I consider to be slow).
var squares = new List<int>(a.Length);
Parallel.ForEach(a, n =>
{
if(n < 1000 && n % 2 == 0) squares.Add(n * n);
}
The Linq version would be:
return a.AsParallel()
.Where(n => n < 1000 && n % 2 == 0)
.Select(n => n * n)
.ToList();
I don't think there's a functional solution that will be completely on-par with the iterative solution performance-wise. In my timings (see below) the 'functional' implementation from the OP appears to be around twice as slow as the iterative implementation.
Micro-benchmarks like this one are prone to all manner of issues. A common tactic in dealing with variability problems is to repeatedly call the method being timed and compute an average time per call - like this:
// from main
Time(Functional, "Functional", a);
Time(Linq, "Linq", a);
Time(Iterative, "Iterative", a);
// ...
static int reps = 1000;
private static List<int> Time(Func<int[],List<int>> func, string name, int[] a)
{
var sw = System.Diagnostics.Stopwatch.StartNew();
List<int> ret = null;
for(int i = 0; i < reps; ++i)
{
ret = func(a);
}
sw.Stop();
Console.WriteLine(
"{0} per call timings - {1} ticks, {2} ms",
name,
sw.ElapsedTicks/(double)reps,
sw.ElapsedMilliseconds/(double)reps);
return ret;
}
Here are the timings from one session:
Functional per call timings - 46493.541 ticks, 16.945 ms
Linq per call timings - 46526.734 ticks, 16.958 ms
Iterative per call timings - 21971.274 ticks, 8.008 ms
There are a host of other challenges as well: strobe-effects with the timer use, how and when the just-in-time compiler does its thing, the garbage collector running its collections, the order that competing algorithms are run, the type of cpu, the OS swapping other processes in and out, etc.
I tried my hand at a little optimization. I removed the square from the test (num * num < 1000000) - changing it to (num < 1000) - which seemed safe since there are no negatives in the input - that is, I took the square root of both sides of the inequality. Surprisingly, I got different results as compared to the methods in the OP - there were only 500 items in my optimized output as compared to the 241,849 from the three implementations in the OP implementations. So why the difference? Much of the input when squared overflows 32 bit integers, so those extra 241,349 items came from numbers that when squared overflowed to either negative numbers or numbers under 1 million while still passing our evenness test.
optimized (functional) timing:
Optimized per call timings - 16849.529 ticks, 6.141 ms
This was one of the functional implementations altered as suggested. It output the 500 items passing the criteria as expected. It is deceptively "faster" only because it output fewer items than the iterative solution.
We can make the original implementations blow up with an OverflowException by adding a checked block around their implementations. Here is a checked block added to the "Iterative" method:
private static List<int> Iterative(int[] a)
{
checked
{
var squares = new List<int>(a.Length);
// rest of method omitted for brevity...
return squares;
}
}

Dynamic compilation for performance

I have an idea of how I can improve the performance with dynamic code generation, but I'm not sure which is the best way to approach this problem.
Suppose I have a class
class Calculator
{
int Value1;
int Value2;
//..........
int ValueN;
void DoCalc()
{
if (Value1 > 0)
{
DoValue1RelatedStuff();
}
if (Value2 > 0)
{
DoValue2RelatedStuff();
}
//....
//....
//....
if (ValueN > 0)
{
DoValueNRelatedStuff();
}
}
}
The DoCalc method is at the lowest level and it is called many times during calculation. Another important aspect is that ValueN are only set at the beginning and do not change during calculation. So many of the ifs in the DoCalc method are unnecessary, as many of ValueN are 0. So I was hoping that dynamic code generation could help to improve performance.
For instance if I create a method
void DoCalc_Specific()
{
const Value1 = 0;
const Value2 = 0;
const ValueN = 1;
if (Value1 > 0)
{
DoValue1RelatedStuff();
}
if (Value2 > 0)
{
DoValue2RelatedStuff();
}
....
....
....
if (ValueN > 0)
{
DoValueNRelatedStuff();
}
}
and compile it with optimizations switched on the C# compiler is smart enough to only keep the necessary stuff. So I would like to create such method at run time based on the values of ValueN and use the generated method during calculations.
I guess that I could use expression trees for that, but expression trees works only with simple lambda functions, so I cannot use things like if, while etc. inside the function body. So in this case I need to change this method in an appropriate way.
Another possibility is to create the necessary code as a string and compile it dynamically. But it would be much better for me if I could take the existing method and modify it accordingly.
There's also Reflection.Emit, but I don't want to stick with it as it would be very difficult to maintain.
BTW. I'm not restricted to C#. So I'm open to suggestions of programming languages that are best suited for this kind of problem. Except for LISP for a couple of reasons.
One important clarification. DoValue1RelatedStuff() is not a method call in my algorithm. It's just some formula-based calculation and it's pretty fast. I should have written it like this
if (Value1 > 0)
{
// Do Value1 Related Stuff
}
I have run some performance tests and I can see that with two ifs when one is disabled the optimized method is about 2 times faster than with the redundant if.
Here's the code I used for testing:
public class Program
{
static void Main(string[] args)
{
int x = 0, y = 2;
var if_st = DateTime.Now.Ticks;
for (var i = 0; i < 10000000; i++)
{
WithIf(x, y);
}
var if_et = DateTime.Now.Ticks - if_st;
Console.WriteLine(if_et.ToString());
var noif_st = DateTime.Now.Ticks;
for (var i = 0; i < 10000000; i++)
{
Without(x, y);
}
var noif_et = DateTime.Now.Ticks - noif_st;
Console.WriteLine(noif_et.ToString());
Console.ReadLine();
}
static double WithIf(int x, int y)
{
var result = 0.0;
for (var i = 0; i < 100; i++)
{
if (x > 0)
{
result += x * 0.01;
}
if (y > 0)
{
result += y * 0.01;
}
}
return result;
}
static double Without(int x, int y)
{
var result = 0.0;
for (var i = 0; i < 100; i++)
{
result += y * 0.01;
}
return result;
}
}
I would usually not even think about such an optimization. How much work does DoValueXRelatedStuff() do? More than 10 to 50 processor cycles? Yes? That means you are going to build quite a complex system to save less then 10% execution time (and this seems quite optimistic to me). This can easily go down to less then 1%.
Is there no room for other optimizations? Better algorithms? An do you really need to eliminate single branches taking only a single processor cycle (if the branch prediction is correct)? Yes? Shouldn't you think about writing your code in assembler or something else more machine specific instead of using .NET?
Could you give the order of N, the complexity of a typical method, and the ratio of expressions usually evaluating to true?
It would surprise me to find a scenario where the overhead of evaluating the if statements is worth the effort to dynamically emit code.
Modern CPU's support branch prediction and branch predication, which makes the overhead for branches in small segments of code approach zero.
Have you tried to benchmark two hand-coded versions of the code, one that has all the if-statements in place but provides zero values for most, and one that removes all of those same if branches?
If you are really into code optimisation - before you do anything - run the profiler! It will show you where the bottleneck is and which areas are worth optimising.
Also - if the language choice is not limited (except for LISP) then nothing will beat assembler in terms of performance ;)
I remember achieving some performance magic by rewriting some inner functions (like the one you have) using assembler.
Before you do anything, do you actually have a problem?
i.e. does it run long enough to bother you?
If so, find out what is actually taking time, not what you guess. This is the quick, dirty, and highly effective method I use to see where time goes.
Now, you are talking about interpreting versus compiling. Interpreted code is typically 1-2 orders of magnitude slower than compiled code. The reason is that interpreters are continually figuring out what to do next, and then forgetting, while compiled code just knows.
If you are in this situation, then it may make sense to pay the price of translating so as to get the speed of compiled code.

Categories

Resources