Sieve of Eratosthenes run slower after improving - c#

I try to improve basic Sieve of Eratosthenes algorithm by avoiding cross out duplicate multiple of primes, but it turn out to be worse than my expectation
I has implemented two method that return primes in range [2..max)
Basic sieve
public static List<int> Sieve22Max_Basic(int n) {
var primes = new List<int>();
var sieve = new BitArray(n, true); // default all number are prime
//int crossTotal = 0;
int sqrt_n = (int)Math.Sqrt(n) + 1;
for (int p = 2; p < sqrt_n; ++p) {
if (sieve[p]) {
primes.Add(p);
//var cross = new List<int>();
int inc = p == 2 ? p : 2 * p;
for (int mul = p * p; mul < n; mul += inc) {
// cross out multiple of prime p
// cross.Add(mul);
//++crossTotal;
sieve[mul] = false;
}
//if (cross.Count > 0)
// Console.WriteLine($"Prime {p}, cross out: {string.Join(' ', cross)}");
}
}
//Console.WriteLine($"crossTotal: {crossTotal:n0}");
for (int p = sqrt_n; p < n; ++p)
if (sieve[p])
primes.Add(p);
return primes;
}
Run Sieve22Max_Basic(100), see some multiples are cross more than one (ex: 45, 75, 63)
Prime 2, cross out: 4 6 8 ... 96 98
Prime 3, cross out: 9 15 21 27 33 39 45 51 57 63 69 75 81 87 93 99
Prime 5, cross out: 25 35 45 55 65 75 85 95
Prime 7, cross out: 49 63 77 91
Enhance Sieve
Then, I try to improve by using array that store smallest prime divisor (spd) of each number.
45 = 3 x 5 // spd[45] = 3
75 = 3 x 5 x 5 // spd[75] = 3
63 = 3 x 3 x 7 // spd[63] = 3
When go through multiple of prime p, I will not cross out number mul that have spd[mul] < p because mul was cross out by spd[mul] before
public static List<int> Sieve22Max_Enh(int n) {
var sieve = new BitArray(n, true);
var spd = new int[n];
for (int i = 0; i < n; ++i) spd[i] = i;
var primes = new List<int>();
//int crossTotal = 0;
int sqrt_n = (int)Math.Sqrt(n) + 1;
for (int p = 2; p < sqrt_n; ++p) {
if (sieve[p]) {
primes.Add(p);
//var cross = new List<int>();
int inc = p == 2 ? 1 : 2;
for (long mul = p; mul * p < n; mul += inc) {
if (spd[mul] >= p) {
sieve[(int)(mul * p)] = false;
spd[mul * p] = p;
//++crossTotal;
//cross.Add((int)(mul * p));
}
}
//if (cross.Count > 0)
// Console.WriteLine($"Prime {p}, cross out: {string.Join(' ', cross)}");
}
}
//Console.WriteLine($"crossTotal: {crossTotal:n0}");
for (int p = sqrt_n; p < n; ++p)
if (sieve[p])
primes.Add(p);
return primes;
}
Test
I test on my laptop (core i7 - 2.6 Ghz), with n = 1 billion
Sieve22Max_Basic take only 6s while Sieve22Max_Enh take more than 10s to complete
var timer = new Stopwatch();
int n = 1_000_000_000;
timer.Restart();
Console.WriteLine("==== Sieve22Max_Basic ===");
var list = Sieve22Max_Basic(n);
Console.WriteLine($"Count: {list.Count:n0}, Last: {list[list.Count - 1]:n0}, elapsed: {timer.Elapsed}");
Console.WriteLine();
timer.Restart();
Console.WriteLine("==== Sieve22Max_Enh ===");
list = Sieve22Max_Enh(n);
Console.WriteLine($"Count: {list.Count:n0}, Last: {list[list.Count - 1]:n0}, elapsed: {timer.Elapsed}");
You can try at https://onlinegdb.com/tWfMuDDK0
What does it make slower?

Compare your two loops from the original and improved versions.
Original:
int inc = p == 2 ? p : 2 * p;
for (int mul = p * p; mul < n; mul += inc) {
sieve[mul] = false;
}
Improved:
int inc = p == 2 ? 1 : 2;
for (long mul = p; mul * p < n; mul += inc) {
if (spd[mul] >= p) {
sieve[(int)(mul * p)] = false;
spd[mul * p] = p;
}
}
Some observations:
Both loops run the same amount of iterations.
For every iteration, the original loop executes three very fast operations: 1) change a value in BitArray, mul += inc and check mul < n.
For every iteration of the improved loop, we execute more operations: check spd[mul] >= p, mul += inc, mul * p (in the for-loop condition), check mul * p < n.
Increment += and loop condition check for < are the same in both loops; check spd[mul] >= p and change a value in BitArray are comparable in how long they take; but the additional op mul * p in the second loop's condition is multiplication – it's expensive!
But also, for every iteration of the second loop, if spd[mul] >= p is true, then we also execute: mul * p (again!), cast to int, change a value in BitArray, mul * p(third time!), I assume again a cast to int in the indexing of spd, and assign a value in the array spd.
To summarize, your second improved loop's every iteration is computationally "heavier". This is the reason why your improved version is slower.

Related

Codewar task. How to find simple square numbers by the condition?

I tried to solve this problem on Codewar, but I don`t understand how to find exceptions.
In this Kata, you will be given a number n (n > 0) and your task will be to return the smallest square number N (N > 0) such that n + N is also a perfect square. If there is no answer, return -1 (nil in Clojure, Nothing in Haskell).
Here code that I wrote:
using System;
public class SqNum
{
public static long solve(long n){
int N = 1;
long lTemp = n;
double sum, result;
bool isSqare;
while(true)
{
sum = lTemp + N;
result = Math.Sqrt(sum);
isSqare = result % 1 == 0;
if(n == 4)
{
return -1;
}
if(isSqare == true)
{
return Convert.ToInt32(result);
}
N++;
}
return -1;
}
}
If N (square) is p^2, and n+N=r^2, you can write
n + p^2 = r^2
n = r^2 - p^2
n = (r - p) * (r + p)
If we represent n as product of pair of divisors:
n = a * b // let a<=b
a * b = (r - p) * (r + p)
We have system
r - p = a
r + p = b
and
p = (b - a) / 2
When p is the smallest? In the case of maximal close factors b and a. So we can try to calculate divisors starting from square root of n. Also note that a and b should have the same parity (to make p integer)
Pseudocode:
int N = -1;
for (int a = Math.Ceiling(Math.Sqrt(n)) - 1; a > 0; a--) {
if (n % a == 0) {
int bma = n / a - a;
if (bma % 2 == 0) {
int p = bma / 2;
int N = p * p;
break;
}
}
}
Examples:
n = 16
first divisor below sqrt(16) a = 2, b=8
p = 3, 16 + 9 = 25
n = 13
first divisor below sqrt(13) a = 1, b=13
p = 6, 13 + 36 = 49
n = 72
first divisor below sqrt(72) is a = 8, b= 9 - but distinct parity!
so the next suitable pair is a = 6, b = 12
p = 3, 72 + 9 = 81

How to do the loop and solve the following

I'm supposed to code a program that writes out a division just like in school.
Example:
13:3=4.333333333333
13
1
10
10
10....
So my approach was:
Solve the division then get the solution in a List.
Then question if the first number (in this case 1) is divisible by 3.
If not put it down and add the second number and so on...
I managed to do this the first time. It's sloppy but works. The problem is that it only works with numbers that when divided get to have a decimal in it.
Exapmle:
123:13
This is the first code:
do
{
for (int number = 1; number <= divNum; number++)
if (number % divisor == 0) countH++;
for (int i = 0; i < count; i++)
Console.Write(" ");
if ((c = divNum % divisor ) < divisor )
{
Console.WriteLine(" " + ((divNum- (countH * divisor ))) * 10);
}
else Console.WriteLine(" " + (divNum- (countH * divisor )));
c = divNum % divisor ;
if (c < divisor )
{
divNum = c * 10;
}
count++; countH = 0;
} while ((divNum >= divisor ) && (count < x));
Any ideas or help? Sorry if this is a bad question.
************ added
Try of a better explanation:
1 cant be divided by 13, so it goes down, we get the 2 down and try 12 divided by 13, still nothing so we get the 3 down and try 123:13, 13 goes 9 times in 123 so we have 123-9*13 = 6 the six goes down we write 9 in the result. We try 6:13 not going so we drop a 0 next to 6. Next we try 60:13, 13 goes 4 times so 60-4*13 = 8, we get the 8 down. And so on..
123:13=9.46153....
123
60
80
20
70
50
....
Something like this should work. Not the fastest solution most likely, but should do the job.
var number = 123;
var b = 12;
int quotient;
double remainder = number;
var x = 10;
do
{
quotient = (int)Math.Floor(remainder / b);
remainder = remainder - (quotient * b);
for (int i = 0; i < count; i++)
Console.Write(" ");
remainder *= 10;
Console.WriteLine(" " + remainder);
count++;
} while ((remainder > 0) && (count < x));

Why aren't 10 and 100 considered Kaprekar numbers?

I'm trying to do the Modified Kaprekar Numbers problem (https://www.hackerrank.com/challenges/kaprekar-numbers) which describes a Kaprekar number by
Here's an explanation from Wikipedia about the ORIGINAL Kaprekar
Number (spot the difference!): In mathematics, a Kaprekar number for a
given base is a non-negative integer, the representation of whose
square in that base can be split into two parts that add up to the
original number again. For instance, 45 is a Kaprekar number, because
45² = 2025 and 20+25 = 45.
and what I don't understand is why 10 and 100 aren't Kaprekar numbers.
10^2 = 1000 and 10 + 00 = 10
Right?
So my solution
// Returns the number represented by the digits
// in the range arr[i], arr[i + 1], ..., arr[j - 1].
// If there are no elements in range, return 0.
static int NumberInRange(int[] arr, int i, int j)
{
int result = 0;
for(; i < j; ++i)
{
result *= 10;
result += arr[i];
}
return result;
}
// Returns true or false depending on whether k
// is a Kaprekar number.
// Example: IsKaprekar(45) = true because 45^2=2025 and 20+25=45
// Example: IsKaprekar(9) = false because the set of the split
// digits of 7^2=49 are {49,0},{4,9} and
// neither of 49+0 or 4+9 equal 7.
static bool IsKaprekar(int k)
{
int square = k * k;
int[] digits = square.ToString().Select(c => (int)Char.GetNumericValue(c)).ToArray();
for(int i = 0; i < digits.Length; ++i)
{
int right = NumberInRange(digits, 0, i);
int left = NumberInRange(digits, i, digits.Length);
if((right + left) == k)
return true;
}
return false;
}
is saying all the Kaprekar numbers between 1 and 100 are
1 9 10 45 55 99 100
whereas the "right" answer is
1 9 45 55 99
In 100+00 the right is 00, which is wrong because in a kaprekar number the right may start with zero (ex: 025) but cannot be entirely 0.
Therefore you can put a condition in the loop that
if(right==0)
return false;
The reason is because 10 x 10 = 100. Then you substring the right part with a length equals d = 2, that is digit count of original value (10), then the left part would be 1.
So l = 1 and r = 00, l + r = 1, that is not equals to 10.
The same for 100. 100 x 100 = 10000. l = 10, r = 000, so l + r = 10 not equal 100.
Here is my solution in JAVA.
static void kaprekarNumbers(int p, int q) {
long[] result = IntStream.rangeClosed(p, q).mapToLong(Long::valueOf)
.filter(v -> {
int d = String.valueOf(v).length();
Long sq = v * v;
String sqSt = sq.toString();
if (sqSt.length() > 1) {
long r = Long.parseLong(sqSt.substring(sqSt.length() - d));
long l = Long.parseLong(sqSt.substring(0, sqSt.length() - d));
return r + l == v;
} else return v == 1;
}).toArray();
if (result.length > 0) {
for (long l : result) {
System.out.print(l + " ");
}
} else {
System.out.println("INVALID RANGE");
}
}
How about something like this.
static bool IsKaprekar(int k)
{
int t;
for (int digits = new String(k).length(); digits > 0; digits--, t *= 10);
long sq = k * k;
long first = sq / t;
long second = sq % t;
return k == first + second;
}
find a number to divide and mod the square with in order to split it. This number should be a factor of 10 based on the number of digits in the original number.
calculate the square.
split the square.
compare the original to the sum of the splits.

Result not matching. Floating point error?

I am trying to rewrite the R function acf that computes Auto-Correlation into C#:
class AC
{
static void Main(string[] args)
{
double[] y = new double[] { 772.9, 909.4, 1080.3, 1276.2, 1380.6, 1354.8, 1096.9, 1066.7, 1108.7, 1109, 1203.7, 1328.2, 1380, 1435.3, 1416.2, 1494.9, 1525.6, 1551.1, 1539.2, 1629.1, 1665.3, 1708.7, 1799.4, 1873.3, 1973.3, 2087.6, 2208.3, 2271.4, 2365.6, 2423.3, 2416.2, 2484.8, 2608.5, 2744.1, 2729.3, 2695, 2826.7, 2958.6, 3115.2, 3192.4, 3187.1, 3248.8, 3166, 3279.1, 3489.9, 3585.2, 3676.5 };
Console.WriteLine(String.Join("\n", acf(y, 17)));
Console.Read();
}
public static double[] acf(double[] series, int maxlag)
{
List<double> acf_values = new List<double>();
float flen = (float)series.Length;
float xbar = ((float)series.Sum()) / flen;
int N = series.Length;
double variance = 0.0;
for (int j = 0; j < N; j++)
{
variance += (series[j] - xbar)*(series[j] - xbar);
}
variance = variance / N;
for (int lag = 0; lag < maxlag + 1; lag++)
{
if (lag == 0)
{
acf_values.Add(1.0);
continue;
}
double autocv = 0.0;
for (int k = 0; k < N - lag; k++)
{
autocv += (series[k] - xbar) * (series[lag + k] - xbar);
}
autocv = autocv / (N - lag);
acf_values.Add(autocv / variance);
}
return acf_values.ToArray();
}
}
I have two problems with this code:
For large arrays (length = 25000), this code takes about 1-2 seconds whereas R's acf function returns in less than 200 ms.
The output does not match R's output exactly.
Any suggestions on where I messed up or any optimizations to the code?
C# R
1 1 1
2 0.945805846 0.925682317
3 0.89060465 0.85270658
4 0.840762283 0.787096604
5 0.806487301 0.737850083
6 0.780259665 0.697253317
7 0.7433111 0.648420319
8 0.690344341 0.587527097
9 0.625632533 0.519141887
10 0.556860982 0.450228026
11 0.488922355 0.38489632
12 0.425406196 0.325843042
13 0.367735169 0.273845337
14 0.299647764 0.216766466
15 0.22344712 0.156888402
16 0.14575994 0.099240809
17 0.072389526 0.047746281
18 -0.003238526 -0.002067146
You might try changing this line:
autocv = autocv / (N - lag);
to this:
autocv = autocv / N;
Either of these is an acceptable divisor for the expected value, and R is clearly using the second one.
To see this without having access to a C# compiler, we can read in the table that you have, and adjust the values by dividing each value in the C# column by N/(N - lag), and see that they agree with the values from R.
N is 47 here, and lag ranges from 0 to 17, so N - lag is 47:30.
After copying the table above into my local clipboard:
cr <- read.table(file='clipboard', comment='', check.names=FALSE)
cr$adj <- cr[[1]]/47*(47:30)
max(abs(cr$R - cr$adj))
## [1] 2.2766e-09
A much closer approximation.
You might do better if you define flen and xbar as type double as floats do not have 9 decimal digits of precision.
The reason that R is so much faster is that acf is implemented as native and non-managed code (either C or FORTRAN).

Mathematically navigating a large 2D Numeric grid in C#

I am trying to find certain coordinates of interest within a very large virtual grid. This grid does not actually exist in memory since the dimensions are huge. For the sake of this question, let's assume those dimensions to be (Width x Height) = (Int32.MaxValue x Int32.MaxValue).
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40
5 10 15 20 25 30 35 40 45 50
6 12 18 24 30 36 42 48 54 60
7 14 21 28 35 42 49 56 63 70
8 16 24 32 40 48 56 64 72 80
9 18 27 36 45 54 63 72 81 90
10 20 30 40 50 60 70 80 90 100
Known data about grid:
Dimensions of the grid = (Int32.MaxValue x Int32.MaxValue).
Value at any given (x, y) coordinate = Product of X and Y = (x * y).
Given the above large set of finite numbers, I need to calculate a set of coordinates whose value (x * y) is a power of e. Let's say e is 2 in this case.
Since looping through the grid is not an option, I thought about looping through:
int n = 0;
long r = 0;
List<long> powers = new List<long>();
while (r < (Int32.MaxValue * Int32.MaxValue))
{
r = Math.Pow(e, n++);
powers.Add(r);
}
This gives us a unique set of powers. I now need to find out at what coordinates each power exists. Let's take 2^3=8. As shown in the grid above, 8 exists in 4 coordinates: (8,1), (4,2), (2,4) & (1, 8).
Clearly the problem here is finding multiple factors of the number 8 but this would become impractical for larger numbers. Is there another way to achieve this and am I missing something?
Using sets won't work since the factors don't exist in memory.
Is there a creative way to factor knowing that the number in question will always be a power of e?
The best method is to factor e into it prime components. Lets say they are as follows: {a^m, b^p, c^q}. Then you build set for each power of e, for example if m=2, p=1, q=3,
e^1 = {a, a, b, c, c, c}
e^2 = (a, a, a, a, b, b, c, c, c, c, c, c}
etc. up to e^K > Int32.MaxValue * Int32.MaxValue
then for each set you need to iterate over each unique subset of these sets to form one coordinate. The other coordinate is what remains. You will need one nested loop for each of the unique primes in e. For example:
Lets say for e^2
M=m*m;
P=p*p;
Q=q*q;
c1 = 1 ;
for (i=0 ; i<=M ; i++)
{
t1 = c1 ;
for (j=0 ; j<=P ; j++)
{
t2 = c1 ;
for (k=0 ; k<=Q ; k++)
{
// c1 holds first coordinate
c2 = e*e/c1 ;
// store c1, c2
c1 *= c ;
}
c1 = t2*b ;
}
c1 = t1*a ;
}
There should be (M+1)(P+1)(Q+1) unique coordinates.
Another solution, not as sophisticated as the idea from Commodore63, but therefore maybe a little bit simpler (no need to do a prime factorization and calculating all proper subsets):
const int MaxX = 50;
const int MaxY = 50;
const int b = 6;
var maxExponent = (int)Math.Log((long)MaxX * MaxY, b);
var result = new List<Tuple<int, int>>[maxExponent + 1];
for (var i = 0; i < result.Length; ++i)
result[i] = new List<Tuple<int, int>>();
// Add the trivial case
result[0].Add(Tuple.Create(1, 1));
// Add all (x,y) with x*y = b
for (var factor = 1; factor <= (int)Math.Sqrt(b); ++factor)
if (b % factor == 0)
result[1].Add(Tuple.Create(factor, b / factor));
// Now handle the rest, meaning x > b, y <= x, x != 1, y != 1
for (var x = b; x <= MaxX; ++x) {
if (x % b != 0)
continue;
// Get the max exponent for b in x and the remaining factor
int exp = 1;
int lastFactor = x / b;
while (lastFactor >= b && lastFactor % b == 0) {
++exp;
lastFactor = lastFactor / b;
}
if (lastFactor > 1) {
// Find 1 < y < b with x*y yielding a power of b
for (var y = 2; y < b; ++y)
if (lastFactor * y == b)
result[exp + 1].Add(Tuple.Create(x, y));
} else {
// lastFactor == 1 meaning that x is a power of b
// that means that y has to be a power of b (with y <= x)
for (var k = 1; k <= exp; ++k)
result[exp + k].Add(Tuple.Create(x, (int)Math.Pow(b, k)));
}
}
// Output the result
for (var i = 0; i < result.Length; ++i) {
Console.WriteLine("Exponent {0} - Power {1}:", i, Math.Pow(b, i));
foreach (var pair in result[i]) {
Console.WriteLine(" {0}", pair);
//if (pair.Item1 != pair.Item2)
// Console.WriteLine(" ({0}, {1})", pair.Item2, pair.Item1);
}
}

Categories

Resources