I am trying to figure out what the difference between the following for loops is.
The first is code that I wrote while practicing algorithms on codewars.com. It times out when attempting the larger test cases.
The second is one of the top solutions. It seems functionally similar (obviously its more concise) but runs much faster and does not time out. Can anyone explain to me what the difference is? Also, the return statement in the second snippet is confusing to me. What exactly does this syntax mean? Maybe this is where it is more efficient.
public static long findNb(long m)
{
int sum = 0;
int x = new int();
for (int n = 0; sum < m; n++)
{
sum += n*n*n;
x = n;
System.Console.WriteLine(x);
}
if (sum == m)
{
return x;
}
return -1;
}
vs
public static long findNb(long m) //seems similar but doesnt time out
{
long total = 1, i = 2;
for(; total < m; i++) total += i * i * i;
return total == m ? i - 1 : -1;
}
The second approach uses long for the total value. Chances are that you're using an m value that's high enough to exceed the number of values representable by int. So your math overflows and the n value becomes a negative number. You get caught in an infinite loop, where n can never get as big as m.
And, like everyone else says, get rid of the WriteLine.
Also, the return statement in the second snippet is confusing to me. What exactly does this syntax mean?
It's a ternary conditional operator.
Both approaches are roughly the same, except unwanted System.Console.WriteLine(x); which spolis the fun: printing on the Console (UI!) is a slow operation.
If you are looking for a fast solution (esp. for the large m and long loop) you can just precompute all (77936) values:
public class Solver {
static Dictionary<long, long> s_Sums = new Dictionary<long, long>();
private static void Build() {
long total = 0;
for (long i = 0; i <= 77936; ++i) {
total += i * i * i;
s_Sums.Add(total, i);
}
}
static Solver()
Build();
}
public static long findNb(long m) {
return s_Sums.TryGetValue(m, out long result)
? result
: -1;
}
}
When I run into micro optimisation challenges like this, I always use BenchmarkDotnet. It's the tool to use to get all the insights to performance, memory allocations, deviations in .NET Framework versions, 64bit vs 32 bit etc. etc.
But as others write - remember to remove the WriteLine() statement :)
Related
As part of learning c# I engage in codesignal challenges. So far everything is going good for me, except for the test stated in the title.
The problem is that my code is not efficient enough to run under 3 seconds when the length of an array is 10^5 and the number of consecutive elements (k) is 1000. My code runs as follows:
int arrayMaxConsecutiveSum(int[] inputArray, int k) {
int sum = 0;
int max = 0;
for (int i = 0; i <= inputArray.Length-k; i++)
{
sum = inputArray.Skip(i).Take(k).Sum();
if (sum > max)
max = sum;
}
return max;
}
All visible tests in the website run OK, but when it comes to hidden test, in test 20, an error occured, stating that
19/20 tests passed. Execution time limit exceeded on test 20: Program exceeded the execution time limit. Make sure that it completes execution in a few seconds for any possible input.
I also tried unlocking solutions but on c# the code is somewhat similar to this but he didn't use LINQ. I also tried to run it together with the hidden tests but same error occurred, which is weird as how it was submitted as a solution when it didn't even passed all tests.
Is there any faster way to get the sum of an array?
I also thought of unlocking the hidden tests, but I think it won't give me any specific solution as the problem would still persists.
It would seem that you are doing the addition of k numbers for every loop. This pseudo code should be more efficient:
Take the sum of the first k elements and set this to be the max.
Loop as you had before, but each time subtract from the existing sum the element at i-1 and add the element at i + k.
Check for max as before and repeat.
The difference here is about the number of additions in each loop. In the original code you add k elements for every loop, in this code, within each loop you subtract a single element and add a single element to an existing sum, so this is 2 operations versus k operations. Your code starts to slow down as k gets large for large arrays.
For this specific case, I would suggest you not to use Skip method as it iterates on the collection every time. You can check the Skip implementation at here. Copying the code for reference.
public static IEnumerable<TSource> Skip<TSource>(this IEnumerable<TSource> source, int count) {
if (source == null) throw Error.ArgumentNull("source");
return SkipIterator<TSource>(source, count);
}
static IEnumerable<TSource> SkipIterator<TSource>(IEnumerable<TSource> source, int count) {
using (IEnumerator<TSource> e = source.GetEnumerator()) {
while (count > 0 && e.MoveNext()) count--;
if (count <= 0) {
while (e.MoveNext()) yield return e.Current;
}
}
}
As you can see Skip iterates the collection everytime, so if you have a huge collection with k as a high number, than you can see the execution time sluggish.
Instead of using Skip, you can write simple for loop which iterates required items:
public static int arrayMaxConsecutiveSum(int[] inputArray, int k)
{
int sum = 0;
int max = 0;
for (int i = 0; i <= inputArray.Length-k; i++)
{
sum = 0;
for (int j = i; j < k + i; j++)
{
sum += inputArray[j];
}
if (sum > max)
max = sum;
}
return max;
}
You can check this dotnet fiddle -- https://dotnetfiddle.net/RrUmZX where you can compare the time difference. For through benchmarking, I would suggest to look into Benchmark.Net.
You need to be careful when using LINQ and thinking about performance. Not that it's slow, but that it can easily hide a big operation behind a single word. In this line:
sum = inputArray.Skip(i).Take(k).Sum();
Skip(i) and Take(k) will both take approximately as long as a for loop, stepping through thousands of rows, and that line is run for every one of the thousands of items in the main loop.
There's no magic command that is faster, instead you have to rethink your approach to do the minimum of steps inside the loop. In this case you could remember the sum from each step and just add or remove individual values, rather than recalculating the whole thing every time.
public static int arrayMaxConsecutiveSum(int[] inputArray, int k)
{
int sum = 0;
int max = 0;
for (int i = 0; i <= inputArray.Length-k; i++)
{
// Add the next item
sum += inputArray[i];
// Limit the sum to k items
if (i > k) sum -= inputArray[i-k];
// Is this the highest sum so far?
if (sum > max)
max = sum;
}
return max;
}
This is my solution.
public int ArrayMaxConsecutiveSum(int[] inputArray, int k)
{
int max = inputArray.Take(k).Sum();
int sum = max;
for (int i = 1; i <= inputArray.Length - k; i++)
{
sum = sum - inputArray[i- 1] + inputArray[i + k - 1];
if (sum > max)
max = sum;
}
return max;
}
Yes you should never run take & skip on large lists but here's a purely LINQ based solution that is both easy to understand and performs the task in sufficient time. Yes iterative code will still outperform it so you have to take the trade off for your use case. Benchmarks because of size or easy to understand
int arrayMaxConsecutiveSum(int[] inputArray, int k)
{
var sum = inputArray.Take(k).Sum();
return Math.Max(sum, Enumerable.Range(k, inputArray.Length - k)
.Max(i => sum += inputArray[i] - inputArray[i - k]));
}
This is a formula to approximate arcsine(x) using Taylor series from this blog
This is my implementation in C#, I don't know where is the wrong place, the code give wrong result when running:
When i = 0, the division will be 1/x. So I assign temp = 1/x at startup. For each iteration, I change "temp" after "i".
I use a continual loop until the two next value is very "near" together. When the delta of two next number is very small, I will return the value.
My test case:
Input is x =1, so excected arcsin(X) will be arcsin (1) = PI/2 = 1.57079633 rad.
class Arc{
static double abs(double x)
{
return x >= 0 ? x : -x;
}
static double pow(double mu, long n)
{
double kq = mu;
for(long i = 2; i<= n; i++)
{
kq *= mu;
}
return kq;
}
static long fact(long n)
{
long gt = 1;
for (long i = 2; i <= n; i++) {
gt *= i;
}
return gt;
}
#region arcsin
static double arcsinX(double x) {
int i = 0;
double temp = 0;
while (true)
{
//i++;
var iFactSquare = fact(i) * fact(i);
var tempNew = (double)fact(2 * i) / (pow(4, i) * iFactSquare * (2*i+1)) * pow(x, 2 * i + 1) ;
if (abs(tempNew - temp) < 0.00000001)
{
return tempNew;
}
temp = tempNew;
i++;
}
}
public static void Main(){
Console.WriteLine(arcsin());
Console.ReadLine();
}
}
In many series evaluations, it is often convenient to use the quotient between terms to update the term. The quotient here is
(2n)!*x^(2n+1) 4^(n-1)*((n-1)!)^2*(2n-1)
a[n]/a[n-1] = ------------------- * --------------------- -------
(4^n*(n!)^2*(2n+1)) (2n-2)!*x^(2n-1)
=(2n(2n-1)²x²)/(4n²(2n+1))
= ((2n-1)²x²)/(2n(2n+1))
Thus a loop to compute the series value is
sum = 1;
term = 1;
n=1;
while(1 != 1+term) {
term *= (n-0.5)*(n-0.5)*x*x/(n*(n+0.5));
sum += term;
n += 1;
}
return x*sum;
The convergence is only guaranteed for abs(x)<1, for the evaluation at x=1 you have to employ angle halving, which in general is a good idea to speed up convergence.
You are saving two different temp values (temp and tempNew) to check whether or not continuing computation is irrelevant. This is good, except that you are not saving the sum of these two values.
This is a summation. You need to add every new calculated value to the total. You are only keeping track of the most recently calculated value. You can only ever return the last calculated value of the series. So you will always get an extremely small number as your result. Turn this into a summation and the problem should go away.
NOTE: I've made this a community wiki answer because I was hardly the first person to think of this (just the first to put it down in a comment). If you feel that more needs to be added to make the answer complete, just edit it in!
The general suspicion is that this is down to Integer Overflow, namely one of your values (probably the return of fact() or iFactSquare()) is getting too big for the type you have chosen. It's going to negative because you are using signed types — when it gets to too large a positive number, it loops back into the negative.
Try tracking how large n gets during your calculation, and figure out how big a number it would give you if you ran that number through your fact, pow and iFactSquare functions. If it's bigger than the Maximum long value in 64-bit like we think (assuming you're using 64-bit, it'll be a lot smaller for 32-bit), then try using a double instead.
I'm trying to implement logistic regression by myself writing code in C#. I found a library (Accord.NET) that I use to minimize the cost function. However I'm always getting different minimums. Therefore I think something may be wrong in the cost function that I wrote.
static double costfunction(double[] thetas)
{
int i = 0;
double sum = 0;
double[][] theta_matrix_transposed = MatrixCreate(1, thetas.Length);
while(i!=thetas.Length) { theta_matrix_transposed[0][i] = thetas[i]; i++;}
i = 0;
while (i != m) // m is the number of examples
{
int z = 0;
double[][] x_matrix = MatrixCreate(thetas.Length, 1);
while (z != thetas.Length) { x_matrix[z][0] = x[z][i]; z++; } //Put values from the training set to the matrix
double p = MatrixProduct(theta_matrix_transposed, x_matrix)[0][0];
sum += y[i] * Math.Log(sigmoid(p)) + (1 - y[i]) * Math.Log(1 - sigmoid(p));
i++;
}
double value = (-1 / m) * sum;
return value;
}
static double sigmoid(double z)
{
return 1 / (1 + Math.Exp(-z));
}
x is a list of lists that represent the training set, one list for each feature. What's wrong with the code? Why am I getting different results every time I run the L-BFGS? Thank you for your patience, I'm just getting started with machine learning!
That is very common with these optimization algorithms - the minima you arrive at depends on your weight initialization. The fact that you are getting different minimums doesn't necessarily mean something is wrong with your implementation. Instead, check your gradients to make sure they are correct using the finite differences method, and also look at your train/validation/test accuracy to see if they are also acceptable.
I've found two diferent methods to get a Max value from an array but I'm not really fond of parallel programing, so I really don't understand it.
I was wondering do this methods do the same or am I missing something?
I really don't have much information about them. Not even comments...
The first method:
int[] vec = ... (I guess the content doesn't matter)
static int naiveMax()
{
int max = vec[0];
object obj = new object();
Parallel.For(0, vec.Length, i =>
{
lock (obj) {
if (vec[i] > max) max = vec[i];
}
});
return max;
}
And the second one:
static int Max()
{
int max = vec[0];
object obj = new object();
Parallel.For(0, vec.Length, //could be Parallel.For<int>
() => vec[0],
(i, loopState, partial) =>
{
if(vec[i]>partial) partial = vec[i];
return partial;
},
partial => {
lock (obj) {
if( partial > max) max = partial;
}
});
return max;
}
Do these do the same or something diferent and what? Thanks ;)
Both find the maximum value in an array of integers. In an attempt to find the maximum value faster, they do it "in parallel" using the Parallel.For Method. Both methods fail at this, though.
To see this, we first need a sufficiently large array of integers. For small arrays, parallel processing doesn't give us a speed-up anyway.
int[] values = new int[100000000];
Random random = new Random();
for (int i = 0; i < values.Length; i++)
{
values[i] = random.Next();
}
Now we can run the two methods and see how long they take. Using an appropriate performance measurement setup (Stopwatch, array of 100,000,000 integers, 100 iterations, Release build, no debugger attached, JIT warm-up) I get the following results on my machine:
naiveMax 00:06:03.3737078
Max 00:00:15.2453303
So Max is much much better than naiveMax (6 minutes! cough).
But how does it compare to, say, PLINQ?
static int MaxPlinq(int[] values)
{
return values.AsParallel().Max();
}
MaxPlinq 00:00:11.2335842
Not bad, saved a few seconds. Now, what about a plain, old, sequential for loop for comparison?
static int Simple(int[] values)
{
int result = values[0];
for (int i = 0; i < values.Length; i++)
{
if (result < values[i]) result = values[i];
}
return result;
}
Simple 00:00:05.7837002
I think we have a winner.
Lesson learned: Parallel.For is not pixie dust that you can sprinkle over your code to
make it magically run faster. If performance matters, use the right tools and measure, measure, measure, ...
They appear to do the same thing, however they are very inefficient. The point of parallelization is to improve the speed of code that can be executed independently. Due to race conditions, discovering the maximum (as implemented here) requires an atomic semaphore/lock on the actual logic... Which means you're spinning up many threads and related resources simply to do the code sequentially anyway... Defeating the purpose of parallelization entirely.
I'm looking for a library or existing code to simplify fractions.
Does anyone have anything at hand or any links?
P.S. I already understand the process but really don't want to rewrite the wheel
Update
Ok i've checked out the fraction library on the CodeProject
BUT the problem I have is a little bit tricker than simplifying a fraction.
I have to reduce a percentage split which could be 20% / 50% / 30% (always equal to 100%)
I think you just need to divide by the GCD of all the numbers.
void Simplify(int[] numbers)
{
int gcd = GCD(numbers);
for (int i = 0; i < numbers.Length; i++)
numbers[i] /= gcd;
}
int GCD(int a, int b)
{
while (b > 0)
{
int rem = a % b;
a = b;
b = rem;
}
return a;
}
int GCD(int[] args)
{
// using LINQ:
return args.Aggregate((gcd, arg) => GCD(gcd, arg));
}
I haven't tried the code, but it seems simple enough to be right (assuming your numbers are all positive integers and you don't pass an empty array).
You can use Microsoft.FSharp.Math.BigRational, which is in the free F# Power Pack library. Although it depends on F# (which is gratis and included in VS2010), it can be used from C#.
BigRational reduced = BigRational.FromInt(4)/BigRational.FromInt(6);
Console.WriteLine(reduced);
2/3
Console.WriteLine(reduced.Numerator);
2
Console.WriteLine(reduced.Denominator);
3
This library looks like it might be what you need:
var f = new Fraction(numerator, denominator);
numerator = f.Numerator;
denominator = f.Denominator;
Although, I haven't tested it, so it looks like you may need to play around with it to get it to work.
The best example of Fraction (aka Rational) I've seen is in Timothy Budd's "Classic Data Structures in C++". His implementation is very good. It includes a simple implementation of GCD algorithm.
It shouldn't be hard to adapt to C#.
A custom solution:
void simplify(int[] numbers)
{
for (int divideBy = 50; divideBy > 0; divideBy--)
{
bool divisible = true;
foreach (int cur in numbers)
{
//check for divisibility
if ((int)(cur/divideBy)*divideBy!=cur){
divisible = false;
break;
}
}
if (divisible)
{
for (int i = 0; i < numbers.GetLength(0);i++ )
{
numbers[i] /= divideBy;
}
}
}
}
Example usage:
int [] percentages = {20,30,50};
simplify(percentages);
foreach (int p in percentages)
{
Console.WriteLine(p);
}
Outupts:
2
3
5
By the way, this is my first c# program. Thought it would simply be a fun problem to try a new language with, and now I'm in love! It's like Java, but everything I wish was a bit different is exactly how I wanted it
<3 c#
Edit: Btw don't forget to make it static void if it's for your Main class.