I asked a question about having the Excel's BetaInv function ported to .NET: BetaInv function in SQL Server
now I managed to write that function in pure dependency less C# code and I do get the same results than in MS Excel up to 6 or 7 digits after comma, results are fine for us, the problem is that such code is embedded in a SQL CLR Function and gets called thousands of time from a stored procedure and makes the execution of the whole procedure about 50% slower, from 30 seconds up to a minute if I use that function or not.
here some code of it, I am not asking a deep analysis but is there anybody who sees any major performance issue in the way I am doing this calculations? like for example usage of other data types instead of doubles or whatsoever... ?
private static double betacf(double a, double b, double x)
{
int m, m2;
double aa, c, d, del, h, qab, qam, qap;
qab = a + b;
qap = a + 1.0;
qam = a - 1.0;
c = 1.0; // First step of Lentz’s method.
d = 1.0 - qab * x / qap;
if (System.Math.Abs(d) < FPMIN)
{
d = FPMIN;
}
d = 1.0 / d;
h = d;
for (m = 1; m <= MAXIT; ++m)
{
m2 = 2 * m;
aa = m * (b - m) * x / ((qam + m2) * (a + m2));
d = 1.0 + aa * d; //One step (the even one) of the recurrence.
if (System.Math.Abs(d) < FPMIN)
{
d = FPMIN;
}
c = 1.0 + aa / c;
if (System.Math.Abs(c) < FPMIN)
{
c = FPMIN;
}
d = 1.0 / d;
h *= d * c;
aa = -(a + m) * (qab + m) * x / ((a + m2) * (qap + m2));
d = 1.0 + aa * d; // Next step of the recurrence (the odd one).
if (System.Math.Abs(d) < FPMIN)
{
d = FPMIN;
}
c = 1.0 + aa / c;
if (System.Math.Abs(c) < FPMIN)
{
c = FPMIN;
}
d = 1.0 / d;
del = d * c;
h *= del;
if (System.Math.Abs(del - 1.0) < EPS)
{
// Are we done?
break;
}
}
if (m > MAXIT)
{
return 0;
}
else
{
return h;
}
}
private static double gammln(double xx)
{
double x, y, tmp, ser;
double[] cof = new double[] { 76.180091729471457, -86.505320329416776, 24.014098240830911, -1.231739572450155, 0.001208650973866179, -0.000005395239384953 };
y = xx;
x = xx;
tmp = x + 5.5;
tmp -= (x + 0.5) * System.Math.Log(tmp);
ser = 1.0000000001900149;
for (int j = 0; j <= 5; ++j)
{
y += 1;
ser += cof[j] / y;
}
return -tmp + System.Math.Log(2.5066282746310007 * ser / x);
}
The only thing that stands out for me, and is usually a performance hit, is memory allocation. I don't know how often gammln is called but you might want to move the double[] cof = new double[] {} to a static one time only allocation.
double is usually the best type. Especially since the functions in Math take doubles. Unfortunately I see no obvious improvements to make on your code.
It might be possible to use look up tables to get a better first estimate on which you iterate, but since I don't know the Math behind what you're doing I don't know if that's possible in this specific case.
Obviously larger epsilons will improve performance. So choose it as large as possible while fulfilling your accuracy demands.
If the function gets called repeatedly with the same parameters you might be able to cache results.
One thing that looks odd is the way you force small values for c, d,... to FPMIN. My instinct is that this might lead to suboptimal step sizes.
All I've got is unrolling the j loop in gammln, but it'll make at most a tiny difference.
A more radical thought would be to rewrite in pure T-SQL, since it has everything you use: + - * / abs log are all available.
Related
So I have a problem that I'm stuck on it since 3 days ago.
You want to participate at the lottery 6/49 with only one winning variant(simple) and you want to know what odds of winning you have:
-at category I (6 numbers)
-at category II (5 numbers)
-at category III (4 numbers)
Write a console app which gets from input the number of total balls, the number of extracted balls, and the category, then print the odds of winning with a precision of 10 decimals if you play with one simple variant.
Inputs:
40
5
II
Result I must print:
0.0002659542
static void Main(string[] args)
{
int numberOfBalls = Convert.ToInt32(Console.ReadLine());
int balls = Convert.ToInt32(Console.ReadLine());
string line = Console.ReadLine();
int theCategory = FindCategory(line);
double theResult = CalculateChance(numberOfBalls, balls, theCategory);
Console.WriteLine(theResult);
}
static int FindCategory (string input)
{
int category = 0;
switch (input)
{
case "I":
category = 1;
break;
case "II":
category = 2;
break;
case "III":
category = 3;
break;
default:
Console.WriteLine("Wrong category.");
break;
}
return category;
}
static int CalculateFactorial(int x)
{
int factorial = 1;
for (int i = 1; i <= x; i++)
factorial *= i;
return factorial;
}
static int CalculateCombinations(int x, int y)
{
int combinations = CalculateFactorial(x) / (CalculateFactorial(y) * CalculateFactorial(x - y));
return combinations;
}
static double CalculateChance(int a, int b, int c)
{
double result = c / CalculateCombinations(a, b);
return result;
}
Now my problems: I'm pretty sure I have to use Combinations. For using combinations I need to use Factorials. But on the combinations formula I'm working with pretty big factorials so my numbers get truncated. And my second problem is that I don't really understand what I have to do with those categories, and I'm pretty sure I'm doing wrong on that method also. I'm new to programming so please bare with me. And I can use for this problem just basic stuff, like conditions, methods, primitives, arrays.
Let's start from combinatorics; first, come to terms:
a - all possible numbers (40 in your test case)
t - all taken numbers (5 in your test case)
c - category (2) in your test case
So we have
t - c + 1 for numbers which win and c - 1 for numbers which lose. Let's count combinations:
All combinations: take t from a possible ones:
A = a! / t! / (a - t)!
Winning numbers' combinations: take t - c + 1 winning number from t possible ones:
W = t! / (t - c + 1)! / (t - t + c - 1) = t! / (t - c + 1)! / (c - 1)!
Lost numbers' combinations: take c - 1 losing numbers from a - t possible ones:
L = (a - t)! / (c - 1)! / (a - t - c + 1)!
All combinations with category c, i.e. with exactly t - c + 1 winning and c - 1 losing numbers:
C = L * W
Probability:
P = C / A = L * W / A =
t! * t! (a - t)! * (a - t)! / (t - c + 1)! / (c - 1)! / (c - 1)! / (a - t- c + 1)! / a!
Ugh! Not let's implement some code for it:
Code:
// double : note, that int is too small for 40! and the like values
private static double Factorial(int value) {
double result = 1.0;
for (int i = 2; i <= value; ++i)
result *= i;
return result;
}
private static double Chances(int a, int t, int c) =>
Factorial(a - t) * Factorial(t) * Factorial(a - t) * Factorial(t) /
Factorial(t - c + 1) /
Factorial(c - 1) /
Factorial(c - 1) /
Factorial(a - t - c + 1) /
Factorial(a);
Test:
Console.Write(Chances(40, 5, 2));
Outcome:
0.00026595421332263435
Edit:
in terms of Combinations, if C(x, y) which means "take y items from x" we
have
A = C(a, t); W = C(t, t - c + 1); L = C(a - t, c - 1)
and
P = W * L / A = C(t, t - c + 1) * C(a - t, c - 1) / C(a, t)
Code for Combinations is quite easy; the only trick is that we return double:
// Let'g get rid of noisy "Compute": what else can we do but compute?
// Just "Combinations" without pesky prefix.
static double Combinations(int x, int y) =>
Factorial(x) / Factorial(y) / Factorial(x - y);
private static double Chances(int a, int t, int c) =>
Combinations(t, t - c + 1) *
Combinations(a - t, c - 1) /
Combinations(a, t);
You can fiddle the solution
I'm writing a class which should calculate angles (in degrees), after long trying I don't get it to work.
When r = 45 & h = 0 sum should be around 77,14°
but it returns NaN - Why? I know it must be something with the Math.Atan-Method and a not valid value.
My code:
private int r = 0, h = 0;
public Schutzwinkelberechnung(int r, int h)
{
this.r = r;
this.h = h;
}
public double getSchutzwinkel()
{
double sum = 0;
sum = 180 / Math.PI * Math.Atan((Math.Sqrt(2 * h * r - (h * h)) - r * Math.Sin(Math.Asin((Math.Sqrt(2 * h * r)) / (2 * r)))) / (h - r + r * (Math.Cos(Math.Asin(Math.Sqrt(2 * h * r) / (2 * r))))));
return sum;
}
Does someone see my mistake? I got the formula from an excel sheet.
EDIT: Okay, my problem was that I had a parsing error while creating the object or getting the user input, apparently solved it by accident. Ofc I have to add a simple exception, as Nick Dechiara said. Thank you very much for the fast reply, I appreciate it.
EDIT2: The exception in my excel sheet is:
if(h < 2) {h = 2}
so that's explaining everything and I wasn't paying attention at all. Thanks again for all answers.
int r = 45, h = 2;
sum = 77.14°
A good approach to debugging these kinds of issues is to break the equation into smaller pieces, so it is easier to debug.
double r = 45;
double h = 0;
double sqrt2hr = Math.Sqrt(2 * h * r);
double asinsqrt2hr = Math.Asin((sqrt2hr) / (2 * r));
double a = (Math.Sqrt(2 * h * r - (h * h)) - r * Math.Sin(asinsqrt2hr));
double b = (h - r + r * (Math.Cos(asinsqrt2hr)));
double sum = 180 / Math.PI * Math.Atan(a / b);
Now if we put a breakpoint at sum and let the code run, we see that both a and b are equal to zero. This gives us a / b = 0 / 0 = NaN in the final line.
Now we can ask, why is this happening? Well in the case of b you have h - r + r which is 0 - 45 + 45, evaluates to 0, so b becomes 0. You probably have an error in your math there.
In the case of a, we have 2 * h * r - h * h, which also evaluates to 0.
You probably either A) have an error in your equation, or B) need to include a special case for when h = 0, as that is breaking your math here.
Definitely break up the expression something like
var a = Asin(Sqrt(2 * h * r) / (2 * r));
var b = Sqrt(2 * h * r - h * h) - r * Sin(a);
var c = h - r + r * Cos(a);
var sum = 180 / PI * Atan(b / c);
and you will find b=0 and c=0. You might consider changing the last expression into
var sum = 180 / PI * Atan2(b , c);
which will return a value when b=0 and c=0.
PS. Also, use using static System.Math; in the beginning of the code to shorten such math expressions.
I had a look at the wave format today and created a little wave generator. I create a sine sound like this:
public static Wave GetSine(double length, double hz)
{
int bitRate = 44100;
int l = (int)(bitRate * length);
double f = 1.0 / bitRate;
Int16[] data = new Int16[l];
for (int i = 0; i < l; i++)
{
data[i] = (Int16)(Math.Sin(hz * i * f * Math.PI * 2) * Int16.MaxValue);
}
return new Wave(false, Wave.MakeInt16WaveData(data));
}
MakeInt16WaveData looks like this:
public static byte[] MakeInt16WaveData(Int16[] ints)
{
int s = sizeof(Int16);
byte[] buf = new byte[s * ints.Length];
for(int i = 0; i < ints.Length; i++)
{
Buffer.BlockCopy(BitConverter.GetBytes(ints[i]), 0, buf, i * s, s);
}
return buf;
}
This works as expected! Now I wanted to swoop from one frequency to another like this:
public static Wave GetSineSwoop(double length, double hzStart, double hzEnd)
{
int bitRate = 44100;
int l = (int)(bitRate * length);
double f = 1.0 / bitRate;
Int16[] data = new Int16[l];
double hz;
double hzDelta = hzEnd - hzStart;
for (int i = 0; i < l; i++)
{
hz = hzStart + ((double)i / l) * hzDelta * 0.5; // why *0.5 ?
data[i] = (Int16)(Math.Sin(hz * i * f * Math.PI * 2) * Int16.MaxValue);
}
return new Wave(false, Wave.MakeInt16WaveData(data));
}
Now, when I swooped from 200 to 100 Hz, the sound played from 200 to 0 hertz. For some reason I had to multiply the delta by 0.5 to get the correct output. What might be the issue here ? Is this an audio thing or is there a bug in my code ?
Thanks
Edit by TaW: I take the liberty to add screenshots of the data in a chart which illustrate the problem, the first is with the 0.5 factor, the 2nd with 1.0 and the 3rd & 4th with 2.0 and 5.0:
Edit: here is an example, a swoop from 200 to 100 hz:
Debug values:
Wave clearly does not end at 100 hz
Digging out my rusty math I think it may be because:
Going in L steps from frequency F1 to F2 you have a frequency of
Fi = F1 + i * ( F2 - F1 ) / L
or with
( F2 - F1 ) / L = S
Fi = F1 + i * S
Now to find out how far we have progressed we need the integral over i, which would be
I(Fi) = i * F1 + 1/2 * i ^ 2 * S
Which give or take resembles the term inside your formula for the sine.
Note that you can gain efficiency by moving the constant part (S/2) out of the loop..
I am trying to replicate some formulas but am having trouble translating the math to code.
Here is the simple Exponential Moving Average
In c#:
out[1] = values[1];
for (i in 2:N(X)) {
tmp = (times[i] - times[i-1]) / tau;
w = exp(-tmp);
w2 = (1 - w) / tmp;
out[i] = out[i-1] * w + values[i] * (1 - w2) + values[i-1] * (w2 - w);
}
In Python:
mu = numpy.exp ((ts[1] - ts[0]) / self.tau)
nu = 1.0 - mu
return numpy.array ([
mu * el + nu * arr[0] for el, arr in zip (last, arrays)
])
I want to be able to specify different kernels and am not sure how to go about it as described here:
This is all done so I can eventually recreate this moving differential given here:
Thanks for any help given
One possible approach here is to have a method that returns the kernel.
From what I am able to see, inputs to this method would be kerneltype, i, and otherInputs.
A simple approach would be:
for(int i = 1; i < values.length(); i++)
{
tmp = (times[i] - times[i-1]) / tau;
//w = exp(-tmp);
//w2 = (1 - w) / tmp;
List<Object> kernelInputsInital = new List<Object>();
kernelInputsInitial.Add(tmp); //takes in the first argument
kernelInputsInitial.Add(true); //expected to calculate the first
w = GetKernel(KernelType.Exponential, i, kernelInputsInitial);
List<Object> kernelInputsSecondTerm = new List<Object>();
kernelInputsSecondTerm.Add(w); //takes in the first argument
kernelInputsSecondTerm.Add(false); //expected to calculate the first
w2 = GetKernel(KernelType.Exponential, i, kernelInputsInitial);
out[i] = out[i-1] * w + values[i] * (1 - w2) + values[i-1] * (w2 - w);
....
}
This is of course terribly, terribly rough, and a lot of improvement can be made, but it is intended to merely get the point across.
I would use an interface to represent a kernel, and have classes derived per kernel. In my experience, that produces sufficiently readable and maintainable code, but there's always room for improvement.
I need to divide a numeric range to some segments that have same length. But I can't decide which way is more accurate. For example:
double r1 = 100.0, r2 = 1000.0, r = r2 - r1;
int n = 30;
double[] position = new double[n];
for (int i = 0; i < n; i++)
{
position[i] = r1 + (double)i / n * r;
// position[i] = r1 + i * r / n;
}
It's about (double)int1 / int2 * double or int1 * double / int2. Which way is more accurate? Which way should I use?
Update
The following code will show the difference:
double r1 = 1000.0, r2 = 100000.0, r = r2 - r1;
int n = 300;
double[] position = new double[n];
for (int i = 0; i < n; i++)
{
double v1 = r1 + (double)i / n * r;
double v2 = position[i] = r1 + i * r / n;
if (v1 != v2)
{
Console.WriteLine(v2 - v1);
}
}
Disclaimer: All numbers I am going to give as examples are not exact, but show the principle of what's happening behind the scenes.
Let's examine two cases:
(1) int1 = 1000, int2= 3, double = 3.0
The first method will give you: (1000.0 / 3) * 3 == 333.33333 * 3.0 == 999.999...
While the second will give (1000 * 3.0) / 3 == 3000 / 3 == 1000
In this scenario - the second method is more accurate.
(2) int1 = 2, int2 = 2, double = Double.MAX_VALUE
The first will yield (2.0 / 2) * Double.MAX_VALUE == 1 * Double.MAX_VALUE == Double.MAX_VALUE
While the second will give (2 * Double.MAX_VALUE) / 2 - which will cause (in Java) to be Infinity, I am not sure what the double standard says about this cases, if it might overflow or is it always infinity - but it is definetly an issue.
So, in this case - the first method is more accurate.
The things might go more complicated if the integers are longs or the double is float, since there are long values that cannot be represented by doubles, so loss of accuracy might happen for large double values in this case, and in any case - large double values are less accurate.
Conclusion: Which is better is domain specific. In some cases the first method should be better and in some the first. It really depends on the values of int1,int2, and double.
However- AFAIK, the general rule of thumb with double precision ops is keep the calculations as small as possible (Don't create huge numbers and then decrease them back, keep them small as longest as you can). This issue is known as loss of significant digits.
Neither is particularly faster, since the compiler or the JIT process may reorder the operation for efficiency anyway.
Maybe I misunderstand your requirement but why do any division/multiplication inside the loop at all? Maybe this would get the same results:
decimal r1 = 100.0m, r2 = 1000.0m, r = r2 - r1;
int n = 30;
decimal[] position = new double[n];
decimal diff = r / n;
decimal current = r1;
for (int i = 0; i < n; i++)
{
position[i] = current;
current += diff;
}