When I using Math.Exp() in C# I have some questions?This code is about Kernel density estimation, and I don't have any knowledge about kernel density estimation. So I look up some wiki and some paper.
I try to write it by C#. The problem is when "distance" is getting higher the result is become 0. It's confuse me and I cannot find any other way to get the right result.
disExp = Math.Pow(Math.E, -(distance / 2 * Math.Pow(h, 2)));
So, can any one help me to get the solution? Or give me some idea about Kernel density estimation on C#. Sorry for poor English.
Try this
public static double[,] KernelDensityEstimation(double[] data, double sigma, int nsteps)
{
// probability density function (PDF) signal analysis
// Works like ksdensity in mathlab.
// KDE performs kernel density estimation (KDE)on one - dimensional data
// http://en.wikipedia.org/wiki/Kernel_density_estimation
// Input: -data: input data, one-dimensional
// -sigma: bandwidth(sometimes called "h")
// -nsteps: optional number of abscis points.If nsteps is an
// array, the abscis points will be taken directly from it. (default 100)
// Output: -x: equispaced abscis points
// -y: estimates of p(x)
// This function is part of the Kernel Methods Toolbox(KMBOX) for MATLAB.
// http://sourceforge.net/p/kmbox
// Converted to C# code by ksandric
double[,] result = new double[nsteps, 2];
double[] x = new double[nsteps], y = new double[nsteps];
double MAX = Double.MinValue, MIN = Double.MaxValue;
int N = data.Length; // number of data points
// Find MIN MAX values in data
for (int i = 0; i < N; i++)
{
if (MAX < data[i])
{
MAX = data[i];
}
if (MIN > data[i])
{
MIN = data[i];
}
}
// Like MATLAB linspace(MIN, MAX, nsteps);
x[0] = MIN;
for (int i = 1; i < nsteps; i++)
{
x[i] = x[i - 1] + ((MAX - MIN) / nsteps);
}
// kernel density estimation
double c = 1.0 / (Math.Sqrt(2 * Math.PI * sigma * sigma));
for (int i = 0; i < N; i++)
{
for (int j = 0; j < nsteps; j++)
{
y[j] = y[j] + 1.0 / N * c * Math.Exp(-(data[i] - x[j]) * (data[i] - x[j]) / (2 * sigma * sigma));
}
}
// compilation of the X,Y to result. Good for creating plot(x, y)
for (int i = 0; i < nsteps; i++)
{
result[i, 0] = x[i];
result[i, 1] = y[i];
}
return result;
}
kernel density estimation C#
plot
Related
This is a piece of my code, which calculate the differentiate. It works correctly but it takes a lot (because of height and width).
"Data" is a grey image bitmap.
"Filter" is [3,3] matrix.
"fh" and "fw" maximum values are 3.
I am looking to speed up this code.
I also tried with using parallel, for but it didn't work correct (error with out of bounds).
private float[,] Differentiate(int[,] Data, int[,] Filter)
{
int i, j, k, l, Fh, Fw;
Fw = Filter.GetLength(0);
Fh = Filter.GetLength(1);
float sum = 0;
float[,] Output = new float[Width, Height];
for (i = Fw / 2; i <= (Width - Fw / 2) - 1; i++)
{
for (j = Fh / 2; j <= (Height - Fh / 2) - 1; j++)
{
sum=0;
for(k = -Fw/2; k <= Fw/2; k++)
{
for(l = -Fh/2; l <= Fh/2; l++)
{
sum = sum + Data[i+k, j+l] * Filter[Fw/2+k, Fh/2+l];
}
}
Output[i,j] = sum;
}
}
return Output;
}
For parallel execution you need to drop c language like variable declaration at the beginning of method and declare them in actual scope that they are used so they are not shared between threads. Making it parallel should provide some benefit for performance, but making them all ParallerFors is not a good idea as there is a limit for threads amount that actually can run in parallel. I would try to make it with top level loop only:
private static float[,] Differentiate(int[,] Data, int[,] Filter)
{
var Fw = Filter.GetLength(0);
var Fh = Filter.GetLength(1);
float[,] Output = new float[Width, Height];
Parallel.For(Fw / 2, Width - Fw / 2 - 1, (i, state) =>
{
for (var j = Fh / 2; j <= (Height - Fh / 2) - 1; j++)
{
var sum = 0;
for (var k = -Fw / 2; k <= Fw / 2; k++)
{
for (var l = -Fh / 2; l <= Fh / 2; l++)
{
sum = sum + Data[i + k, j + l] * Filter[Fw / 2 + k, Fh / 2 + l];
}
}
Output[i, j] = sum;
}
});
return Output;
}
This is a perfect example of a task where using the GPU is better than using the CPU. A GPU is able to perform trillions of floating point operations per second (TFlops), while CPU performance is still measured in GFlops. The catch is that it's only any good if you use SIMD instructions (Single Instruction Multiple Data). The GPU excels at data-parallel tasks. If different data needs different instructions, using the GPU has no advantage.
In your program, the elements of your bitmap go through the same calculations: the same computations just with slightly different data (SIMD!). So using the GPU is a great option. This won't be too complex because with your calculations threads on the GPU would not need to exchange information, nor would they be dependent on results of previous iterations (Each element would be processed by a different thread on the GPU).
You can use, for example, OpenCL to easily access the GPU. More on OpenCL and using the GPU here: https://www.codeproject.com/Articles/502829/GPGPU-image-processing-basics-using-OpenCL-NET
I'm taking the Coursera machine learning course right now and I cant get my gradient descent linear regression function to minimize. I use: one dependent variable, an intercept, and four values of x and y, therefore the equations are fairly simple. The final value of the Gradient Decent equation varies wildly depending on the initial values of alpha and beta and I cant figure out why.
I've only been coding for about two weeks, so my knowledge is limited to say the least, please keep this in mind if you take the time to help.
using System;
namespace LinearRegression
{
class Program
{
static void Main(string[] args)
{
Random rnd = new Random();
const int N = 4;
//We randomize the inital values of alpha and beta
double theta1 = rnd.Next(0, 100);
double theta2 = rnd.Next(0, 100);
//Values of x, i.e the independent variable
double[] x = new double[N] { 1, 2, 3, 4 };
//VAlues of y, i.e the dependent variable
double[] y = new double[N] { 5, 7, 9, 12 };
double sumOfSquares1;
double sumOfSquares2;
double temp1;
double temp2;
double sum;
double learningRate = 0.001;
int count = 0;
do
{
//We reset the Generalized cost function, called sum of squares
//since I originally used SS to
//determine if the function was minimized
sumOfSquares1 = 0;
sumOfSquares2 = 0;
//Adding 1 to counter for each iteration to keep track of how
//many iterations are completed thus far
count += 1;
//First we calculate the Generalized cost function, which is
//to be minimized
sum = 0;
for (int i = 0; i < (N - 1); i++)
{
sum += Math.Pow((theta1 + theta2 * x[i] - y[i]), 2);
}
//Since we have 4 values of x and y we have 1/(2*N) = 1 /8 = 0.125
sumOfSquares1 = 0.125 * sum;
//Then we calcualte the new alpha value, using the derivative of
//the cost function.
sum = 0;
for (int i = 0; i < (N - 1); i++)
{
sum += theta1 + theta2 * x[i] - y[i];
}
//Since we have 4 values of x and y we have 1/(N) = 1 /4 = 0.25
temp1 = theta1 - learningRate * 0.25 * sum;
//Same for the beta value, it has a different derivative
sum = 0;
for (int i = 0; i < (N - 1); i++)
{
sum += (theta1 + theta2 * x[i]) * x[i] - y[i];
}
temp2 = theta2 - learningRate * 0.25 * sum;
//WE change the values of alpha an beta at the same time, otherwise the
//function wont work
theta1 = temp1;
theta2 = temp2;
//We then calculate the cost function again, with new alpha and beta values
sum = 0;
for (int i = 0; i < (N - 1); i++)
{
sum += Math.Pow((theta1 + theta2 * x[i] - y[i]), 2);
}
sumOfSquares2 = 0.125 * sum;
Console.WriteLine("Alpha: {0:N}", theta1);
Console.WriteLine("Beta: {0:N}", theta2);
Console.WriteLine("GCF Before: {0:N}", sumOfSquares1);
Console.WriteLine("GCF After: {0:N}", sumOfSquares2);
Console.WriteLine("Iterations: {0}", count);
Console.WriteLine(" ");
} while (sumOfSquares2 <= sumOfSquares1 && count < 5000);
//we end the iteration cycle once the generalized cost function
//cannot be reduced any further or after 5000 iterations
Console.ReadLine();
}
}
}
There are two bugs in the code.
First, I assume that you would like to iterate through all the element in the array. So rework the for loop like this: for (int i = 0; i < N; i++)
Second, when updating the theta2 value the summation is not calculated well. According to the update function it should be look like this: sum += (theta1 + theta2 * x[i] - y[i]) * x[i];
Why the final values depend on the initial values?
Because the gradient descent update step is calculated from these values. If the initial values (Starting Point) are too big or too small, then it will be too far away from the final values (Final Value). You could solve this problem by:
Increasing the iteration steps (e.g. 5000 to 50000): gradient descent algorithm has more time to converge.
Decreasing the learning rate (e.g. 0.001 to 0.01): gradient descent update steps are bigger, therefore it converges faster. Note: if the learning rate is too small, then it is possible to step through the global minimum.
The slope (theta2) is around 2.5 and the intercept (theta1) is around 2.3 for the given data. I have created a github project to fix your code and i have also added a shorter solution using LINQ. It is 5 line of codes. If you are curious check it out here.
I am a newcomer to the sound programming. I have a real-time sound visualizer(http://www.codeproject.com/Articles/20025/Sound-visualizer-in-C). I downloaded it from codeproject.com.
In AudioFrame.cs class there is an array as below:
_fftLeft = FourierTransform.FFTDb(ref _waveLeft);
_fftLeft is a double array. _waveLeft is also a double array. As above they applied
FouriorTransform.cs class's FFTDb function to a _waveLeft array.
Here is FFTDb function:
static public double[] FFTDb(ref double[] x)
{
n = x.Length;
nu = (int)(Math.Log(n) / Math.Log(2));
int n2 = n / 2;
int nu1 = nu - 1;
double[] xre = new double[n];
double[] xim = new double[n];
double[] decibel = new double[n2];
double tr, ti, p, arg, c, s;
for (int i = 0; i < n; i++)
{
xre[i] = x[i];
xim[i] = 0.0f;
}
int k = 0;
for (int l = 1; l <= nu; l++)
{
while (k < n)
{
for (int i = 1; i <= n2; i++)
{
p = BitReverse(k >> nu1);
arg = 2 * (double)Math.PI * p / n;
c = (double)Math.Cos(arg);
s = (double)Math.Sin(arg);
tr = xre[k + n2] * c + xim[k + n2] * s;
ti = xim[k + n2] * c - xre[k + n2] * s;
xre[k + n2] = xre[k] - tr;
xim[k + n2] = xim[k] - ti;
xre[k] += tr;
xim[k] += ti;
k++;
}
k += n2;
}
k = 0;
nu1--;
n2 = n2 / 2;
}
k = 0;
int r;
while (k < n)
{
r = BitReverse(k);
if (r > k)
{
tr = xre[k];
ti = xim[k];
xre[k] = xre[r];
xim[k] = xim[r];
xre[r] = tr;
xim[r] = ti;
}
k++;
}
for (int i = 0; i < n / 2; i++)
decibel[i] = 10.0 * Math.Log10((float)(Math.Sqrt((xre[i] * xre[i]) + (xim[i] * xim[i]))));
return decibel;
}
When I play a music note in a guitar i wanted to know it's frequency in a numerical format. I wrote a foreach loop to know what is the output of a _fftLeft array as below,
foreach (double myarray in _fftLeft)
{
Console.WriteLine(myarray );
}
This output's contain lots of real-time values as below .
41.3672743963389
,43.0176034462662,
35.3677383746087,
42.5968946936404,
42.0600935794783,
36.7521669642071,
41.6356709559342,
41.7189032845742,
41.1002451261724,
40.8035583510188,
45.604366914128,
39.645552593115
I want to know what are those values (frequencies or not)? if the answer is frequencies then why it contains low frequency values? And when I play a guitar note I want to detect a frequency of that particular guitar note.
Based on the posted code, FFTDb first computes the FFT then computes and returns the magnitudes of the frequency spectrum in the logarithmic decibels scale. In other words, the _fftLeft then contains magnitudes for a discreet set of frequencies. The actual values of those frequencies can be computed using the array index and sampling frequency according to this answer.
As an example, if you were plotting the _fftLeft output for a pure sinusoidal tone input you should be able to see a clear spike in the index corresponding to the sinusoidal frequency. For a guitar note however you are likely going to see multiple spikes in magnitude corresponding to the harmonics. To detect the note's frequency aka pitch is a more complicated topic and typically requires the use of one of several pitch detection algorithms.
I'm attempting to implement a simple Gaussian Blur function, however when run on the image, it just comes back as more opaque than the original; no blur takes place.
public double[,] CreateGaussianFilter(int size)
{
double[,] gKernel = new double[size,size];
for (int y = 0; y < size; y++)
for (int x = 0; x < size; x++)
gKernel[y,x] = 0.0;
// set standard deviation to 1.0
double sigma = 1.0;
double r, s = 2.0 * sigma * sigma;
// sum is for normalization
double sum = 0.0;
// generate kernel
for (int x = -size/2; x <= size/2; x++)
{
for(int y = -size/2; y <= size/2; y++)
{
r = Math.Sqrt(x*x + y*y);
gKernel[x + size/2, y + size/2] = (Math.Exp(-(r*r)/s))/(Math.PI * s);
sum += gKernel[x + size/2, y + size/2];
}
}
// normalize the Kernel
for(int i = 0; i < size; ++i)
for(int j = 0; j < size; ++j)
gKernel[i,j] /= sum;
return gKernel;
}
public void GaussianFilter(ref LockBitmap image, double[,] filter)
{
int size = filter.GetLength(0);
for (int y = size/2; y < image.Height - size/2; y++)
{
for (int x = size/2; x < image.Width - size/2; x++)
{
//Grab surrounding pixels and stick them in an accumulator
double sum = 0.0;
int filter_y = 0;
for (int r = y - (size / 2); r < y + (size / 2); r++)
{
int filter_x = 0;
for (int c = x - (size / 2); c < x + (size / 2); c++)
{
//Multiple surrounding pixels by filter, add them up and set the center pixel (x,y) to this value
Color pixelVal = image.GetPixel(c, r);
double grayVal = (pixelVal.B + pixelVal.R + pixelVal.G) / 3.0;
sum += grayVal * filter[filter_y,filter_x];
filter_x++;
}
filter_y++;
}
//set the xy pixel
image.SetPixel(x,y, Color.FromArgb(255, (int)sum,(int)sum, (int)sum));
}
}
}
Any suggestions are much appreciated. Thanks!
There are a number of things about your solution.
A convolved image getting darker generally means there is a gain in the kernel of less than 1. Though perhaps not in this case, see (5).
Gaussian blur is a separable kernel and can be performed in far less time than brute force.
Averaging RGB to gray is not an optically "correct" means of computing luminance.
getpixel, setpixel approaches are generally very slow. If you are in a language supporting pointers you should use them. Looks like C#? Use unsafe code to get access to pointers.
int() truncates - this could be your source of decreased brightness. You are in essence always rounding down.
Your nested loops in the kernel generating function contain excessive bounds adjustments. This could be much faster but better yet replaced with a separable approach.
You are convolving in a single buffer. Therefore you are convolving convoved values.
Thanks
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed last year.
Improve this question
Do you know of a .net library to perform a LOESS/LOWESS regression? (preferably free/open source)
Port from java to c#
public class LoessInterpolator
{
public static double DEFAULT_BANDWIDTH = 0.3;
public static int DEFAULT_ROBUSTNESS_ITERS = 2;
/**
* The bandwidth parameter: when computing the loess fit at
* a particular point, this fraction of source points closest
* to the current point is taken into account for computing
* a least-squares regression.
*
* A sensible value is usually 0.25 to 0.5.
*/
private double bandwidth;
/**
* The number of robustness iterations parameter: this many
* robustness iterations are done.
*
* A sensible value is usually 0 (just the initial fit without any
* robustness iterations) to 4.
*/
private int robustnessIters;
public LoessInterpolator()
{
this.bandwidth = DEFAULT_BANDWIDTH;
this.robustnessIters = DEFAULT_ROBUSTNESS_ITERS;
}
public LoessInterpolator(double bandwidth, int robustnessIters)
{
if (bandwidth < 0 || bandwidth > 1)
{
throw new ApplicationException(string.Format("bandwidth must be in the interval [0,1], but got {0}", bandwidth));
}
this.bandwidth = bandwidth;
if (robustnessIters < 0)
{
throw new ApplicationException(string.Format("the number of robustness iterations must be non-negative, but got {0}", robustnessIters));
}
this.robustnessIters = robustnessIters;
}
/**
* Compute a loess fit on the data at the original abscissae.
*
* #param xval the arguments for the interpolation points
* #param yval the values for the interpolation points
* #return values of the loess fit at corresponding original abscissae
* #throws MathException if some of the following conditions are false:
* <ul>
* <li> Arguments and values are of the same size that is greater than zero</li>
* <li> The arguments are in a strictly increasing order</li>
* <li> All arguments and values are finite real numbers</li>
* </ul>
*/
public double[] smooth(double[] xval, double[] yval)
{
if (xval.Length != yval.Length)
{
throw new ApplicationException(string.Format("Loess expects the abscissa and ordinate arrays to be of the same size, but got {0} abscisssae and {1} ordinatae", xval.Length, yval.Length));
}
int n = xval.Length;
if (n == 0)
{
throw new ApplicationException("Loess expects at least 1 point");
}
checkAllFiniteReal(xval, true);
checkAllFiniteReal(yval, false);
checkStrictlyIncreasing(xval);
if (n == 1)
{
return new double[] { yval[0] };
}
if (n == 2)
{
return new double[] { yval[0], yval[1] };
}
int bandwidthInPoints = (int)(bandwidth * n);
if (bandwidthInPoints < 2)
{
throw new ApplicationException(string.Format("the bandwidth must be large enough to accomodate at least 2 points. There are {0} " +
" data points, and bandwidth must be at least {1} but it is only {2}",
n, 2.0 / n, bandwidth
));
}
double[] res = new double[n];
double[] residuals = new double[n];
double[] sortedResiduals = new double[n];
double[] robustnessWeights = new double[n];
// Do an initial fit and 'robustnessIters' robustness iterations.
// This is equivalent to doing 'robustnessIters+1' robustness iterations
// starting with all robustness weights set to 1.
for (int i = 0; i < robustnessWeights.Length; i++) robustnessWeights[i] = 1;
for (int iter = 0; iter <= robustnessIters; ++iter)
{
int[] bandwidthInterval = { 0, bandwidthInPoints - 1 };
// At each x, compute a local weighted linear regression
for (int i = 0; i < n; ++i)
{
double x = xval[i];
// Find out the interval of source points on which
// a regression is to be made.
if (i > 0)
{
updateBandwidthInterval(xval, i, bandwidthInterval);
}
int ileft = bandwidthInterval[0];
int iright = bandwidthInterval[1];
// Compute the point of the bandwidth interval that is
// farthest from x
int edge;
if (xval[i] - xval[ileft] > xval[iright] - xval[i])
{
edge = ileft;
}
else
{
edge = iright;
}
// Compute a least-squares linear fit weighted by
// the product of robustness weights and the tricube
// weight function.
// See http://en.wikipedia.org/wiki/Linear_regression
// (section "Univariate linear case")
// and http://en.wikipedia.org/wiki/Weighted_least_squares
// (section "Weighted least squares")
double sumWeights = 0;
double sumX = 0, sumXSquared = 0, sumY = 0, sumXY = 0;
double denom = Math.Abs(1.0 / (xval[edge] - x));
for (int k = ileft; k <= iright; ++k)
{
double xk = xval[k];
double yk = yval[k];
double dist;
if (k < i)
{
dist = (x - xk);
}
else
{
dist = (xk - x);
}
double w = tricube(dist * denom) * robustnessWeights[k];
double xkw = xk * w;
sumWeights += w;
sumX += xkw;
sumXSquared += xk * xkw;
sumY += yk * w;
sumXY += yk * xkw;
}
double meanX = sumX / sumWeights;
double meanY = sumY / sumWeights;
double meanXY = sumXY / sumWeights;
double meanXSquared = sumXSquared / sumWeights;
double beta;
if (meanXSquared == meanX * meanX)
{
beta = 0;
}
else
{
beta = (meanXY - meanX * meanY) / (meanXSquared - meanX * meanX);
}
double alpha = meanY - beta * meanX;
res[i] = beta * x + alpha;
residuals[i] = Math.Abs(yval[i] - res[i]);
}
// No need to recompute the robustness weights at the last
// iteration, they won't be needed anymore
if (iter == robustnessIters)
{
break;
}
// Recompute the robustness weights.
// Find the median residual.
// An arraycopy and a sort are completely tractable here,
// because the preceding loop is a lot more expensive
System.Array.Copy(residuals, sortedResiduals, n);
//System.arraycopy(residuals, 0, sortedResiduals, 0, n);
Array.Sort<double>(sortedResiduals);
double medianResidual = sortedResiduals[n / 2];
if (medianResidual == 0)
{
break;
}
for (int i = 0; i < n; ++i)
{
double arg = residuals[i] / (6 * medianResidual);
robustnessWeights[i] = (arg >= 1) ? 0 : Math.Pow(1 - arg * arg, 2);
}
}
return res;
}
/**
* Given an index interval into xval that embraces a certain number of
* points closest to xval[i-1], update the interval so that it embraces
* the same number of points closest to xval[i]
*
* #param xval arguments array
* #param i the index around which the new interval should be computed
* #param bandwidthInterval a two-element array {left, right} such that: <p/>
* <tt>(left==0 or xval[i] - xval[left-1] > xval[right] - xval[i])</tt>
* <p/> and also <p/>
* <tt>(right==xval.length-1 or xval[right+1] - xval[i] > xval[i] - xval[left])</tt>.
* The array will be updated.
*/
private static void updateBandwidthInterval(double[] xval, int i, int[] bandwidthInterval)
{
int left = bandwidthInterval[0];
int right = bandwidthInterval[1];
// The right edge should be adjusted if the next point to the right
// is closer to xval[i] than the leftmost point of the current interval
int nextRight = nextNonzero(weights, right);
if (nextRight < xval.Length && xval[nextRight] - xval[i] < xval[i] - xval[left])
{
int nextLeft = nextNonzero(weights, bandwidthInterval[0]);
bandwidthInterval[0] = nextLeft;
bandwidthInterval[1] = nextRight;
}
}
/**
* Compute the
* tricube
* weight function
*
* #param x the argument
* #return (1-|x|^3)^3
*/
private static double tricube(double x)
{
double tmp = Math.abs(x);
tmp = 1 - tmp * tmp * tmp;
return tmp * tmp * tmp;
}
/**
* Check that all elements of an array are finite real numbers.
*
* #param values the values array
* #param isAbscissae if true, elements are abscissae otherwise they are ordinatae
* #throws MathException if one of the values is not
* a finite real number
*/
private static void checkAllFiniteReal(double[] values, bool isAbscissae)
{
for (int i = 0; i < values.Length; i++)
{
double x = values[i];
if (Double.IsInfinity(x) || Double.IsNaN(x))
{
string pattern = isAbscissae ?
"all abscissae must be finite real numbers, but {0}-th is {1}" :
"all ordinatae must be finite real numbers, but {0}-th is {1}";
throw new ApplicationException(string.Format(pattern, i, x));
}
}
}
/**
* Check that elements of the abscissae array are in a strictly
* increasing order.
*
* #param xval the abscissae array
* #throws MathException if the abscissae array
* is not in a strictly increasing order
*/
private static void checkStrictlyIncreasing(double[] xval)
{
for (int i = 0; i < xval.Length; ++i)
{
if (i >= 1 && xval[i - 1] >= xval[i])
{
throw new ApplicationException(string.Format(
"the abscissae array must be sorted in a strictly " +
"increasing order, but the {0}-th element is {1} " +
"whereas {2}-th is {3}",
i - 1, xval[i - 1], i, xval[i]));
}
}
}
}
Since I'm unable to comment on other people's posts (new user), and people seem to think I should do that with this instead of editing the above answer, I'm simply going to write it as an answer even though I know this is better as a comment.
The updateBandwidthInterval method in the above answer forgets to check the left side as written in the method comment. This can give NaN issues for sumWeights. The below should fix that. I encountered this when doing a c++ implementation based on the above.
/**
* Given an index interval into xval that embraces a certain number of
* points closest to xval[i-1], update the interval so that it embraces
* the same number of points closest to xval[i]
*
* #param xval arguments array
* #param i the index around which the new interval should be computed
* #param bandwidthInterval a two-element array {left, right} such that: <p/>
* <tt>(left==0 or xval[i] - xval[left-1] > xval[right] - xval[i])</tt>
* <p/> and also <p/>
* <tt>(right==xval.length-1 or xval[right+1] - xval[i] > xval[i] - xval[left])</tt>.
* The array will be updated.
*/
private static void updateBandwidthInterval(double[] xval, int i, int[] bandwidthInterval)
{
int left = bandwidthInterval[0];
int right = bandwidthInterval[1];
// The edges should be adjusted if the previous point to the
// left is closer to x than the current point to the right or
// if the next point to the right is closer
// to x than the leftmost point of the current interval
if (left != 0 &&
xval[i] - xval[left - 1] < xval[right] - xval[i])
{
bandwidthInterval[0]++;
bandwidthInterval[1]++;
}
else if (right < xval.Length - 1 &&
xval[right + 1] - xval[i] < xval[i] - xval[left])
{
bandwidthInterval[0]++;
bandwidthInterval[1]++;
}
}
Hope someone after 5 years find this useful. This is the original code posted by Tutcugil but with the missing methods and updated.
using System;
using System.Linq;
namespace StockCorrelation
{
public class LoessInterpolator
{
public static double DEFAULT_BANDWIDTH = 0.3;
public static int DEFAULT_ROBUSTNESS_ITERS = 2;
/**
* The bandwidth parameter: when computing the loess fit at
* a particular point, this fraction of source points closest
* to the current point is taken into account for computing
* a least-squares regression.
*
* A sensible value is usually 0.25 to 0.5.
*/
private double bandwidth;
/**
* The number of robustness iterations parameter: this many
* robustness iterations are done.
*
* A sensible value is usually 0 (just the initial fit without any
* robustness iterations) to 4.
*/
private int robustnessIters;
public LoessInterpolator()
{
this.bandwidth = DEFAULT_BANDWIDTH;
this.robustnessIters = DEFAULT_ROBUSTNESS_ITERS;
}
public LoessInterpolator(double bandwidth, int robustnessIters)
{
if (bandwidth < 0 || bandwidth > 1)
{
throw new ApplicationException(string.Format("bandwidth must be in the interval [0,1], but got {0}", bandwidth));
}
this.bandwidth = bandwidth;
if (robustnessIters < 0)
{
throw new ApplicationException(string.Format("the number of robustness iterations must be non-negative, but got {0}", robustnessIters));
}
this.robustnessIters = robustnessIters;
}
/**
* Compute a loess fit on the data at the original abscissae.
*
* #param xval the arguments for the interpolation points
* #param yval the values for the interpolation points
* #return values of the loess fit at corresponding original abscissae
* #throws MathException if some of the following conditions are false:
* <ul>
* <li> Arguments and values are of the same size that is greater than zero</li>
* <li> The arguments are in a strictly increasing order</li>
* <li> All arguments and values are finite real numbers</li>
* </ul>
*/
public double[] smooth(double[] xval, double[] yval, double[] weights)
{
if (xval.Length != yval.Length)
{
throw new ApplicationException(string.Format("Loess expects the abscissa and ordinate arrays to be of the same size, but got {0} abscisssae and {1} ordinatae", xval.Length, yval.Length));
}
int n = xval.Length;
if (n == 0)
{
throw new ApplicationException("Loess expects at least 1 point");
}
checkAllFiniteReal(xval, true);
checkAllFiniteReal(yval, false);
checkStrictlyIncreasing(xval);
if (n == 1)
{
return new double[] { yval[0] };
}
if (n == 2)
{
return new double[] { yval[0], yval[1] };
}
int bandwidthInPoints = (int)(bandwidth * n);
if (bandwidthInPoints < 2)
{
throw new ApplicationException(string.Format("the bandwidth must be large enough to accomodate at least 2 points. There are {0} " +
" data points, and bandwidth must be at least {1} but it is only {2}",
n, 2.0 / n, bandwidth
));
}
double[] res = new double[n];
double[] residuals = new double[n];
double[] sortedResiduals = new double[n];
double[] robustnessWeights = new double[n];
// Do an initial fit and 'robustnessIters' robustness iterations.
// This is equivalent to doing 'robustnessIters+1' robustness iterations
// starting with all robustness weights set to 1.
for (int i = 0; i < robustnessWeights.Length; i++) robustnessWeights[i] = 1;
for (int iter = 0; iter <= robustnessIters; ++iter)
{
int[] bandwidthInterval = { 0, bandwidthInPoints - 1 };
// At each x, compute a local weighted linear regression
for (int i = 0; i < n; ++i)
{
double x = xval[i];
// Find out the interval of source points on which
// a regression is to be made.
if (i > 0)
{
updateBandwidthInterval(xval, weights, i, bandwidthInterval);
}
int ileft = bandwidthInterval[0];
int iright = bandwidthInterval[1];
// Compute the point of the bandwidth interval that is
// farthest from x
int edge;
if (xval[i] - xval[ileft] > xval[iright] - xval[i])
{
edge = ileft;
}
else
{
edge = iright;
}
// Compute a least-squares linear fit weighted by
// the product of robustness weights and the tricube
// weight function.
// See http://en.wikipedia.org/wiki/Linear_regression
// (section "Univariate linear case")
// and http://en.wikipedia.org/wiki/Weighted_least_squares
// (section "Weighted least squares")
double sumWeights = 0;
double sumX = 0, sumXSquared = 0, sumY = 0, sumXY = 0;
double denom = Math.Abs(1.0 / (xval[edge] - x));
for (int k = ileft; k <= iright; ++k)
{
double xk = xval[k];
double yk = yval[k];
double dist;
if (k < i)
{
dist = (x - xk);
}
else
{
dist = (xk - x);
}
double w = tricube(dist * denom) * robustnessWeights[k];
double xkw = xk * w;
sumWeights += w;
sumX += xkw;
sumXSquared += xk * xkw;
sumY += yk * w;
sumXY += yk * xkw;
}
double meanX = sumX / sumWeights;
double meanY = sumY / sumWeights;
double meanXY = sumXY / sumWeights;
double meanXSquared = sumXSquared / sumWeights;
double beta;
if (meanXSquared == meanX * meanX)
{
beta = 0;
}
else
{
beta = (meanXY - meanX * meanY) / (meanXSquared - meanX * meanX);
}
double alpha = meanY - beta * meanX;
res[i] = beta * x + alpha;
residuals[i] = Math.Abs(yval[i] - res[i]);
}
// No need to recompute the robustness weights at the last
// iteration, they won't be needed anymore
if (iter == robustnessIters)
{
break;
}
// Recompute the robustness weights.
// Find the median residual.
// An arraycopy and a sort are completely tractable here,
// because the preceding loop is a lot more expensive
System.Array.Copy(residuals, sortedResiduals, n);
//System.arraycopy(residuals, 0, sortedResiduals, 0, n);
Array.Sort<double>(sortedResiduals);
double medianResidual = sortedResiduals[n / 2];
if (medianResidual == 0)
{
break;
}
for (int i = 0; i < n; ++i)
{
double arg = residuals[i] / (6 * medianResidual);
robustnessWeights[i] = (arg >= 1) ? 0 : Math.Pow(1 - arg * arg, 2);
}
}
return res;
}
public double[] smooth(double[] xval, double[] yval)
{
if (xval.Length != yval.Length)
{
throw new Exception($"xval and yval len are different");
}
double[] unitWeights = Enumerable.Repeat(1.0, xval.Length).ToArray();
return smooth(xval, yval, unitWeights);
}
/**
* Given an index interval into xval that embraces a certain number of
* points closest to xval[i-1], update the interval so that it embraces
* the same number of points closest to xval[i]
*
* #param xval arguments array
* #param i the index around which the new interval should be computed
* #param bandwidthInterval a two-element array {left, right} such that: <p/>
* <tt>(left==0 or xval[i] - xval[left-1] > xval[right] - xval[i])</tt>
* <p/> and also <p/>
* <tt>(right==xval.length-1 or xval[right+1] - xval[i] > xval[i] - xval[left])</tt>.
* The array will be updated.
*/
private static void updateBandwidthInterval(double[] xval, double[] weights,
int i,
int[] bandwidthInterval)
{
int left = bandwidthInterval[0];
int right = bandwidthInterval[1];
// The right edge should be adjusted if the next point to the right
// is closer to xval[i] than the leftmost point of the current interval
int nextRight = nextNonzero(weights, right);
if (nextRight < xval.Length && xval[nextRight] - xval[i] < xval[i] - xval[left])
{
int nextLeft = nextNonzero(weights, bandwidthInterval[0]);
bandwidthInterval[0] = nextLeft;
bandwidthInterval[1] = nextRight;
}
}
private static int nextNonzero(double[] weights, int i)
{
int j = i + 1;
while (j < weights.Length && weights[j] == 0)
{
++j;
}
return j;
}
/**
* Compute the
* tricube
* weight function
*
* #param x the argument
* #return (1-|x|^3)^3
*/
private static double tricube(double x)
{
double tmp = Math.Abs(x);
tmp = 1 - tmp * tmp * tmp;
return tmp * tmp * tmp;
}
/**
* Check that all elements of an array are finite real numbers.
*
* #param values the values array
* #param isAbscissae if true, elements are abscissae otherwise they are ordinatae
* #throws MathException if one of the values is not
* a finite real number
*/
private static void checkAllFiniteReal(double[] values, bool isAbscissae)
{
for (int i = 0; i < values.Length; i++)
{
double x = values[i];
if (Double.IsInfinity(x) || Double.IsNaN(x))
{
string pattern = isAbscissae ?
"all abscissae must be finite real numbers, but {0}-th is {1}" :
"all ordinatae must be finite real numbers, but {0}-th is {1}";
throw new ApplicationException(string.Format(pattern, i, x));
}
}
}
/**
* Check that elements of the abscissae array are in a strictly
* increasing order.
*
* #param xval the abscissae array
* #throws MathException if the abscissae array
* is not in a strictly increasing order
*/
private static void checkStrictlyIncreasing(double[] xval)
{
for (int i = 0; i < xval.Length; ++i)
{
if (i >= 1 && xval[i - 1] >= xval[i])
{
throw new ApplicationException(string.Format(
"the abscissae array must be sorted in a strictly " +
"increasing order, but the {0}-th element is {1} " +
"whereas {2}-th is {3}",
i - 1, xval[i - 1], i, xval[i]));
}
}
}
}
}