changing activation function from Sigmoid to Tanh? - c#

I'm trying to change my Neural Net from using sigmoid activation for hidden and output layer to tanh function.
I'm confused what i should change. just the output calculation for the neurons or also error calculation for back propagation?
this is the output calculation:
public void calcOutput()
{
if (!isBias)
{
float sum = 0;
float bias = 0;
//System.out.println("Looking through " + connections.size() + " connections");
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node from = c.getFrom();
Node to = c.getTo();
// Is this connection moving forward to us
// Ignore connections that we send our output to
if (to == this)
{
// This isn't really necessary
// But I am treating the bias individually in case I need to at some point
if (from.isBias) bias = from.getOutput()*c.getWeight();
else sum += from.getOutput()*c.getWeight();
}
}
// Output is result of sigmoid function
output = Tanh(bias+sum);
}
}
it works great for how i trained it before, but now i want want to train it to give 1 or -1 as output.
when i change
output = Sigmoid(bias+sum);
to
output = Tanh(bias+sum);
the result are all messed up...
Sigmoid:
public static float Sigmoid(float x)
{
return 1.0f / (1.0f + (float) Mathf.Exp(-x));
}
Tanh:
public float Tanh(float x)
{
//return (float)(Mathf.Exp(x) - Mathf.Exp(-x)) / (Mathf.Exp(x) + Mathf.Exp(-x));
//return (float)(1.7159f * System.Math.Tanh(2/3 * x));
return (float)System.Math.Tanh(x);
}
as you can see i tried different formula i found for tanh but none the outputs make sense, i get -1 where i ask 0 or 0.76159 where i ask 1 or it keeps flipping between a positive and a negative number when asking -1 and other mismatches...
-EDIT- updated currently working code (changed the above calcOuput to what i use now):
public float[] train(float[] inputs, float[] answer)
{
float[] result = feedForward(inputs);
deltaOutput = new float[result.Length];
for(int ii=0; ii<result.Length; ii++)
{
deltaOutput[ii] = 0.66666667f * (1.7159f - (result[ii]*result[ii])) * (answer[ii]-result[ii]);
}
// BACKPROPOGATION
for(int ii=0; ii<output.Length; ii++)
{
ArrayList connections = output[ii].getConnections();
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node node = c.getFrom();
float o = node.getOutput();
float deltaWeight = o*deltaOutput[ii];
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
// ADJUST HIDDEN WEIGHTS
for (int i = 0; i < hidden.Length; i++)
{
ArrayList connections = hidden[i].getConnections();
//Debug.Log(connections.Count);
float sum = 0;
// Sum output delta * hidden layer connections (just one output)
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from hidden layer to next layer (output)?
if (c.getFrom() == hidden[i])
{
for(int k=0; k<deltaOutput.Length; k++)
sum += c.getWeight()*deltaOutput[k];
}
}
// Then adjust the weights coming in based:
// Above sum * derivative of sigmoid output function for hidden neurons
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from previous layer (input) to hidden layer?
if (c.getTo() == hidden[i])
{
float o = hidden[i].getOutput();
float deltaHidden = o * (1 - o); // Derivative of sigmoid(x)
deltaHidden *= sum;
Node node = c.getFrom();
float deltaWeight = node.getOutput()*deltaHidden;
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
}
return result;
}

I'm confused what i should change. just the output calculation for the neurons or also error calculation for back propagation? this is the output calculation:
You should be using the derivative of the sigmoid function somewhere in your backpropagation code. You will also need to replace that with the derivative of the tanh function, which is 1 - (tanh(x))^2.
Your code looks like C#. I get this:
Console.WriteLine(Math.Tanh(0)); // prints 0
Console.WriteLine(Math.Tanh(-1)); // prints -0.761594155955765
Console.WriteLine(Math.Tanh(1)); // prints 0.761594155955765
Console.WriteLine(Math.Tanh(0.234)); // prints 0.229820548214317
Console.WriteLine(Math.Tanh(-4)); // prints -0.999329299739067
Which is in line with the tanh plot:
I think you're reading the results wrong: you get the correct answer for 1. Are you sure you get -1 for tanh(0)?
If you're sure there's a problem, please post more code.

Related

Calculation Not Updating

I have been running this syntax with one variable successfully, but now I am trying to change it to a foeach() loop and take a range of values and show the results as a message box. My issue with the syntax is that ecll always retains the value of the first number passed, and the calculation is never updated for each subsequent number in the array.
Where did I err that is preventing this from being updated for each subsequent number in the array?
private void btnGetData_Click(object sender, EventArgs e)
{
int start = 2;
int end = 10;
int[] nums = Enumerable.Range(start, end - start).ToArray();
foreach (int n in nums)
{
float tp_x = 0, tp_y = 0;
SAP = new List<PointF>();
float fbw = m_pd.bl[m_pd.bl.Count - 1].m_Width;
float Location1_X = tp_x + fbw;
float Location1_Y = tp_y;
SAP.Add(new PointF(Location1_X, Location1_Y));
float iBH = gbh(m_pd.bl.Count - 1);
float lbw = m_pd.bl[0].m_Width;
float Location2_X = tp_x + lbw;
float Location2_Y = tp_y + (iBH) + 1.5f;
PointF rip = new PointF();
if (!Getrip(ftp, rhep, ref rip))
{
SAP = null;
return;
}
for (int iRowIndex = saii; iRowIndex < m_pd.pp.Count; iRowIndex++)
{
float Xvalue = m_pd.pp[iRowIndex].X;
float Yvalue = m_pd.pp[iRowIndex].Y;
SAP.Add(new PointF(Xvalue, Yvalue));
if (Yvalue == LeftIntersectionPoint.Y)
{
pp.X = Xvalue;
pp.Y = Yvalue;
continue;
}
if (Xvalue >= rip.X)
{
Xvalue = rip.X;
SAP[SAP.Count - 1] = new PointF(rip.X, rip.Y);
}
if (Xvalue == rip.X)
{
break;
}
pp.X = Xvalue;
pp.Y = Yvalue;
}
double ecll = Getll(Location1_X, Location1_Y, rip.X, rip.Y);
Messagebox.Show(Convert.ToString(ec11));
txtLength.Text = ll.ToString("0.00");
}
}
I feel like this is more of a comment based on what's going on here, but I kind of need the code section to explain this better I believe.
Let's simplify away from your points, widths, etc. I think we can all agree that n is never used within your function, so let's do a similar example:
So I have a function I wrote that adds 1 to 1
var newNum = 1 + 1;
It does what is expected, sets newNum to 2, but let's say I wanted to enhance it so that it adds 1 to the numbers in nums (from your original function):
int start = 2;
int end = 10;
int[] nums = Enumerable.Range(start, end - start).ToArray();
but if I try to reuse my function outright:
foreach (int n in nums)
{
var newNum = 1 + 1;
}
Every single pass, I'm always going to have newNum set at 2 because I'm not using the variable.
what I should do is write this:
foreach (int n in nums)
{
var newNum = 1 + n;
}
so based on your 2 through 10, I should see newNum set to 3 through 11 at various iterations.
Each iteration in the 'Foreach' loop assigns a new value to 'n'.
However in the loop body, that value (n) is not used anywhere in the calculations (unless I am missing something). So, of course the result of these calculations will always be the same.
As soon as you include 'n' in some of the calclations, the result will change...

Using FFTW in C# to compute HilbertTransform

I want to implement the Hilbert Transform in C#. From this article I saw that the fastest FFT open source implementation seems to be the FFTW, so I downloaded that example and used it to learn how to use the fftw wrapper for C#.
I have a current signal of 200.000 points which I'm using for testing. Getting the Hilbert transform through the fft is relatively simple:
Compute the fft.
Multiply by 2 all positive frequencies except for the DC and Nyquist components (0 and n/2 + 1, if the sample size is even).
Multiply by 0 all the negative frequencies ([n/2 + 1, n]).
Compute the inverse fft.
This far, I've done all of it. The only problem is the inverse fft. I'm not able to get the same results with fftw than with the ifft from Matlab.
My code
RealArray _input;
ComplexArray _fft;
void ComputeFFT()
{
_fft = new ComplexArray(_length / 2 + 1);
_input.Set(Data);
_plan = Plan.Create1(_length, _input, _fft, Options.Estimate);
_plan.Execute();
}
This far, I've a fft with only the positive frequencies. So I don't need to multiply by zero the negative frequencies: they don't even exist. With the following code, I can get my original signal back:
double[] ComputeIFFT(ComplexArray input)
{
double[] temp = new double[_length];
RealArray output = new RealArray(_length);
_plan = Plan.Create1(_length, input, output, Options.Estimate);
_plan.Execute();
temp = output.ToArray();
for (int i = 0; i < _length; ++i)
{
temp[i] /= _length;
}
return temp;
}
The problem comes when I try to get a complex inverse from the signal.
void ComputeHilbert()
{
double[] fft = FFT.ToArray();
double[] h = new double[_length / 2 + 1];
double[] temp = new double[_length * 2];
bool fftLengthIsOdd = (_length | 1) == 1;
h[0] = 1;
for (int i = 1; i < _length / 2; i++) h[i] = 2;
if (!fftLengthIsOdd) h[(_length / 2)] = 1;
for (int i = 0; i <= _length / 2; i++)
{
temp[2 * i] = fft[2*i] * h[i];
temp[2 * i + 1] = fft[2*i + 1] * h[i];
}
ComplexArray _tempHilbert = new ComplexArray(_length);
_tempHilbert.Set(temp);
_hilbert = ComputeIFFT(_tempHilbert);
_hilbertComputed = true;
}
It's important to note that, when I do apply the ToArray() method on a ComplexArray object, I get as result a double[] with twice as length as the original array, having the real and imaginary parts consecutive. That's it, for a ComplexArray object containing "3 + 1i", I would get a double vector with [3, 1].
So, at this moment, what I have is something like:
[DC Frequency, 2*positive frequencies, Nyquist Frequency, zeros]
If I export this data to Matlab and compute the IFFT, I get the same result as its hilbert(signal).
However, if I try to apply the IFFT provided by fftw, I get weird values from Nyquist Frequency to the end (that is to say, the zeros mess with fftw).
This is the ifft I'm using to do this:
double[] ComputeIFFT(ComplexArray input)
{
double[] temp;
ComplexArray output = new ComplexArray(_length);
_plan = Plan.Create1(_length, input, output, Direction.Backward, Options.Estimate);
_plan.Execute();
temp = output.ToArray();
for (int i = 0; i < _length; ++i)
{
temp[i] /= _length;
}
return temp;
}
So, just to sum it up, my problem is the way I'm using to calculate the ifft. It doesn't seems to work well with zeros. Or maybe Matlab is capable to understand that it has to apply some different approach and I should do it manually, but I don't know how.
Thank you very much for your help in advance, much appreciated!
So the problem was the ComputeIFFT function. In the for loop, I was doing i < _length, but the length of temp array is 2 * _length, because it holds both real and imaginary values.
That's why I only got half of the values right.
The correct code for it is:
double[] ComputeIFFT(ComplexArray input)
{
double[] temp;
ComplexArray output = new ComplexArray(_length);
_plan = Plan.Create1(_length, input, output, Direction.Backward, Options.Estimate);
_plan.Execute();
temp = output.ToArray();
for (int i = 0; i < temp.Length; ++i)
{
temp[i] /= _length;
}
return temp;
}
I hope this will be useful for anyone trying to implement the Hilbert Transform through FFTW in C#.

C# Exp cannot get result

When I using Math.Exp() in C# I have some questions?This code is about Kernel density estimation, and I don't have any knowledge about kernel density estimation. So I look up some wiki and some paper.
I try to write it by C#. The problem is when "distance" is getting higher the result is become 0. It's confuse me and I cannot find any other way to get the right result.
disExp = Math.Pow(Math.E, -(distance / 2 * Math.Pow(h, 2)));
So, can any one help me to get the solution? Or give me some idea about Kernel density estimation on C#. Sorry for poor English.
Try this
public static double[,] KernelDensityEstimation(double[] data, double sigma, int nsteps)
{
// probability density function (PDF) signal analysis
// Works like ksdensity in mathlab.
// KDE performs kernel density estimation (KDE)on one - dimensional data
// http://en.wikipedia.org/wiki/Kernel_density_estimation
// Input: -data: input data, one-dimensional
// -sigma: bandwidth(sometimes called "h")
// -nsteps: optional number of abscis points.If nsteps is an
// array, the abscis points will be taken directly from it. (default 100)
// Output: -x: equispaced abscis points
// -y: estimates of p(x)
// This function is part of the Kernel Methods Toolbox(KMBOX) for MATLAB.
// http://sourceforge.net/p/kmbox
// Converted to C# code by ksandric
double[,] result = new double[nsteps, 2];
double[] x = new double[nsteps], y = new double[nsteps];
double MAX = Double.MinValue, MIN = Double.MaxValue;
int N = data.Length; // number of data points
// Find MIN MAX values in data
for (int i = 0; i < N; i++)
{
if (MAX < data[i])
{
MAX = data[i];
}
if (MIN > data[i])
{
MIN = data[i];
}
}
// Like MATLAB linspace(MIN, MAX, nsteps);
x[0] = MIN;
for (int i = 1; i < nsteps; i++)
{
x[i] = x[i - 1] + ((MAX - MIN) / nsteps);
}
// kernel density estimation
double c = 1.0 / (Math.Sqrt(2 * Math.PI * sigma * sigma));
for (int i = 0; i < N; i++)
{
for (int j = 0; j < nsteps; j++)
{
y[j] = y[j] + 1.0 / N * c * Math.Exp(-(data[i] - x[j]) * (data[i] - x[j]) / (2 * sigma * sigma));
}
}
// compilation of the X,Y to result. Good for creating plot(x, y)
for (int i = 0; i < nsteps; i++)
{
result[i, 0] = x[i];
result[i, 1] = y[i];
}
return result;
}
kernel density estimation C#
plot

Neural Net backpropagation doesn't work properly

Lately I've implemented my own neural network (using different guides, but mainly from here), for future use (I intend to use it for an OCR program i'l develop). currently I'm testing it, and I'm having this weird problem.
Whenever I give my network a training example, the algorithm changes the weights in a way that leads to the desired output. However, after a few training examples, the weights get messed up- making the network work well for some outputs, and making it wrong for other outputs (even if I enter the input of the training examples, exactly as it was).
I would appreciate if someone directed me towards the problem, should they see it.
Here are the methods for calculating the error of the neurons and the weight adjusting-
private static void UpdateOutputLayerDelta(NeuralNetwork Network, List<double> ExpectedOutputs)
{
for (int i = 0; i < Network.OutputLayer.Neurons.Count; i++)
{
double NeuronOutput = Network.OutputLayer.Neurons[i].Output;
Network.OutputLayer.Neurons[i].ErrorFactor = ExpectedOutputs[i]-NeuronOutput; //calculating the error factor
Network.OutputLayer.Neurons[i].Delta = NeuronOutput * (1 - NeuronOutput) * Network.OutputLayer.Neurons[i].ErrorFactor; //calculating the neuron's delta
}
}
//step 3 method
private static void UpdateNetworkDelta(NeuralNetwork Network)
{
NeuronLayer UpperLayer = Network.OutputLayer;
for (int i = Network.HiddenLayers.Count - 1; i >= 0; i--)
{
foreach (Neuron LowerLayerNeuron in Network.HiddenLayers[i].Neurons)
{
for (int j = 0; j < UpperLayer.Neurons.Count; j++)
{
Neuron UpperLayerNeuron = UpperLayer.Neurons[j];
LowerLayerNeuron.ErrorFactor += UpperLayerNeuron.Delta * UpperLayerNeuron.Weights[j + 1]/*+1 because of bias*/;
}
LowerLayerNeuron.Delta = LowerLayerNeuron.Output * (1 - LowerLayerNeuron.Output) * LowerLayerNeuron.ErrorFactor;
}
UpperLayer = Network.HiddenLayers[i];
}
}
//step 4 method
private static void AdjustWeights(NeuralNetwork Network, List<double> NetworkInputs)
{
//Adjusting the weights of the hidden layers
List<double> LowerLayerOutputs = new List<double>(NetworkInputs);
for (int i = 0; i < Network.HiddenLayers.Count; i++)
{
foreach (Neuron UpperLayerNeuron in Network.HiddenLayers[i].Neurons)
{
UpperLayerNeuron.Weights[0] += -LearningRate * UpperLayerNeuron.Delta;
for (int j = 1; j < UpperLayerNeuron.Weights.Count; j++)
UpperLayerNeuron.Weights[j] += -LearningRate * UpperLayerNeuron.Delta * LowerLayerOutputs[j - 1] /*-1 because of bias*/;
}
LowerLayerOutputs = Network.HiddenLayers[i].GetLayerOutputs();
}
//Adjusting the weight of the output layer
foreach (Neuron OutputNeuron in Network.OutputLayer.Neurons)
{
OutputNeuron.Weights[0] += -LearningRate * OutputNeuron.Delta * 1; //updating the bias - TODO: change this if the bias is also changed throughout the program
for (int j = 1; j < OutputNeuron.Weights.Count; j++)
OutputNeuron.Weights[j] += -LearningRate * OutputNeuron.Delta * LowerLayerOutputs[j - 1];
}
}
The learning rate is 0.5, and the neurons' activation function is a sigmoid function.
EDIT: I've noticed I never implemented the function to calculate the overall error: E=0.5 * Sum(t-y) for each training example. could that be the problem? and if so, how should I fix it?
The learning rate 0.5 seems a bit too large. Usually values closer to 0.01 or 0.1 are used. Also, it usually helps in convergence if training patterns are presented in random order. More useful hints can be found here: Neural Network FAQ (comp.ai.neural archive).

Simple Cluster algorithm 2D. Detecting clumps of points

Anyone know a simple algorithm to implement in C# to detect monster groups in a 2D game.
EX:
100 Range around the char there are monsters. I want to detect which monsters are within range 2 of each other, and if there is at-least 5 together, use the Area of Effect skill on that location. Otherwise use the single-target skill.
A link to an implementation would be great, C# preferably. I just get lost reading the Wikipedia articles.
EDIT:
"your question is incomplete. what do you want to do exactly? do you want to find all groups? the biggest group? any group, if there are groups, none otherwise? please be more specific." -gilad hoch
I want to find all groups within 100 units of range around the main character. The groups should be formed if there are at-least 5 or more monsters all within 2 range of each other, or maybe within 10 range from the center monster.
So the result should be probably a new List of groups or a List of potential target locations.
a very simple clustering algorithm is the k-mean algorithm. it is like
create random points
assign all points to the nearest point, and create groups
relocate the original points to the middle of the groups
do the last two steps several times.
an implementation you can find for example here, or just google for "kmean c#"
http://kunuk.wordpress.com/2011/09/20/markerclusterer-with-c-example-and-html-canvas-part-3/
I recently implemented the algorithm given in this paper by Efraty, which solves the problem by considering the intersections of circles of radius 2 centered at each given point. In simple terms, if you order the points in which two circles intersect in clockwise order, then you can do something similar to a line sweep to figure out the point in which a bomb (or AoE spell) needs to be launched to hit the most enemies. The implementation is this:
#include <stdio.h>
#include <cmath>
#include <algorithm>
using namespace std;
#define INF 1e16
#define eps 1e-8
#define MAXN 210
#define RADIUS 2
struct point {
double x,y;
point() {}
point(double xx, double yy) : x(xx), y(yy) {}
point operator*(double ot) {
return point(x*ot, y*ot);
}
point operator+(point ot) {
return point(x+ot.x, y+ot.y);
}
point operator-(point ot) {
return point(x-ot.x, y-ot.y);
}
point operator/(double ot) {
return point(x/ot, y/ot);
}
};
struct inter {
double x,y;
bool entry;
double comp;
bool operator< (inter ot) const {
return comp < ot.comp;
}
};
double dist(point a, point b) {
double dx = a.x-b.x;
double dy = a.y-b.y;
return sqrt(dx*dx+dy*dy);
}
int N,K;
point p[MAXN];
inter it[2*MAXN];
struct distst {
int id, dst;
bool operator<(distst ot) const {return dst<ot.dst;}
};
distst dst[200][200];
point best_point;
double calc_depth(double r, int i) {
int left_inter = 0;
point left = p[i];
left.y -= r;
best_point = left;
int tam = 0;
for (int k = 0; k < N; k++) {
int j = dst[i][k].id;
if (i==j) continue;
double d = dist(p[i], p[j]);
if (d > 2*r + eps) break;
if (fabs(d)<eps) {
left_inter++;
continue;
}
bool is_left = dist(p[j], left) < r+eps;
if (is_left) {
left_inter++;
}
double a = (d*d) / (2*d);
point diff = p[j] - p[i];
point p2 = p[i] + (diff * a) / d;
double h = sqrt(r*r - a*a);
it[tam].x = p2.x + h*( p[j].y - p[i].y ) / d;
it[tam].y = p2.y - h*( p[j].x - p[i].x ) / d;
it[tam+1].x = p2.x - h*( p[j].y - p[i].y ) / d;
it[tam+1].y = p2.y + h*( p[j].x - p[i].x ) / d;
it[tam].x -= p[i].x;
it[tam].y -= p[i].y;
it[tam+1].x -= p[i].x;
it[tam+1].y -= p[i].y;
it[tam].comp = atan2(it[tam].x, it[tam].y);
it[tam+1].comp = atan2(it[tam+1].x, it[tam+1].y);
if (it[tam] < it[tam+1]) {
it[tam].entry = true;
it[tam+1].entry = false;
}
else {
it[tam].entry = false;
it[tam+1].entry = true;
}
if (is_left) {
swap(it[tam].entry, it[tam+1].entry);
}
tam+=2;
}
int curr,best;
curr = best = left_inter;
sort(it,it+tam);
for (int j = 0; j < tam; j++) {
if (it[j].entry) curr++;
else curr--;
if (curr > best) {
best = curr;
best_point = point(it[j].x, it[j].y);
}
}
return best;
}
int main() {
scanf("%d", &N);
for (int i = 0; i < N; i++) {
scanf("%lf %lf", &p[i].x, &p[i].y);
}
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
dst[i][j].id = j;
dst[i][j].dst = dist(p[i],p[j]);
}
sort(dst[i],dst[i]+N);
}
int best = 0;
point target = p[0];
for (int i = 0; i < N; i++) {
int depth = calc_depth(RADIUS, i);
if (depth > best) {
best = depth;
target = best_point;
}
}
printf("A bomb at (%lf, %lf) will hit %d target(s).\n", target.x, target.y, best+1);
}
Sample usage:
2 (number of points)
0 0
3 0
A bomb at (1.500000, 1.322876) will hit 2 targets.

Categories

Resources