Lately I've implemented my own neural network (using different guides, but mainly from here), for future use (I intend to use it for an OCR program i'l develop). currently I'm testing it, and I'm having this weird problem.
Whenever I give my network a training example, the algorithm changes the weights in a way that leads to the desired output. However, after a few training examples, the weights get messed up- making the network work well for some outputs, and making it wrong for other outputs (even if I enter the input of the training examples, exactly as it was).
I would appreciate if someone directed me towards the problem, should they see it.
Here are the methods for calculating the error of the neurons and the weight adjusting-
private static void UpdateOutputLayerDelta(NeuralNetwork Network, List<double> ExpectedOutputs)
{
for (int i = 0; i < Network.OutputLayer.Neurons.Count; i++)
{
double NeuronOutput = Network.OutputLayer.Neurons[i].Output;
Network.OutputLayer.Neurons[i].ErrorFactor = ExpectedOutputs[i]-NeuronOutput; //calculating the error factor
Network.OutputLayer.Neurons[i].Delta = NeuronOutput * (1 - NeuronOutput) * Network.OutputLayer.Neurons[i].ErrorFactor; //calculating the neuron's delta
}
}
//step 3 method
private static void UpdateNetworkDelta(NeuralNetwork Network)
{
NeuronLayer UpperLayer = Network.OutputLayer;
for (int i = Network.HiddenLayers.Count - 1; i >= 0; i--)
{
foreach (Neuron LowerLayerNeuron in Network.HiddenLayers[i].Neurons)
{
for (int j = 0; j < UpperLayer.Neurons.Count; j++)
{
Neuron UpperLayerNeuron = UpperLayer.Neurons[j];
LowerLayerNeuron.ErrorFactor += UpperLayerNeuron.Delta * UpperLayerNeuron.Weights[j + 1]/*+1 because of bias*/;
}
LowerLayerNeuron.Delta = LowerLayerNeuron.Output * (1 - LowerLayerNeuron.Output) * LowerLayerNeuron.ErrorFactor;
}
UpperLayer = Network.HiddenLayers[i];
}
}
//step 4 method
private static void AdjustWeights(NeuralNetwork Network, List<double> NetworkInputs)
{
//Adjusting the weights of the hidden layers
List<double> LowerLayerOutputs = new List<double>(NetworkInputs);
for (int i = 0; i < Network.HiddenLayers.Count; i++)
{
foreach (Neuron UpperLayerNeuron in Network.HiddenLayers[i].Neurons)
{
UpperLayerNeuron.Weights[0] += -LearningRate * UpperLayerNeuron.Delta;
for (int j = 1; j < UpperLayerNeuron.Weights.Count; j++)
UpperLayerNeuron.Weights[j] += -LearningRate * UpperLayerNeuron.Delta * LowerLayerOutputs[j - 1] /*-1 because of bias*/;
}
LowerLayerOutputs = Network.HiddenLayers[i].GetLayerOutputs();
}
//Adjusting the weight of the output layer
foreach (Neuron OutputNeuron in Network.OutputLayer.Neurons)
{
OutputNeuron.Weights[0] += -LearningRate * OutputNeuron.Delta * 1; //updating the bias - TODO: change this if the bias is also changed throughout the program
for (int j = 1; j < OutputNeuron.Weights.Count; j++)
OutputNeuron.Weights[j] += -LearningRate * OutputNeuron.Delta * LowerLayerOutputs[j - 1];
}
}
The learning rate is 0.5, and the neurons' activation function is a sigmoid function.
EDIT: I've noticed I never implemented the function to calculate the overall error: E=0.5 * Sum(t-y) for each training example. could that be the problem? and if so, how should I fix it?
The learning rate 0.5 seems a bit too large. Usually values closer to 0.01 or 0.1 are used. Also, it usually helps in convergence if training patterns are presented in random order. More useful hints can be found here: Neural Network FAQ (comp.ai.neural archive).
Related
I used OxyPlot for drawing contours, but they seem a little rough at some places, so I would need to smooth them. So, I was wondering if it was possible to smooth the contour with OxyPlot or if I will need something else?
I know that you can smooth the LineSeries, so by looking inside the code for the LineSeries I have tried to put the code for smoothing in the ContourSeries, but I am not quite able to figure out how everything works. I tried to find smoothing algorithm on the internet, but I wasn't able to find many and those that I tried didn't work at all.
Also, I have tried to look for another library for drawing contours, but OxyPlot seems to be the best for free for WPF, but if you have other suggestions that can give me better results I would appreciate it.
After some more searching I found an algorithm that did a pretty good job, but some lines that needed to be closed were not, but the performance were really good by comparing to other algorithms I tried.
Here's the algorithm
private void SmoothContour(List<DataPoint> points, int severity = 1)
{
for (int i = 1; i < points.Count; i++)
{
var start = (i - severity > 0 ? i - severity + 1 : 0);
var end = (i + severity < points.Count ? i + severity + 1 : points.Count);
double sumX = 0, sumY = 0;
for (int j = start; j < end; j++)
{
sumX += points[j].X;
sumY += points[j].Y;
}
DataPoint sum = new DataPoint(sumX, sumY);
DataPoint avg = new DataPoint(sum.X / (end - start), sum.Y / (end - start));
points[i] = avg;
}
}
It's based on this method: https://stackoverflow.com/a/18830268/11788646
After that I put this method in the ContourSeries class like this :
List<Contour> smoothedContours = new List<Contour>();
for (int i = 0; i < contours.Count; i++)
{
List<DataPoint> smoothedPoints = new List<DataPoint>(contours[i].Points);
for (int j = 0; j < SmoothLevel; j++)
{
SmoothContour(smoothedPoints);
}
Contour contour = new Contour(smoothedPoints, contours[i].ContourLevel);
smoothedContours.Add(contour);
}
contours = smoothedContours;
It's located in the CalculateContours() method just after the call of the JoinContourSegments() method. I also added a SmoothLevel property to add more smoothing to the lines. The higher it is the smoother the lines are, but when it is set too high, it doesn't do too well. So I keep at 10 and it's good.
This is a piece of my code, which calculate the differentiate. It works correctly but it takes a lot (because of height and width).
"Data" is a grey image bitmap.
"Filter" is [3,3] matrix.
"fh" and "fw" maximum values are 3.
I am looking to speed up this code.
I also tried with using parallel, for but it didn't work correct (error with out of bounds).
private float[,] Differentiate(int[,] Data, int[,] Filter)
{
int i, j, k, l, Fh, Fw;
Fw = Filter.GetLength(0);
Fh = Filter.GetLength(1);
float sum = 0;
float[,] Output = new float[Width, Height];
for (i = Fw / 2; i <= (Width - Fw / 2) - 1; i++)
{
for (j = Fh / 2; j <= (Height - Fh / 2) - 1; j++)
{
sum=0;
for(k = -Fw/2; k <= Fw/2; k++)
{
for(l = -Fh/2; l <= Fh/2; l++)
{
sum = sum + Data[i+k, j+l] * Filter[Fw/2+k, Fh/2+l];
}
}
Output[i,j] = sum;
}
}
return Output;
}
For parallel execution you need to drop c language like variable declaration at the beginning of method and declare them in actual scope that they are used so they are not shared between threads. Making it parallel should provide some benefit for performance, but making them all ParallerFors is not a good idea as there is a limit for threads amount that actually can run in parallel. I would try to make it with top level loop only:
private static float[,] Differentiate(int[,] Data, int[,] Filter)
{
var Fw = Filter.GetLength(0);
var Fh = Filter.GetLength(1);
float[,] Output = new float[Width, Height];
Parallel.For(Fw / 2, Width - Fw / 2 - 1, (i, state) =>
{
for (var j = Fh / 2; j <= (Height - Fh / 2) - 1; j++)
{
var sum = 0;
for (var k = -Fw / 2; k <= Fw / 2; k++)
{
for (var l = -Fh / 2; l <= Fh / 2; l++)
{
sum = sum + Data[i + k, j + l] * Filter[Fw / 2 + k, Fh / 2 + l];
}
}
Output[i, j] = sum;
}
});
return Output;
}
This is a perfect example of a task where using the GPU is better than using the CPU. A GPU is able to perform trillions of floating point operations per second (TFlops), while CPU performance is still measured in GFlops. The catch is that it's only any good if you use SIMD instructions (Single Instruction Multiple Data). The GPU excels at data-parallel tasks. If different data needs different instructions, using the GPU has no advantage.
In your program, the elements of your bitmap go through the same calculations: the same computations just with slightly different data (SIMD!). So using the GPU is a great option. This won't be too complex because with your calculations threads on the GPU would not need to exchange information, nor would they be dependent on results of previous iterations (Each element would be processed by a different thread on the GPU).
You can use, for example, OpenCL to easily access the GPU. More on OpenCL and using the GPU here: https://www.codeproject.com/Articles/502829/GPGPU-image-processing-basics-using-OpenCL-NET
I am new to optimization problem and gone through several math library like Alglib, DotNumerics and Microsoft Solver Foundation, I have no luck on how to kick start, perhaps some experts can shed some lights.
I wanted to get optimal translation from 3d points on reference contour to target contour.
Below is the constrained optimization problem. How do I optimize it if I wish to use DotNumerics for instance, I have no idea how to kickstart:
Pr : 3d points on reference contour
Pt : 3d points on target contour
t(Pr): translation vector of point Pr <-- this is what I looking for
Below is the example provided by DotNumerics, how should I take all my 3d points as input and churn out translation vector?
public void OptimizationLBFGSBConstrained()
{
//This example minimize the function
//f(x0,x2,...,xn)= (x0-0)^2+(x1-1)^2+...(xn-n)^2
//The minimum is at (0,1,2,3...,n) for the unconstrained case.
//using DotNumerics.Optimization;
L_BFGS_B LBFGSB = new L_BFGS_B();
int numVariables = 5;
OptBoundVariable[] variables = new OptBoundVariable[numVariables];
//Constrained Minimization on the interval (-10,10), initial Guess=-2;
for (int i = 0; i < numVariables; i++) variables[i] = new OptBoundVariable("x" + i.ToString(), -2, -10, 10);
double[] minimum = LBFGSB.ComputeMin(ObjetiveFunction, Gradient,variables);
ObjectDumper.Write("L-BFGS-B Method. Constrained Minimization on the interval (-10,10)");
for (int i = 0; i < minimum.Length; i++) ObjectDumper.Write("x" + i.ToString() + " = " + minimum[i].ToString());
//Constrained Minimization on the interval (-10,3), initial Guess=-2;
for (int i = 0; i < numVariables; i++) variables[i].UpperBound = 3;
minimum = LBFGSB.ComputeMin(ObjetiveFunction, Gradient, variables);
ObjectDumper.Write("L-BFGS-B Method. Constrained Minimization on the interval (-10,3)");
for (int i = 0; i < minimum.Length; i++) ObjectDumper.Write("x" + i.ToString() + " = " + minimum[i].ToString());
//f(x0,x2,...,xn)= (x0-0)^2+(x1-1)^2+...(xn-n)^2
//private double ObjetiveFunction(double[] x)
//{
// int numVariables = 5;
// double f = 0;
// for (int i = 0; i < numVariables; i++) f += Math.Pow(x[i] - i, 2);
// return f;
//}
//private double[] Gradient(double[] x)
//{
// int numVariables = 5;
// double[] grad = new double[x.Length];
// for (int i = 0; i < numVariables; i++) grad[i] = 2 * (x[i] - i);
// return grad;
//}
}
Edit 1:
To make things complicated, I have added the real problem been working on Unity. I sampled 5 iso-contour lines from the reference model and did the same to target model(different mesh and different vertices position) - On first iso-contour on reference model, I sampled 8 normalized points(equally split by distance), still I did the same to the target model. Therefore, I have 2 pairs of corresponding point sets (target model normalized points position will always change given every user has different body size) - Next, I repeat steps mentioned above to cover rest of iso-contours. Once I done that, I will be using formula above to optimize the problem in order to get the optimal translation vector so that I can translate all vertices from reference to target model with 1 single translation vector (not sure this is possible) - Is this how optimization works?
Please ignore red line in the yellow iso-contour
I'm trying to change my Neural Net from using sigmoid activation for hidden and output layer to tanh function.
I'm confused what i should change. just the output calculation for the neurons or also error calculation for back propagation?
this is the output calculation:
public void calcOutput()
{
if (!isBias)
{
float sum = 0;
float bias = 0;
//System.out.println("Looking through " + connections.size() + " connections");
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node from = c.getFrom();
Node to = c.getTo();
// Is this connection moving forward to us
// Ignore connections that we send our output to
if (to == this)
{
// This isn't really necessary
// But I am treating the bias individually in case I need to at some point
if (from.isBias) bias = from.getOutput()*c.getWeight();
else sum += from.getOutput()*c.getWeight();
}
}
// Output is result of sigmoid function
output = Tanh(bias+sum);
}
}
it works great for how i trained it before, but now i want want to train it to give 1 or -1 as output.
when i change
output = Sigmoid(bias+sum);
to
output = Tanh(bias+sum);
the result are all messed up...
Sigmoid:
public static float Sigmoid(float x)
{
return 1.0f / (1.0f + (float) Mathf.Exp(-x));
}
Tanh:
public float Tanh(float x)
{
//return (float)(Mathf.Exp(x) - Mathf.Exp(-x)) / (Mathf.Exp(x) + Mathf.Exp(-x));
//return (float)(1.7159f * System.Math.Tanh(2/3 * x));
return (float)System.Math.Tanh(x);
}
as you can see i tried different formula i found for tanh but none the outputs make sense, i get -1 where i ask 0 or 0.76159 where i ask 1 or it keeps flipping between a positive and a negative number when asking -1 and other mismatches...
-EDIT- updated currently working code (changed the above calcOuput to what i use now):
public float[] train(float[] inputs, float[] answer)
{
float[] result = feedForward(inputs);
deltaOutput = new float[result.Length];
for(int ii=0; ii<result.Length; ii++)
{
deltaOutput[ii] = 0.66666667f * (1.7159f - (result[ii]*result[ii])) * (answer[ii]-result[ii]);
}
// BACKPROPOGATION
for(int ii=0; ii<output.Length; ii++)
{
ArrayList connections = output[ii].getConnections();
for (int i = 0; i < connections.Count; i++)
{
Connection c = (Connection) connections[i];
Node node = c.getFrom();
float o = node.getOutput();
float deltaWeight = o*deltaOutput[ii];
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
// ADJUST HIDDEN WEIGHTS
for (int i = 0; i < hidden.Length; i++)
{
ArrayList connections = hidden[i].getConnections();
//Debug.Log(connections.Count);
float sum = 0;
// Sum output delta * hidden layer connections (just one output)
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from hidden layer to next layer (output)?
if (c.getFrom() == hidden[i])
{
for(int k=0; k<deltaOutput.Length; k++)
sum += c.getWeight()*deltaOutput[k];
}
}
// Then adjust the weights coming in based:
// Above sum * derivative of sigmoid output function for hidden neurons
for (int j = 0; j < connections.Count; j++)
{
Connection c = (Connection) connections[j];
// Is this a connection from previous layer (input) to hidden layer?
if (c.getTo() == hidden[i])
{
float o = hidden[i].getOutput();
float deltaHidden = o * (1 - o); // Derivative of sigmoid(x)
deltaHidden *= sum;
Node node = c.getFrom();
float deltaWeight = node.getOutput()*deltaHidden;
c.adjustWeight(LEARNING_CONSTANT*deltaWeight);
}
}
}
return result;
}
I'm confused what i should change. just the output calculation for the neurons or also error calculation for back propagation? this is the output calculation:
You should be using the derivative of the sigmoid function somewhere in your backpropagation code. You will also need to replace that with the derivative of the tanh function, which is 1 - (tanh(x))^2.
Your code looks like C#. I get this:
Console.WriteLine(Math.Tanh(0)); // prints 0
Console.WriteLine(Math.Tanh(-1)); // prints -0.761594155955765
Console.WriteLine(Math.Tanh(1)); // prints 0.761594155955765
Console.WriteLine(Math.Tanh(0.234)); // prints 0.229820548214317
Console.WriteLine(Math.Tanh(-4)); // prints -0.999329299739067
Which is in line with the tanh plot:
I think you're reading the results wrong: you get the correct answer for 1. Are you sure you get -1 for tanh(0)?
If you're sure there's a problem, please post more code.
Hi and thanks for looking!
Background
I have a computing task that requires either a lot of time, or parallel computing.
Specifically, I need to loop through a list of about 50 images, Base64 encode them, and then calculate the Levenshtein distance between each newly encoded item and values in an XML file containing about 2000 Base64 string-encoded images in order to find the string in the XML file that has the smallest Lev. Distance from the benchmark string.
A regular foreach loop works, but is too slow so I have chosen to use PLINQ to take advantage of my Core i7 multi-core processor:
Parallel.ForEach(candidates, item => findImage(total,currentWinner,benchmark,item));
The task starts brilliantly, racing along at high speed, but then I get an "Out of Memory" exception.
I am using C#, .NET 4, Forms App.
Question
How do I tweak my PLINQ code so that I don't run out of available memory?
Update/Sample Code
Here is the method that is called to iniate the PLINQ foreach:
private void btnGo_Click(object sender, EventArgs e)
{
XDocument doc = XDocument.Load(#"C:\Foo.xml");
var imagesNode = doc.Element("images").Elements("image"); //Each "image" node contains a Base64 encoded string.
string benchmark = tbData.Text; //A Base64 encoded string.
IEnumerable<XElement> candidates = imagesNode;
currentWinner = 1000000; //Set the "Current" low score to a million and bubble lower scores into it's place iteratively.
Parallel.ForEach(candidates, i => {
dist = Levenshtein(benchmark, i.Element("score").Value);
if (dist < currentWinner)
{
currentWinner = dist;
path = i.Element("path").Value;
}
});
}
. . .and here is the Levenshtein Distance Method:
public static int Levenshtein(string s, string t) {
int n = s.Length;
int m = t.Length;
var d = new int[n + 1, m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
Thanks in advance!
Update
Ran into this error again today under different circumstances. I was working on a desktop app with high memory demand. Make sure that you have set the project for 64-bit architecture to access all available memory. My project was set on x86 by default and so I kept getting out of memory exceptions. Of course, this only works if you can count on 64-bit processors for your deployment.
End Update
After struggling a bit with this it appears to be operator error:
I was making calls to the UI thread from the parallel threads in order to update progress labels, but I was not doing it in a thread-safe way.
Additionally, I was running the app without the debugger, so there was an uncaught exception each time the code attempted to update the UI thread from a parallel thread which caused the overflow.
Without being an expert on PLINQ, I am guessing that it handles all of the low-level allocation stuff for you as long as you don't make a goofy smelly code error like this one.
Hope this helps someone else.