I've made a neural network in C# from scratch.
The problem is that when I try to train it for an XOR logic gate, it always converges to 0.5, instead of alternating between 0 and 1.
The neural network is made to work with a custumizable number of layers and nodes.
I've been over the theory countless of times, but I have no idea where the problem could be. I have excluded the bias in the backpropagation.
The backpropagation algorithm:
Neuralnetwork is constructed as follows:
nn.layerlist[x] = specific layer
nn.layerlist[x].network[y] = specific node
nn.layerlist[x].network[y].weight[z] = specific weight leading towards that node
public static NeuralNetwork Calculate(NeuralNetwork nn, List<double> Target, double LearningRate)
{
var OutputLayer = nn.LayerList.Count - 1;
var node = nn.LayerList[OutputLayer].Network[0];
List<List<double>> WeightList = new List<List<double>>();
List<double> Weights = new List<double>();
//Weightlist initialiseren
for (int i = 0; i < node.Weights.Count; i++)
Weights.Add(0);
// Door de laatste layer eerst gaan
for (int i = 0; i < Target.Count(); i++)
{
for (int j = 0; j < node.Weights.Count; j++)
{
var weight = nn.LayerList[OutputLayer].Network[i].Weights[j];
var outputcurrent = nn.LayerList[OutputLayer].Network[i].Value;
var OutputPrevious = nn.LayerList[OutputLayer - 1].Network[j].Value;
var Cost = (outputcurrent - Target[i]) * (outputcurrent * (1 - outputcurrent));
Weights[j] += Cost * weight;
weight = weight - (Cost * OutputPrevious * LearningRate);
nn.LayerList[OutputLayer].Network[i].Weights[j] = weight;
}
}
WeightList.Add(Weights);
int layercount = nn.LayerList.Count - 1;
//alle layers in reverse bijlangs gaan
for (int i = layercount; i > 0; i--)
{
var WeightsHidden = new List<double>();
for(int k = 0; k < 1000; k++) //TODO: Change this to something dynamic!
WeightsHidden.Add(0);
for (int j = 0; j < nn.LayerList[i].Network.Count; j++)
{
for (int k = 0; k < nn.LayerList[i].Network[j].Weights.Count; k++)
{
var weight = nn.LayerList[i].Network[j].Weights[k];
var outputcurrent = nn.LayerList[i].Network[j].Value;
var OutputPrevious = nn.LayerList[i - 1].Network[k].Value;
Console.WriteLine("Total Weights: {0}", WeightList[WeightList.Count - 1][k]);
var Cost = (outputcurrent * (1 - outputcurrent)) * WeightList[WeightList.Count - 1][k];
WeightsHidden[k] += Cost * weight;
weight = weight - (Cost * OutputPrevious * LearningRate);
nn.LayerList[i].Network[j].Weights[k] = weight;
}
}
WeightList.Add(WeightsHidden);
}
return nn;
}
}
The code is quite complex, but I have no idea how to format it in any other way.
I would like to neural network to put out the desired result like an XOR gate.
Related
I'm trying to implement FFT, using butterfly notation with decimation in spatial domain. I have two separated functions, one to calculate the FFT on 1D array, and the other that applies the first function over columns of the image, puts these columns in a temporary array, and then computes the FFT of rows in the temporary array.
Here is the code
public static Complex[] FFT(Complex[] input)
{
// Get the length of the input array
int n = input.Length;
Complex I = new Complex(0, 1);
// Check if the input has a length of 1
if (n == 1)
{
return input;
}
// Split the input into even and odd elements
Complex[] even = new Complex[n / 2];
Complex[] odd = new Complex[n / 2];
for (int i = 0; i < n; i++)
{
if (i % 2 == 0)
{
even[i / 2] = input[i];
}
else
{
odd[(i - 1) / 2] = input[i];
}
}
// Compute the FFT of the even and odd elements
Complex[] evenFFT = FFT(even);
Complex[] oddFFT = FFT(odd);
// Combine the FFT of the even and odd elements using the butterfly notation
Complex[] output = new Complex[n];
for (int i = 0; i < n / 2; i++)
{
Complex w = Complex.Exp(-2 * I * Math.PI * i / n);
output[i] = evenFFT[i] + w * oddFFT[i];
output[i + n / 2] = evenFFT[i] - w * oddFFT[i];
}
return output;
}
public static Complex[,] FFT2D(Complex[,] input)
{
int N = input.GetLength(0);
int M = input.GetLength(1);
Complex[,] output = new Complex[128, 128];
Complex[,] columnsFFT = new Complex[128, 128];
//Perform FFT over the columns of the input
for(int i = 0; i < M; i++)
{
//Put all the values from i'th column in the tempColumn variable
var tempColumn = new Complex[128];
for(int j = 0; j < N; j++)
{
tempColumn[j] = input[j, i];
}
//Calculate the FFT of i'th column
tempColumn = FFT(tempColumn);
//Assign the column to columnsFFT, after calculating its FFT
for(int z = 0; z < 128; z++)
{
columnsFFT[z, i] = tempColumn[z];
}
}
//Perform FFT over the rows of the columnsFFT
for(int i = 0; i < N; i++)
{
//Put the values from i'th row in tempRow, so we can perform the FFT on its entirety
var tempRow = new Complex[128];
for(int j = 0; j < M; j++)
{
tempRow[j] = columnsFFT[i, j];
}
//Calculate the FFT on tempRow
tempRow = FFT(tempRow);
//Assign the tempRow to the output
for(int z = 0; z < M; z++)
{
if (i == 0)
Console.WriteLine(tempRow[z].Real);
output[i, z] = tempRow[z];
}
}
return output;
}
The problem that i have is that it simply doesn't work, or i think it doesn't, based on the image output i get.
Input image
Output
What is it that i'm doing wrong here?
It's possible to write neural network for different size of training data inputs and outputs
for example:
inputs are 1. (1,2,3,4) , 2. (2,3,1), 3. (1,2,3,4,5) and so on...
and the same for outputs 1. (0,0,1,1) 2. (1,1,1) 3. (0,0,1,1,1)
So far I have managed to write the one which only works with the same size of training data which mean that all my training data needs to have the same length.
So far I'm stuck with this
NeuralNetwork net;
int[] layers = new int[3]
{
3/*Always the same*/,
1/*Always the same*/,
3 /*Always the same*/
};
string[] activation = new string[2] { "leakyrelu", "leakyrelu" };
net = new NeuralNetwork(layers, activation);
What I need
NeuralNetwork net1;
int[] layers1 = new int[3]
{
input.Length /*Based on input's Length*/,
1/*Always the same*/,
output.Length /*Based on output's Length*/
};
string[] activation1 = new string[2] { "leakyrelu", "leakyrelu" };
net = new NeuralNetwork(layers, activation);
// BackPropagate
public void BackPropagate(float[] inputs, float[] expected)
{
float[] output = FeedForward(inputs);
cost = 0;
for (int i = 0; i < output.Length; i++) cost += (float)Math.Pow(output[i] - expected[i], 2);
cost = cost / 2;
float[][] gamma;
List<float[]> gammaList = new List<float[]>();
for (int i = 0; i < layers.Length; i++)
{
gammaList.Add(new float[layers[i]]);
}
gamma = gammaList.ToArray();
int layer = layers.Length - 2;
for (int i = 0; i < output.Length; i++)
gamma[layers.Length-1][i] = (output[i] - expected[i]) * activateDer(output[i],layer);
for (int i = 0; i < neurons[layers.Length - 1].Length; i++)
{
biases[layers.Length - 1][i] -= gamma[layers.Length - 1][i] * learningRate;
for (int j = 0; j < neurons[layers.Length - 2].Length; j++)
{
weights[layers.Length - 2][i][j] -= gamma[layers.Length - 1][i] * neurons[layers.Length-2][j] * learningRate;
}
}
for (int i = layers.Length - 2; i > 0; i--)
{
layer = i - 1;
for (int j = 0; j < neurons[i].Length; j++)
{
gamma[i][j] = 0;
for (int k = 0; k < gamma[i+1].Length; k++)
{
gamma[i][j] = gamma[i + 1][k] * weights[i][k][j];
}
gamma[i][j] *= activateDer(neurons[i][j],layer);
}
for (int j = 0; j < neurons[i].Length; j++)
{
biases[i][j] -= gamma[i][j] * learningRate;
for (int k = 0; k < neurons[i-1].Length; k++)
{
weights[i - 1][j][k] -= gamma[i][j] * neurons[i-1][k] * learningRate;
}
}
}
}
I am creating a library for Neural Networks; I have successfully implemented the Back Propagation algorithm, but I'm having trouble with Resilient Propagation.
I have been using the XOR scenario to test my implementation; the error seems to decrease after a few epochs, then it increases and eventually stops changing.
The network I am testing on has 3 layers - 2 input, 2 hidden and 1 output neuron/s. The synapse/dendrite weights are on the following layer's neurons, for example, the input-hidden weights would be stored on the hidden neurons and the hidden-output weights would be stored on the output neurons.
This is my gradient calculation (EDIT 4):
private void CalculateGradient()
{
for (int t = 0; t < TrainingInputs.Count; ++t) //loop through training data
{
Network.Run(TrainingInputs[t]);
for (int l = Network.Layers.Count - 1; l > 0; l--)
{
for (int i = 0; i < Network.Layers[l].Neurons.Count; i++)
{
Network.Layers[l].Neurons[i].Delta = l < Network.Layers.Count - 1
? CalculateNonLastGradient(l + 1, i, Network.Layers[l].Neurons[i].Value)
: CalculateLastGradient(TrainingOutputs[t][i], Network.Layers[l].Neurons[i].Value);
}
}
for (int l = Network.Layers.Count - 1; l > 0; l--)
{
for (int j = 0; j < Network.Layers[l].Neurons.Count; ++j)
{
double grad = Network.Layers[l].Neurons[j].Delta;
biasGrads[l][j] += grad;
for (int i = 0; i < Network.Layers[l - 1].Neurons.Count; ++i)
{
grad = Network.Layers[l].Neurons[j].Delta * Network.Layers[l - 1].Neurons[i].Value;
grads[l][j][i] += grad;
}
}
}
int o = 0;
Layer layer = Network.Layers[Network.Layers.Count - 1];
errors.Add(layer.Neurons.Sum(n => Math.Abs(TrainingOutputs[t][o++] - n.Value)));
}
Error = errors.Average();
}
private double CalculateLastGradient(double ideal, double nValue)
{
return Network.Activation.Derivitive(nValue) * (ideal - nValue);
}
private double CalculateNonLastGradient(int nextLayer, int j, double nValue)
{
double sum = 0.0;
for (int i = 0; i < Network.Layers[nextLayer].Neurons.Count; i++)
{
sum += Network.Layers[nextLayer].Neurons[i].Delta * Network.Layers[nextLayer].Neurons[i].Dendrites[j].Weight;
}
return Network.Activation.Derivitive(nValue) * sum;
}
My RProp implementation (EDIT 4):
public bool Algorithm()
{
//initialise matrices
if (!initializedMatrices)
{
InitializeMatrices();
}
ZeroOut();
//calculate gradients
CalculateGradient();
for (int l = 1; l < Network.Layers.Count; l++) //layers
{
for (int i = 0; i < Network.Layers[l - 1].Neurons.Count; ++i) //prev layer neurons
{
for (int j = 0; j < Network.Layers[l].Neurons.Count; ++j) //current layer neurons
{
double delta = prevDeltas[l][j][i];
int change = Math.Sign(prevGrads[l][j][i] * grads[l][j][i]);
if (change > 0)
{
delta = Math.Min(delta * etaPlus, deltaMax);
double deltaWeight = -Math.Sign(grads[l][j][i]) * delta;
Network.Layers[l].Neurons[j].Dendrites[i].Weight += deltaWeight;
}
else if (change < 0)
{
delta = Math.Max(delta * etaMinus, deltaMin);
Network.Layers[l].Neurons[j].Dendrites[i].Weight -= prevDeltas[l][j][i];
prevGrads[l][j][i] = 0;
}
else if (change == 0)
{
double deltaWeight = -Math.Sign(grads[l][j][i]) * delta;
Network.Layers[l].Neurons[j].Dendrites[i].Weight += deltaWeight;
}
prevGrads[l][j][i] = grads[l][j][i];
prevDeltas[l][j][i] = delta;
} //j
} //i
for (int i = 1; i < Network.Layers[l].Neurons.Count; ++i)
{
double delta = prevBiasDeltas[l][i];
int change = Math.Sign(prevBiasGrads[l][i] * biasGrads[l][i]);
if (change > 0)
{
delta = Math.Min(prevBiasDeltas[l][i] * etaPlus, deltaMax);
double biasDeltaWeight = -Math.Sign(biasGrads[l][i]) * delta;
Network.Layers[l].Neurons[i].Bias += biasDeltaWeight;
}
else if (change < 0)
{
delta = Math.Max(prevBiasDeltas[l][i] * etaMinus, deltaMin);
Network.Layers[l].Neurons[i].Bias -= prevBiasDeltas[l][i];
prevBiasGrads[l][i] = 0;
}
else if (change == 0)
{
double biasDeltaWeight = -Math.Sign(biasGrads[l][i]) * delta;
Network.Layers[l].Neurons[i].Bias += biasDeltaWeight;
}
prevBiasGrads[l][i] = biasGrads[l][i];
prevBiasDeltas[l][i] = delta;
}
}
return true;
}
Could anyone point me as to what could be going wrong?
EDIT:
The issue seems to be that the delta is not changing.
EDIT 2:
I have fixed the delta not changing by initializing the first previous deltas to 0.01 and Zero'ing the gradients before each call. The error increases very fast, then decreases very slowly with TANH; with Sigmoid it does the opposite.
EDIT 3:
The loop for the bias' was starting at 0, when it should have been 1. Fixing this fixed the original issue, but has brought up at new one - the error stops decreasing after a certain point.
EDIT 4: I realised that I had not accumulated the gradients for each neuron over the training set. The error now decreases, but very slowly.
I've run into a stall trying to put together some code to average out 10x10 subarrays of a 2D multidimensional array.
Given a multidimensional array
var myArray = new byte[100, 100];
How should I go about creating 100 subarrays of 100 bytes (10x10) each.
Here are some examples of the value indexes the subarrays from the multidimensional would contain.
[x1,y1,x2,y2]
Subarray1[0,0][9,9]
Subarray2[10,10][19,19]
Subarray3[20,20][29,29]
Given these subarrays, I would then need to average the subarray values to create a byte[10,10] from the original byte[100,100].
I realize this is not unbelievably difficult, but after spending 4 days debugging very low-level code and now getting stuck on this would appreciate some fresh eyes.
Use this as a reference. I used ints just for ease of use. Code is untested. but the idea is there.
var rowSize = 100;
var colSize = 100;
var arr = new int[rowSize, colSize];
var r = new Random();
for (int i = 0; i < rowSize; i++)
for (int j = 0; j < colSize; j++)
arr[i, j] = r.Next(20);
for (var subcol = 0; subcol < colSize / 10; subcol++)
{
for (var subrow = 0; subrow < colSize/10; subrow++)
{
var startX = subcol*10;
var startY = subrow*10;
var avg = 0;
for (var x=0; x<10; x++)
for (var y = 0; y < 10; y++)
avg += arr[startX + x, startY + y];
avg /= 10*10;
Console.WriteLine(avg);
}
}
It looks like you're new to SO. Next time try to post your attempt at the problem; it's better to fix your code.
The only challenge is figuring out the function, that given the subarray index we're trying to populate, would give you the correct row and column indexes in your original 100x100 array; the rest would just be a matter of copying the values:
// psuedocode
// given a subarrayIndex of 0 to 99, these will calculate the correct indices
rowIndexIn100x100Array = (subarrayIndex / 10) * 10 + subArrayRowIndexToPopulate;
colIndexIn100x100Array = (subarrayIndex % 10) * 10 + subArrayColIndexToPopulate;
I'll leave it as an exercise to you to deduce why the above functions correctly calculate the indices.
With the above, we can easily map the values:
var subArrays = new List<byte[,]>();
for (int subarrayIndex = 0; subarrayIndex < 100; subarrayIndex++)
{
var subarray = new byte[10, 10];
for (int i = 0; i < 10; i++)
for (int j = 0; j < 10; j++)
{
int rowIndexIn100x100Array = (subarrayIndex / 10) * 10 + i;
int colIndexIn100x100Array = (subarrayIndex % 10) * 10 + j;
subarray[i, j] = originalArray[rowIndexIn100x100Array, colIndexIn100x100Array];
}
subArrays.Add(subarray);
}
Once we have the 10x10 arrays, calculating the average would be trivial using LINQ:
var averages = new byte[10, 10];
for (int i = 0; i < 10; i++)
for (int j = 0; j < 10; j++)
{
averages[i, j] = (byte)subArrays[(i * 10) + j].Cast<byte>().Average(b => b);
}
Fiddle.
I come looking for general tips about the program I'm writing now.
The goal is:
Use neural network program to recognize 3 letters [D,O,M] (or display "nothing is recognized" if i input anything other than those 3).
Here's what I have so far:
A class for my single neuron
public class neuron
{
double[] weights;
public neuron()
{
weights = null;
}
public neuron(int size)
{
weights = new double[size + 1];
Random r = new Random();
for (int i = 0; i <= size; i++)
{
weights[i] = r.NextDouble() / 5 - 0.1;
}
}
public double output(double[] wej)
{
double s = 0.0;
for (int i = 0; i < weights.Length; i++) s += weights[i] * wej[i];
s = 1 / (1 + Math.Exp(s));
return s;
}
}
A class for a layer:
public class layer
{
neuron[] tab;
public layer()
{
tab = null;
}
public layer(int numNeurons, int numInputs)
{
tab = new neuron[numNeurons];
for (int i = 0; i < numNeurons; i++)
{
tab[i] = new neuron(numInputs);
}
}
public double[] compute(double[] wejscia)
{
double[] output = new double[tab.Length + 1];
output[0] = 1;
for (int i = 1; i <= tab.Length; i++)
{
output[i] = tab[i - 1].output(wejscia);
}
return output;
}
}
And finally a class for a network
public class network
{
layer[] layers = null;
public network(int numLayers, int numInputs, int[] npl)
{
layers = new layer[numLayers];
for (int i = 0; i < numLayers; i++)
{
layers[i] = new layer(npl[i], (i == 0) ? numInputs : (npl[i - 1]));
}
}
double[] compute(double[] inputs)
{
double[] output = layers[0].compute(inputs);
for (int i = 1; i < layers.Length; i++)
{
output = layers[i].compute(output);
}
return output;
}
}
Now for the algorythm I chose:
I have a picture box, size 200x200, where you can draw a letter (or read one from jpg file).
I then convert it to my first array(get the whole picture) and 2nd one(cut the non relevant background around it) like so:
Bitmap bmp2 = new Bitmap(this.pictureBox1.Image);
int[,] binaryfrom = new int[bmp2.Width, bmp2.Height];
int minrow=0, maxrow=0, mincol=0, maxcol=0;
for (int i = 0; i < bmp2.Height; i++)
{
for (int j = 0; j < bmp2.Width; j++)
{
if (bmp2.GetPixel(j, i).R == 0)
{
binaryfrom[i, j] = 1;
if (minrow == 0) minrow = i;
if (maxrow < i) maxrow = i;
if (mincol == 0) mincol = j;
else if (mincol > j) mincol = j;
if (maxcol < j) maxcol = j;
}
else
{
binaryfrom[i, j] = 0;
}
}
}
int[,] boundaries = new int[binaryfrom.GetLength(0)-minrow-(binaryfrom.GetLength(0)-(maxrow+1)),binaryfrom.GetLength(1)-mincol-(binaryfrom.GetLength(1)-(maxcol+1))];
for(int i = 0; i < boundaries.GetLength(0); i++)
{
for(int j = 0; j < boundaries.GetLength(1); j++)
{
boundaries[i, j] = binaryfrom[i + minrow, j + mincol];
}
}
And convert it to my final array of 12x8 like so (i know I could shorten this a fair bit, but wanted to have every step in different loop so I can see what went wrong easier[if anything did]):
int[,] finalnet = new int[12, 8];
int k = 1;
int l = 1;
for (int i = 0; i < finalnet.GetLength(0); i++)
{
for (int j = 0; j < finalnet.GetLength(1); j++)
{
finalnet[i, j] = 0;
}
}
while (k <= finalnet.GetLength(0))
{
while (l <= finalnet.GetLength(1))
{
for (int i = (int)(boundaries.GetLength(0) / finalnet.GetLength(0)) * (k - 1); i < (int)(boundaries.GetLength(0) / finalnet.GetLength(0)) * k; i++)
{
for (int j = (int)(boundaries.GetLength(1) / finalnet.GetLength(1)) * (l - 1); j < (int)(boundaries.GetLength(1) / finalnet.GetLength(1)) * l; j++)
{
if (boundaries[i, j] == 1) finalnet[k-1, l-1] = 1;
}
}
l++;
}
l = 1;
k++;
}
int a = boundaries.GetLength(0);
int b = finalnet.GetLength(1);
if((a%b) != 0){
k = 1;
while (k <= finalnet.GetLength(1))
{
for (int i = (int)(boundaries.GetLength(0) / finalnet.GetLength(0)) * finalnet.GetLength(0); i < boundaries.GetLength(0); i++)
{
for (int j = (int)(boundaries.GetLength(1) / finalnet.GetLength(1)) * (k - 1); j < (int)(boundaries.GetLength(1) / finalnet.GetLength(1)) * k; j++)
{
if (boundaries[i, j] == 1) finalnet[finalnet.GetLength(0) - 1, k - 1] = 1;
}
}
k++;
}
}
if (boundaries.GetLength(1) % finalnet.GetLength(1) != 0)
{
k = 1;
while (k <= finalnet.GetLength(0))
{
for (int i = (int)(boundaries.GetLength(0) / finalnet.GetLength(0)) * (k - 1); i < (int)(boundaries.GetLength(0) / finalnet.GetLength(0)) * k; i++)
{
for (int j = (int)(boundaries.GetLength(1) / finalnet.GetLength(1)) * finalnet.GetLength(1); j < boundaries.GetLength(1); j++)
{
if (boundaries[i, j] == 1) finalnet[k - 1, finalnet.GetLength(1) - 1] = 1;
}
}
k++;
}
for (int i = (int)(boundaries.GetLength(0) / finalnet.GetLength(0)) * finalnet.GetLength(0); i < boundaries.GetLength(0); i++)
{
for (int j = (int)(boundaries.GetLength(1) / finalnet.GetLength(1)) * finalnet.GetLength(1); j < boundaries.GetLength(1); j++)
{
if (boundaries[i, j] == 1) finalnet[finalnet.GetLength(0) - 1, finalnet.GetLength(1) - 1] = 1;
}
}
}
The result is a 12x8 (I can change it in the code to get it from some form controls) array of 0 and 1, where 1 form the rough shape of a letter you drawn.
Now my questions are:
Is this a correct algorythm?
Is my function
1/(1+Math.Exp(x))
good one to use here?
What should be the topology? 2 or 3 layers, and if 3, how many neurons in hidden layer? I have 96 inputs (every field of the finalnet array), so should I also take 96 neurons in the first layer? Should I have 3 neurons in the final layer or 4(to take into account the "not recognized" case), or is it not necessary?
Thank you for your help.
EDIT: Oh, and I forgot to add, I'm gonna train my network using Backpropagation algorythm.
You may need 4 layers at least to get accurate results using back propagation method. 1 input, 2 middle layers, and an output layer.
12 * 8 matrix is too small(and you may end up in data loss which will result in total failure) - try some thing 16 * 16. If you want to reduce the size then you have to peel out the outer layers of black pixels further.
Think about training the network with your reference characters.
Remember that you have to feed back the output back to the input layer again and iterate it multiple times.
A while back I created a neural net to recognize digits 0-9 (python, sorry), so based on my (short) experience, 3 layers are ok and 96/50/3 topology will probably do a good job. As for the output layer, it's your choice; you can either backpropagate all 0s when the input image is not a D, O or M or use the fourth output neuron to indicate that the letter was not recognized. I think that the first option would be the best one because it's simpler (shorter training time, less problems debugging the net...), you just need to apply a threshold under which you classify the image as 'not recognized'.
I also used the sigmoid as activation function, I didn't try others but it worked :)