Sine predictions from ML.NET not really working - c#

I am currently experimenting with ML.NET but with the very first project I get stuck. I am trying to make a prediction of sine values.
Generating a list of X and Y values with a function for sine (y = sin(x))
Using that list for ML.NET to learn
Make Y-predictions for the next X-values
Append these predictions to the list
Result: I am always getting one single result for any following number.
Sine is just as a varifyable function.
This is the current code:
class Program
{
private const string FILEPATH = #"sinus.txt";
private const float XSTART = 0f;
private const float XEND = 20f;
private const float XSTEP = 0.1f;
private const float XEND_FORECAST = 30f;
static void Main(string[] args)
{
GenerateSinusList();
var pipeline = new LearningPipeline();
pipeline.Add(new TextLoader(FILEPATH).CreateFrom<Sinus>(separator: ';'));
pipeline.Add(new ColumnConcatenator("Features", "X"));
pipeline.Add(new FastTreeRegressor());
var model = pipeline.Train<Sinus, SinusForecast>();
PredictUpcomingValues(model);
Console.WriteLine("done");
Console.ReadLine();
}
static void PredictUpcomingValues(PredictionModel<Sinus, SinusForecast> model)
{
using (var sw = System.IO.File.AppendText(FILEPATH))
{
sw.WriteLine();
for (double i = XEND + XSTEP; i < XEND_FORECAST; i += XSTEP)
{
var prediction = model.Predict(new Sinus() { X = (float)i });
var t = string.Format("{0};{1}", i, prediction.ResultY);
sw.WriteLine(t.Replace(',', '.')); //Quick localization fixSine
}
sw.Close();
}
}
static void GenerateSinusList()
{
var sinus = GenerateSine(XSTART, XEND, XSTEP);
var text = string.Join(System.Environment.NewLine, sinus.Select(x => string.Format("{0:};{1}", x.Key, x.Value)));
System.IO.File.WriteAllText(FILEPATH, text.Replace(',', '.'));
}
static Dictionary<float, float> GenerateSine(float from, float to, float step)
{
Dictionary<float, float> result = new Dictionary<float, float>((int)((to - from) / step) + 1);
for (double i = from; i < to; i += step)
{
result[(float)i] = (float)Math.Sin(i);
}
return result;
}
public class Sinus
{
[Column("0")]
public float X;
[Column("1", name: "Label")]
public float Y;
}
public class SinusForecast
{
[ColumnName("Score")]
public float ResultY;
}
}
The result of this: Each value > 20 returns 0.5429355. The list looks like that:
...
19.4;0.523066
19.5;0.6055401
19.6;0.6819639
19.7;0.7515736
19.8;0.8136739
19.9;0.8676443
20.1;0.5429355 << first predicted
20.2;0.5429355
20.3;0.5429355
20.4;0.5429355
20.5;0.5429355
20.6;0.5429355
...
Edit: I am Using ML 0.4.0

Decision trees are not very good at extrapolation (i.e. predicting on data that is outside the range of the training data). If you make predictions on the training data, the scores will not be constant and will actually be somewhat reasonable.
One approach to turn this into an interpolation problem is to map all the inputs to their corresponding values in one period of the sine function. If you add another features column that is mod(X, 2*Pi), you get great predictions on the test data as well.

Related

Random number with Probabilities in C#

I have converted this Java program into a C# program.
using System;
using System.Collections.Generic;
namespace RandomNumberWith_Distribution__Test
{
public class DistributedRandomNumberGenerator
{
private Dictionary<Int32, Double> distribution;
private double distSum;
public DistributedRandomNumberGenerator()
{
distribution = new Dictionary<Int32, Double>();
}
public void addNumber(int val, double dist)
{
distribution.Add(val, dist);// are these two
distSum += dist; // lines correctly translated?
}
public int getDistributedRandomNumber()
{
double rand = new Random().NextDouble();//generate a double random number
double ratio = 1.0f / distSum;//why is ratio needed?
double tempDist = 0;
foreach (Int32 i in distribution.Keys)
{
tempDist += distribution[i];
if (rand / ratio <= tempDist)//what does "rand/ratio" signify? What does this comparison achieve?
{
return i;
}
}
return 0;
}
}
public class MainClass
{
public static void Main(String[] args)
{
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(2, 0.3d);
drng.addNumber(3, 0.5d);
//=================
// start simulation
int testCount = 1000000;
Dictionary<Int32, Double> test = new Dictionary<Int32, Double>();
for (int i = 0; i < testCount; i++)
{
int random = drng.getDistributedRandomNumber();
if (test.ContainsKey(random))
{
double prob = test[random]; // are these
prob = prob + 1.0 / testCount;// three lines
test[random] = prob; // correctly translated?
}
else
{
test.Add(random, 1.0 / testCount);// is this line correctly translated?
}
}
foreach (var item in test.Keys)
{
Console.WriteLine($"{item}, {test[item]}");
}
Console.ReadLine();
}
}
}
I have several questions:
Can you check if the marked-by-comment lines are correct or need explanation?
Why doesn't getDistributedRandomNumber() check if the sum of the distribution 1 before proceeding to further calculations?
The method
public void addNumber(int val, double dist)
Is not correctly translated, since you are missing the following lines:
if (this.distribution.get(value) != null) {
distSum -= this.distribution.get(value);
}
Those lines should cover the case when you call the following (based on your example code):
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(1, 0.5d);
So calling the method addNumber twice with the same first argument, the missing code part looks if the first argument is already present in the dictionary and if so it will remove the "old" value from the dictionary to insert the new value.
Your method should look like this:
public void addNumber(int val, double dist)
{
if (distribution.TryGetValue(val, out var oldDist)) //get the old "dist" value, based on the "val"
{
distribution.Remove(val); //remove the old entry
distSum -= oldDist; //substract "distSum" with the old "dist" value
}
distribution.Add(val, dist); //add the "val" with the current "dist" value to the dictionary
distSum += dist; //add the current "dist" value to "distSum"
}
Now to your second method
public int getDistributedRandomNumber()
Instead of calling initializing a new instance of Random every time this method is called you should only initialize it once, so change the line
double rand = new Random().NextDouble();
to this
double rand = _random.NextDouble();
and initialize the field _random outside of a method inside the class declaration like this
public class DistributedRandomNumberGenerator
{
private Dictionary<Int32, Double> distribution;
private double distSum;
private Random _random = new Random();
... rest of your code
}
This will prevent new Random().NextDouble() from producing the same number over and over again if called in a loop.
You can read about this problem here: Random number generator only generating one random number
As I side note, fields in c# are named with a prefix underscore. You should consider renaming distribution to _distribution, same applies for distSum.
Next:
double ratio = 1.0f / distSum;//why is ratio needed?
Ratio is need because the method tries its best to do its job with the information you have provided, imagine you only call this:
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
int random = drng.getDistributedRandomNumber();
With those lines you told the class you want to have the number 1 in 20% of the cases, but what about the other 80%?
And that's where the ratio variable comes in place, it calculates a comparable value based on the sum of probabilities you have given.
eg.
double ratio = 1.0f / distSum;
As with the latest example drng.addNumber(1, 0.2d); distSum will be 0.2, which translates to a probability of 20%.
double ratio = 1.0f / 0.2;
The ratio is 5.0, with a probability of 20% the ratio is 5 because 100% / 5 = 20%.
Now let's have a look at how the code reacts when the ratio is 5
double tempDist = 0;
foreach (Int32 i in distribution.Keys)
{
tempDist += distribution[i];
if (rand / ratio <= tempDist)
{
return i;
}
}
rand will be to any given time a value that is greater than or equal to 0.0, and less than 1.0., that's how NextDouble works, so let's assume the following 0.254557522132321 as rand.
Now let's look what happens step by step
double tempDist = 0; //initialize with 0
foreach (Int32 i in distribution.Keys) //step through the added probabilities
{
tempDist += distribution[i]; //get the probabilities and add it to a temporary probability sum
//as a reminder
//rand = 0.254557522132321
//ratio = 5
//rand / ratio = 0,0509115044264642
//tempDist = 0,2
// if will result in true
if (rand / ratio <= tempDist)
{
return i;
}
}
If we didn't apply the ratio the if would be false, but that would be wrong, since we only have a single value inside our dictionary, so no matter what the rand value might be the if statement should return true and that's the natur of rand / ratio.
To "fix" the randomly generated number based on the sum of probabilities we added. The rand / ratio will only be usefull if you didn't provide probabilites that perfectly sum up to 1 = 100%.
eg. if your example would be this
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(2, 0.3d);
drng.addNumber(3, 0.5d);
You can see that the provided probabilities equal to 1 => 0.2 + 0.3 + 0.5, in this case the line
if (rand / ratio <= tempDist)
Would look like this
if (rand / 1 <= tempDist)
Divding by 1 will never change the value and rand / 1 = rand, so the only use case for this devision are cases where you didn't provided a perfect 100% probability, could be either more or less.
As a side note, I would suggest changing your code to this
//call the dictionary distributions (notice the plural)
//dont use .Keys
//var distribution will be a KeyValuePair
foreach (var distribution in distributions)
{
//access the .Value member of the KeyValuePair
tempDist += distribution.Value;
if (rand / ratio <= tempDist)
{
return i;
}
}
Your test routine seems to be correctly translated.

Check if an stl file may contain two models

An stl file may contain 2 3D models. Is there any way I can detect if there are 2 or more models stored in one stl file?
In my current code, it can detect that there are 2 models in the example, but there are instances that it detects a lot of model even though it only has one.
The Triangle class structure has Vertices that contains 3 points (x, y, z)..
Sample STL File:
EDIT: Using #Gebb's answer this is how I implemented it:
private int GetNumberOfModels(List<TopoVertex> vertices)
{
Vertex[][] triangles = new Vertex[vertices.Count() / 3][];
int vertIdx = 0;
for(int i = 0; i < vertices.Count() / 3; i++)
{
Vertex v1 = new Vertex(vertices[vertIdx].pos.x, vertices[vertIdx].pos.y, vertices[vertIdx].pos.z);
Vertex v2 = new Vertex(vertices[vertIdx + 1].pos.x, vertices[vertIdx + 1].pos.y, vertices[vertIdx + 1].pos.z);
Vertex v3 = new Vertex(vertices[vertIdx + 2].pos.x, vertices[vertIdx + 2].pos.y, vertices[vertIdx + 2].pos.z);
triangles[i] = new Vertex[] { v1, v2, v3 };
vertIdx += 3;
}
var uniqueVertices = new HashSet<Vertex>(triangles.SelectMany(t => t));
int vertexCount = uniqueVertices.Count;
// The DisjointUnionSets class works with integers, so we need a map from vertex
// to integer (its id).
Dictionary<Vertex, int> indexedVertices = uniqueVertices
.Zip(
Enumerable.Range(0, vertexCount),
(v, i) => new { v, i })
.ToDictionary(vi => vi.v, vi => vi.i);
int[][] indexedTriangles =
triangles
.Select(t => t.Select(v => indexedVertices[v]).ToArray())
.ToArray();
var du = new XYZ.view.wpf.DisjointUnionSets(vertexCount);
// Iterate over the "triangles" consisting of vertex ids.
foreach (int[] triangle in indexedTriangles)
{
int vertex0 = triangle[0];
// Mark 0-th vertexes connected component as connected to those of all other vertices.
foreach (int v in triangle.Skip(1))
{
du.Union(vertex0, v);
}
}
var connectedComponents =
new HashSet<int>(Enumerable.Range(0, vertexCount).Select(x => du.Find(x)));
return connectedComponents.Count;
}
In some cases, it produces the correct output, but for the example image above, it outputs 3 instead of 2. I am now trying to optimize the snippet #Gebb gave to use float values since I believe that the floating points are necessary to the comparisons. Does anyone have a way to do that as well? Maybe I need another perspective.
You could do this by representing vertices and connections between them as a graph and finding the number of connected components of the graph with the help of the Disjoint-set data structure.
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
using Vertex = System.ValueTuple<double,double,double>;
namespace UnionFindSample
{
internal class DisjointUnionSets
{
private readonly int _n;
private readonly int[] _rank;
private readonly int[] _parent;
public DisjointUnionSets(int n)
{
_rank = new int[n];
_parent = new int[n];
_n = n;
MakeSet();
}
// Creates n sets with single item in each
public void MakeSet()
{
for (var i = 0; i < _n; i++)
// Initially, all elements are in
// their own set.
_parent[i] = i;
}
// Finds the representative of the set
// that x is an element of.
public int Find(int x)
{
if (_parent[x] != x)
{
// if x is not the parent of itself, then x is not the representative of
// his set.
// We do the path compression by moving x’s node directly under the representative
// of this set.
_parent[x] = Find(_parent[x]);
}
return _parent[x];
}
// Unites the set that includes x and
// the set that includes x
public void Union(int x, int y)
{
// Find representatives of two sets.
int xRoot = Find(x), yRoot = Find(y);
// Elements are in the same set, no need to unite anything.
if (xRoot == yRoot)
{
return;
}
if (_rank[xRoot] < _rank[yRoot])
{
// Then move x under y so that depth of tree remains equal to _rank[yRoot].
_parent[xRoot] = yRoot;
}
else if (_rank[yRoot] < _rank[xRoot])
{
// Then move y under x so that depth of tree remains equal to _rank[xRoot].
_parent[yRoot] = xRoot;
}
else
{
// if ranks are the same
// then move y under x (doesn't matter which one goes where).
_parent[yRoot] = xRoot;
// And increment the result tree's
// rank by 1
_rank[xRoot] = _rank[xRoot] + 1;
}
}
}
internal class Program
{
private static void Main(string[] args)
{
string file = args[0];
Vertex[][] triangles = ParseStl(file);
var uniqueVertices = new HashSet<Vertex>(triangles.SelectMany(t => t));
int vertexCount = uniqueVertices.Count;
// The DisjointUnionSets class works with integers, so we need a map from vertex
// to integer (its id).
Dictionary<Vertex, int> indexedVertices = uniqueVertices
.Zip(
Enumerable.Range(0, vertexCount),
(v, i) => new {v, i})
.ToDictionary(vi => vi.v, vi => vi.i);
int[][] indexedTriangles =
triangles
.Select(t => t.Select(v => indexedVertices[v]).ToArray())
.ToArray();
var du = new DisjointUnionSets(vertexCount);
// Iterate over the "triangles" consisting of vertex ids.
foreach (int[] triangle in indexedTriangles)
{
int vertex0 = triangle[0];
// Mark 0-th vertexes connected component as connected to those of all other vertices.
foreach (int v in triangle.Skip(1))
{
du.Union(vertex0, v);
}
}
var connectedComponents =
new HashSet<int>(Enumerable.Range(0, vertexCount).Select(x => du.Find(x)));
int count = connectedComponents.Count;
Console.WriteLine($"Number of connected components: {count}.");
var groups = triangles.GroupBy(t => du.Find(indexedVertices[t[0]]));
foreach (IGrouping<int, Vertex[]> g in groups)
{
Console.WriteLine($"Group id={g.Key}:");
foreach (Vertex[] triangle in g)
{
string tr = string.Join(' ', triangle);
Console.WriteLine($"\t{tr}");
}
}
}
private static Regex _triangleStart = new Regex(#"^\s+outer loop");
private static Regex _triangleEnd = new Regex(#"^\s+endloop");
private static Regex _vertex = new Regex(#"^\s+vertex\s+(\S+)\s+(\S+)\s+(\S+)");
private static Vertex[][] ParseStl(string file)
{
double ParseCoordinate(GroupCollection gs, int i) =>
double.Parse(gs[i].Captures[0].Value, CultureInfo.InvariantCulture);
var triangles = new List<Vertex[]>();
bool isInsideTriangle = false;
List<Vertex> triangle = new List<Vertex>();
foreach (string line in File.ReadAllLines(file))
{
if (isInsideTriangle)
{
if (_triangleEnd.IsMatch(line))
{
isInsideTriangle = false;
triangles.Add(triangle.ToArray());
triangle = new List<Vertex>();
continue;
}
Match vMatch = _vertex.Match(line);
if (vMatch.Success)
{
double x1 = ParseCoordinate(vMatch.Groups, 1);
double x2 = ParseCoordinate(vMatch.Groups, 2);
double x3 = ParseCoordinate(vMatch.Groups, 3);
triangle.Add((x1, x2, x3));
}
}
else
{
if (_triangleStart.IsMatch(line))
{
isInsideTriangle = true;
}
}
}
return triangles.ToArray();
}
}
}
I'm also using the fact that System.ValueTuple implements Equals and GetHashCode in an appropriate way, so we can easily compare vertices (this is used implicitly by HashSet) and use them as keys in a dictionary.

How to add conditional inside Linq Aggregate in C#?

I'm trying Linq over Imperative style, but I can't convert this conditional inside Aggregate to Linq.
Consider two following examples.
Simple example:
public enum Color {
Red, // Red score = 10
Yellow, // Yellow score = 5
Green, // Green score = 2
}
//Populate our sample list
public List<Color> Colors = new List<Color> {Red, Green, Green, Yellow};
//I need help on this one
public float Score => Colors.Aggregate(0.0f, (total, next) =>
{
//How to properly use conditional inside Aggregate?
if (next == Color.Red) {
return total + 10.0f;
} else if (next == Color.Yellow) {
return total + 5.0f;
} else if (next == Color.Green) {
return total + 2.0f;
}
//edit: forgot the default
return total;
}
Log(Score); //19
Edit: I have tried moving the conditional to Select, but then it will just move the problem, Which is how to add conditional inside Linq Select?
public float Score => Colors.Select(x =>
{
// The problem still happening
if (x == Color.Red) {
return 10.0f;
} else if (x == Color.Yellow) {
return 5.0f;
} else if (x == Color.Green) {
return 2.0f;
}
return 0.0f;
}
.Aggregate(0.0f, (total, next) => total + next);
And here is the complex example, basically it's just a stat modifier for a game,
// This is a Game Status Modifier, for example: "Strength 30 + 10%"
public enum StatModType
{
Flat = 100, // Flat addition to Stat
PercentAdd = 200, // Percent addition to Stat
... // many other type of addition
}
private float _baseValue = 30.0f;
public List<StatModifier> StatModifiers = new List<StatModifier>
{...} //dummy data
public float Value => StatModifiers.Aggregate(_baseValue, (finalValue, mod) =>
{
//I need help on this one
if (mod.Type == StatModType.Flat)
return finalValue + mod.Value;
else if (mod.Type == StatModType.PercentAdd)
// When we encounter a "PercentAdd" modifier
return finalValue + finalValue * mod.Value;
else if (mod.Type == ...)
//and continues below everytime I add more modifier types..
}
Log(Value); // Strength = 33;
Edit: I'll just post (Credit: https://forum.unity.com/threads/tutorial-character-stats-aka-attributes-system.504095/) the imperative code in case someone needs it, I also have a hard time reading this one:
private float CalculateFinalValue()
{
float finalValue = BaseValue;
float sumPercentAdd = 0; // This will hold the sum of our "PercentAdd" modifiers
for (int i = 0; i < statModifiers.Count; i++)
{
StatModifier mod = statModifiers[i];
if (mod.Type == StatModType.Flat)
{
finalValue += mod.Value;
}
else if (mod.Type == StatModType.PercentAdd) // When we encounter a "PercentAdd" modifier
{
sumPercentAdd += mod.Value; // Start adding together all modifiers of this type
// If we're at the end of the list OR the next modifer isn't of this type
if (i + 1 >= statModifiers.Count || statModifiers[i + 1].Type != StatModType.PercentAdd)
{
finalValue *= 1 + sumPercentAdd; // Multiply the sum with the "finalValue", like we do for "PercentMult" modifiers
sumPercentAdd = 0; // Reset the sum back to 0
}
}
else if (mod.Type == StatModType.PercentMult) // Percent renamed to PercentMult
{
finalValue *= 1 + mod.Value;
}
}
return (float)Math.Round(finalValue, 4);
}
How can I add conditional inside Aggregate / Reduce / Scan function?
I suggest extracting model in both cases i.e.
Simple Example:
private static Dictionary<Color, float> s_ColorScores = new Dictionary<Color, float>() {
{Color.Red, 10.0f},
{Color.Yellow, 5.0f},
{Color.Green, 2.0f},
};
...
float Score = Colors
.Sum(color => s_ColorScores[color]);
Complex Example:
private static Dictionary<StatModType, Func<float, float, float>> s_Modifications = new
Dictionary<StatModType, Func<float, float, float>> {
{StatModType.Flat, (prior, value) => prior + value},
{StatModType.PercentAdd, (prior, value) => prior + prior * value},
//TODO: add modification rules here
};
public float Value => StatModifiers
.Aggregate(_baseValue, (prior, mod) => s_Modifications[mod.Type](prior, mod.Value));
So you are going to have game's model (s_ColorScores, s_Modifications...) with rules, settings, balances etc. (which you will probably want to tune, may be Color.Yellow score of 6.0f is a better choice) separated from simple business logics.
Assuming that the behaviors associated to the enum types are static and not dynamic, based on this MSDocs article another approach would be to use enumeration classes instead of enum types. To simplify this, you could use the SmartEnum package.
Using this lib and approach, your use cases turn into:
Simple Example:
public sealed class Color: SmartEnum<Color>
{
public static readonly Color Red = new Color (nameof(Red), 1, 10.0f);
public static readonly Color Yellow = new Color (nameof(Yellow), 2, 20.0f);
public static readonly Color Green = new Color (nameof(Green), 3, 30.0f);
private Color(string name, int value, double score)
: base(name, value)
{
this.Score = score;
}
public float Score {get;}
}
float TotalScore = Colors
.Sum(color => color.Score);
Complex Example:
public sealed class StatMod: SmartEnum<StatMod>
{
public static readonly StatMod FlatAdd = new StatMod(nameof(FlatAdd), 200, (prev, val)=>prev+val);
public static readonly StatMod PercentAdd = new StatMod(nameof(PercentAdd), 300, (prev,val)=>prior + prior * value);
private StatMod(string name, int value, Func<float, float, float> reduce) : base(name, value)
{
this.Reduce = reduce;
}
public Func<float, float, float> Reduce {get;}
}
public float Value => StatModifiers
.Aggregate(_baseValue, (prior, mod) => mod.Reduce(prev, mod.Value));

Alglib Data fitting with minlmoptimize does not minimize the results. Full c# included

I'm having trouble implementing the lm optimizer in the alglib library. I'm not sure why the parameters are hardly changing at all while still receiving an exit code of 4. I have been unable to determine what i am doing wrong with the documentation for alglib. Below is the full source I am running:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Threading.Tasks;
namespace FBkineticsFitter
{
class Program
{
public static int Main(string[] args)
{
/*
* This code finds the parameters ka, kd, and Bmax from the minimization of the residuals using "V" mode of the Levenberg-Marquardt optimizer (alglib library).
* This optimizer is used because the equation is non-linear and this particular version of the optimizer does not require the ab inito calculation of partial
* derivatives, a jacobian matrix, or other parameter-space definitions, so it's implementation is simple.
*
* The equations being solved represent a model of a protein-protein interaction where protein in solution is interacting with immobilized protein on a sensor
* in a 1:1 stoichiometery. Mass transport limit is not taken into account. The detials of this equation are described in:
* R.B.M. Schasfoort and Anna J. Tudos Handbook of Surface Plasmon Resonance, 2008, Chapter 5, ISBN: 978-0-85404-267-8
*
* Y=((ka*Cpro*Bmax)/(ka*Cpro+kd))*(1-exp(-1*X*(ka*Cpro+kd))) ; this equation describes the association phase
*
* Y=Req*exp(-1*X*kd) ; this equation describes the dissociation phase
*
* The data are fit globally such that Bmax and Req parameters are linked and kd parameters are linked during simultaneous optimization for the most robust fit
*
* Y= signal
* X= time
* ka= association constant
* kd= dissociation constant
* Bmax= maximum binding capacity at equilibrium
* Req=(Cpro/(Cpro+kobs))*Bmax :. in this case Req=Bmax because Cpro=0 during the dissociation step
* Cpro= concentration of protein in solution
*
* additional calculations:
* kobs=ka*Cpro
* kD=kd/ka
*/
GetRawDataXY(#"C:\Results.txt");
double epsg = .0000001;
double epsf = 0;
double epsx = 0;
int maxits = 0;
alglib.minlmstate state;
alglib.minlmreport rep;
alglib.minlmcreatev(2, GlobalVariables.param, 0.0001, out state);
alglib.minlmsetcond(state, epsg, epsf, epsx, maxits);
alglib.minlmoptimize(state, Calc_residuals, null, null);
alglib.minlmresults(state, out GlobalVariables.param, out rep);
System.Console.WriteLine("{0}", rep.terminationtype); ////1=relative function improvement is no more than EpsF. 2=relative step is no more than EpsX. 4=gradient norm is no more than EpsG. 5=MaxIts steps was taken. 7=stopping conditions are too stringent,further improvement is impossible, we return best X found so far. 8= terminated by user
System.Console.WriteLine("{0}", alglib.ap.format(GlobalVariables.param, 20));
System.Console.ReadLine();
return 0;
}
public static void Calc_residuals(double[] param, double[] fi, object obj)
{
/*calculate the difference of the model and the raw data at each X (I.E. residuals)
* the sum of the square of the residuals is returned to the optimized function to be minimized*/
fi[0] = 0;
fi[1] = 0;
for (int i = 0; i < GlobalVariables.rawXYdata[0].Count();i++ )
{
if (GlobalVariables.rawXYdata[1][i] <= GlobalVariables.breakpoint)
{
fi[0] += System.Math.Pow((kaEQN(GlobalVariables.rawXYdata[0][i]) - GlobalVariables.rawXYdata[1][i]), 2);
}
else
{
fi[1] += System.Math.Pow((kdEQN(GlobalVariables.rawXYdata[0][i]) - GlobalVariables.rawXYdata[1][i]), 2);
}
}
}
public static double kdEQN(double x)
{
/*Calculate kd Y value based on the incremented parameters*/
return GlobalVariables.param[2] * Math.Exp(-1 * x * GlobalVariables.param[1]);
}
public static double kaEQN(double x)
{
/*Calculate ka Y value based on the incremented parameters*/
return ((GlobalVariables.param[0] * GlobalVariables.Cpro * GlobalVariables.param[2]) / (GlobalVariables.param[0] * GlobalVariables.Cpro + GlobalVariables.param[1])) * (1 - Math.Exp(-1 * x * (GlobalVariables.param[0] * GlobalVariables.Cpro + GlobalVariables.param[1])));
}
public static void GetRawDataXY(string filename)
{
/*Read in Raw data From tab delim txt*/
string[] elements = { "x", "y" };
int count = 0;
GlobalVariables.rawXYdata[0] = new double[1798];
GlobalVariables.rawXYdata[1] = new double[1798];
using (StreamReader sr = new StreamReader(filename))
{
while (sr.Peek() >= 0)
{
elements = sr.ReadLine().Split('\t');
GlobalVariables.rawXYdata[0][count] = Convert.ToDouble(elements[0]);
GlobalVariables.rawXYdata[1][count] = Convert.ToDouble(elements[1]);
count++;
}
}
}
public class GlobalVariables
{
public static double[] param = new double[] { 1, .02, 0.13 }; ////ka,kd,Bmax these are initial guesses for the algorithm
public static double[][] rawXYdata = new double[2][];
public static double Cpro = 100E-9;
public static double kD = 0;
public static double breakpoint = 180;
}
}
}
According to Sergey Bochkanova The issue is the following:
"You should use param[] array which is provided to you by optimizer. It creates its internal copy of your param, and updates this copy - not your param array.
From the optimizer point of view, it has function which never changes when it changes its internal copy of param. So, it terminates right after first iteration."
Here is the updated and working example code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Threading.Tasks;
namespace FBkineticsFitter
{
class Program
{
public static int Main(string[] args)
{
/*
* This code finds the parameters ka, kd, and Bmax from the minimization of the residuals using "V" mode of the Levenberg-Marquardt optimizer (alglib library).
* This optimizer is used because the equation is non-linear and this particular version of the optimizer does not require the ab inito calculation of partial
* derivatives, a jacobian matrix, or other parameter-space definitions, so it's implementation is simple.
*
* The equations being solved represent a model of a protein-protein interaction where protein in solution is interacting with immobilized protein on a sensor
* in a 1:1 stoichiometery. Mass transport limit is not taken into account. The detials of this equation are described in:
* R.B.M. Schasfoort and Anna J. Tudos Handbook of Surface Plasmon Resonance, 2008, Chapter 5, ISBN: 978-0-85404-267-8
*
* Y=((Cpro*Rmax)/(Cpro+kd))*(1-exp(-1*X*(ka*Cpro+kd))) ; this equation describes the association phase
*
* Y=Req*exp(-1*X*kd)+NS ; this equation describes the dissociation phase
*
* According to ForteBio's Application Notes #14 the amplitudes of the data can be correctly accounted for by modifying the above equations as follows:
*
* Y=(Rmax*(1/(1+(kd/(ka*Cpro))))*(1-exp(((-1*Cpro)+kd)*X)) ; this equation describes the association phase
*
* Y=Y0*(exp(-1*kd*(X-X0))) ; this equation describes the dissociation phase
*
*
*
* The data are fit simultaneously such that all fitting parameters are linked during optimization for the most robust fit
*
* Y= signal
* X= time
* ka= association constant [fitting parameter 0]
* kd= dissociation constant [fitting parameter 1]
* Rmax= maximum binding capacity at equilibrium [fitting parameter 2]
* KD=kd/ka
* kobs=ka*Cpro+kd
* Req=(Cpro/(Cpro+KD))*Rmax
* Cpro= concentration of protein in solution
* NS= non-specific binding at time=infinity (constant correction for end point of fit) [this is taken into account in the amplitude corrected formula: Y0=Ylast]
* Y0= the initial value of Y for the first point of the dissociation curve (I.E. the last point of the association phase)
* X0= the initial value of X for the first point of the dissociation phase
*
*/
GetRawDataXY(#"C:\Results.txt");
double epsg = .00001;
double epsf = 0;
double epsx = 0;
int maxits = 10000;
alglib.minlmstate state;
alglib.minlmreport rep;
double[] param = new double[] { 1000000, .0100, 0.20};////ka,kd,Rmax these are initial guesses for the algorithm and should be mid range for the expected data., The last parameter Rmax should be guessed as the maximum Y-value of Ka
double[] scaling= new double[] { 1E6,1,1};
alglib.minlmcreatev(2, param, 0.001, out state);
alglib.minlmsetcond(state, epsg, epsf, epsx, maxits);
alglib.minlmsetgradientcheck(state, 1);
alglib.minlmsetscale(state, scaling);
alglib.minlmoptimize(state, Calc_residuals, null, V.rawXYdata);
alglib.minlmresults(state, out param, out rep);
System.Console.WriteLine("{0}", rep.terminationtype); ////1=relative function improvement is no more than EpsF. 2=relative step is no more than EpsX. 4=gradient norm is no more than EpsG. 5=MaxIts steps was taken. 7=stopping conditions are too stringent,further improvement is impossible, we return best X found so far. 8= terminated by user
System.Console.WriteLine("{0}", alglib.ap.format(param, 25));
System.Console.ReadLine();
return 0;
}
public static void Calc_residuals(double[] param, double[] fi, object obj)
{
/*calculate the difference of the model and the raw data at each X (I.E. residuals)
* the sum of the square of the residuals is returned to the optimized function to be minimized*/
CalcVariables(param);
fi[0] = 0;
fi[1] = 0;
for (int i = 0; i < V.rawXYdata[0].Count(); i++)
{
if (V.rawXYdata[0][i] <= V.breakpoint)
{
fi[0] += System.Math.Pow((kaEQN(V.rawXYdata[0][i], param) - V.rawXYdata[1][i]), 2);
}
else
{
if (!V.breakpointreached)
{
V.breakpointreached = true;
V.X_0 = V.rawXYdata[0][i];
V.Y_0 = V.rawXYdata[1][i];
}
fi[1] += System.Math.Pow((kdEQN(V.rawXYdata[0][i], param) - V.rawXYdata[1][i]), 2);
}
}
if (param[0] <= 0 || param[1] <=0 || param[2] <= 0)////Exponentiates the error if the parameters go negative to favor positive non-zero values
{
fi[0] = Math.Pow(fi[0], 2);
fi[1] = Math.Pow(fi[1], 2);
}
System.Console.WriteLine("{0}"+" "+V.Cpro+" -->"+fi[0], alglib.ap.format(param, 5));
Console.WriteLine((kdEQN(V.rawXYdata[0][114], param)));
}
public static double kdEQN(double X, double[] param)
{
/*Calculate kd Y value based on the incremented parameters*/
return (V.Rmax * (1 / (1 + (V.kd / (V.ka * V.Cpro)))) * (1 - Math.Exp((-1 * V.ka * V.Cpro) * V.X_0))) * Math.Exp((-1 * V.kd) * (X - V.X_0));
}
public static double kaEQN(double X, double[] param)
{
/*Calculate ka Y value based on the incremented parameters*/
return ((V.Cpro * V.Rmax) / (V.Cpro + V.kd)) * (1 - Math.Exp(-1 * X * ((V.ka * V.Cpro) + V.kd)));
}
public static void GetRawDataXY(string filename)
{
/*Read in Raw data From tab delim txt*/
string[] elements = { "x", "y" };
int count = 0;
V.rawXYdata[0] = new double[226];
V.rawXYdata[1] = new double[226];
using (StreamReader sr = new StreamReader(filename))
{
while (sr.Peek() >= 0)
{
elements = sr.ReadLine().Split('\t');
V.rawXYdata[0][count] = Convert.ToDouble(elements[0]);
V.rawXYdata[1][count] = Convert.ToDouble(elements[1]);
count++;
}
}
}
public class V
{
/*Global Variables*/
public static double[][] rawXYdata = new double[2][];
public static double Cpro = 100E-9;
public static bool breakpointreached = false;
public static double X_0 = 0;
public static double Y_0 = 0;
public static double ka = 0;
public static double kd = 0;
public static double Rmax = 0;
public static double KD = 0;
public static double Kobs = 0;
public static double Req = 0;
public static double breakpoint = 180;
}
public static void CalcVariables(double[] param)
{
V.ka = param[0];
V.kd = param[1];
V.Rmax = param[2];
V.KD = param[1] / param[0];
V.Kobs = param[0] * V.Cpro + param[1];
V.Req = (V.Cpro / (V.Cpro + param[0] * V.Cpro + param[1])) * param[2];
}
}
}

Simplifying redundant variable assignment

I don't like this code, it is overcomplicated and impractical, so I'm looking to simplify it.
I want it to change a var by a random amount, and I need to put at least 150 variables into this code.
//Variable list
public double price1 = 100;
public double price2 = 100;
public double price3 = 100;
public void DaysEnd(){ //Simplified version of inefficient code
var = price1;
HVariation();
price1 = newvar;
var = price2;
HVariation();
price2 = newvar;
var = price2;
MVariation();
price2 = newvar;
var = price3;
LVariation();
price3 = newvar;
}
public void Hvariation(){
newvar = var + (var * (Random.NextDouble(0 - 0.5, 0.5)));
}
public void Mvariation(){
newvar = var + (var * (Random.NextDouble(0 - 0.25, 0.25)));
}
public void Lvariation(){
newvar = var + (var * (Random.NextDouble(0 - 0.1, 0.5)));
}
This should get you started
List<double> values = new List<double> { 100, 100, 200, 500, ... };
values = values.Select(val => Hvariation(val)).ToList();
// now all values have been altered by Hvariation
...
private readonly Random _rand = new Random();
public double Hvariation(double val) {
return val + (val * (_rand.NextDouble(-0.5, 0.5)));
}
The first thing to do is find repeated code. For example:
var = price3;
LVariation(); //Different variations
price3 = newvar;
This can be turned into a method (that takes the variation as a parameter).
To do this, you will also need to make a default variation that takes the min and max:
public void Variation(double min, double max){
newvar = var + (var * (Random.NextDouble(min, max)));
}
You can then put this together to reduce code to look some thing like this:
public double UpdatePrice(double price, double min, double max)
{
var = price;
Variation(min, max);
return newvar;
}
In general, if I have to copy the code more than once (or even once if the amount copied is significant), I turn the code into a method.
You can simplify this by instead of defining three variation methods, defining a variation level and passing it into a single method. I'm not sure if you would need it to be in arrays or if you can use lists (in which case lists are preferable), but you can store your variable in an array instead of defining a variable name for each one and separate them into logical groupings as you need to. You can then apply the change/transformation to each array using LINQ. An example of this would be
public enum VariationLevel
{
High,
Medium,
Low
};
public double[] HighVariancePrices =
{
100, 100, 100, 100, 100
};
public double[] MediumVariancePrices =
{
100, 100, 100, 100, 100
};
public double[] LowVariancePrices =
{
100, 100, 100, 100, 100
};
public void DaysEnd()
{
HighVariancePrices = HighVariancePrices.Select(price => GetVariation(price, VariationLevel.High)).ToArray();
MediumVariancePrices = MediumVariancePrices.Select(price => GetVariation(price, VariationLevel.Medium)).ToArray();
LowVariancePrices = LowVariancePrices.Select(price => GetVariation(price, VariationLevel.Low)).ToArray();
}
public double GetVariation(double value, VariationLevel variationLevel)
{
switch (variationLevel)
{
case VariationLevel.High:
return value + (value * (Random.NextDouble(0 - 0.5, 0.5)));
case VariationLevel.Medium:
return value + (value * (Random.NextDouble(0 - 0.25, 0.25)));
case VariationLevel.Low:
return value + (value * (Random.NextDouble(0 - 0.1, 0.5)));
}
}
However, the code around Random.NextDouble() doesn't compile (because NextDouble doesn't take arguments) so I'm not certain what you're trying to do there, but that's outside of the scope of "how can I simplify my code?" Hope this helps some.

Categories

Resources