Array data normalization - c#

I have an array of values (between -1.0 and 1.0) that represent intensity (Black to White). I need a way to map the double values from -1.0 through 1.0 to 0 through 255 and back.
More generalized, I have an array of data and I need to map from the min and max value of the data to a supplied min and max. Basic structure should be like:
private static int[] NormalizeData(double[] data, int min, int max)
{
var sorted = data.OrderBy(d => d);
double dataMax = sorted.First();
double dataMin = sorted.Last();
int[] ret = new int[data.Length];
for (int i = 0; i < data.Length; i++)
{
ret[i] = (int)data[i]; // Normalization here
}
return ret;
}

This works:
private static int[] NormalizeData(IEnumerable<double> data, int min, int max)
{
double dataMax = data.Max();
double dataMin = data.Min();
double range = dataMax - dataMin;
return data
.Select(d => (d - dataMin) / range)
.Select(n => (int)((1 - n) * min + n * max))
.ToArray();
}
The first select normalizes the input to be from 0 to 1 (0 being minimum, 1 being the maximum). The second select takes that normalized number, and maps it to the new minimum and maximum.
Note that using the LINQ Min() and Max() functions are faster than sorting the input for larger datasets: O(n) vs. O(n * lg(n)).
Also, if you want to go the other way, then you'll want it to return doubles instead of ints.

public static double Scale(this double elementToScale,
double rangeMin, double rangeMax,
double scaledRangeMin, double scaledRangeMax)
{
var scaled = scaledRangeMin + ((elementToScale - rangeMin) * (scaledRangeMax - scaledRangeMin) / (rangeMax - rangeMin));
return scaled;
}
Usage:
// double [-1,1] to int [0-255]
int[] integers = doubles.Select(x => x.Scale(-1,1,0,255)).ToArray();
// int [0-255] to double [-1,1]
double[] doubles = integers.Select(x => ((double)x).Scale(0,255,-1,1)).ToArray();
If you don't know the min and max in advance ([0-255] and [-1,1] in the example), you can use LINQ Min() and Max()

private static int[] NormalizeData(double[] data, int min, int max) {
int[] ret = new int[data.Length];
for (int i = 0; i < data.Length; i++) {
ret[i] = (int)((max * (data[i] + 1)) / 2);
}
return ret;
}
static void Main(string[] args) {
double[] data = { 1.0, -1, 0, -.5, .5 };
int[] normalized = NormalizeData(data, 0, 255);
foreach (var v in normalized) {
Console.WriteLine(v);
}
}

EDIT:
How about this:
private static int[] NormalizeData(double[] data, int min, int max)
{
var sorted = data.OrderBy(d => d);
double dataMax = sorted.First();
double dataMin = sorted.Last();
int[] ret = new int[data.Length];
double avgIn = (double)((min + max) / 2.0);
double avgOut = (dataMax + dataMin) / 2.0);
for (int i = 0; i < data.Length; i++)
{
ret[i] = (int) Math.Round(avgOut * (data[i] + avgIn) / 2);
}
return ret;
}

Assuming a strictly linear transformation and that you want dataMin to map to min and dataMax to map to max:
double dataRange = dataMax - dataMin;
int newRange = max - min;
double pct = (data[i] - dataMin) / dataRange;
int newValue = Math.Round(min + (pct * newRange));
That can certainly be optimized, but it shows the basic idea. Basically, you figure out the position (as a percentage) of the value in the original range and then map that percentage to the target range.
Note that if dataMin is -0.5 and dataMax is 0.5, this might not produce the results that you're looking for because -0.5 will map to 0 and 0.5 will map to 255. If you want things to map exactly as stated, you'll have to define the source range as well.
As an aside, there's no particular reason to sort the items just to get the min and max. You can write:
double dataMax = data.Max();
double dataMin = data.Min();

To be able to normalize your array which in this example acts a vector mathematically you need to define what length the vector is in (how many dimensions).
It's not really clear from the example if you want to normalize the entire array taking all elements in the array into account. If so then you calculate the dot product of the array, store the dot products square root as the length of the array. then you divide every term with that length to normalize the array to a length of 1.0.
In the case above you did not actually describe a normalization of the data but a conversion. To solve that you could use something like the following:
private static double[] convertToScale(double[] data, double oldMin, double oldMax,double min, double max)
{
double oldDiff = 0 - oldMin;
double oldScale = oldMax - oldMin;
double diff = 0 - min;
double scale = max - min;
int[] ret = new double[data.Length];
for (int i = 0; i < data.Length; i++)
{
double scaledFromZeroToOne = (oldDiff+data[i])/oldScale; // Normalization here [0,1]
double value = (scaledFromZeroToOne*scale)-diff;
ret[i] = value;
}
return ret;
}
This function i believe would solve the problem described above.
You can call it like following row:
double[] result = convertToScale(input,-1.0,1.0,0,255);
And then cast everything to int if you'd rather have the values represented as ints.
Hope it helps.

Related

What is the fastest way to calculate min, max, mean, median and standard deviation from C# array?

I have a larger array or list of doubles which is not sorted and I want to calculate min, max, mean, median and standard deviation the most efficient way. Of course I could simply use Linq to calculate each one by one, but I think one can go faster. Sample code:
var list = new List<double>(){1.0, 2.5, 0.11, 0.7, 8.2, 3.4, 1.0};
var (min, max, mean, median, std) = CalculateMetrics(list);
private (double, double, double, double, double) CalculateMetrics(List<double> list) {
// TODO
}
So what is the most efficient way? Using libraries is also fine for me.
All the descriptive stats except median you want can be computed in one pass through your list. The trick to getting the standard deviation is accumulating both the sum and the sum-of-squares of your samples. Here's an example of that.
int count = 0;
double sum = 0.0;
double sumsq = 0.0;
double max = double.MinValue;
double min = double.MaxValue;
foreach (double sample in list)
{
count++;
sum += sample;
sumsq += sample * sample;
if (sample > max) max = sample;
if (sample < min) min = sample;
}
double mean = sum / count;
double stdev = Math.Sqrt((sumsq / count) - (mean * mean));
Because this makes only one pass through the list, it works with any IEnumerable collection of samples, and is compatible with LINQ.
Obviously this is quick-n-dirty example code. I leave it to you to build it into a useful function.
It will throw a divide check on an empty list. And, if you have very large numbers or very long lists, that subtraction in the computation of stdev may lose precision and give you back a useless number.
But it works well for most applications.
Because the median is asked for and the standard deviation requires the mean, it makes this hard to do in O(n).
Here's my best attempt:
private (double min, double max, double mean, double median, double std) CalculateMetrics(List<double> list)
{
var mean = list.Average();
var std = Math.Sqrt(list.Aggregate(0.0, (a, x) => a + (x - mean) * (x - mean)) / list.Count());
var sorted = list.OrderBy(x => x).ToList();
var median = sorted.Count % 2 == 0 ? (sorted[sorted.Count / 2 - 1] + sorted[sorted.Count / 2]) / 2 : sorted[sorted.Count / 2];
return (sorted.First(), sorted.Last(), mean, median, std);
}
O(2n) solution:
private static (double, double, double, double, double) CalculateMetrics(double[] list)
{
if (list.Length < 1)
{
throw new Exception();
}
double min = list[0];
double max = list[0];
double median = list[list.Length / 2];
double sum = 0;
foreach (double el in list)
{
if (el > max)
{
max = el;
}
if (el < min)
{
min = el;
}
sum += el;
}
double mean = sum / list.Length;
double sumStd = 0;
foreach (var el in list)
{
sumStd += Math.Pow(el - mean, 2) / list.Length;
}
double stdDev = Math.Sqrt(sumStd);
return (min, max, mean, median, stdDev);
}

Normalization of a decimal value

My problem is like this: I need to generate a random number from Gaussin/Norma distribution and create a histogram of a width of 0.1.
class Gaussian
{
public static double Next(Random r, double mu = 0, double sigma = 1)
{
var u1 = r.NextDouble();
var u2 = r.NextDouble();
var rand_std_normal = Math.Sqrt(-2.0 * Math.Log(u1)) *
Math.Sin(2.0 * Math.PI * u2);
var rand_normal = mu + sigma * rand_std_normal;
return rand_normal;
}
}
I am using the above function to generate a Gaussian random value.
Now, in order to create a histogram, I am in need of such a calculation that is able to automatically convert the gaussian value into an array index. Something like the following:
static void Main(string[] args)
{
const int N = 1000;
int time = N;
const double binsDistance = 0.1;
int binsCount = (int)(N * binsDistance);
Random rand = new Random();
int[] array = new int[binsCount];
int side = 0;
for (int i = 0; i < time; i++)
{
double gauss = Gaussian.Next(rand);
int binNo = Normalization.Normalize(0, binsCount - 1, gauss);
array[binNo]++;
}
}
For that, I tried two calculations.
the first one is here.
the second one is here.
The problem with the first one is, it can't handle negative numbers properly.
The problem with the second one is, it is generating too many zero values.
So, I have two questions:
What is the basic difference between #1 and #2?
How can I achieve what I am trying to do?

Scale FFT result frequencys in log

I am about to program a visualizer with pretty good results. I have got an array with the size of 1500, with the magnitude of the frequencys in it. Now I want to convert this array in an array with 100 values. For example in the 1st index of the 2nd array should be the average of the first two values in the first array. On the 2nd index of the 2nd array should be the values of index 3-6. But i don't know how to calculate this properly. So how can I convert the first array into the second one?
I have found an answer in the rainmeter source code. Maybe it will now be clearer what I wanted to do here is the c# code:
To get an array with an specific length, log scaled with min. and max. frequencies.
private float[] getFrequencies(int min, int max, int nBands)
{
float[] returnVal = new float[nBands];
double step = (Math.Log(max / min) / nBands) / Math.Log(2.0);
returnVal[0] = (float)(min * Math.Pow(2.0, step / 2.0));
for (int iBand = 1; iBand < nBands; ++iBand)
{
returnVal[iBand] = (float)(returnVal[iBand - 1] * Math.Pow(2.0, step));
}
return returnVal;
}
And to fill the output array:
private double[] getLogArray(double[] data, int nBands, int minFreq, int maxFreq)
{
float[] bandFreq = getFrequencies(minFreq, maxFreq, nBands);
float df = (float)sampleRate / samples;
float scalar = 1.0f / sampleRate;
double[] bandOut = new double[nBands];
int iBin = 0;
int iBand = 0;
float f0 = 0.0f;
while (iBin <= (samples / 2) && iBand < nBands)
{
float fLin1 = ((float)iBin + 0.5f) * df;
float fLog1 = bandFreq[iBand];
float x = (float)data[iBin];
if (fLin1 <= fLog1)
{
bandOut[iBand] += (fLin1 - f0) * x * scalar;
f0 = fLin1;
iBin += 1;
}
else
{
bandOut[iBand] += (fLog1 - f0) * x * scalar;
f0 = fLog1;
iBand += 1;
}
}
return bandOut;
}
Have a nice day and sorry for the late response.

Find the min and max for quadratic equation

how to find the min and max for quadratic equation using c# ??
f(x,y) = x^2 + y^2 + 25 * (sin(x)^2 + sin(y)^2) ,where (x,y) from (-2Pi, 2Pi) ??
in the manual solving I got min is = 0 , max = 8Pi^2 = 78.957 .
I tried to write the code based on liner quadratic code but something goes totally wrong
this code give the min = -4.?? and the max = 96 could you help to know where is my mistake please ??
I uploaded the code to dropbox if anyone can have look : https://www.dropbox.com/s/p7y6krk2gk29i9e/Program.cs
double[] X, Y, Result; // Range array and result array.
private void BtnRun_Click(object sender, EventArgs e)
{
//Set any Range for the function
X = setRange(-2 * Math.PI, 2 * Math.PI, 10000);
Y = setRange(-2 * Math.PI, 2 * Math.PI, 10000);
Result = getOutput_twoVariablesFunction(X, Y);
int MaxIndex = getMaxIndex(Result);
int MinIndex = getMinIndex(Result);
TxtMin.Text = Result[MinIndex].ToString();
TxtMax.Text = Result[MaxIndex].ToString();
}
private double twoVariablesFunction(double x,double y)
{
double f;
//Set any two variables function
f = Math.Pow(x, 2) + Math.Pow(y, 2) + 25 * (Math.Pow(Math.Sin(x), 2) + Math.Pow(Math.Sin(y), 2));
return f;
}
private double[] setRange(double Start, double End, int Sample)
{
double Step = (End - Start) / Sample;
double CurrentVaue = Start;
double[] Array = new double[Sample];
for (int Index = 0; Index < Sample; Index++)
{
Array[Index] = CurrentVaue;
CurrentVaue += Step;
}
return Array;
}
private double[] getOutput_twoVariablesFunction(double[] X, double[] Y)
{
int Step = X.Length;
double[] Array = new double[Step];
for (int Index = 0; Index < X.Length ; Index++)
{
Array[Index] = twoVariablesFunction(X[Index], Y[Index]);
}
return Array;
}
private int getMaxIndex(double[] ValuesArray)
{
double M = ValuesArray.Max();
int Index = ValuesArray.ToList().IndexOf(M);
return Index;
}
private int getMinIndex(double[] ValuesArray)
{
double M = ValuesArray.Min();
int Index = ValuesArray.ToList().IndexOf(M);
return Index;
}
Do you want to compute (sin(x))^2 or sin(x^2)? In your f(x,y) formula it looks like (sin(x))^2, but in your method twoVariablesFunction like sin(x^2).

Round decimals and convert to % in C#

I have a list of probabilities like
0.0442857142857143
0.664642857142857
0.291071428571429
I want to convert them to the nearest percentages so that the sum of percentages adds up to 100
so something like this
0.0442857142857143 - 4 %
0.664642857142857 - 67 %
0.291071428571429 - 29 %
I cannot rely on Math.Round to always give me results which will add up to 1. What would be the best way to do this?
This is an method that could do the job.
public int[] Round(params decimal[] values)
{
decimal total = values.Sum();
var percents = values.Select(x=> Math.Round(x/total*100)).ToArray();
int totalPercent = perents.Sum();
var diff = 100 - totalPercent;
percents[percents.Lenght - 1] += diff;
return percents;
}
Interesting collection of answers.
The problem here is that you are getting a cumulative error in your rounding operations. In some cases the accumulated error cancels out - some values round up, others down, cancelling the total error. In other cases such as the one you have here, the rounding errors are all negative, giving an accumulated total error of (approximately) -1.
The only way to work around this in the general case is to keep track of the total accumulated error and add/subtract when that error gets large enough. It's tedious, but the only real way to get this right:
static int[] ToIntPercents(double[] values)
{
int[] results = new int[values.Length];
double error = 0;
for (int i = 0; i < values.Length; i++)
{
double val = values[i] * 100;
int percent = (int)Math.Round(val + error);
error += val - percent;
if (Math.Abs(error) >= 0.5)
{
int sign = Math.Sign(error);
percent += sign;
error -= sign;
}
results[i] = percent;
}
return results;
}
This code produces reasonable results for any size array with a sum of approximately +1.0000 (or close enough). Array can contain negative and positive values, just as long as the sum is close enough to +1.0000 to introduce no gross errors.
The code accumulates the rounding errors and when the total error exceeds the acceptable range of -0.5 < error < +0.5 it adjusts the output. Using this method the the output array for your numbers would be: [4, 67, 29]. You could change the acceptable error range to be 0 <= error < 1, giving the output [4, 66, 30], but this causes odd results when the array contains negative numbers. If that's your preference, change the if statement in the middle of the method to read:
if (error < 0 || error >= 1)
You could just multiply the number by 100 (if you have the decimal number)
0.0442857142857143 * 100 = 4 %
0.664642857142857 * 100 = 66 %
0.291071428571429 * 100 = 29 %
E: correct, 0.291071428571429 wouldn't add up to 30%...
Since you don't seem to care which number is bumped, I'll use the last. The algo is pretty simple, and will works for both the .4 edge case where you must add 1 and the one at .5 where you must remove 1 :
1) Round each number but the last one
2) Subtract 100 from the sum you have
3) Assign the remainder to the last number
As an extension method, it looks like this :
public static int[] SplitIntoPercentage(this double[] input)
{
int[] results = new int[input.Length];
for (int i = 0; i < input.Length - 1; i++)
{
results[i] = (int)Math.Round(input[i] * 100, MidpointRounding.AwayFromZero);
}
results[input.Length - 1] = 100 - results.Sum();
return results;
}
And here's the associated unit tests :
[TestMethod]
public void IfSumIsUnder100ItShouldBeBumpedToIt()
{
double[] input = new []
{
0.044,
0.664,
0.294
};
var result = input.SplitIntoPercentage();
Assert.AreEqual(100, result.Sum());
Assert.AreEqual(4, result[0]);
Assert.AreEqual(66, result[1]);
Assert.AreEqual(30, result[2]);
}
[TestMethod]
public void IfSumIsOver100ItShouldBeReducedToIt()
{
double[] input = new[]
{
0.045,
0.665,
0.295
};
var result = input.SplitIntoPercentage();
Assert.AreEqual(100, result.Sum());
Assert.AreEqual(5, result[0]);
Assert.AreEqual(67, result[1]);
Assert.AreEqual(28, result[2]);
}
Once refactored a little bit, the result looks like this :
public static int[] SplitIntoPercentage(this double[] input)
{
int[] results = RoundEachValueButTheLast(input);
results = SetTheLastValueAsTheRemainder(input, results);
return results;
}
private static int[] RoundEachValueButTheLast(double[] input)
{
int[] results = new int[input.Length];
for (int i = 0; i < input.Length - 1; i++)
{
results[i] = (int)Math.Round(input[i]*100, MidpointRounding.AwayFromZero);
}
return results;
}
private static int[] SetTheLastValueAsTheRemainder(double[] input, int[] results)
{
results[input.Length - 1] = 100 - results.Sum();
return results;
}
Logic is , Firstly we have to round off the "after decimal value" then apply round off to whole value.
static long PercentageOut(double value)
{
value = value * 100;
value = Math.Round(value, 1, MidpointRounding.AwayFromZero); // Rounds "up"
value = Math.Round(value, 0, MidpointRounding.AwayFromZero); // Rounds to even
return Convert.ToInt64(value);
}
static void Main(string[] args)
{
double d1 = 0.0442857142857143;
double d2 = 0.664642857142857;
double d3 = 0.291071428571429;
long l1 = PercentageOut(d1);
long l2 = PercentageOut(d2);
long l3 = PercentageOut(d3);
Console.WriteLine(l1);
Console.WriteLine(l2);
Console.WriteLine(l3);
}
Output
4
67
29
---
sum is 100 %

Categories

Resources