counting weight-average of a sequence of doubles

counting weight-average of a sequence of doubles - c#

I have long sequence of double numbers (let's assume that not longer than 100 000).
Let's also assume that each number is not bigger than 200 000. If my algorithm below is suitable for such calculations? Will it be precise enough?
For example if I sum 100 000 times 200 000 and than devide to 100 000 I expect to have something between 199 999 and 200 001, but not 200 100 or something like that (though for these particular numbers it seems my class works perferctly, thanks to MarcinJuraszek for testing)
class Candle
{
public Candle(double value)
{
ValueUpdated(value);
}
private double sum = 0;
private double count = 0;
public void ValueUpdated(double value)
{
sum += value;
count++;
}
public double WeightAverage
{
get { return sum / count; }
}
}

Double-precision floating point numbers have 52 fractional bits, that is they have approximately log10(2 ^ 52) ~= 16 decimal places of precision. Since you only require 6 decimal places of precision, you should be perfectly fine.
Why not test it, though?
double sum = 0.0;
int count = 100000;
for (int i = 0; i < count; ++i) {
sum += 200000.0;
}
double average = sum / (double)count;
Console.WriteLine(average); // prints out exactly 200000

You definitely has a bug on your loop. Just tried:
var candle = new Candle(200000);
for (int i = 1; i < 100000; i++)
candle.ValueUpdated(200000);
Console.WriteLine(candle.WeightAverage);
The result is predictable and correct: 200000!
According to MSDN double precision is set up to 15-16 digits, what is far more than you need.

As noted a double has a precision of 15-16 decimal places. So as long as the range of your values (either the straight values or the value times the weight for a weighted average) is 15 digits or less you should be fine.

Related

Display very Large or Small Numbers in Scientific Notation by Counting the Zero's

I need a method to get the number of Zeros AFTER the Decimal point when the number BEFORE the decimal point is also Zero. So For example 0.00000000987654 would work out as 8, since there are 8 Zero's after 0. Turning the Decimal Data type into a string I could then display this in Scientific Notation as 9.87654E9.
The reason I need to do this is so I can iterate very small numbers multiple times producing results too high for calculators.
So for example 0.123456789 Multiplied by 0.1 and iterated a 1000 times. (0.123456789 * 0.1 * 0.1 * 0.1 * 0.1 ......) works out at 1.234567890000000000000000000E-1001 using the Decimal Data Type with the full 28-digit precision and displayed in Scientific Notation
I was able to achieve this when working with Factorials. For Example the Factorial of 1000 is 1000 x 999 * 998 * 997 * 996 .... all the way down to 0. This number is too high for calculators so I used iteration to achieve the result to 28-digit precision in Scientific Notation.
For the very large numbers I was successful. I achieved this by getting the number of Digits BEFORE the period:
static int Get_Digits_Before_Period(decimal Large_Number)
{
decimal d = decimal.Floor(Large_Number < 0 ? decimal.Negate(Large_Number) : Large_Number);
// 0.xyz should return 0, therefore a special case
if (d == 0m)
return 0;
int cnt = 1;
while ((d = decimal.Floor(d / 10m)) != 0m)
cnt++;
return cnt;
}
I now need a similar method but one for obtaining the number of Zero's AFTER the period.

The exponent range for decimal is 0 ~ -28, so it cannot represent a number such as 1.234567890000000000000000000E-1001, so I just explain numbers in the valid ranges.
To count the ZERO for a decimal, you can fetch the integer and exponent part of the decimal first
var number = 0.00000000987654m;
var bits = decimal.GetBits(number); //0~2 are integer part.
var exponent = (bits[3] & 0xff0000) >> 16;
Then reduce exponent by significant digits of the integers to get zero count after the period.
var zeros = exponent;
for(int i = 0; i <= 2; i++)
{
if(bits[i] != 0)
zeros -= (int)Math.Log10(bits[i]) + 1;
}
if(zeros < 0)
zeros = 0;

Double every time brings different values

It's my generating algorithm it's generating random double elements for the array which sum must be 1
public static double [] GenerateWithSumOfElementsIsOne(int elements)
{
double sum = 1;
double [] arr = new double [elements];
for (int i = 0; i < elements - 1; i++)
{
arr[i] = RandomHelper.GetRandomNumber(0, sum);
sum -= arr[i];
}
arr[elements - 1] = sum;
return arr;
}
And the method helper
public static double GetRandomNumber(double minimum, double maximum)
{
Random random = new Random();
return random.NextDouble() * (maximum - minimum) + minimum;
}
My test cases are:
[Test]
[TestCase(7)]
[TestCase(5)]
[TestCase(4)]
[TestCase(8)]
[TestCase(10)]
[TestCase(50)]
public void GenerateWithSumOfElementsIsOne(int num)
{
Assert.AreEqual(1, RandomArray.GenerateWithSumOfElementsIsOne(num).Sum());
}
And the thing is - when I'm testing it returns every time different value like this cases :
Expected: 1
But was: 0.99999999999999967d
Expected: 1
But was: 0.99999999999999989d
But in the next test, it passes sometimes all of them, sometimes not.
I know that troubles with rounding and ask for some help, dear experts :)

https://en.wikipedia.org/wiki/Floating-point_arithmetic
In computing, floating-point arithmetic is arithmetic using formulaic
representation of real numbers as an approximation so as to support a
trade-off between range and precision. For this reason, floating-point
computation is often found in systems which include very small and
very large real numbers, which require fast processing times. A number
is, in general, represented approximately to a fixed number of
significant digits (the significand) and scaled using an exponent in
some fixed base; the base for the scaling is normally two, ten, or
sixteen.
In short, this is what floats do, they dont hold every single value and do approximate. If you would like more precision try using a Decimal instead, or adding tolerance by an epsilon (an upper bound on the relative error due to rounding in floating point arithmetic)
var ratio = a / b;
var diff = Math.Abs(ratio - 1);
return diff <= epsilon;

Round up errors are frequent in case of floating point types (like Single and Double), e.g. let's compute an easy sum:
// 0.1 + 0.1 + ... + 0.1 = ? (100 times). Is it 0.1 * 100 == 10? No!
Console.WriteLine((Enumerable.Range(1, 100).Sum(i => 0.1)).ToString("R"));
Outcome:
9.99999999999998
That's why when comparing floatinfg point values with == or != add tolerance:
// We have at least 8 correct digits
// i.e. the asbolute value of the (round up) error is less than tolerance
Assert.IsTrue(Math.Abs(RandomArray.GenerateWithSumOfElementsIsOne(num).Sum() - 1.0) < 1e-8);

Percentile algorithm

I am writing a program that finds percentile. According to eHow:
Start to calculate the percentile of your test score (as an example we’ll stick with your score of 87). The formula to use is L/N(100) = P where L is the number of tests with scores less than 87, N is the total number of test scores (here 150) and P is the percentile. Count up the total number of test scores that are less than 87. We’ll assume the number is 113. This gives us L = 113 and N = 150.
And so, according to the instructions, I wrote:
string[] n = Interaction.InputBox("Enter the data set. The numbers do not have to be sorted.").Split(',');
List<Single> x = new List<Single> { };
foreach (string i in n)
{
x.Add(Single.Parse(i));
}
x.Sort();
List<double> lowerThan = new List<double> { };
Single score = Single.Parse(Interaction.InputBox("Enter the number."));
uint length = (uint)x.Count;
foreach (Single index in x)
{
if (index > score)
{
lowerThan.Add(index);
}
}
uint lowerThanCount = (uint)lowerThan.Count();
double percentile = lowerThanCount / length * 100;
MessageBox.Show("" + percentile);
Yet the program always returns 0 as the percentile! What errors have I made?

Your calculation
double percentile = lowerThanCount / length * 100;
is all done in integers, since the right hand side consist of all integers. Atleast one of the operand should be of floating point type. So
double percentile = (float) lowerThanCount / length * 100;

This is effectively a rounding problem, lowerThanCount / length are both unit therefore don't support decimal places so any natural percentage calculation (e.g. 0.2/0.5) would result in 0.
For example, If we were to assume lowerThanCount = 10 and length = 20, the sum would look something like
double result = (10 / 20) * 100
Therefore results in
(10 / 20) = 0.5 * 100
As 0.5 cannot be represented as an integer the floating point is truncated which leaves you with 0, so the final calculation eventually becomes
0 * 100 = 0;
You can fix this by forcing the calculation to work with a floating point type instead e.g.
double percentile = (double)lowerThanCount / length * 100
In terms of readability, it probably makes better sense to go with the cast in the calculation given lowerThanCount & length won't ever naturally be floating point numbers.
Also, your code could be simplified a lot using LINQ
string[] n = Interaction.InputBox("Enter the data set. The numbers do not have to be sorted.")
.Split(',');
IList<Single> x = n.Select(n => Single.Parse(n))
.OrderBy(x => x);
Single score = Single.Parse(Interaction.InputBox("Enter the number."));
IList<Single> lowerThan = x.Where(s => s < score);
Single percentile = (Single)lowerThan.Count / x.Count;
MessageBox.Show(percentile.ToString("%"));

The problem is in the types that you used for your variables: in this expression
double percentile = lowerThanCount / length * 100;
// ^^^^^^^^^^^^^^^^^^^^^^^
// | | |
// This is integer division; since length > lowerThanCount, its result is zero
the division is done on integers, so the result is going to be zero.
Change the type of lowerThanCount to double to fix this problem:
double lowerThanCount = (double)lowerThan.Count();

You are using integer division instead of floating point division. Cast length/lowerThanCount to a float before dividing.

Besides the percentile calculation (should be with floats), I think your count is off here:
foreach (Single index in x)
{
if (index > score)
{
lowerThan.Add(index);
}
}
You go through indexes and if they are larger than score, you put them into lowerThan
Just a logical mistake?
EDIT: for the percentile problem, here is my fix:
double percentile = ((double)lowerThanCount / (double)length) * 100.0;
You might not need all the (double)'s there, but just to be safe...

Unexpected decimal value behavior

I used to think I understand the difference between decimal and double values, but now I'm not able to justify the behavior of this code snippet.
I need to divide the difference between two decimal numbers in some intervals, for example:
decimal minimum = 0.158;
decimal maximum = 64.0;
decimal delta = (maximum - minimum) / 6; // 10.640333333333333333333333333
Then I create the intervals in reverse order, but the first result is already unexpected:
for (int i = 5; i >= 0; i--)
{
Interval interval = new Interval(minimum + (delta * i), minimum + (delta * (i + 1));
}
{53.359666666666666666666666665, 63.999999999999999999999999998}
I would expect the maximum value to be exactly 64. What am I missing here?
Thank you very much!
EDIT: if I use double instead of decimal it seems to works properly!

You're not missing anything. This is the result of rounding the numbers multiple times internally, i.e. compounding loss of precision. The delta, to begin with, isn't exactly 10.640333333333333333333333333, but the 3s keep repeating endlessly, resulting in a loss of precision when you multiply or divide using this decimal.
Maybe you could do it like this instead:
for (decimal i = maximum; i >= delta; i -= delta)
{
Interval interval = new Interval(i - delta, i);
}

Double has 16 digits precision while Decimal has 29 digits precision. Thus, double is more than likely would round it off than decimal.

Evenly divide a dollar amount (decimal) by an integer

I need to write an accounting routine for a program I am building that will give me an even division of a decimal by an integer. So that for example:
$143.13 / 5 =
28.62
28.62
28.63
28.63
28.63
I have seen the article here: Evenly divide in c#, but it seems like it only works for integer divisions. Any idea of an elegant solution to this problem?

Calculate the amounts one at a time, and subtract each amount from the total to make sure that you always have the correct total left:
decimal total = 143.13m;
int divider = 5;
while (divider > 0) {
decimal amount = Math.Round(total / divider, 2);
Console.WriteLine(amount);
total -= amount;
divider--;
}
result:
28,63
28,62
28,63
28,62
28,63

You can solve this (in cents) without constructing an array:
int a = 100 * amount;
int low_value = a / n;
int high_value = low_value + 1;
int num_highs = a % n;
int num_lows = n - num_highs;

It's easier to deal with cents. I would suggest that instead of 143.13, you divide 14313 into 5 equal parts. Which gives you 2862 and a remainder of 3. You can assign this remainder to the first three parts or any way you like. Finally, convert the cents back to dollars.
Also notice that you will always get a remainder less than the number of parts you want.

First of all, make sure you don't use a floating point number to represent dollars and cents (see other posts for why, but the simple reason is that not all decimal numbers can be represented as floats, e.g., $1.79).
Here's one way of doing it:
decimal total = 143.13m;
int numberOfEntries = 5;
decimal unadjustedEntryAmount = total / numberOfEntries;
decimal leftoverAmount = total - (unadjustedEntryAmount * numberOfEntries);
int numberOfPenniesToDistribute = leftoverAmount * 100;
int numberOfUnadjustedEntries = numberOfEntries - numberOfPenniesToDistribute;
So now you have the unadjusted amounts of 28.62, and then you have to decide how to distribute the remainder. You can either distribute an extra penny to each one starting at the top or at the bottom (looks like you want from the bottom).
for (int i = 0; i < numberOfUnadjustedEntries; i++) {
Console.WriteLine(unadjustedEntryAmount);
}
for (int i = 0; i < numberOfPenniesToDistribute; i++) {
Console.WriteLine(unadjustedEntryAmount + 0.01m);
}
You could also add the entire remainder to the first or last entries. Finally, depending on the accounting needs, you could also create a separate transaction for the remainder.

If you have a float that is guaranteed exactly two digits of precision, what about this (pseudocode):
amount = amount * 100 (convert to cents)
int[] amounts = new int[divisor]
for (i = 0; i < divisor; i++) amounts[i] = amount / divisor
extra = amount % divisor
for (i = 0; i < extra; i++) amounts[i]++
and then do whatever you want with amounts, which are in cents - you could convert back to floats if you absolutely had to, or format as dollars and cents.
If not clear, the point of all this is not just to divide a float value evenly but to divide a monetary amount as evenly as possible, given that cents are an indivisible unit of USD. To the OP: let me know if this isn't what you wanted.

You can use the algorithm in the question you're referencing by multipling by 100, using the integer evenly divide function, and then dividing each of the results by 100 (assuming you only want to handle 2 dp, if you want 3dp multiple by 1000 etc)

It is also possible to use C# iterator generation to make Guffa's answer more convenient:
public static IEnumerable<decimal> Divide(decimal amount, int numBuckets)
{
while(numBuckets > 0)
{
// determine the next amount to return...
var partialAmount = Math.Round(amount / numBuckets, 2);
yield return partialAmount;
// reduce th remaining amount and #buckets
// to account for previously yielded values
amount -= partialAmount;
numBuckets--;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

counting weight-average of a sequence of doubles - c#

As noted a double has a precision of 15-16 decimal places. So as long as the range of your values (either the straight values or the value times the weight for a weighted average) is 15 digits or less you should be fine.

Related

Display very Large or Small Numbers in Scientific Notation by Counting the Zero's

Double every time brings different values

Percentile algorithm

Unexpected decimal value behavior

Evenly divide a dollar amount (decimal) by an integer

Categories

Resources