Google Code Jam 2013 R1B - Falling Diamonds - c#

Yesterdays Code Jam had a question titled Falling Diamonds. The full text can be found here, but in summary:
Diamonds fall down the Y axis.
If a diamond hits point to point with another diamond, there is a 50/50 chance it will slide to the right or left, provided it is not blocked from doing so.
If a diamond is blocked from sliding one direction, it will always slide the other way.
If a diamond is blocked in both directions, it will stop and rest on the blocking diamonds.
If a diamond hits the ground, it will bury itself half way, then stop.
The orientation of the diamond never changes, i.e. it will slide or sink, but not tumble.
The objective is to find the probability that a diamond will rest at a given coordinate, assuming N diamonds fall.
The above requirements basically boil down to the diamonds building successively larger pyramids, one layer at a time.
Suffice to say, I have not been able to solve this problem to google’s satisfaction. I get the sample from the problem description correct, but fail on the actual input files. Ideally I would like to see a matched input and correct output file that I can play with to try and find my error. Barring that, I would also welcome comments on my code.
In general, my approach is to find how many layers are needed to have one which contains the coordinate. Once I know which layer I am looking at, I can determine a number of values relevant to the layer and point we are trying to reach. Such as how many diamonds are in the pyramid when this layer is empty, how many diamonds can stack up on a side before the rest are forced the other way, how many have to slide in the same direction to reach the desired point, etc.
I then check to see if the number of diamonds dropping either makes it impossible to reach the point (probability 0), or guarantees we will cover the point (probability 1). The challenge is in the middle ground where it is possible but not guaranteed.
For the middle ground, I first check to see if we are dropping enough to potentially fill a side and force remaining drops to slide in the opposite direction. Reason being that in this condition we can guarantee that a certain number of diamonds will slide to each side, which reduces the number of drops we have to worry about, and resolves the problem of the probability changing when a side gets full. Example: if 12 diamonds drop it is guaranteed that each side of the outer layer will have 2 or more diamonds in it, whether a given side has 2, 3, or 4 depends on the outcome of just 2 drops, not of all 6 that fall in this layer.
Once I know how many drops are relevant to success, and the number that have to break the same way in order to cover the point, I sum the probabilities that the requisite number, or more, will go the same way.
As I said, I can solve the sample in the problem description, but I am not getting the correct output for the input files. Unfortunately I have not been able to find anything telling me what the correct output is so that I can compare it to what I am getting. Here is my code (I have spent a fair amount of time since the contest ended trying to tune this for success and adding comments to keep from getting myself lost):
protected string Solve(string Line)
{
string[] Inputs = Line.Split();
int N = int.Parse(Inputs[0]);
int X = int.Parse(Inputs[1]);
int Y = int.Parse(Inputs[2]);
int AbsX = X >= 0 ? X : -X;
int SlideCount = AbsX + Y; //number that have to stack up on one side of desired layer in order to force the remaining drops to slide the other way.
int LayerCount = (SlideCount << 1) | 1; //Layer is full when both sides have reached slidecount, and one more drops
int Layer = SlideCount >> 1; //Zero based Index of the layer is 1/2 the slide count
int TotalLayerEmpty = ((Layer * Layer) << 1) - Layer; //Total number of drops required to fill the layer below the desired layer
int LayerDrops = N - TotalLayerEmpty; //how many will drop in this layer
int MinForTarget; //Min number that have to be in the layer to hit the target location, i.e. all fall to correct side
int TargetCovered; //Min number that have to be in the layer to guarantee the target is covered
if (AbsX == 0)
{//if target X is 0 we need the layer to be full for coverage (top one would slide off until both sides were full)
MinForTarget = TargetCovered = LayerCount;
}
else
{
MinForTarget = Y + 1; //Need Y + 1 to hit an altitude of Y
TargetCovered = MinForTarget + SlideCount; //Min number that have to be in the layer to guarantee the target is covered
}
if (LayerDrops >= TargetCovered)
{//if we have enough dropping to guarantee the target is covered, probability is 1
return "1.0";
}
else if (LayerDrops < MinForTarget)
{//if we do not have enough dropping to reach the target under any scenario, probability is 0
return "0.0";
}
else
{//We have enough dropping that reaching the target is possible, but not guaranteed
int BalancedDrops = LayerDrops > SlideCount ? LayerDrops - SlideCount : 0; //guaranteed to have this many on each side
int CriticalDrops = LayerDrops - (BalancedDrops << 1);//the number of drops relevant to the probablity of success
int NumToSucceed = MinForTarget - BalancedDrops;//How many must break our way for success
double SuccessProb = 0;//Probability that the number of diamonds sliding the same way is between NumToSucceed and CriticalDrops
double ProbI;
for (int I = NumToSucceed; I <= CriticalDrops; I++)
{
ProbI = Math.Pow(0.5, I); //Probability that I diamonds will slide the same way
SuccessProb += ProbI;
}
return SuccessProb.ToString();
}
}

Your general approach seems to fit the problem, though the calculation of the last probability is not completely correct.
Let me describe how I solved this. We are looking at pyramids. These pyramids can be assigned a layer, based on how many diamonds the pyramid has. A pyramid of layer 1 has only 1 diamond. A pyramid of layer 2 has 1 + 2 + 3 diamonds. A pyramid of layer 3 has 1 + 2 + 3 + 4 + 5 diamonds. A pyramid of layer n has 1 + 2 + 3 + ... + 2*n-1 diamonds, which equals (2 * n - 1) * n.
Given this, we can calculate the layer of the biggest pyramid we are able to build with a given number of diamonds:
layer = floor( ( sqrt( 1 + 8 * diamonds ) + 1 ) / 4 )
and the number of diamonds which are not needed in order to build this pyramid. These diamonds will start to fill the next bigger pyramid:
overflow = diamonds - layer * ( 2 * layer - 1 )
We can now see the following things:
If the point is within the layer layer, it will be covered, so p = 1.0.
If the point is not within the layer layer + 1 (i.e. the next bigger pyramid), it will not be covered, so p = 0.0.
If the point is within the the layer layer + 1, is might be covered, so 0 <= p <= 1.
Since we only need to solve the last problem, we can simplify the problem statement a little bit: Given are the two sides of the triangle, r and l. Each side has a fixed capacity, the maximum number of diamonds it can take. What is the probability for one configuration (nr, nl), where nr denotes the diamonds on the right side, nl denotes the diamonds on the left side and nr + nl = overflow.
This probability can be calculated using Bernoulli's Trails:
P( nr ) = binomial_coefficient( overflow, k ) * pow( 0.5, overflow )
However, this will fail in one case: If one side is completely filled with diamonds, the probabilities change. The probability, that the diamond falls on the completely filled side is now 0, while the probability for the other side is 1.
Assume the following case: Each side can take up to 4 diamonds, while 6 diamonds are still left. The interesting case is now P( 2 ), because in this case, the left side will take 4 diamonds.
Some examples how the 6 diamonds could fall down. r stands for the decision go right, while l stands for go left:
l r l r l l => For every diamond, the probability for each side was 0.5. This case doesn't differ from the previous case. The probability for exactly this case is pow( 0.5, 6 ). There are 4 different cases like this (rllllr, lrlllr, llrllr, lllrlr). There are 10 different cases like this. The number of cases is the number of ways one element can be chosen from 5: binomial_coefficient( 5, 2 ) = 10
l r l l l r => The last diamond was going to fall on the right side, because the left side was full. The last probability was 1 for the right side and 0 for the left side. The probability for exactly this case is pow( 0.5, 5 ). There are 4 different cases like this: binomial_coefficient( 4, 1 ) = 4
l l l l r r => The last two diamonds were going to fall on the right side, because the left side was full. The last two probabilities were 1 for the right side and 0 for the left side. The probability for exactly this case is pow( 0.5, 4 ). There is exactly one case like this, because binomial_coefficient( 3, 0 ) = 1.
The general algorithm is to assume, that the last 0, 1, 2, 3, ..., nr elements will go to the right side inevitably, then to calculate the probability for each of these cases (the last 0, 1, 2, 3, ..., nr probabilites will be 1) and multiply each probability with the number of different cases where the last 0, 1, 2, 3, ..., nr probabilities are 1.
See the following code. p will be the probability for the case that nr diamonds will go on the right side and the left side is full:
p = 0.0
for i in range( nr + 1 ):
p += pow( 0.5, overflow - i ) * binomial_coefficient( overflow - i - 1, nr - i )
Now that we can calculate the probabilities for each individual combinations (nr, nl), one can simply add all cases where nr > k, with k being the minimal number of diamonds for one side for which the required point is still covered.
See the complete python code I used for this problem: https://github.com/frececroka/codejam-2013-falling-diamonds/blob/master/app.py

Your assumption are over simplicistic. You can download the correct answers of the large dataset caluclated with my solution:
http://pastebin.com/b6xVhp9U
You have to calc all the possible combinations of diamonds that will occupy your point of interests. To do that I have used this formula:
https://math.stackexchange.com/a/382123/32707
You basically have to:
Calc the height of the pyramid (ie calc the FIXED diamonds)
Calc the number of the diamonds that can freely move on the left or on the right
Calc the probability (with sums of binomial coeff)
With the latter and the Point Y you can apply that formula to calc the probability.
Also don't worry if you are not able solve this problem because it was pretty tough. If you want my solution in PHP here it is:
Note that you have to calc if the point is inside the fixed pyramid of is outside the fixed pyramid, also you have to do other minor checks.
<?php
set_time_limit(0);
$data = file('2bl.in',FILE_IGNORE_NEW_LINES);
$number = array_shift($data);
for( $i=0;$i<$number;$i++ ) {
$firstLine = array_shift($data);
$firstLine = explode(' ',$firstLine);
$s = $firstLine[0];
$x = $firstLine[1];
$y = $firstLine[2];
$s = calcCase( $s,$x,$y );
appendResult($i+1,$s);
}
function calcCase($s,$x,$y) {
echo "S: [$s] P($x,$y)\n<br>";
$realH = round(calcH($s),1);
echo "RealHeight [$realH] ";
$h = floor($realH);
if (isEven($h))
$h--;
$exactDiamonds = progression($h);
movableDiamonds($s,$h,$exactDiamonds,$movableDiamonds,$unfullyLevel);
$widthLevelPoint = $h-$y;
$spacesX = abs($x) - $widthLevelPoint;
$isFull = (int)isFull($s,$exactDiamonds);
echo "Diamonds: [$s], isFull [$isFull], Height: [$h], exactDiamonds [$exactDiamonds], movableDiamonds [$movableDiamonds], unfullyLevel [$unfullyLevel] <br>
widthlevel [$widthLevelPoint],
distance from pyramid (horizontal) [$spacesX]<br> ";
if ($spacesX>1)
return '0.0';
$pointHeight = $y+1;
if ($x==0 && $pointHeight > $h) {
return '0.0';
}
if ($movableDiamonds==0) {
echo 'Fixed pyramid';
if ( $y<=$h && abs($x) <= $widthLevelPoint )
return '1.0';
else
return '0.0';
}
if ( !$isFull ) {
echo "Pyramid Not Full ";
if ($spacesX>0)
return '0.0';
if ($unfullyLevel == $widthLevelPoint)
return '0.5';
else if ($unfullyLevel > $widthLevelPoint)
return '0.0';
else
return '1.0';
}
echo "Pyramid full";
if ($spacesX<=0)
return '1.0';
if ($movableDiamonds==0)
return '0.0';
if ( $movableDiamonds > ($h+1) ) {
$otherDiamonds = $movableDiamonds - ($h+1);
if ( $otherDiamonds - $pointHeight >= 0 ) {
return '1.0';
}
}
$totalWays = totalWays($movableDiamonds);
$goodWays = goodWays($pointHeight,$movableDiamonds,$totalWays);
echo "<br>GoodWays: [$goodWays], totalWays: [$totalWays]<br>";
return sprintf("%1.7f",$goodWays / $totalWays);
}
function goodWays($pointHeight,$movableDiamonds,$totalWays) {
echo "<br>Altezza punto [$pointHeight] ";
if ($pointHeight>$movableDiamonds)
return 0;
if ( $pointHeight == $movableDiamonds )
return 1;
$good = sumsOfBinomial( $movableDiamonds, $pointHeight );
return $good;
}
function totalWays($diamonds) {
return pow(2,$diamonds);
}
function sumsOfBinomial( $n, $k ) {
$sum = 1; //> Last element (n;n)
for($i=$k;$i<($n);$i++) {
$bc = binomial_coeff($n,$i);
//echo "<br>Binomial Coeff ($n;$i): [$bc] ";
$sum += $bc;
}
return $sum;
}
// calculate binomial coefficient
function binomial_coeff($n, $k) {
$j = $res = 1;
if($k < 0 || $k > $n)
return 0;
if(($n - $k) < $k)
$k = $n - $k;
while($j <= $k) {
$res = bcmul($res, $n--);
$res = bcdiv($res, $j++);
}
return $res;
}
function isEven($n) {
return !($n&1);
}
function isFull($s,$exact) {
return ($exact <= $s);
}
function movableDiamonds($s,$h,$exact,&$movableDiamonds,&$level) {
$baseWidth = $h;
$level=$baseWidth;
//> Full pyramid
if ( isFull($s,$exact) ) {
$movableDiamonds = ( $s-$exact );
return;
}
$movableDiamonds = $s;
while( $level ) {
//echo "<br> movable [$movableDiamonds] removing [$level] <br>" ;
if ($level > $movableDiamonds)
break;
$movableDiamonds = $movableDiamonds-$level;
$level--;
if ($movableDiamonds<=0)
break;
}
return $movableDiamonds;
}
function progression($n) {
return (1/2 * $n *(1+$n) );
}
function calcH($s) {
if ($s<=3)
return 1;
$sqrt = sqrt(1+(4*2*$s));
//echo "Sqrt: [$sqrt] ";
return ( $sqrt-1 ) / 2;
}
function appendResult($caseNumber,$string) {
static $first = true;
//> Cleaning file
if ($first) {
file_put_contents('result.out','');
$first=false;
}
$to = "Case #{$caseNumber}: {$string}";
file_put_contents( 'result.out' ,$to."\n",FILE_APPEND);
echo $to.'<br>';
}

Related

Check if root of cubic equation is complex or not?

I use this Cubic root implementation.
I have equation #1:
x³ -2 x² -5 x + 6 = 0
It gives me 3 complex roots ({real, imaginary}):
{-2, 7.4014868308343765E-17}
{1 , -2.9605947323337506E-16}
{3 , 2.9605947323337506E-16}
But in fact, the right result should be 3 non-complex roots: -2, 1, 3.
With this case, I can test by: apply 3 complex roots to the equation, it returns non-zero result (failed); apply 3 non-complex roots to the equation, it returns zero result (passed).
But there is the case where I apply both 3-complex roots and 3-non-complex roots to the equation (e.g. 47 x³ +7 x² -52 x + 0 = 0), it return non-zero (failed).
I think what causes this issue is because of this code:
/// <summary>
/// Evaluate all cubic roots of this <c>Complex</c>.
/// </summary>
public static (Complex, Complex, Complex) CubicRoots(this Complex complex)
{
var r = Math.Pow(complex.Magnitude, 1d/3d);
var theta = complex.Phase/3;
const double shift = Constants.Pi2/3;
return (Complex.FromPolarCoordinates(r, theta),
Complex.FromPolarCoordinates(r, theta + shift),
Complex.FromPolarCoordinates(r, theta - shift));
}
I know that floating point value can lose precision when calculating (~1E-15), but the problem is the imaginary part needs to decide weather it's zero or non-zero to tell if it's complex number or not.
I can't tell the user of my app: "hey user, if you see the imaginary part is close enough to 0, you can decide for yourself that the root's not a complex number".
Currently, I use this method to check:
const int TOLERATE = 15;
bool isRemoveImaginary = System.Math.Round(root.Imaginary, TOLERATE) == 0; //Remove imaginary if it's too close to zero
But I don't know if this method is appropriate, what if the TOLERATE = 15 is not enough. Or is it the right method to solve this problem?
So I want to ask, is there any better way to tell the root is complex or not?
Thank you Mark Dickinson.
So according to Wikipedia:
delta > 0: the cubic has three distinct real roots
delta < 0: the cubic has one real root and two non-real complex
conjugate roots.
The delta D = (B*B - 4*A*A*A)/(-27*a*a)
My ideal is:
delta > 0: remove all imaginary numbers of 3 roots.
delta < 0: find the real root then remove its imaginary part if any
(to make sure it's real). Leave the other 2 roots untouched. Now I
have 2 ideas to find the real root:
Ideal #1
In theory, the real root should have imaginary = 0, but due to floating point precision, imaginary can deviate from 0 a little (e.g. imaginary = 1E-15 instead of 0). So the idea is: the 1 real root among 3 roots should have the imaginary whose value is closest to 0.
Code:
NumComplex[] arrRoot = { x1, x2, x3 };
if (delta > 0)
{
for (var idxRoot = 0; idxRoot < arrRoot.Length; ++idxRoot)
arrRoot[idxRoot] = arrRoot[idxRoot].RemoveImaginary();
}
else
{
//The root with imaginary closest to 0 should be the real root,
//the other two should be non-real.
var realRootIdx = 0;
var absClosest = double.MaxValue;
double abs;
for (var idxRoot = 0; idxRoot < arrRoot.Length; ++idxRoot)
{
abs = System.Math.Abs(arrRoot[idxRoot].GetImaginary());
if (abs < absClosest)
{
absClosest = abs;
realRootIdx = idxRoot;
}
}
arrRoot[realRootIdx] = arrRoot[realRootIdx].RemoveImaginary();
}
The code above can be wrong if there are 3 roots ({real, imaginary}) like this:
{7, -1E-99}
{3, 1E-15}//1E-15 caused by floating point precision, 1E-15 should be 0
{7, 1E-99}//My code will mistake this because this is closer to 0 than 1E-15.
Maybe if that case does happen in real life, I will come up with a better way to pick the real root.
Idea #2
Take a look at how the 3 roots calculated:
x1 = FromPolarCoordinates(r, theta);
x2 = FromPolarCoordinates(r, theta + shift);
x3 = FromPolarCoordinates(r, theta - shift);
3 roots have the form (know this by tests, not proven by math):
x1 = { A }
x2 = { B, C }
x3 = { B, -C }
Use math knowledge to prove which one among the 3 roots is the real one.
Trial #1: Maybe the root x1 = FromPolarCoordinates(r, theta) is always real? (failed) untrue because the following case proved that guess is wrong: -53 x³ + 6 x² + 14 x - 54 = 0 (Thank Mark Dickinson again)
I don't know if math can prove something like: while delta < 0: if B < 0 then x3 is real, else x1 is real?
So until I get better idea, I'll just use idea #1.

Linear interpolation between two numbers with steps

I've a little trouble finding out how to linearly interpolate between two numbers with a defined number of intermediate steps.
Let's say I want to interpolate between 4 and 22 with 8 intermediate steps like so : Example
It's easy to figure out that it's x+2 here. But what if the starting value was 5 and the final value 532 with 12 intermediate steps? (In my special case I would need starting and ending value with 16 steps in between)
If you have two fence posts and you put k fence posts between them, you create k + 1 spaces. For instance:
| |
post1 post2
adding one posts creates two spaces
| | |
post1 post2
If you want those k + 1 spaces to be equal you can divide the total distance by k + 1 to get the distance between adjacent posts.
d = 22 - 4 = 18
k = 8
e = d / (k + 1) = 18 / 9 = 2
In your other case example, the answer is
d = 532 - 5 = 527
k = 12
e = d / (k + 1) = 527 / 13 ~ 40.5
I hesitate to produce two separate answers, but I feel this methodology is sufficiently unique from the other one. There's a useful function which may be exactly what you need which is appropriately called Mathf.Lerp().
var start = 5;
var end = 532;
var steps = 13;
for (int i = 0; i <= steps; i++) {
// The type conversion is necessary because both i and steps are integers
var value = Mathf.Lerp(start, end, i / (float)steps);
Debug.Log(value);
}
For actually doing the linear interpolation, use Mathf.MoveTowards().
For figuring out your maximum delta (i.e. the amount you want it to move each step), take the difference, and then divide it by the number of desired steps.
var start = 4;
var end = 22;
var distance = end - start;
var steps = 9; // Your example technically has 9 steps, not 8
var delta = distance / steps;
Note that this conveniently assumes your distance is a clean multiple of steps. If you don't know this is the case and it's important that you never exceed that number of steps, you may want to explicitly check for it. Here's a crude example for an integer. Floating point methods may be more complicated:
if (distance % delta > 0) { delta += 1; }

How to interpolate through 3 points/numbers with a defined number of samples? (in c#)

So for example we have 1, 5, and 10 and we want to interpolate between these with 12 points, we should get:
1.0000
1.7273
2.4545
3.1818
3.9091
4.6364
5.4545
6.3636
7.2727
8.1818
9.0909
10.0000
say we have 5, 10, and 4 and again 12 points, we should get:
5.0000
5.9091
6.8182
7.7273
8.6364
9.5455
9.4545
8.3636
7.2727
6.1818
5.0909
4.0000
This is a generalized solution that works by these principles:
Performs linear interpolation
It calculates a "floating point index" into the input array
This index is used to select 1 (if the fractional parts is very close to 0) or 2 numbers from the input array
The integer part of this index is the base input array index
The fractional part says how far towards the next array element we should move
This should work with whatever size input arrays and output collections you would need.
public IEnumerable<double> Interpolate(double[] inputs, int count)
{
double maxCountForIndexCalculation = count - 1;
for (int index = 0; index < count; index++)
{
double floatingIndex = (index / maxCountForIndexCalculation) * (inputs.Length - 1);
int baseIndex = (int)floatingIndex;
double fraction = floatingIndex - baseIndex;
if (Math.Abs(fraction) < 1e-5)
yield return inputs[baseIndex];
else
{
double delta = inputs[baseIndex + 1] - inputs[baseIndex];
yield return inputs[baseIndex] + fraction * delta;
}
}
}
It produces the two collections of outputs you showed in your question but beyond that, I have not tested it. Little error checking is performed so you should add the necessary bits.
The problem is an interpolation of two straight lines with different slopes given the end points and the intersection.
Interpolation is defined as following : In the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points.
I'm tired of people giving negative points for solutions to hard problems. This is not a simply problem, but a problem that require "thinking out of the box". lets looks at the solution for following input : 1 12 34
I picked these numbers because the results are all integers
The step size L (Lower) = distance of elements from 1 to 12 = 2
The step size H (Higher) = distance of elements from 12 to 34 = 4
So the answer is : 1 3 5 7 9 11 [12] 14 18 22 26 30 34
Notice the distance between the 6th point 11 and center is 1 (half of L)
Notice the distance between the center point 12 and the 7th point is 2 (half of H)
Finally notice the distance between the 6th and 7th points is 3.
My results are scaled exactly the same as the OPs first example.
It is hard to see the sequence with the fractional inputs the OP posted. If you look at the OP first example and calculate the step distance of the first 6 points you get 0.72. The last 6 points the distance is 0.91. Then calculate the distance from the 6th point to the center is .36 (half 0.72). Then center to 7th point 0.45 (half 0.91). Excuse me for rounding the numbers a little bit.
It is a sequence problem just like the in junior high school where you learned arithmetic and geometric sequences. Then as a bonus question you got the sequence 23, 28, 33, 42,51,59,68,77,86 which turns out to be the train stations on the NYC 3rd Ave subway system. Solving problems like this you need to think "Outside the Box" which comes from the tests IBM gives to Job Applicants. These are the people who can solve the Nine Point Problem : http://www.brainstorming.co.uk/puzzles/ninedotsnj.html
I did the results when the number of points is EVEN which in you case is 12. You will need to complete the code if the number of points is ODD.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
const int NUMBER_POINTS = 12;
static void Main(string[] args)
{
List<List<float>> tests = new List<List<float>>() {
new List<float>() { 1,5, 10},
new List<float>() { 5,10, 4}
};
foreach (List<float> test in tests)
{
List<float> output = new List<float>();
float midPoint = test[1];
if(NUMBER_POINTS % 2 == 0)
{
//even number of points
//add lower numbers
float lowerDelta = (test[1] - test[0])/((NUMBER_POINTS / 2) - .5F);
for (int i = 0; i < NUMBER_POINTS / 2; i++)
{
output.Add(test[0] + (i * lowerDelta));
}
float upperDelta = (test[2] - test[1]) / ((NUMBER_POINTS / 2) - .5F); ;
for (int i = 0; i < NUMBER_POINTS / 2; i++)
{
output.Add(test[1] + (i * upperDelta) + (upperDelta / 2F));
}
}
else
{
}
Console.WriteLine("Numbers = {0}", string.Join(" ", output.Select(x => x.ToString())));
}
Console.ReadLine();
}
}
}

Hysteresis Round to solve "flickering" values due to noise

Background:
We have an embedded system that converts linear positions (0 mm - 40 mm) from a potentiometer voltage to its digital value using a 10-bit analog to digital converter.
------------------------
0mm | | 40 mm
------------------------
We show the user the linear position at 1 mm increments. Ex. 1mm, 2mm, 3mm, etc.
The problem:
Our system can be used in electromagnetically noisy environments which can cause the linear position to "flicker" due to noise entering the ADC. For example, we will see values like: 39,40,39,40,39,38,40 etc. when the potentiometer is at 39 mm.
Since we are rounding to every 1 mm, we will see flicker between 1 and 2 if the value toggles between 1.4 and 1.6 mm for example.
Proposed software solution:
Assuming we can not change the hardware, I would like to add some hysteresis to the rounding of values to avoid this flicker. Such that:
If the value is currently at 1mm, it can only go to 2mm iff the raw value is 1.8 or higher.
Likewise, if the current value is 1mm it can only go to 0mm iff the raw value is 0.2 or lower.
I wrote the following simple app to test my solution. Please let me know if I am on the right track, or if you have any advice.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace PDFSHysteresis
{
class Program
{
static void Main(string[] args)
{
double test = 0;
int curr = 0;
Random random = new Random();
for (double i = 0; i < 100; i++)
{
test = test + random.Next(-1, 2) + Math.Round((random.NextDouble()), 3);
curr = HystRound(test, curr, 0.2);
Console.WriteLine("{0:00.000} - {1}", test, curr);
}
Console.ReadLine();
}
static int HystRound(double test, int curr, double margin)
{
if (test > curr + 1 - margin && test < curr + 2 - margin)
{
return curr + 1;
}
else if (test < curr - 1 + margin && test > curr - 2 + margin)
{
return curr - 1;
}
else if (test >= curr - 1 + margin && test <= curr + 1 - margin)
{
return curr;
}
else
{
return HystRound(test, (int)Math.Floor(test), margin);
}
}
}
}
Sample output:
Raw HystRound
====== =========
00.847 1
00.406 1
01.865 2
01.521 2
02.802 3
02.909 3
02.720 3
04.505 4
06.373 6
06.672 6
08.444 8
09.129 9
10.870 11
10.539 11
12.125 12
13.622 13
13.598 13
14.141 14
16.023 16
16.613 16
How about using the average of readings for the last N seconds, where N could be fairly small / sub-second depending on your sample rate?
You can use a simple linear average, or something more complex, depending on your needs. Several moving average algorithms are detailed on Wikipedia:
http://en.wikipedia.org/wiki/Moving_average
Depending on your sensitivity / responsiveness needs, you could reset the average if a new reading exceeds the running average by X%.
I had to deal with something similar sometime ago where I had to read voltage output from a circuit and display a graph on a computer screen. The bottom line is, this really depends on your system requirements. If the requirement is "1mm" accuracy then there is nothing you could really do. Otherwise, as mentioned above, you could go with several methods that can help you out lessen the flickering. You can:
Calculate the average of these values over a certain period of time the user can configure.
Allow the user to set a "Sensitivity threshold". This threshold can be used to decide on weather to consider the new value as valid or not. In your example, the threshold can be set to 2mm in which case values such as 39, 40, 39, 38 would read as 39mm
Also, have you thought about putting an external stabilizer between your application and the hardware itself?
I think Gareth Rees gave an excellent answer to a very similar question:
how to prevent series of integers to have the same value to often

Proportionately distribute (prorate) a value across a set of values

I have a need to write code that will prorate a value across a list, based on the relative weights of "basis" values in the list. Simply dividing the "basis" values by the sum of the "basis" values and then multiplying the factor by the original value to prorate works to a certain degree:
proratedValue = (basis / basisTotal) * prorationAmount;
However, the result of this calculation must then be rounded to integer values. The effect of the rounding means that the the sum of proratedValue for all items in the list may differ from the original prorationAmount.
Can anyone explain how to apply a "lossless" proration algorithm that proportionately distributes a value across a list as accurately as possible, without suffering from rounding errors?
Simple algorithm sketch here...
Have a running total which starts at zero.
Do your standard "divide basis by total basis, then multiply by proportion amount" for the first item.
Store the original value of the running total elsewhere, then add the amount you just calculated in #2.
Round both the old value and the new value of the running total to integers (don't modify the existing values, round them into separate variables), and take the difference.
The number calculated in step 4 is the value assigned to the current basis.
Repeat steps #2-5 for each basis.
This is guaranteed to have the total amount prorated equal to the input prorate amount, because you never actually modify the running total itself (you only take rounded values of it for other calculations, you don't write them back). What would have been an issue with integer rounding before is now dealt with, since the rounding error will add up over time in the running total and eventually push a value across the rounding threshold in the other direction.
Basic example:
Input basis: [0.2, 0.3, 0.3, 0.2]
Total prorate: 47
----
R used to indicate running total here:
R = 0
First basis:
oldR = R [0]
R += (0.2 / 1.0 * 47) [= 9.4]
results[0] = int(R) - int(oldR) [= 9]
Second basis:
oldR = R [9.4]
R += (0.3 / 1.0 * 47) [+ 14.1, = 23.5 total]
results[1] = int(R) - int(oldR) [23-9, = 14]
Third basis:
oldR = R [23.5]
R += (0.3 / 1.0 * 47) [+ 14.1, = 37.6 total]
results[1] = int(R) - int(oldR) [38-23, = 15]
Fourth basis:
oldR = R [37.6]
R += (0.2 / 1.0 * 47) [+ 9.4, = 47 total]
results[1] = int(R) - int(oldR) [47-38, = 9]
9+14+15+9 = 47
TL;DR algorithm with best (+20%) possible accuracy, 70% slower.
Evaulated algorithms presented in accepted answer here as well as answer to python question of similar nature.
Distribute 1 - based on Amber's algorithm
Distribute 2 - based on John Machin's algorithm
Distribute 3 - see below
Distribute 4 - optimized version of Distribute 3 (eg. removed LINQ, used arrays)
Testing results (10,000 iterations)
Algorithm | Avg Abs Diff (x lowest) | Time (x lowest)
------------------------------------------------------------------
Distribute 1 | 0.5282 (1.1992) | 00:00:00.0906921 (1.0000)
Distribute 2 | 0.4526 (1.0275) | 00:00:00.0963136 (1.0620)
Distribute 3 | 0.4405 (1.0000) | 00:00:01.1689239 (12.8889)
Distribute 4 | 0.4405 (1.0000) | 00:00:00.1548484 (1.7074)
Method 3 present has 19.9% better accuracy, for 70.7% slower execution time as expected.
Distribute 3
Makes best effort to be as accurate as possible in distributing amount.
Distribute weights as normal
Increment weights with highest error until actual distributed amount equals expected amount
Sacrifices speed for accuracy by making more then one pass through the loop.
public static IEnumerable<int> Distribute3(IEnumerable<double> weights, int amount)
{
var totalWeight = weights.Sum();
var query = from w in weights
let fraction = amount * (w / totalWeight)
let integral = (int)Math.Floor(fraction)
select Tuple.Create(integral, fraction);
var result = query.ToList();
var added = result.Sum(x => x.Item1);
while (added < amount)
{
var maxError = result.Max(x => x.Item2 - x.Item1);
var index = result.FindIndex(x => (x.Item2 - x.Item1) == maxError);
result[index] = Tuple.Create(result[index].Item1 + 1, result[index].Item2);
added += 1;
}
return result.Select(x => x.Item1);
}
Distribute 4
public static IEnumerable<int> Distribute4(IEnumerable<double> weights, int amount)
{
var totalWeight = weights.Sum();
var length = weights.Count();
var actual = new double[length];
var error = new double[length];
var rounded = new int[length];
var added = 0;
var i = 0;
foreach (var w in weights)
{
actual[i] = amount * (w / totalWeight);
rounded[i] = (int)Math.Floor(actual[i]);
error[i] = actual[i] - rounded[i];
added += rounded[i];
i += 1;
}
while (added < amount)
{
var maxError = 0.0;
var maxErrorIndex = -1;
for(var e = 0; e < length; ++e)
{
if (error[e] > maxError)
{
maxError = error[e];
maxErrorIndex = e;
}
}
rounded[maxErrorIndex] += 1;
error[maxErrorIndex] -= 1;
added += 1;
}
return rounded;
}
Test Harness
static void Main(string[] args)
{
Random r = new Random();
Stopwatch[] time = new[] { new Stopwatch(), new Stopwatch(), new Stopwatch(), new Stopwatch() };
double[][] results = new[] { new double[Iterations], new double[Iterations], new double[Iterations], new double[Iterations] };
for (var i = 0; i < Iterations; ++i)
{
double[] weights = new double[r.Next(MinimumWeights, MaximumWeights)];
for (var w = 0; w < weights.Length; ++w)
{
weights[w] = (r.NextDouble() * (MaximumWeight - MinimumWeight)) + MinimumWeight;
}
var amount = r.Next(MinimumAmount, MaximumAmount);
var totalWeight = weights.Sum();
var expected = weights.Select(w => (w / totalWeight) * amount).ToArray();
Action<int, DistributeDelgate> runTest = (resultIndex, func) =>
{
time[resultIndex].Start();
var result = func(weights, amount).ToArray();
time[resultIndex].Stop();
var total = result.Sum();
if (total != amount)
throw new Exception("Invalid total");
var diff = expected.Zip(result, (e, a) => Math.Abs(e - a)).Sum() / amount;
results[resultIndex][i] = diff;
};
runTest(0, Distribute1);
runTest(1, Distribute2);
runTest(2, Distribute3);
runTest(3, Distribute4);
}
}
The problem you have is to define what an "acceptable" rounding policy is, or in other words, what it is you are trying to minimize. Consider first this situation: you have only 2 identical items in your list, and are trying to allocate 3 units. Ideally, you would want to allocate the same amount to each item (1.5), but that is clearly not going to happen. The "best" you could do is likely to allocate 1 and 2, or 2 and 1. So
there might be multiple solutions to each allocation
identical items may not receive an identical allocation
Then, I chose 1 and 2 over 0 and 3 because I assume that what you want is to minimize the difference between the perfect allocation, and the integer allocation. This might not be what you consider "a good allocation", and this is a question you need to think about: what would make an allocation better than another one?
One possible value function could be to minimize the "total error", i.e. the sum of the absolute values of the differences between your allocation and the "perfect", unconstrained allocation.
It sounds to me that something inspired by Branch and Bound could work, but it's non trivial.
Assuming that Dav solution always produces an allocation that satisfies the constraint (which I'll trust is the case), I assume that it is not guaranteed to give you the "best" solution, "best" defined by whatever distance/fit metric you end up adopting. My reason for this is that this is a greedy algorithm, which in integer programming problems can lead you to solutions which are really off the optimal solution. But if you can live with a "somewhat correct" allocation, then I say, go for it! Doing it "optimally" doesn't sound trivial.
Best of luck!
Ok. I'm pretty certain that the original algorithm (as written) and the code posted (as written) doesn't quite answer the mail for the test case outlined by #Mathias.
My intended use of this algorithm is a slightly more specific application. Rather than calculating the % using (#amt / #SumAmt) as shown in the original question. I have a fixed $ amount that needs to be split or spread across multiple items based on a % split defined for each of those items. The split % sums to 100%, however, straight multiplication often results in decimals that (when forced to round to whole $) don't add up to the total amount that I'm splitting apart. This is the core of the problem.
I'm fairly certain that the original answer from #Dav doesn't work in cases where (as #Mathias described) the rounded values are equal across multiple slices. This problem with the original algorithm and code can be summed up with one test case:
Take $100 and split it 3 ways using 33.333333% as your percentage.
Using the code posted by #jtw (assuming this is an accurate implementation of the original algorithm), yields you the incorrect answer of allocating $33 to each item (resulting in an overall sum of $99), so it fails the test.
I think a more accurate algorithm might be:
Have a running total which starts at 0
For each item in the group:
Calculate the un-rounded allocation amount as ( [Amount to be Split] * [% to Split] )
Calculate the cumulative Remainder as [Remainder] + ( [UnRounded Amount] - [Rounded Amount] )
If Round( [Remainder], 0 ) > 1 OR the current item is the LAST ITEM in the list, then set the item's allocation = [Rounded Amount] + Round( [Remainder], 0 )
else set item's allocation = [Rounded Amount]
Repeat for next item
Implemented in T-SQL, it looks like this:
-- Start of Code --
Drop Table #SplitList
Create Table #SplitList ( idno int , pctsplit decimal(5, 4), amt int , roundedAmt int )
-- Test Case #1
--Insert Into #SplitList Values (1, 0.3333, 100, 0)
--Insert Into #SplitList Values (2, 0.3333, 100, 0)
--Insert Into #SplitList Values (3, 0.3333, 100, 0)
-- Test Case #2
--Insert Into #SplitList Values (1, 0.20, 57, 0)
--Insert Into #SplitList Values (2, 0.20, 57, 0)
--Insert Into #SplitList Values (3, 0.20, 57, 0)
--Insert Into #SplitList Values (4, 0.20, 57, 0)
--Insert Into #SplitList Values (5, 0.20, 57, 0)
-- Test Case #3
--Insert Into #SplitList Values (1, 0.43, 10, 0)
--Insert Into #SplitList Values (2, 0.22, 10, 0)
--Insert Into #SplitList Values (3, 0.11, 10, 0)
--Insert Into #SplitList Values (4, 0.24, 10, 0)
-- Test Case #4
Insert Into #SplitList Values (1, 0.50, 75, 0)
Insert Into #SplitList Values (2, 0.50, 75, 0)
Declare #R Float
Declare #Results Float
Declare #unroundedAmt Float
Declare #idno Int
Declare #roundedAmt Int
Declare #amt Float
Declare #pctsplit Float
declare #rowCnt int
Select #R = 0
select #rowCnt = 0
-- Define the cursor
Declare SplitList Cursor For
Select idno, pctsplit, amt, roundedAmt From #SplitList Order By amt Desc
-- Open the cursor
Open SplitList
-- Assign the values of the first record
Fetch Next From SplitList Into #idno, #pctsplit, #amt, #roundedAmt
-- Loop through the records
While ##FETCH_STATUS = 0
Begin
-- Get derived Amounts from cursor
select #unroundedAmt = ( #amt * #pctsplit )
select #roundedAmt = Round( #unroundedAmt, 0 )
-- Remainder
Select #R = #R + #unroundedAmt - #roundedAmt
select #rowCnt = #rowCnt + 1
-- Magic Happens! (aka Secret Sauce)
if ( round(#R, 0 ) >= 1 ) or ( ##CURSOR_ROWS = #rowCnt ) Begin
select #Results = #roundedAmt + round( #R, 0 )
select #R = #R - round( #R, 0 )
End
else Begin
Select #Results = #roundedAmt
End
If Round(#Results, 0) <> 0
Begin
Update #SplitList Set roundedAmt = #Results Where idno = #idno
End
-- Assign the values of the next record
Fetch Next From SplitList Into #idno, #pctsplit, #amt, #roundedAmt
End
-- Close the cursor
Close SplitList
Deallocate SplitList
-- Now do the check
Select * From #SplitList
Select Sum(roundedAmt), max( amt ),
case when max(amt) <> sum(roundedamt) then 'ERROR' else 'OK' end as Test
From #SplitList
-- End of Code --
Which yields a final result set for the test case of:
idno pctsplit amt roundedAmt
1 0.3333 100 33
2 0.3333 100 34
3 0.3333 100 33
As near as I can tell (and I've got several test cases in the code), this handles all of these situations pretty gracefully.
This is an apportionment problem, for which there are many known methods. All have certain pathologies: the Alabama paradox, the population paradox, or a failure of the quota rule. (Balinski and Young proved that no method can avoid all three.) You'll probably want one that follows the quote rule and avoids the Alabama paradox; the population paradox isn't as much of a concern since there's no much difference in the number of days per month between different years.
I think proportional distributions is the answer:
http://www.sangakoo.com/en/unit/proportional-distributions-direct-and-inverse

Categories

Resources