Determining % of time above a certain value in a dataset

Determining % of time above a certain value in a dataset - c#

I have a dataset of voltages (Sampled every 500ms). Lets say it looks something like this (In an array):
0ms -> 1.4v
500ms -> 1.3v
1000ms -> 1.2v
1500ms -> 1.5v
2000ms -> 1.3v
2500ms -> 1.3v
3000ms -> 1.2v
3500ms -> 1.3v
Assuming the transition between readings is linear (IE: 250ms = 1.35v), how would I go about calculating the total % of time that the reading is above or equal to 1.3v?
I was initially going to just get % of values that are >= 1.3v (IE: 6/8 in sample array), however this only works if the angle between points is 45 degrees. I am assuming I have to do something like create a line from point 1 to point 2 and find the intercept with the base line (1.3v). Then do the same for point 2 and point 3 and find the distance between both intersects (Say 700ms) then repeat for all points and get as a % of total sample time.
EDIT
Maybe I wasn't clear when I initially asked. I need help with identifying how I can perform these calculations, IE: objects/classes that I can use to help me virtually graph these lines and perform these calculations or any 3rd party math packages that might offer these capabilities.

The important part is not to think in data points, but in intervals. Every interval (e.g. 0-500, 500-1000, ...) is one of three cases (starting with float variables above and below both 0):
Trivial: Both start and end point are below your threshold - below += 1
Trivial: Both start and end point are above your threshold - above += 1
Interesting: One point is below, one above your threshold. Let's call the smaller value min and the higher value max. Now we do above += (max-threshold)/(max-min) and below += (threshold-min)/(max-min), so we linearily distribute this interval between both states.
Finally normalize the results by dividing both above and below by the number of intervals. This will give you a pair of numbers, that represent the fractions, i.e. that add up to 1 modulo rounding errors. Ofcourse multiplication with 100 gives you the percentages.
EDIT
#phoog pointed out in the comment, that I did not mention an "equal" case. This is by design, as your question already contains that: You chose >= as a comparison, so I definitly ment to use the same comparison here.

If I've understood the problem correctly, you can use a class like this to hold each entry:
public class DataEntry
{
public DataEntry(int time, double reading)
{
Time = time;
Reading = reading;
}
public int Time { get; set; }
public double Reading { get; set; }
}
And then the following link statement to get segments above 1.3:
var entries = new List<DataEntry>()
{
new DataEntry(0, 1.4),
new DataEntry(500, 1.3),
new DataEntry(1000, 1.2),
new DataEntry(1500, 1.5),
new DataEntry(2000, 1.3),
new DataEntry(2500, 1.3),
new DataEntry(3000, 1.2),
new DataEntry(3500, 1.3)
};
double totalTime = entries
.OrderBy(e => e.Time)
.Take(entries.Count - 1)
.Where((t, i) => t.Reading >= 1.3 && entries[i + 1].Reading >= 1.3)
.Sum(t => 500);
var perct = (totalTime / entries.Max(e => e.Time));
This should give you the 500ms segments that remained above 1.3.

Related

Compare if elements are almost equal in a list in C# .NET

I am still very beginner in C# and .NET and just need to do this simple test.
var odds = new System.Collections.Generic.List<double>();
// here is a code which adds the values in the list
foreach(var odd in odds)
{
System.Console.WriteLine(odd);
}
and the output is something like that:
13.098252624859418
14.098252624859349
13.098252624859577
13.098252624853423
14.098252624859398
So I would like to compare all the values inside the list if they are almost equal. That means even if there is a little difference between the numbers (such as 13 and 14) inside the list still to be acceptable so I would like this difference to be maximum of 2.

Check the difference between the maximum value and the minimum value in the list (2 in your case). Using a tolerance value. For example
double delta = 2;
// getting largest element
var maxNum = odds.Max();
// getting smallest element
var minNum = odds.Min();
var almostEqual = maxNum - minNum <= delta;

You'll need to do it manually, as is recommended with every floating point number comparison (because floating point math is unintuitive), doing that is quite simple, something like this:
var a = 13.098252624859418;
var b = 14.098252624859398;
// define your acceptable range, i.e 1.0 means number 1.0 larger and smaller are equal to one another
var delta = 1.0;
var areNearlyEqual = Math.Abs(a - b) <= delta; // true
Now if you want to check if every element in a List is nearly equal to every other element, there is a naïve and more "complicated" solution, I'll start with the naïve one:
(Don't actually use this implementation, this is for illustration purposes of how to check equality of all items in a list which aren't just numbers)
var allAreNearlyEqual = true; // Let's start of assuming all are equal
foreach (var x in odds)
{
if (!allAreNearlyEqual)
break;
foreach (var y in odds)
{
if (!Math.Abs(x - y) <= delta)
allAreNearlyEqual = false;
}
}
Console.WriteLine(allAreNearlyEqual);
As you can see we need to iterate over every element in the list (x) and compare it to every other element in the list (y), there is an easier to read (and also faster*) version of this:
var max = odds.Max();
var min = odds.Min();
if (Math.Abs(max - min) <= delta)
Console.WriteLine("All items are nearly equal");
else
Console.WriteLine("Not all items are nearly equal");
(This takes advantage of the fact that all other elements between the min and max are also close enough to be nearly equal, if the min and max are)
You can check out the implementation for Max here to see how they do it, but basically it's just a foreach loop which returns the highest value found.
*The second version is faster, because it's O(2N) where as the first version is O(N^2), I added the first version to illustrate how you could do the same thing on a list of objects which are not just numbers

or-tools - Compute the stdev from a SumArray()

I need to generate plannings for employees using Google's Optimization Tools.
One of the constraints would be that every employee has approximately the same amount of working hours.
Thus, I want to aggregate in a list how many hours each employee is working, and then minimize the standard deviation of this list.
var workingTimes = new List<SumArray>();
foreach (var employee in employees) {
// Gather the duration of each task the employee is
// assigned to in a list
// o.IsAssign is an IntVar and task.Duration is an int
var allDurations = shifts.Where(o => o.Employee == employee.Name)
.Select(o => o.IsAssigned * task[o.Task].Duration);
// Total time the employee is working
var workTime = new SumArray(allDurations);
workingTimes.Add(workTime);
}
Now I want to minimize the stdev of workingTimes. I tried the following:
IntegerExpression workingTimesMean = new SumArray(workingTimes) * (1/workingTimes.Count);
var gaps = workingTimes.Select(o => (o - workingTimesMean)*(o - workingTimesMean));
var stdev = new SumArray(gaps) * (1/gaps.Count());
model.Minimize(stdev);
But the LINQ query at the 2nd line of the last code snippet is throwing me an error:
Can't apply operator * to IntegerExpression and IntegerExpression
How can I compute the standard deviation of a Google.OrTools.Sat.SumArray?

The 'natural' API only supports linear expressions.
You need to use the AddProductEquality() API.
Please note that 1 / Gaps.Count() will always return 0 (we are in integer arithmetic).
So you need to scale everything up.
Personally, I would just minimize the unscaled sum of abs(val - average). No need to divide by the number of elements.
Just check that the computation of the average has the right precision (once again, we are in integer arithmetic).
You could also consider just minimize the max(abs(val - average)). This is simpler and may be good enough.

Check if int is 10, 100, 1000,

I have a part in my application which needs to do do something (=> add padding 0 in front of other numbers) when a specified number gets an additional digit, meaning it gets 10, 100, 1000 or so on...
At the moment I use the following logic for that:
public static bool IsNewDigit(this int number)
{
var numberString = number.ToString();
return numberString.StartsWith("1")
&& numberString.Substring(1).All(c => c == '0');
}
The I can do:
if (number.IsNewDigit()) { /* add padding 0 to other numbers */ }
This seems like a "hack" to me using the string conversion.
Is there something something better (maybe even "built-in") to do this?
UPDATE:
One example where I need this:
I have an item with the following (simplified) structure:
public class Item
{
public int Id { get; set; }
public int ParentId { get; set; }
public int Position { get; set; }
public string HierarchicPosition { get; set; }
}
HierarchicPosition is the own position (with the padding) and the parents HierarchicPositon. E.g. an item, which is the 3rd child of 12 from an item at position 2 has 2.03 as its HierarchicPosition. This can as well be something more complicated like 011.7.003.2.02.
This value is then used for sorting the items very easily in a "tree-view" like structure.
Now I have an IQueryable<Item> and want to add one item as the last child of another item. To avoid needing to recreate all HierarchicalPosition I would like to detect (with the logic in question) if the new position adds a new digit:
Item newItem = GetNewItem();
IQueryable<Item> items = db.Items;
var maxPosition = items.Where(i => i.ParentId == newItem.ParentId)
.Max(i => i.Position);
newItem.Position = maxPosition + 1;
if (newItem.Position.IsNewDigit())
UpdateAllPositions(items.Where(i => i.ParentId == newItem.ParentId));
else
newItem.HierarchicPosition = GetHierarchicPosition(newItem);
UPDATE #2:
I query this position string from the DB like:
var items = db.Items.Where(...)
.OrderBy(i => i.HierarchicPosition)
.Skip(pageSize * pageNumber).Take(pageSize);
Because of this I can not use an IComperator (or something else wich sorts "via code").
This will return items with HierarchicPosition like (pageSize = 10):
03.04
03.05
04
04.01
04.01.01
04.01.02
04.02
04.02.01
04.03
05
UPDATE #3:
I like the alternative solution with the double values, but I have some "more complicated cases" like the following I am not shure I can solve with that:
I am building (on part of many) an image gallery, which has Categories and Images. There a category can have a parent and multiple children and each image belongs to a category (I called them Holder and Asstes in my logic - so each image has a holder and each category can have multiple assets). These images are sorted first be the categories position and then by its own position. This I do by combining the HierarchicPosition like HolderHierarchicPosition#ItemHierarchicPosition. So in a category which has 02.04 as its position and 120 images the 3rd image would get 02.04#003.
I have even some cases with "three levels" (or maybe more in the future) like 03.1#02#04.
Can I adapt the "double solution" to suport such scenarios?
P.S.: I am also open to other solution for my base problem.

You could check if base-10 logarithm of the number is an integer. (10 -> 1, 100 -> 2, 1000 -> 3, ...)
This could also simplify your algorithm a bit in general. Instead of adding one 0 of padding every time you find something bigger, simply keep track of the maximum number you see, then take length = floor(log10(number))+1 and make sure everything is padded to length. This part does not suffer from the floating point arithmetic issues like the comparison to integer does.

From What you describe, it looks like your HierarchicPosition position should maintain an order of items and you run into the problem, that when you have the ids 1..9 and add a 10, you'll get the order 1,10,2,3,4,5,6... somewhere and therefore want to pad-left to 01,02,03...,10 - correct?
If I'm right, please have a look at this first: https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem
Because what you try to do is a workarround to solve the problem in a certain way. - But there might be more efficent ways to actually really solve it. (therefore you should have better asked about your actual problem rather than the solution you try to implement)
See here for a solution, using a custom IComparator to sort strings (that are actually numbers) in a native way: http://www.codeproject.com/Articles/11016/Numeric-String-Sort-in-C
Update regarding your update:
With providing a sorting "String" like you do, you could insert a element "somewhere" without having ALL subsequent items reindexed, as it would be for a integer value. (This seems to be the purpose)
Instead of building up a complex "String", you could use a Double-Value to achieve the very same result real quick:
If you insert an item somewhere between 2 existing items, all you have to do is : this.sortingValue = (prior.sortingValue + next.sortingValue) / 2 and handle the case when you are inserting at the end of the list.
Let's assume you add Elements in the following order:
1 First Element // pick a double value for sorting - 100.00 for example. -> 100.00
2 Next Element // this is the list end - lets just add another 100.00 -> 200.00
1.1 Child // this should go "in between": (100+200)/2 = 150.00
1.2 Another // between 1.1 and 2 : (150+200)/2 = 175
When you now simple sort depending on that double field, the order would be:
100.00 -> 1
150.00 -> 1.1
175.00 -> 1.2
200.00 -> 2
Wanna Add 1.1.1? Great: positon = (150.00 + 175.00)/2;;
you could simple multiply all by 10, whenever your NEW value hits x.5* to ensure you are not running out of decimal places (but you dont have to - having .5 .25 .125 ... does not hurt the sorting):
So, after adding the 1.1.1 which would be 162,5, multiply all by 10:
1000.00 -> 1
1500.00 -> 1.1
1625.00 -> 1.1.1
1750.00 -> 1.2
2000.00 -> 2
So, whenever you move an item arround, you only need to recalculate the position of n by looking at n-1 and n+1
Depending on the expected childs per entry, you could start with "1000.00", "10.000" or whatever matches best.
What I didn't take into account: When you want to move "2" to the top, you would need to recalculate all childs of "2" to have a value somewhere between the sorting value of "2" and the now "next" item... Could serve some headache :)
The solution with "double" values has some limitations, but will work for smaller sets of groups. However you are talking about "Groups, subgroups, and pictures with counts of 100" - so another solution would be preferable:
First, you should refactor your database: Currently you are trying to "squeeze" a Tree into a list (datatables are basically lists)
To really reflect the complex layout of a tree with an infinite depth, you should use 2 tables and implement the composite pattern.
Then you can use a recursive approach to get a category, its subcategory, [...] and finally the elements of that category.
With that, you only need to provide a position of each leaf within it's current node.
Rearanging leafs will not affect any leaf of another node or any node.
Rearanging nodes will not affect any subnode or leaf of that node.

You could check sum of square of all digits for the input, 10,100,1000 has something in common that, if you do the sum of square of all digits, it should converge to one;
10
1^2 + 0^2 = 1
100
1^2 + 0^2 + 0^2 = 1
so on so forth.

How to transform one list to another semi-aggregated list via LINQ?

I'm a Linq beginner so just looking for someone to let me know if following is possible to implement with Linq and if so some pointers how it could be achieved.
I want to transform one financial time series list into another where the second series list will be same length or shorter than the first list (usually it will be shorter, i.e., it becomes a new list where the elements themselves represent aggregation of information of one or more elements from the 1st list). How it collapses the list from one to the other depends on the data in the first list. The algorithm needs to track a calculation that gets reset upon new elements added to second list. It may be easier to describe via an example:
List 1 (time ordered from beginning to end series of closing prices and volume):
{P=7,V=1}, {P=10,V=2}, {P=10,V=1}, {P=10,V=3}, {P=11,V=5}, {P=12,V=1}, {P=13,V=2}, {P=17,V=1}, {P=15,V=4}, {P=14,V=10}, {P=14,V=8}, {P=10,V=2}, {P=9,V=3}, {P=8,V=1}
List 2 (series of open/close price ranges and summation of volume for such range period using these 2 param settings to transform list 1 to list 2: param 1: Price Range Step Size = 3, param 2: Price Range Reversal Step Size = 6):
{O=7,C=10,V=1+2+1}, {O=10,C=13,V=3+5+1+2}, {O=13,C=16,V=0}, {O=16,C=10,V=1+4+10+8+2}, {O=10,C=8,V=3+1}
In list 2, I explicitly am showing the summation of the V attributes from list 1 in list 2. But V is just a long so it would just be one number in reality. So how this works is opening time series price is 7. Then we are looking for first price from this initial starting price where delta is 3 away from 7 (via param 1 setting). In list 1, as we move thru the list, the next step is upwards move to 10 and thus we've established an "up trend". So now we build our first element in list 2 with Open=7,Close=10 and sum up the Volume of all bars used in first list to get to this first step in list 2. Now, next element starting point is 10. To build another up step, we need to advance another 3 upwards to create another up step or we could reverse and go downwards 6 (param 2). With data from list 1, we reach 13 first, so that builds our second element in list 2 and sums up all the V attributes used to get to this step. We continue on this process until end of list 1 processing.
Note the gap jump that happens in list 1. We still want to create a step element of {O=13,C=16,V=0}. The V of 0 is simply stating that we have a range move that went thru this step but had Volume of 0 (no actual prices from list 1 occurred here - it was above it but we want to build the set of steps that lead to price that was above it).
Second to last entry in list 2 represents the reversal from up to down.
Final entry in list 2 just uses final Close from list 1 even though it really hasn't finished establishing full range step yet.
Thanks for any pointers of how this could be potentially done via Linq if at all.

My first thought is, why try to use LINQ on this? It seems like a better situation for making a new Enumerable using the yield keyword to partially process and then spit out an answer.
Something along the lines of this:
public struct PricePoint
{
ulong price;
ulong volume;
}
public struct RangePoint
{
ulong open;
ulong close;
ulong volume;
}
public static IEnumerable<RangePoint> calculateRanges(IEnumerable<PricePoint> pricePoints)
{
if (pricePoints.Count() > 0)
{
ulong open = pricePoints.First().price;
ulong volume = pricePoints.First().volume;
foreach(PricePoint pricePoint in pricePoints.Skip(1))
{
volume += pricePoint.volume;
if (pricePoint.price > open)
{
if ((pricePoint.price - open) >= STEP)
{
// We have established a up-trend.
RangePoint rangePoint;
rangePoint.open = open;
rangePoint.close = close;
rangePoint.volume = volume;
open = pricePoint.price;
volume = 0;
yield return rangePoint;
}
}
else
{
if ((open - pricePoint.price) >= REVERSAL_STEP)
{
// We have established a reversal.
RangePoint rangePoint;
rangePoint.open = open;
rangePoint.close = pricePoint.price;
rangePoint.volume = volume;
open = pricePoint.price;
volume = 0;
yield return rangePoint;
}
}
}
RangePoint lastPoint;
lastPoint.open = open;
lastPoint.close = pricePoints.Last().price;
lastPoint.volume = volume;
yield return lastPoint;
}
}
This isn't yet complete. For instance, it doesn't handle gapping, and there is an unhandled edge case where the last data point might be consumed, but it will still process a "lastPoint". But it should be enough to get started.

Calculate Time Remaining

What's a good algorithm for determining the remaining time for something to complete? I know how many total lines there are, and how many have completed already, how should I estimate the time remaining?

Why not?
(linesProcessed / TimeTaken) (timetaken / linesProcessed) * LinesLeft = TimeLeft
TimeLeft will then be expressed in whatever unit of time timeTaken is.
Edit:
Thanks for the comment you're right this should be:
(TimeTaken / linesProcessed) * linesLeft = timeLeft
so we have
(10 / 100) * 200 = 20 Seconds now 10 seconds go past
(20 / 100) * 200 = 40 Seconds left now 10 more seconds and we process 100 more lines
(30 / 200) * 100 = 15 Seconds and now we all see why the copy file dialog jumps from 3 hours to 30 minutes :-)

I'm surprised no one has answered this question with code!
The simple way to calculate time, as answered by #JoshBerke, can be coded as follows:
DateTime startTime = DateTime.Now;
for (int index = 0, count = lines.Count; index < count; index++) {
// Do the processing
...
// Calculate the time remaining:
TimeSpan timeRemaining = TimeSpan.FromTicks(DateTime.Now.Subtract(startTime).Ticks * (count - (index+1)) / (index+1));
// Display the progress to the user
...
}
This simple example works great for simple progress calculation.
However, for a more complicated task, there are many ways this calculation could be improved!
For example, when you're downloading a large file, the download speed could easily fluctuate. To calculate the most accurate "ETA", a good algorithm would be to only consider the past 10 seconds of progress. Check out ETACalculator.cs for an implementation of this algorithm!
ETACalculator.cs is from Progression -- an open source library that I wrote. It defines a very easy-to-use structure for all kinds of "progress calculation". It makes it easy to have nested steps that report different types of progress. If you're concerned about Perceived Performance (as #JoshBerke suggested), it will help you immensely.

Make sure to manage perceived performance.
Although all the progress bars took exactly the same amount of time in the test, two characteristics made users think the process was faster, even if it wasn't:
progress bars that moved smoothly towards completion
progress bars that sped up towards the end

Not to revive a dead question but I kept coming back to reference this page.
You could create an extension method on the Stopwatch class to get functionality that would get an estimated remaining time span.
static class StopWatchUtils
{
/// <summary>
/// Gets estimated time on compleation.
/// </summary>
/// <param name="sw"></param>
/// <param name="counter"></param>
/// <param name="counterGoal"></param>
/// <returns></returns>
public static TimeSpan GetEta(this Stopwatch sw, int counter, int counterGoal)
{
/* this is based off of:
* (TimeTaken / linesProcessed) * linesLeft=timeLeft
* so we have
* (10/100) * 200 = 20 Seconds now 10 seconds go past
* (20/100) * 200 = 40 Seconds left now 10 more seconds and we process 100 more lines
* (30/200) * 100 = 15 Seconds and now we all see why the copy file dialog jumps from 3 hours to 30 minutes :-)
*
* pulled from http://stackoverflow.com/questions/473355/calculate-time-remaining/473369#473369
*/
if (counter == 0) return TimeSpan.Zero;
float elapsedMin = ((float)sw.ElapsedMilliseconds / 1000) / 60;
float minLeft = (elapsedMin / counter) * (counterGoal - counter); //see comment a
TimeSpan ret = TimeSpan.FromMinutes(minLeft);
return ret;
}
}
Example:
int y = 500;
Stopwatch sw = new Stopwatch();
sw.Start();
for(int x = 0 ; x < y ; x++ )
{
//do something
Console.WriteLine("{0} time remaining",sw.GetEta(x,y).ToString());
}
Hopefully it will be of some use to somebody.
EDIT:
It should be noted this is most accurate when each loop takes the same amount of time.
Edit 2:
Instead of subclassing I created an extension method.

Generally, you know three things at any point in time while processing:
How many units/chunks/items have been processed up to that point in time (A).
How long it has taken to process those items (B).
The number of remaining items (C).
Given those items, the estimate (unless the time to process an item is constant) of the remaining time will be
B * C / A

I made this and it works quite good, feel free to change the method signature according to your variable types or also to the return type, probably you would like to get the TimeSpan object or just the seconds...
/// <summary>
/// Calculates the eta.
/// </summary>
/// <param name="processStarted">When the process started</param>
/// <param name="totalElements">How many items are being processed</param>
/// <param name="processedElements">How many items are done</param>
/// <returns>A string representing the time left</returns>
private string CalculateEta(DateTime processStarted, int totalElements, int processedElements)
{
int itemsPerSecond = processedElements / (int)(processStarted - DateTime.Now).TotalSeconds;
int secondsRemaining = (totalElements - processedElements) / itemsPerSecond;
return new TimeSpan(0, 0, secondsRemaining).ToString();
}
You will require to initialize a DateTime variable when the processing starts and send it to the method on each iteration.
Do not forget that probably your window will be locked if the process is quite long, so when you place the return value into a control, don't forget to use the .Refresh() method of it.
If you are using threads then you can attempt to set the text using the Invoke(Action) method, would be easier to use this extension method to archieve it easily.
If you use a console application, then you should not have problems displaying the output line by line.
Hope it helps someone.

It depends greatly on what the "something" is. If you can assume that the amount of time to process each line is similar, you can do a simple calculation:
TimePerLine = Elapsed / LinesProcessed
TotalTime = TimePerLine * TotalLines
TimeRemaining = TotalTime - LinesRemaining * TimePerLine

there is no standard algorithm i know of, my sugestion would be:
Create a variable to save the %
Calculate the complexity of the task you wish to track(or an estimative of it)
Put increments to the % from time to time as you would see fit given the complexity.
You probably seen programs where the load bar runs much faster in one point than in another. Well that's pretty much because this is how they do it. (though they probably just put increments at regular intervals in the main wrapper)

Where time$("ms") represents the current time in milliseconds since 00:00:00.00, and lof represents the total lines to process, and x represents the current line:
if Ln>0 then
Tn=Tn+time$("ms")-Ln 'grand total of all laps
Rn=Tn*(lof-x)/x^2 'estimated time remaining in seconds
end if
Ln=time$("ms") 'start lap time (current time)

That really depends on what is being done... lines are not enough unless each individual line takes the same amount of time.
The best way (if your lines are not similar) would probably be to look at logical sections of the code find out how long each section takes on average, then use those average timings to estimate progress.

If you know the percentage completed, and you can simply assume that the time scales linearly, something like
timeLeft = timeSoFar * (1/Percentage)
might work.

I already knew the percentage complete & time elapsed, so this helped me:
TimeElapsed * ((100 - %complete) / %complete) = TimeRemaining
I then updated this value every time %complete changed, giving me a constant varying ETA.

There is 2 ways of showing time
Time elapsed and Time Remaining overall:
so elapsed will increase but remaining will be likely stable total time needed (if per second is stable)
Time elapsed and Time Left:
so Time Left = Total Needed - Elapsed
My idea/formula is more likely like this:
Processed - updated from running thread from 0 to Total
I have timer with 1000ms interval that calculates processed per second:
processedPerSecond = Processed - lastTickProcessed;
lastTickProcessed = Processed; //store state from past call
processedPerSecond and lastTickProcessed are global variables out of timer method
Now if we would like to get how many seconds is required to complete the processing (in ideal constant assumption)
totalSecondsNeeded = TotalLines / PerSecond
but we want to show case 2. TimeLeft so
TimeLeftSeconds = (TotalLines - Processed) / PerSecond
TimeSpan remaining = new TimeSpan(0, 0, (transactions.Count - Processed) / processedPerSecond);
labelTimeRemaining.Text = remaining.ToString(#"hh\:mm\:ss");
Of course TimeLeftSeconds will "jump" if PerSecond jumps, so if past PerSecond was 10 then 30 then back to 10, the user will see it.
There is a way to calculate average, but this may not show real time left if process speeds up at the end
int perSecond = (int)Math.Ceiling((processed / (decimal)timeElapsed.TotalSeconds)); //average not in past second
So it may be the choice for a developer to "pick" a method that will be most accurate based on prediction of how "jumpy" the processing is
We could also calculate and save each PerSecond, then take last 10 second and made average, but in this case user will have to wait 10 seconds to see first calculation
or we could show time left starting from first per second and then progressively average summing up to 10 last PerSecond
I hope my "jumpy" thoughts will help someone to build something satisfying

How about this....
I used this to walk through a set of records (rows in an Excel file, in one case)
L is the current row number
X is the total number of rows
dat_Start is set to Now() when the routine begins
Debug.Print Format((L / X), "percent") & vbTab & "Time to go:" & vbTab & Format((DateDiff("n", dat_Start, Now) / L) * (X - L), "00") & ":" & Format(((DateDiff("s", dat_Start, Now) / L) * (X - L)) Mod 60, "00")

PowerShell function
function CalculateEta([datetime]$processStarted, [long]$totalElements, [long]$processedElements) {
$itemsPerSecond = $processedElements / [DateTime]::Now.Subtract($processStarted).TotalSeconds
$secondsRemaining = ($totalElements - $processedElements) / $itemsPerSecond
return [TimeSpan]::FromSeconds($secondsRemaining)
}

I prefer System.Threading.Timer rather than System.Diagnostics.Stopwatch.
System.Threading.Timer, which executes a single callback method on a
thread pool thread
The following code is an example of a calculating elapsed time with Threading.Timer.
public class ElapsedTimeCalculator : IDisposable
{
private const int ValueToInstantFire = 0;
private readonly Timer timer;
private readonly DateTime initialTime;
public ElapsedTimeCalculator(Action<TimeSpan> action)
{
timer = new Timer(new TimerCallback(_ => action(ElapsedTime)));
initialTime = DateTime.UtcNow;
}
// Use Timeout.Infinite if you don't want to set period time.
public void Fire() => timer.Change(ValueToInstantFire, Timeout.Infinite);
public void Dispose() => timer?.Dispose();
private TimeSpan ElapsedTime => DateTime.UtcNow - initialTime;
}
BTW You can use System.Reactive.Concurrency.IScheduler (scheduler.Now.UtcDateTime) instead of using DateTime directly, if you would like to mock and virtualize the datetime for unit tests.
public class PercentageViewModel : IDisposable
{
private readonly ElapsedTimeCalculator elapsedTimeCalculator;
public PercentageViewModel()
{
elapsedTimeCalculator = new ElapsedTimeCalculator(CalculateTimeRemaining))
}
// Use it where You would like to estimate time remaining.
public void UpdatePercentage(double percent)
{
Percent = percent;
elapsedTimeCalculator.Fire();
}
private void CalculateTimeRemaining(TimeSpan timeElapsed)
{
var timeRemainingInSecond = GetTimePerPercentage(timeElapsed.TotalSeconds) * GetRemainingPercentage;
//Work with calculated time...
}
public double Percent { get; set; }
public void Dispose() => elapsedTimeCalculator.Dispose();
private double GetTimePerPercentage(double elapsedTime) => elapsedTime / Percent;
private double GetRemainingPercentage => 100 - Percent;
}

In Python:
First create a array with the time taken per entry, then calculate the remaining elements and calculate average time taken
import datetime from datetime
import time
# create average function**
def average(total):
return float(sum(total)) / max(len(total), 1)
# create array
time_elapsed = []
# capture starting time
start_time = datetime.now()
# do stuff
# capture ending time
end_time = datetime.now()
# get the total seconds from the captured time (important between two days)
time_in_seconds = (end_time - start_time).total_seconds()
# append the time to a array
time_elapsed.append(time_in_seconds)
# calculate the remaining elements, then multiply it with the average time taken
est_time_left = (LastElement - Processed) * average(time_elapsed)
print(f"Estimated time left: {time.strftime('%H:%M:%S', time.gmtime(est_time_left))}")
** timeit() with k=5000 random elements and number=1000
def average2(total):
avg = 0
for e in total: avg += e
return avg / max(len(total),1)
>> timeit average 0.014467999999999925
>> timeit average2 0.08711790000000003
>> timeit numpy.mean: 0.16030989999999967
>> timeit numpy.average: 0.16210096000000003
>> timeit statistics.mean: 2.8182458

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Determining % of time above a certain value in a dataset - c#

Related

Compare if elements are almost equal in a list in C# .NET

or-tools - Compute the stdev from a SumArray()

Check if int is 10, 100, 1000,

How to transform one list to another semi-aggregated list via LINQ?

Calculate Time Remaining

Categories

Resources