Aggregate value takes very long time - c#

I have one big list of 15 min values for oround year. and I would like to aggregate them into hours. I am doing it in very simple way :
for (; from <= to; from = from.AddHours(1))
{
List<DataPoint> valuesToAgregate = data.Where(x => x.TimeStamp >= from && x.TimeStamp < from.AddHours(1)).ToList();
dailyInputData.Add(valuesToAgregate.Sum(x=>x.Val));
}
This way it takes a lot of time, like 30 seconds for 35k of values, is there any way to optimize it ? maybe use ordering functionality or some how add index to list or using grouping by instead of for loop?

Of course, if you order your list by TimeStamp previously, this will work quicker. Example:
var orderedData = data.OrderBy(item => item.TimeStamp).ToList();
int firstIndex = 0;
var from = orderedData.First().TimeStamp;
var to = orderedData.Last().TimeStamp;
while (from < to)
{
var sum = 0;
var newTo = from.AddHours(1);
while (firstIndex < data.Count && orderedData[firstIndex].TimeStamp < newTo)
{
sum += orderedData[firstIndex].Val;
++firstIndex;
}
dailyInputData.Add(sum);
from = from.AddHours(1);
}

data = data.Sort(x=>x.TimeStamp);
int counter = 0;
var boundary = from.AddHours(1);
foreach(var d in data){
if(d.TimeStamp > boundary){
boundary = boundary.AddHours(1);
counter = 0;
dailyInputData.Add(counter);
}
++counter;
}
This problem lies in the logic
the list is scanned from start to end every time to find the candidate values (your where clause)
the candidate values are inserted to another temp list
the temp list is THEN scanned from start to end to calculate the sum
The fastest approach:
sort the list
go through the items, if they belong to the current group, add the counter, otherwise you've jumped to a new group, flush the counter to record the value and start it over again

Related

Increase performance of timeinterval calculation

I have the code fragment below (short version first, compete version after) which loops over lots of records which are in chronological order. The number of records ranges from 100's of thousands to millions. I need to compare the time interval between successive records and determine the difference in minutes to decide on some action and set a value. This is the performance bottleneck of the whole application so I need to do something. The profiler clearly shows that
(DayList[nextIndex].ThisDate - entry.ThisDate).Minutes
is the bottleneck of the bottleneck. When this is solved, the next bottleneck will be the date call in the DayList creation:
List<MonthfileValue> DayList = thisList.Where(x => x.ThisDate.Date == i.Date).ToList();
Those two lines roughly take 60% - 70% of all CPU.
So the question is: how can I increase performance (dramatically) or should I abandon this road completely (because this performance is unacceptable)?
for ( DateTime i=startdate; i<=enddate; i=i.AddDays(1) )
{
int nextIndex = 0;
List<MonthfileValue> DayList = thisList.Where(x => x.ThisDate.Date == i.Date).ToList();
foreach (MonthfileValue entry in DayList)
{
if (++nextIndex < DayList.Count - 1)
{
IntervalInMinutes = (DayList[nextIndex].ThisDate - entry.ThisDate).Minutes;
}
// do some calculations
}
// do some calculations
}
The complete version is below:
for ( DateTime i=startdate; i<=enddate; i=i.AddDays(1) )
{
int nextIndex = 0;
DaySolarValues tmp = new DaySolarValues();
List<MonthfileValue> DayList = thisList.Where(x => x.ThisDate.Date == i.Date).ToList();
foreach (MonthfileValue entry in DayList)
{
if (++nextIndex < DayList.Count - 1)
{
OldIntervalInMinutes = IntervalInMinutes;
IntervalInMinutes = (DayList[nextIndex].ThisDate - entry.ThisDate).Minutes;
if (IntervalInMinutes > 30)
{
IntervalInMinutes = OldIntervalInMinutes; //reset the value and try again
continue; // If more than 30 minutes, then skip this data
}
else if (IntervalInMinutes != OldIntervalInMinutes)
{
// Log some message and continue
}
}
tmp.SolarHours += entry.SolarRad / entry.SolarTheoreticalMax >= SunThreshold ? IntervalInMinutes : 0;
tmp.SolarEnergy += entry.SolarRad * IntervalInMinutes * 60;
tmp.SunUpTimeInMinutes += IntervalInMinutes;
}
tmp.SolarHours /= 60;
tmp.SolarEnergy /= 3600;
tmp.ThisDate = i;
DailySolarValuesList.Add(tmp);
}
I can clearly see that the Where(...) call steals performance.
For me it would be the first step to try this:
var dayLookup = thisList.ToLookup(x => x.ThisDate.Date);
for ( DateTime currentDate =startdate; currentDate <=enddate; currentDate = currentDate.AddDays(1) )
{
int nextIndex = 0;
List<MonthfileValue> DayList = dayLookup[currentDate];
...
}
This way you create a hash lookup before the loop, so getting the DayList will be a less expensive operation

Scheduler algorithm that fill remaning hours to work

I need to generate all possible values to a scheduler who works like this:
Some hours of the week can be already chosen.
The week of work is defined by the following pattern "???????" question marks can be replaced.
Given a maximum of hours, I need to replace the question marks with digits so that the sum of the scheduled hours match the hours need to work in a week returning a string array with all possible schedules, ordered lexicographically.
Example:
pattern = "08??840",
required_week_hours= 24
In this example, there are only 4 hours left to work.
calling this:
function List<String> GenerateScheduler(int workHours, int dayHours, string pattern){}
public static void Main(){
GenerateScheduler(24, 4, "08??840");
}
This would return the following list of strings:
0804840
0813840
.......
.......
0840840
I'm not very familiar with algorithms, which one I could use to solve this problem?
This sounds like a problem where you have to generate all permutations of a list of a certain amount of numbers that sum up to a certain number. First, you need to sum up the hours you already know. Then you need to count up the number of ? aka the number of shifts/days you do not know about. Using these parameters, this is what the solution will look like,
public List<string> GenerateScheduler(int workHours, int dayHours, string pattern){
int remainingSum = workHours;
int unknownCount = 0;
// first iterate through the pattern to know how many ? characters there are
// as well as the number of hours remaining
for (int i = 0; i < pattern.Length; i++) {
if (pattern[i] == '?') {
unknownCount++;
}
else {
remainingSum -= pattern[i] - '0';
}
}
List<List<int>> permutations = new List<List<int>>();
// get all the lists of work shifts that sum to the remaining number of hours
// the number of work shifts in each list is the number of ? characters in pattern
GeneratePermutations(permutations, workHours, unknownCount);
// after getting all the permutations, we need to iterate through the pattern
// for each permutation to construct a list of schedules to return
List<string> schedules = new List<string>();
foreach (List<int> permutation in permutation) {
StringBuilder newSchedule = new StringBuilder();
int permCount = 0;
for (int i = 0; i < pattern.Length(); i++) {
if (pattern[i] == '?') {
newSchedule.Append(permutation[permCount]);
permCount++;
}
else {
newSchedule.Append(pattern[i]);
}
}
schedules.Add(newSchedule.ToString());
}
return schedules;
}
public void GeneratePermutations(List<List<int>> permutations, int workHours, int unknownCount) {
for (int i = 0; i <= workHours; i++) {
List<int> permutation = new List<int>();
permutation.Add(i);
GeneratePermuationsHelper(permutations, permutation, workHours - i, unknownCount - 1);
}
}
public void GeneratePermutationsHelper(List<List<int>> permutations, List<int> permutation, int remainingHours, int remainingShifts){
if (remainingShifts == 0 && remainingHours == 0) {
permutations.Add(permutation);
return;
}
if (remainingHours <= 0 || remainingShifts <= 0) {
return;
}
for (int i = 0; i <= remainingHours; i++) {
List<int> newPermutation = new List<int>(permutation);
newPermutation.Add(i);
GeneratePermutationsHelper(permutations, newPermutation, remainingHours - i, remainingShifts - 1);
}
}
This can be a lot to digest so I will briefly go over how the permutation recursive helper function works. The parameters go as follows:
a list containing all the permutations
the current permutation being examined
the remaining number of hours needed to reach the total work hour count
the number of remaining shifts (basically number of '?' - permutation.Count)
First, we check to see if the current permutation meets the criteria that the total of its work hours equals the amount of hours remaining needed to complete the pattern and the number of shifts in the permutation equals the number of question marks in the pattern. If it does, then we add this permutation to the list of permutations. If it doesn't, we check to see if the total amount of work hours surpasses the amount of hours remaining or if the number of shifts has reached the number of question marks in the pattern. If so, then the permutation is not added. However, if we can still add more shifts, we will run a loop from i = 0 to remainingHours and make a copy of the permutation while adding i to this copied list in each iteration of the loop. Then, we will adjust the remaining hours and remaining shifts accordingly before calling the helper function recursively with the copied permutation.
Lastly, we can use these permutations to create a list of new schedules, replacing the ? characters in the pattern with the numbers from each permutation.
As per OP, you already know the remaining hours, which I assume is given by the parameter dayHours. So, if you were to break down the problem further, you would need to replace '?' characters with numbers so that, sum of new character(number) is equal to remaining hours(dayHours).
You can do the following.
public IEnumerable<string> GenerateScheduler(int totalHours,int remainingHours,string replacementString)
{
var numberOfPlaces = replacementString.Count(x => x == '?');
var minValue = remainingHours;
var maxValue = remainingHours * Math.Pow(10,numberOfPlaces-1);
var combinations = Enumerable.Range(remainingHours,(int)maxValue)
.Where(x=> SumOfDigit(x) == remainingHours).Select(x=>x.ToString().PadLeft(numberOfPlaces,'0').ToCharArray());
foreach(var item in combinations)
{
var i = 0;
yield return Regex.Replace(replacementString, "[?]", (m) => {return item[i++].ToString(); });
}
}
double SumOfDigit(int value)
{
int sum = 0;
while (value != 0)
{
int remainder;
value = Math.DivRem(value, 10, out remainder);
sum += remainder;
}
return sum;
}

compare two items in list, and then split into smaller list at index

So i have a list of locations. I need to split the list if the distance between each location is greater than say 30.
I can loop through the list and get the distance between each location, i am just not sure what the best approach is to split the list, i have read answers that break the list into chunks with a set size, but in my case the size could be variable depending on the distance between locations.
This could be really simple and i just cant see it. What i have so far is below, the code is pretty straightforward in comparing the two items, its purely splitting the list i am stuck at. Currently my code would not include all the items from the original list, it would exclude the items before the first GetRange.
var unkownSegments = grouped.Where(x => x.ActivityType == null);
foreach (var group in unkownSegments)
{
var tempLists = new List<List<LocationResult>>();
for (int i = 0; i < group.Items.Count - 1; i++)
{
var point1 = group.Items[i];
var point2 = group.Items[i + 1];
var sCoord = new GeoCoordinate(point1.Lat, point1.Long);
var eCoord = new GeoCoordinate(point2.Lat, point2.Long);
var distance = sCoord.GetDistanceTo(eCoord);
if(distance > 30)
{
var tempList = group.Items.GetRange(i, group.Items.Count - i);
tempLists.Add(tempList);
}
}
}
Thank you for any help or suggestions.
To create a range (using GetRange() method), you need to know where it begins and where it ends. If distance between Item[i] and Item[i+1] is greater then 30, you know the end, because that end is at index i. But you don't know the beginning (of course, you know it for the first range - it's 0), because beginning depends on the end of previous range. So you need to introduce new variable (it's called rangeStart in my example bellow), that will contain such information. It starts with value 0 (that's where first range always begins) and then update it's value whenever you add new range (next range will always start at index i+1).
After the for loop finishes, some points will remain. So need to add them points as the last range. Whole method can then look like this:
var unkownSegments = grouped.Where(x => x.ActivityType == null);
foreach (var group in unkownSegments)
{
var tempLists = new List<List<LocationResult>>();
//This variable keeps track of the beginning of the next range
var rangeStart = 0;
for (int i = 0; i < group.Items.Count - 1; i++)
{
var point1 = group.Items[i];
var point2 = group.Items[i + 1];
var sCoord = new GeoCoordinate(point1.Lat, point1.Long);
var eCoord = new GeoCoordinate(point2.Lat, point2.Long);
var distance = sCoord.GetDistanceTo(eCoord);
if(distance > 30)
{
var tempList = group.Items.GetRange(rangeStart, i - rangeStart + 1);
tempLists.Add(tempList);
rangeStart = i + 1;//Next range will begin on the following item
}
}
if (group.Items.Count - rangeStart > 0)
{
//Add all remainging (not added yet) points as the last range.
var tempList = group.Items.GetRange(rangeStart, group.Items.Count - rangeStart);
tempLists.Add(tempList);
}
}

Adding the calculated sum of a list to another list

I have a list of int .. How can i sum all the numbers in the list and get the result ?
List<int> FineListy = new List<int>();
Your code has a number of issues.
List<int> FineListy = new List<int>();
for (int i = 0; i < fineList.Count(); i++)
{
if (fineList[i] > 0)
{
FineListy.Add((fineList[i] += fineList[i]));
}
}
Firstly: C# naming conventions are such that local variables should start with lowercase letters and be camelCased. I recommend naming your variables totalsList and fineList respectively. For the sake of simplicity, I will use your current naming conventions below.
Next, you're doing FineListy.Add(fineList[i] += fineList[i]); which is the same as:
fineList[i] = fineList[i] * 2;
FineListy.Add(fineList[i]);
And you're doing it in a loop, so you will simply get a list of all items multiplied by 2.
Now you could fix this like so:
int total = 0;
for (int i = 0; i < fineList.Count; ++i)
{
if (fineList[i] > 0)
{
total += fineList[i];
}
}
FineListy.Add(total);
But you can use LINQ to do the same in a single line (I've split it across multiple lines to make it easier to read):
var total = fineList
.Where(v => v > 0)
.Sum();
FineListy.Add(total);
Or simply:
FineListy.Add(fineList.Where(v => v > 0).Sum());
If you have a list of int, why not use sum function??
int sum = FineListy.Sum();
This will add up all the numbers and give you the expected result.Now i see you do an If check to see if the number is not 0.So,create a new list then and pass the numbers to the list only if it's greater than 0
List<int> NewList = new List<int>;
foreach (var number in IntegerList)
{
if (number > 0)
{
NewList.Add(number);
}
}
Finally get the sum total :
int sum = NewList.Sum();
Or one-line LINQ solution :
var result = fineList.Where(a => a > 0).Sum();
NewList.Add(result );
Yes, it doubles, because that's what you do here :
FineListy.Add((fineList[i] += fineList[i]));
You say : "Add me fineList[i] + fineList[i] to my resulting collection FineListy"
So you got all elements doubled.
If you want to add values you don't need list just a variable to store it and LINQ .Sum() method
P.S.
I mean either one of these two (as others suggested):
FineListy.Add(fineList.Sum());
Or
int sum = fineList.Sum();

How to increment the index of a list for saving the next number in next index of list

I want to save the result of zarb function which repeats for 1000 times in a list with size 1000. Then I must to increase the index of the list for every calculation to avoid to save the next calculation at the same index of previous one. How can I do that?
var results = new List<float>(1000);
for (int z = 0; z < 1000; z++)
{
results.Add(zarb(sc,z));
//increase the index of resukts
}
foreach (var resultwithindex in results.Select((r, index) => new { result = r, Index = index }).OrderByDescending(r => r.result).Take(20))
{
MessageBox.Show(string.Format("{0}: {1}", resultwithindex.Index, resultwithindex.result));
}
Zarb function
public float zarb(int userid, int itemid)
{
float[] u_f = a[userid];
float[] i_f = b[itemid];
for (int i = 0; i < u_f.Length; i++)
{
result += u_f[i] * i_f[i];
}
return result;
}
No you don't. The Add method (surprisingly) adds an item into the list. It doesn't replace anything. You should read MSDN documentation for List<T>. Also, don't be afraid of trying and seeing the results before asking—you'll save time.
Maybe I don't understand the question, but you do not need index for list. The add method will deal with the it.

Categories

Resources