Accumulate values of a list - c#

I have a list with that each object has two fields:
Date as DateTime
Estimated as double.
I have some values like this:
01/01/2019 2
01/02/2019 3
01/03/2019 4
... and so.
I need to generate another list, same format, but accumulating the Estimated field, date by date. So the result must be:
01/01/2019 2
01/02/2019 5 (2+3)
01/03/2019 9 (5+4) ... and so.
Right now, I'm calculating it in a foreach statement
for (int iI = 0; iI < SData.TotalDays; iI++)
{
DateTime oCurrent = SData.ProjectStart.AddDays(iI);
oRet.Add(new GraphData(oCurrent, GetProperEstimation(oCurrent)));
}
Then, I can execute a Linq Sum for all the dates prior or equal to the current date:
private static double GetProperEstimation(DateTime pDate)
{
return Data.Where(x => x.Date.Date <= pDate.Date).Sum(x => x.Estimated);
}
It works. But the problem is that is ABSLOUTELLY slow, taking more than 1 minute for a 271 element list.
Is there a better way to do this?
Thanks in advance.

You can write a simple LINQ-like extension method that accumulates values. This version is generalized to allow different input and output types:
static class ExtensionMethods
{
public static IEnumerable<TOut> Accumulate<TIn, TOut>(this IEnumerable<TIn> source, Func<TIn,double> getFunction, Func<TIn,double,TOut> createFunction)
{
double accumulator = 0;
foreach (var item in source)
{
accumulator += getFunction(item);
yield return createFunction(item, accumulator);
}
}
}
Example usage:
public static void Main()
{
var list = new List<Foo>
{
new Foo { Date = new DateTime(2018,1,1), Estimated = 1 },
new Foo { Date = new DateTime(2018,1,2), Estimated = 2 },
new Foo { Date = new DateTime(2018,1,3), Estimated = 3 },
new Foo { Date = new DateTime(2018,1,4), Estimated = 4 },
new Foo { Date = new DateTime(2018,1,5), Estimated = 5 }
};
var accumulatedList = list.Accumulate
(
(item) => item.Estimated, //Given an item, get the value to be summed
(item, sum) => new { Item = item, Sum = sum } //Given an item and the sum, create an output element
);
foreach (var item in accumulatedList)
{
Console.WriteLine("{0:yyyy-MM-dd} {1}", item.Item.Date, item.Sum);
}
}
Output:
2018-01-01 1
2018-01-02 3
2018-01-03 6
2018-01-04 10
2018-01-05 15
This approach will only require one iteration over the set so should perform much better than a series of sums.
Link to DotNetFiddle example

This is exactly job of MoreLinq.Scan
var newModels = list.Scan((x, y) => new MyModel(y.Date, x.Estimated + y.Estimated));
New models will have the values you want.
in (x, y), x is the previous item and y is the current item in the enumeration.
Why your query is slow?
because Where will iterate your collection from the beginning every time you call it. so number of operations grow exponentially 1 + 2 + 3 + ... + n = ((n^2)/2 + n/2).

You can try this. Simple yet effective.
var i = 0;
var result = myList.Select(x => new MyObject
{
Date = x.Date,
Estimated = i = i + x.Estimated
}).ToList();
Edit : try in this way
.Select(x => new GraphData(x.Date, i = i + x.Estimated))

I will assume that what you said is real what you need hehehe
Algorithm
Create a list or array of values based in the original values ordered date asc
sumValues=0;
foreach (var x in collection){
sumValues+= x.Estimated; //this will accumulate all the past values and present value
oRet.Add(x.date, sumValues);
}
The first step (order the values) is the most important. For each will be very fast.
see sort

Related

Filtering an array of objects to remove the ones that don't have the greatest value of for a property

How can I filter an array of objects to remove the ones that don't have the greatest value for Age grouped by IMCB first.
class Program
{
static void Main(string[] args)
{
Container[] containers = buildContainers();
// how can I get an array of containers that only contains the IMCB with the greatest Age.
// eg: [ {IMCB = "123456", Age = 3, Name = "third"}, {IMCB = "12345", Age = 4, Name = "fourth"} ]
}
static Container[] buildContainers()
{
List<Container> containers = new List<Container>();
containers.Add(new Container() { IMCB = "123456", Age = 1, Name = "first" });
containers.Add(new Container() { IMCB = "123456", Age = 3, Name = "third" });
containers.Add(new Container() { IMCB = "12345", Age = 2, Name = "second" });
containers.Add(new Container() { IMCB = "123456", Age = 2, Name = "second" });
containers.Add(new Container() { IMCB = "12345", Age = 4, Name = "fourth" });
return containers.ToArray();
}
}
class Container
{
public string Name { get; set; }
public string IMCB { get; set; }
public int Age { get; set; }
}
Since you want to select elements in the source array that have a Property value that matches to the maximum value of that Property in a group of elements that have a common value in another Property, you can:
Group the elements in the array specifying the Property that defines a group in this context.
Select elements in each Group where the value of another Property matches the maximum value of the same Property in the Group. Here, SelectMany() is used to flatten the results of the selection into a single sequence, otherwise you'd get an IEnumerable<Container>[] instead of a Container[].
Return an array (to match the source collection type) of the resulting elements.
Extra: the resulting elements may need to be ordered in some way, e.g., by the Property that created the groupings and/or the Property that selects the maximum value in each Group.
// [...]
Container[] containers = buildContainers();
// [...]
var filteredOnMaxAgePerGroup = containers
.GroupBy(cnt => cnt.IMCB)
.SelectMany(grp => grp
.Where(elm => elm.Age == grp.Max(val => val.Age)))
.ToArray();
To order the results by the Grouping Property (IMCB, here), add OrderBy() before ToArray():
.OrderBy(elm => elm.IMCB)
To order by the Property the defines the maximum value (Age, here), add OrderBy() or OrderByDescending() and ThenBy() or ThenByDescending() or a combination of these, depending on what better fits here, before ToArray():
.OrderBy(elm => elm.Age)
// Or in descending order
.OrderByDescending(elm => elm.Age)
// or, to define a sub-order based on the Group name
.OrderBy(elm => elm.Age).ThenBy(elm => elm.IMCB)
// or
.OrderByDescending(elm => elm.Age).ThenBy(elm => elm.IMCB)
The answer of #Jimi works, but shorter code is not always better code. For linq2objects you do not want to calculate maximum value for every item in the group, you better take it out of the loop:
var selection = containers
.GroupBy(cnt => cnt.IMCB)
.SelectMany(grp =>
{
var max = grp.Max(v => v.Age);
return grp.Where(elm => elm.Age == max);
})
.ToArray();
The difference is O(n^2) algorithm vs O(n) and that may be difference of 50 hours of calculation vs 0.5 seconds of calculation.
1 O(n): 0,0000069 seconds O(n^2): 0,0000048 seconds Test: OK
10 O(n): 0,0000211 seconds O(n^2): 0,0000319 seconds Test: OK
100 O(n): 0,0000492 seconds O(n^2): 0,0020465 seconds Test: OK
1000 O(n): 0,0004217 seconds O(n^2): 0,1992285 seconds Test: OK
10000 O(n): 0,0041992 seconds O(n^2): 19,7042282 seconds Test: OK
100000 O(n): 0,0405747 seconds O(n^2): 2012,1564200 seconds Test: OK
1000000 O(n): 0,4202187 seconds O(n^2): did not finish, estimated 200000 seconds
the test code
static void Test(int count)
{
List<Container> containers = new List<Container>();
var tmp = buildContainers();
for (int i = 0; i < count; ++i)
{
containers.AddRange(tmp);
}
Console.Write(count);
var st = new System.Diagnostics.Stopwatch();
st.Start();
var selection = containers
.GroupBy(cnt => cnt.IMCB)
.SelectMany(grp =>
{
var max = grp.Max(v => v.Age);
return grp.Where(elm => elm.Age == max);
})
.ToArray();
Console.Write("\tO(n): " + (st.ElapsedTicks / 10000000.0).ToString("0.0000000") + " seconds");
st = new System.Diagnostics.Stopwatch();
st.Start();
var result = containers
.GroupBy(cnt => cnt.IMCB)
.SelectMany(grp => grp.Where(elm => elm.Age == grp.Max(v => v.Age)))
.ToArray();
st.Stop();
Console.Write("\tO(n^2): " + (st.ElapsedTicks / 10000000.0).ToString("0.0000000") + " seconds");
Console.WriteLine("\tTest: " + (result.SequenceEqual(selection) ? "OK" : "ERROR"));
}
static void Main(string[] args)
{
Test(1);
Test(10);//warmup
Console.Clear();
Test(1);
Test(10);
Test(100);
Test(1000);
Test(10000);
Test(100000);
Test(1000000);
Console.ReadKey();
}
First, figure out what the greatest age is:
var maxAge = containers.Max( x => x.Age );
Then select non-matching items:
var result = containers.Where( x => x.Age < maxAge );

C# Join multiple collections into one

i have problem with joining multiple collections into one
-> I need collections with data from many sensors connect into one to have for each time values from all sensors in output file, f.e. if one sensor have no data, it will fill file with 0
Please help me, I am desperate
public class MeasuredData
{
public DateTime Time { get; }
public double Value { get; }
public MeasuredData(DateTime time, double value)
{
Time = time;
Value = value;
}
}
If you have multiple variables containing List<MeasuredData>, one for each sensor, you can group them in an array and then query them.
First, you need an extension method to round the DateTimes per #jdweng if you aren't already canonicalizing them as you acquire them.
public static DateTime Round(this DateTime dt, TimeSpan rnd) {
if (rnd == TimeSpan.Zero)
return dt;
else {
var ansTicks = dt.Ticks + Math.Sign(dt.Ticks) * rnd.Ticks / 2;
return new DateTime(ansTicks - ansTicks % rnd.Ticks);
}
}
Now you can create an array of the sensor reading Lists:
var sensorData = new[] { sensor0, sensor1, sensor2, sensor3 };
Then you can extract all the rounded times to create the left hand side of the table:
var roundTo = TimeSpan.FromSeconds(1);
var times = sensorData.SelectMany(sdl => sdl.Select(md => md.Time.Round(roundTo)))
.Distinct()
.Select(t => new { Time = t, Measurements = Enumerable.Empty<MeasuredData>() });
Then you can join each sensor to the table:
foreach (var oneSensorData in sensorData)
times = times.GroupJoin(oneSensorData, t => t.Time, md => md.Time.Round(roundTo),
(t, mdj) => new { t.Time, Measurements = t.Measurements.Concat(mdj) });
Finally, you can convert each row to the time and a List of measurements ordered by time:
var ans = times.Select(tm => new { tm.Time, Measurements = tm.Measurements.ToList() })
.OrderBy(tm => tm.Time);
If you wanted to flatten the List of measurements out to fields in the answer, you would need to do that manually with another Select.
Assuming you have something to join on, you can use Enumerable.Join:
var result = collection1.Join(collection2,
/* whatever your join is */ x => x.id,
y => y.id,
(a, b) => new {x = a, y = b}
foreach(var obj in result)
{
Console.WriteLine($"{obj.x.id}, {obj.y.id}")
}
This prints the id's of the two objects, but they could access anything. The link is probably more helpful, but you didn't give us much info

How do I get total Qty using one linq query?

I have two linq queries, one to get confirmedQty and another one is to get unconfirmedQty.
There is a condition for getting unconfirmedQty. It should be average instead of sum.
result = Sum(confirmedQty) + Avg(unconfirmedQty)
Is there any way to just write one query and get the desired result instead of writing two separate queries?
My Code
class Program
{
static void Main(string[] args)
{
List<Item> items = new List<Item>(new Item[]
{
new Item{ Qty = 100, IsConfirmed=true },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
});
int confirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty));
int unconfirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed != true).Average(u => u.Qty));
//Output => Total : 140
Console.WriteLine("Total : " + (confirmedQty + unconfirmedQty));
Console.Read();
}
public class Item
{
public int Qty { get; set; }
public bool IsConfirmed { get; set; }
}
}
Actually accepted answer enumerates your items collection 2N + 1 times and it adds unnecessary complexity to your original solution. If I'd met this piece of code
(from t in items
let confirmedQty = items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty)
let unconfirmedQty = items.Where(o => o.IsConfirmed != true).Average(u => u.Qty)
let total = confirmedQty + unconfirmedQty
select new { tl = total }).FirstOrDefault();
it would take some time to understand what type of data you are projecting items to. Yes, this query is a strange projection. It creates SelectIterator to project each item of sequence, then it create some range variables, which involves iterating items twice, and finally it selects first projected item. Basically you have wrapped your original queries into additional useless query:
items.Select(i => {
var confirmedQty = items.Where(o => o.IsConfirmed).Sum(u => u.Qty);
var unconfirmedQty = items.Where(o => !o.IsConfirmed).Average(u => u.Qty);
var total = confirmedQty + unconfirmedQty;
return new { tl = total };
}).FirstOrDefault();
Intent is hidden deeply in code and you still have same two nested queries. What you can do here? You can simplify your two queries, make them more readable and show your intent clearly:
int confirmedTotal = items.Where(i => i.IsConfirmed).Sum(i => i.Qty);
// NOTE: Average will throw exception if there is no unconfirmed items!
double unconfirmedAverage = items.Where(i => !i.IsConfirmed).Average(i => i.Qty);
int total = confirmedTotal + (int)unconfirmedAverage;
If performance is more important than readability, then you can calculate total in single query (moved to extension method for readability):
public static int Total(this IEnumerable<Item> items)
{
int confirmedTotal = 0;
int unconfirmedTotal = 0;
int unconfirmedCount = 0;
foreach (var item in items)
{
if (item.IsConfirmed)
{
confirmedTotal += item.Qty;
}
else
{
unconfirmedCount++;
unconfirmedTotal += item.Qty;
}
}
if (unconfirmedCount == 0)
return confirmedTotal;
// NOTE: Will not throw if there is no unconfirmed items
return confirmedTotal + unconfirmedTotal / unconfirmedCount;
}
Usage is simple:
items.Total();
BTW Second solution from accepted answer is not correct. It's just a coincidence that it returns correct value, because you have all unconfirmed items with equal Qty. This solution calculates sum instead of average. Solution with grouping will look like:
var total =
items.GroupBy(i => i.IsConfirmed)
.Select(g => g.Key ? g.Sum(i => i.Qty) : (int)g.Average(i => i.Qty))
.Sum();
Here you have grouping items into two groups - confirmed and unconfirmed. Then you calculate either sum or average based on group key, and summary of two group values. This also neither readable nor efficient solution, but it's correct.

Can I use an anonymous type in a List<T> instead of a helper class?

I need a list with some objects for calculation.
my current code looks like this
private class HelperClass
{
public DateTime TheDate {get;set;}
public TimeSpan TheDuration {get;set;}
public bool Enabled {get;set;}
}
private TimeSpan TheMethod()
{
// create entries for every date
var items = new List<HelperClass>();
foreach(DateTime d in GetAllDatesOrdered())
{
items.Add(new HelperClass { TheDate = d, Enabled = GetEnabled(d), });
}
// calculate the duration for every entry
for (int i = 0; i < items.Count; i++)
{
var item = items[i];
if (i == items.Count -1) // the last one
item.TheDuration = DateTime.Now - item.TheDate;
else
item.TheDuration = items[i+1].TheDate - item.TheDate;
}
// calculate the total duration and return the result
var result = TimeSpan.Zero;
foreach(var item in items.Where(x => x.Enabled))
result = result.Add(item.TheDuration);
return result;
}
Now I find it a bit ugly just to introduce a type for my calculation (HelperClass).
My first approach was to use Tuple<DateTime, TimeSpan, bool> like I usually do this but since I need to modify the TimeSpan after creating the instance I can't use Tuple since Tuple.ItemX is readonly.
I thought about an anonymous type, but I can't figure out how to init my List
var item1 = new { TheDate = DateTime.Now,
TheDuration = TimeSpan.Zero, Enabled = true };
var items = new List<?>(); // How to declare this ???
items.Add(item1);
Using a projection looks like the way forward to me - but you can compute the durations as you go, by "zipping" your collection with itself, offset by one. You can then do the whole method in one query:
// Materialize the result to avoid computing possibly different sequences
var allDatesAndNow = GetDatesOrdered().Concat(new[] { DateTime.Now })
.ToList();
return allDatesNow.Zip(allDatesNow.Skip(1),
(x, y) => new { Enabled = GetEnabled(x),
Duration = y - x })
.Where(x => x.Enabled)
.Aggregate(TimeSpan.Zero, (t, pair) => t + pair.Duration);
The Zip call pairs up each date with its subsequent one, converting each pair of values into a duration and an enabled flag. The Where call filters out disabled pairs. The Aggregate call sums the durations from the resulting pairs.
You could do it with LINQ like:
var itemsWithoutDuration = GetAllDatesOrdered()
.Select(d => new { TheDate = d, Enabled = GetEnabled(d) })
.ToList();
var items = itemsWithoutDuration
.Select((it, k) => new { TheDate = it.d, Enabled = it.Enabled,
TheDuration = (k == (itemsWithoutDuration.Count - 1) ? DateTime.Now : itemsWithoutDuration[k+1].TheDate) - it.TheDate })
.ToList();
But by that point the Tuple is both more readable and more concise!

LINQ to SQL and a running total on ordered results

I want to display a customer's accounting history in a DataGridView and I want to have a column that displays the running total for their balance. The old way I did this was by getting the data, looping through the data, and adding rows to the DataGridView one-by-one and calculating the running total at that time. Lame. I would much rather use LINQ to SQL, or LINQ if not possible with LINQ to SQL, to figure out the running totals so I can just set DataGridView.DataSource to my data.
This is a super-simplified example of what I'm shooting for. Say I have the following class.
class Item
{
public DateTime Date { get; set; }
public decimal Amount { get; set; }
public decimal RunningTotal { get; set; }
}
I would like a L2S, or LINQ, statement that could generate results that look like this:
Date Amount RunningTotal
12-01-2009 5 5
12-02-2009 -5 0
12-02-2009 10 10
12-03-2009 5 15
12-04-2009 -15 0
Notice that there can be multiple items with the same date (12-02-2009). The results should be sorted by date before the running totals are calculated. I'm guessing this means I'll need two statements, one to get the data and sort it and a second to perform the running total calculation.
I was hoping Aggregate would do the trick, but it doesn't work like I was hoping. Or maybe I just couldn't figure it out.
This question seemed to be going after the same thing I wanted, but I don't see how the accepted/only answer solves my problem.
Any ideas on how to pull this off?
Edit
Combing the answers from Alex and DOK, this is what I ended up with:
decimal runningTotal = 0;
var results = FetchDataFromDatabase()
.OrderBy(item => item.Date)
.Select(item => new Item
{
Amount = item.Amount,
Date = item.Date,
RunningTotal = runningTotal += item.Amount
});
Using closures and anonymous method:
List<Item> myList = FetchDataFromDatabase();
decimal currentTotal = 0;
var query = myList
.OrderBy(i => i.Date)
.Select(i =>
{
currentTotal += i.Amount;
return new {
Date = i.Date,
Amount = i.Amount,
RunningTotal = currentTotal
};
}
);
foreach (var item in query)
{
//do with item
}
How about this: (credit goes to this source)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
delegate string CreateGroupingDelegate(int i);
static void Main(string[] args)
{
List<int> list = new List<int>() { 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 69, 2007};
int running_total = 0;
var result_set =
from x in list
select new
{
num = x,
running_total = (running_total = running_total + x)
};
foreach (var v in result_set)
{
Console.WriteLine( "list element: {0}, total so far: {1}",
v.num,
v.running_total);
}
Console.ReadLine();
}
}
}
In case this hasn't been answered yet, I have a solution that I have been using in my projects. This is pretty similar to an Oracle partitioned group. The key is to have the where clause in the running total match the orig list, then group it by date.
var itemList = GetItemsFromDBYadaYadaYada();
var withRuningTotals = from i in itemList
select new {i.Date, i.Amount,
RunningTotal = itemList.Where( x=> x.Date == i.Date).
GroupBy(x=> x.Date).
Select(DateGroup=> DateGroup.Sum(x=> x.Amount)).Single()};
Aggregate can be used to obtain a running total as well:
var src = new [] { 1, 4, 3, 2 };
var running = src.Aggregate(new List<int>(), (a, i) => {
a.Add(a.Count == 0 ? i : a.Last() + i);
return a;
});
Most of the other answers to this, which properly set the running totals within the objects, rely on a side-effect variable, which is not in the spirit of functional coding and the likes of .Aggregate(). This solution eliminates the side-effect variable.
(NB - This solution will run on the client as with other answers, and so may not be optimal for what you require.)
var results = FetchDataFromDatabase()
.OrderBy(item => item.Date)
.Aggregate(new List<Item>(), (list, i) =>
{
var item = new Item
{
Amount = i.Amount,
Date = i.Date,
RunningTotal = i.Amount + (list.LastOrDefault()?.RunningTotal ?? 0)
};
return list.Append(item).ToList();
// Or, possibly more efficient:
// list.Add(item);
// return list;
});
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var list = new List<int>{1, 5, 4, 6, 8, 11, 3, 12};
int running_total = 0;
list.ForEach(x=> Console.WriteLine(running_total = x+running_total));
}
}

Categories

Resources