Aggregate and Group by over 15 second using c# linq - c#

I have following datatable;
SAMPLE_TIME CPU
-----------------------------
14:59:32 3
14:59:20 2
14:59:14 9
14:58:57 2
14:58:48 1
What i want is, summing "count" over 15 second intervals and getting average to new datatable.
So, I want to get the following result using linq;
SAMPLE_TIME CPU
-----------------------------
14:59:32 0.33
14:59:17 0.6
14:59:02 0.2
I tried to get it like below but i can't find the way;
dtTA = (from dr1 in dtTA.AsEnumerable()
group dr1 by dr1.Field<DateTime>("SAMPLE_TIME") into g
select new
{
ST = g.Key,
CPU = g.Sum(h => h.Field<double>("CPU")),
}).ToDataTable();
What should i change on it?

You can create a function which truncates DateTimes to 15 seconds precision.
private static DateTime By15Seconds(DateTime d)
{
long fifteenSeconds = TimeSpan.FromSeconds(15).Ticks;
return new DateTime((d.Ticks / fifteenSeconds) * fifteenSeconds);
}
Then use it like this
dtTA = (from dr1 in dtTA.AsEnumerable()
group dr1 by By15Seconds(dr1.Field<DateTime>("SAMPLE_TIME")) into g
select new {
ST = g.Key,
CPU = g.Sum(h => h.Field<double>("CPU")) / 15.0,
}).ToDataTable();
Note: This creates 15 seconds blocks starting at 00:00:00. If you want another start value for the seconds, you can first subtract this value, do the truncation and finally re-add this value. this is done in this generalized extension method:
public static DateTime BySeconds(this DateTime d, int blockSize, int startAt = 0)
{
long blockTicks = TimeSpan.FromSeconds(blockSize).Ticks;
long startTicks = TimeSpan.FromSeconds(startAt).Ticks;
return new DateTime(((d.Ticks - startTicks) / blockTicks * blockTicks) + startTicks);
}

Related

Accumulate values of a list

I have a list with that each object has two fields:
Date as DateTime
Estimated as double.
I have some values like this:
01/01/2019 2
01/02/2019 3
01/03/2019 4
... and so.
I need to generate another list, same format, but accumulating the Estimated field, date by date. So the result must be:
01/01/2019 2
01/02/2019 5 (2+3)
01/03/2019 9 (5+4) ... and so.
Right now, I'm calculating it in a foreach statement
for (int iI = 0; iI < SData.TotalDays; iI++)
{
DateTime oCurrent = SData.ProjectStart.AddDays(iI);
oRet.Add(new GraphData(oCurrent, GetProperEstimation(oCurrent)));
}
Then, I can execute a Linq Sum for all the dates prior or equal to the current date:
private static double GetProperEstimation(DateTime pDate)
{
return Data.Where(x => x.Date.Date <= pDate.Date).Sum(x => x.Estimated);
}
It works. But the problem is that is ABSLOUTELLY slow, taking more than 1 minute for a 271 element list.
Is there a better way to do this?
Thanks in advance.
You can write a simple LINQ-like extension method that accumulates values. This version is generalized to allow different input and output types:
static class ExtensionMethods
{
public static IEnumerable<TOut> Accumulate<TIn, TOut>(this IEnumerable<TIn> source, Func<TIn,double> getFunction, Func<TIn,double,TOut> createFunction)
{
double accumulator = 0;
foreach (var item in source)
{
accumulator += getFunction(item);
yield return createFunction(item, accumulator);
}
}
}
Example usage:
public static void Main()
{
var list = new List<Foo>
{
new Foo { Date = new DateTime(2018,1,1), Estimated = 1 },
new Foo { Date = new DateTime(2018,1,2), Estimated = 2 },
new Foo { Date = new DateTime(2018,1,3), Estimated = 3 },
new Foo { Date = new DateTime(2018,1,4), Estimated = 4 },
new Foo { Date = new DateTime(2018,1,5), Estimated = 5 }
};
var accumulatedList = list.Accumulate
(
(item) => item.Estimated, //Given an item, get the value to be summed
(item, sum) => new { Item = item, Sum = sum } //Given an item and the sum, create an output element
);
foreach (var item in accumulatedList)
{
Console.WriteLine("{0:yyyy-MM-dd} {1}", item.Item.Date, item.Sum);
}
}
Output:
2018-01-01 1
2018-01-02 3
2018-01-03 6
2018-01-04 10
2018-01-05 15
This approach will only require one iteration over the set so should perform much better than a series of sums.
Link to DotNetFiddle example
This is exactly job of MoreLinq.Scan
var newModels = list.Scan((x, y) => new MyModel(y.Date, x.Estimated + y.Estimated));
New models will have the values you want.
in (x, y), x is the previous item and y is the current item in the enumeration.
Why your query is slow?
because Where will iterate your collection from the beginning every time you call it. so number of operations grow exponentially 1 + 2 + 3 + ... + n = ((n^2)/2 + n/2).
You can try this. Simple yet effective.
var i = 0;
var result = myList.Select(x => new MyObject
{
Date = x.Date,
Estimated = i = i + x.Estimated
}).ToList();
Edit : try in this way
.Select(x => new GraphData(x.Date, i = i + x.Estimated))
I will assume that what you said is real what you need hehehe
Algorithm
Create a list or array of values based in the original values ordered date asc
sumValues=0;
foreach (var x in collection){
sumValues+= x.Estimated; //this will accumulate all the past values and present value
oRet.Add(x.date, sumValues);
}
The first step (order the values) is the most important. For each will be very fast.
see sort

Use Linq let sum to calculate an average

How to calculate sum of timedifference and average using linq let method, I tried below mentioned code it's return timedifference list only.
var query1 = from c in DBCollection.Find(Query_Collection).ToList()
let DtCreateDate = Convert.ToDateTime(c["CreatedDate"])
let DtModifiedDate = Convert.ToDateTime(c["LastModifiedDate"])
let difference = (DtModifiedDate - DtCreateDate).TotalSeconds select new { difference };
By doing the following:
var query1 = from c in DBCollection.Find(Query_Collection).ToList()
let DtCreateDate = Convert.ToDateTime(c["CreatedDate"])
let DtModifiedDate = Convert.ToDateTime(c["LastModifiedDate"])
let difference = (DtModifiedDate - DtCreateDate).TotalSeconds
let averageSum = (((DtCreateDate + DtModifiedDate) / 2) + difference) //calculate the average
select new { difference, averageSum };
Above, the 'difference' between two given dates is saved in the variable difference.
I have added another variable called 'averageSum', that now stores the value of the average between the two dates, and then adds the difference to the average.
I've don't understund how you calculate average for every row, but to accumulate custom values you can use Aggregate method as shown below:
var query1 = from c in DBCollection.Find(Query_Collection).ToList()
.Select(e => new {DtCreateDate = Convert.ToDateTime(e["CreatedDate"]), DtModifiedDate = Convert.ToDateTime(e["LastModifiedDate"])})
.Select(e => new {Diff = (e.DtModifiedDate - e.DtCreateDate).TotalSeconds, Av = 0 /* Calculate average for the row */} )
.Aggregate(new {Diff = (double) 0, Av = 0}, (group, cur) => new {Diff = group.Diff + cur.Diff, Av = group.Av + cur.Av});
Result of the statement is a single object which contais sums for difference and average on the whole list

Linq query to find average time difference per group

I have a datatable that I want to query to get the average time difference per groups of Case ID. My data looks as follows.
Name Case ID Incept Time Edit Time
---------------------------------------------------------------------------
Blue 1 2017-02-26T02:35:49-04:00 2017-03-26T02:35:49-04:00
Blue 1 2017-02-26T02:34:49-04:00 2017-04-26T02:35:49-04:00
Blue 1 2017-02-26T02:33:49-04:00 2017-05-26T02:35:49-04:00
Blue 2 2017-02-26T02:32:49-04:00 2017-06-26T02:35:49-04:00
Blue 2 2017-02-26T02:31:49-04:00 2017-07-26T01:35:49-04:00
Blue 2 2017-02-26T02:30:49-04:00 2017-08-26T03:35:49-04:00
Red 5 2017-02-26T02:25:49-04:00 2017-09-26T04:35:49-04:00
Red 5 2017-02-26T02:15:49-04:00 2017-10-26T05:35:49-04:00
Red 1 2017-02-26T02:05:49-04:00 2017-11-26T02635:49-04:00
Red 1 2017-02-26T01:35:49-04:00 2017-12-26T02:35:49-04:00
Red 5 2017-02-26T05:35:49-04:00 2017-12-27T02:35:49-04:00
So far I have the following query which can get into each group of Case ID and get the min and max values.
private IEnumerable<DataRow> _data;
var query =
from data in this._data
group data by data.Field<string>("Name") into groups
select new
{
formName = groups.Key,
caseDiffs =
from d in groups
group d by d.Field<string>("Case ID") into grps
select new
{
min = grps.Min(t =>
DateTimeOffset.ParseExact(t.Field<string>("Incept Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
),
max = grps.Max(t =>
DateTimeOffset.ParseExact(t.Field<string>("Edit Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
)
}
};
My questions are
1) is it possible to include the difference between the min and max values (per case ID group) to the query
2) At the end how can I get the averages calculated like the diagram below
UPDATED to reflect your changed question...
I've split this into three separate queries so that you can read it more easily (you can combine if you want):
//convert the data using a projection query
var query1 = from data in _data
let inceptTime = DateTimeOffset.ParseExact(data.Field<string>("Incept Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let editTime = DateTimeOffset.ParseExact(data.Field<string>("Edit Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let difference = editTime - inceptTime
select new
{
name = data.Field<string>("Name"),
caseId = data.Field<string>("Case ID"),
inceptTime,
editTime,
difference
};
//group by caseID (also by NAME, but that won't matter for this grouping and is needed in query3)
var query2 = from data in query1
group data by new { data.caseId, data.name } into groups
let min = groups.Min(x => x.inceptTime)
let max = groups.Max(x => x.editTime)
select new
{
name = groups.Key.name,
caseId = groups.Key.caseId,
min,
max,
diff = max - min
};
//now group by name
var query3 = from data in query2
group data by new { data.name } into groups
select new
{
name = groups.Key.name,
minDiff = groups.Min(x => x.diff),
maxDiff = groups.Max(x => x.diff),
avgDiff = new TimeSpan((long)groups.Average(x => x.diff.Ticks)),
};
NOTE: The "edit time" for the 9th record is in an invalid format
You just need to define a few let variables in your LINQ query. About 3 more lines, in fact. Your grouping LINQ should look like this:
var query =
from data in this._data
group data by data.Field<string>("Name") into groups
select new
{
formName = groups.Key,
caseDiffs = from d in groups group d by d.Field<string>("Case ID") into grps
// three variables here, so that you can do the
// date math that you require!
let minDt = caseGroup.Min(t =>
DateTimeOffset.ParseExact(t.Field<string>("Incept Time"),
"yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let maxDt = grps.Max(t =>
DateTimeOffset.ParseExact(t.Field<string>("Edit Time"),
"yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let diffInSecs = (maxDt - minDt).TotalSeconds
select new
{
min = minDt,
max = maxDt,
diff = diffInSecs
}
};
Hope that helps!

ASP.NET MVC Filter datetime by weeks

I've got a Web API and a Get method, returning a query:
var query = from results in context.Table
where results.Date>= startDate && results.Date <= endDate
select new
{
Week = { this is where I need a method to group by weeks },
Average = results.Where(x => x.Number).Average()
}
return query.ToList();
I want to calculate the average for each 7 days (that being the first week).
Example:
Average 1 ... day 7 (Week 1)
Average 2 ... day 14 (Week 2)
How can I do that? Being given an interval of datetimes, to filter it by weeks (not week of year)
Try this (not tested with tables)
var avgResult = context.QuestionaireResults
.Where(r => (r.DepartureDate >= startDate && r.DepartureDate <= endDate)).ToList()
.GroupBy( g => (Decimal.Round(g.DepartureDate.Day / 7)+1))
.Select( g => new
{
Week = g.Key,
Avg = g.Average(n => n.Number)
});
You will need to group by the number of days, since a reference date, divided by 7, so
.GroupBy(x => Math.Floor(((x.DepartureDate - new DateTime(1980,1,1)).TotalDays + 2) / 7))
Subtracting "Jan 1, 1980" from your departure date, gives you a TimeSpan object with the difference between the two dates. The TotalDays property of that timespan gives you timespan in days. Adding 2 corrects for the fact that "Jan 1, 1980" was a Tuesday. Dividing by 7 gives you the number of weeks since then. Math.Floor rounds it down, so that you get a consistent integer for the week, given any day of the week or portion of days within the week.
You could simplify a little by picking a reference date that is a Sunday (assuming that is your "first day of the week"), so you dont have to add 2 to correct. Like so:
.GroupBy(x => Math.Floor(((x.DepartureDate - new DateTime(1979,12,30)).TotalDays) / 7))
If you are sure that your data all falls within a single calendar year, you could maybe use the Calendar.GetWeekOfYear method to figure out the week, but I am not sure it would be any simpler.
Why not write a stored procedure, I think there may be some limitations on your flexibility using Linq because of the idea that normally the GroupBy groups by value (the value of the referenced "thing") so you can group by State, or Age, but I guess you can Group week... (new thought)
Add a property called EndOfWeek and for example, the end of this week is (Sunday let's say) then EndOfWeek = 9.2.16 whereas last week was 8.28.16... etc. then you can easily group but you still have to arrange the data.
I know I didn't answer the question but I hope that I sparked some brain activity in an area that allows you to solve the problem.
--------- UPDATED ----------------
simple solution, loop through your records, foreach record determine the EndOfWeek for that record. After this you will now have a groupable value. Easily group by EndOfWeek. Simple!!!!!!!!!!!! Now, #MikeMcCaughan please tell me how this doesn't work? Is it illogical to extend an object? What are you talking about?
------------ HERE IS THE CODE ----------------
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace SandboxConsole
{
class Program
{
static void Main(string[] args)
{
var t = new Transactions();
List<Transactions> transactions = t.GetTransactions();
// Now let's add a Weeks end date so we can determine the average per week
foreach(var transaction in transactions)
{
var transactionDayOfWeek = transaction.TransactionDate;
int daysUntilEndOfWeek_Sat = ((int)DayOfWeek.Saturday - (int)transactionDayOfWeek.DayOfWeek + 7) % 7;
transaction.Newly_Added_Property_To_Group_By_Week_To_Get_Averages = transactionDayOfWeek.AddDays(daysUntilEndOfWeek_Sat).ToShortDateString();
//Console.WriteLine("{0} {")
}
foreach(var weekEnd in transactions.GroupBy(tt => tt.Newly_Added_Property_To_Group_By_Week_To_Get_Averages))
{
decimal weekTotal = 0;
foreach(var trans in weekEnd)
{
weekTotal += trans.Amount;
}
var weekAverage = weekTotal / 7;
Console.WriteLine("Week End: {0} - Avg {1}", weekEnd.Key.ToString(), weekAverage.ToString("C"));
}
Console.ReadKey();
}
}
class Transactions
{
public int Id { get; set; }
public string SomeOtherProp { get; set; }
public DateTime TransactionDate { get; set; }
public decimal Amount { get; set; }
public string Newly_Added_Property_To_Group_By_Week_To_Get_Averages { get; set; }
public List<Transactions> GetTransactions()
{
var results = new List<Transactions>();
for(var i = 0; i<100; i++)
{
results.Add(new Transactions
{
Id = i,
SomeOtherProp = "Customer " + i.ToString(),
TransactionDate = GetRandomDate(i),
Amount = GetRandomAmount()
});
}
return results;
}
public DateTime GetRandomDate(int i)
{
Random gen = new Random();
DateTime startTime = new DateTime(2016, 1, 1);
int range = (DateTime.Today - startTime).Days + i;
return startTime.AddDays(gen.Next(range));
}
public int GetRandomAmount()
{
Random rnd = new Random();
int amount = rnd.Next(1000, 10000);
return amount;
}
}
}
------------ OUTPUT ---------------
Sample Output

Aggregate data in DataTable in time intervals (5 minutes)

I have a DataTable
DataTable dt = new DataTable();
dt.Columns.Add("ts");
dt.Columns.Add("agent");
dt.Columns.Add("host");
dt.Columns.Add("metric");
dt.Columns.Add("val");
My data comes in 15 seconds intervals; and I need to get MAX "val" for a period of 5 minutes for each host/agent/metric (including the 5 min timestamp indicator)
This is the colosest thing that I have.
var q1 = from r in dt.Rows.Cast<DataRow>()
let ts = Convert.ToDateTime(r[0].ToString())
group r by new DateTime(ts.Year, ts.Month, ts.Day, ts.Hour, ts.Minute, ts.Second)
into g
select new
{
ts = g.Key,
agentName = g.Select(r => r[1].ToString()),
Sum = g.Sum(r => (int.Parse(r[4].ToString()))),
Average = g.Average(r => (int.Parse(r[4].ToString()))),
Max = g.Max(r => (int.Parse(r[4].ToString())))
};
Pretty lousy
To group the times by five minute intervals we can simply divide the Ticks in the time by the size of our interval, which we can pre-compute. In this case, it's the number of ticks in five minutes:
long ticksInFiveMinutes = TimeSpan.TicksPerMinute * 5;
The query then becomes:
var query = from r in dt.Rows.Cast<DataRow>()
let ts = Convert.ToDateTime(r[0].ToString())
group r by new { ticks = ts.Ticks / ticksInFiveMinutes, agent, host }
into g
let key = new DateTime(g.Key * ticksInFiveMinutes)
select new
{
ts = key,
agentName = g.Select(r => r[1].ToString()),
Sum = g.Sum(r => (int.Parse(r[4].ToString()))),
Average = g.Average(r => (int.Parse(r[4].ToString()))),
Max = g.Max(r => (int.Parse(r[4].ToString())))
};
How about the following approach...
Define a GetHashcode method:
public DateTime Arrange5Min(DateTime value)
{
var stamp = value.timestamp;
stamp = stamp.AddMinutes(-(stamp.Minute % 5));
stamp = stamp.AddMilliseconds(-stamp.Millisecond - 1000 * stamp.Second);
return stamp;
}
public int MyGetHashCode(DataRow r)
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + r[1].ToString().GetHashCode();
hash = hash * 23 + r[2].ToString().GetHashCode();
hash = hash * 23 + r[3].ToString().GetHashCode();
var stamp = Arrange5Min(Convert.ToDateTime(r[0].ToString()));
hash = hash * 23 + stamp.GetHashCode();
return hash;
}
}
borrowed from here: What is the best algorithm for an overridden System.Object.GetHashCode? and LINQ aggregate and group by periods of time
Then use the function in Linq
var q1 = from r in dt.Rows.Cast<DataRow>()
group r by MyGetHashCode(r)
into g
let intermidiate = new {
Row = g.First(),
Max = g.Max(v => (int.Parse(r[4].ToString())))
}
select
new {
Time = Arrange5Min(Convert.ToDateTime(intermidiate[0].ToString())),
Host = intermidiate.Row[2].ToString(),
Agent = intermidiate.Row[1].ToString(),
Metric = intermidiate.Row[3].ToString(),
Max = g.Max(v => (int.Parse(r[4].ToString())))
}

Categories

Resources