Aggregate data in DataTable in time intervals (5 minutes) - c#

I have a DataTable
DataTable dt = new DataTable();
dt.Columns.Add("ts");
dt.Columns.Add("agent");
dt.Columns.Add("host");
dt.Columns.Add("metric");
dt.Columns.Add("val");
My data comes in 15 seconds intervals; and I need to get MAX "val" for a period of 5 minutes for each host/agent/metric (including the 5 min timestamp indicator)
This is the colosest thing that I have.
var q1 = from r in dt.Rows.Cast<DataRow>()
let ts = Convert.ToDateTime(r[0].ToString())
group r by new DateTime(ts.Year, ts.Month, ts.Day, ts.Hour, ts.Minute, ts.Second)
into g
select new
{
ts = g.Key,
agentName = g.Select(r => r[1].ToString()),
Sum = g.Sum(r => (int.Parse(r[4].ToString()))),
Average = g.Average(r => (int.Parse(r[4].ToString()))),
Max = g.Max(r => (int.Parse(r[4].ToString())))
};
Pretty lousy

To group the times by five minute intervals we can simply divide the Ticks in the time by the size of our interval, which we can pre-compute. In this case, it's the number of ticks in five minutes:
long ticksInFiveMinutes = TimeSpan.TicksPerMinute * 5;
The query then becomes:
var query = from r in dt.Rows.Cast<DataRow>()
let ts = Convert.ToDateTime(r[0].ToString())
group r by new { ticks = ts.Ticks / ticksInFiveMinutes, agent, host }
into g
let key = new DateTime(g.Key * ticksInFiveMinutes)
select new
{
ts = key,
agentName = g.Select(r => r[1].ToString()),
Sum = g.Sum(r => (int.Parse(r[4].ToString()))),
Average = g.Average(r => (int.Parse(r[4].ToString()))),
Max = g.Max(r => (int.Parse(r[4].ToString())))
};

How about the following approach...
Define a GetHashcode method:
public DateTime Arrange5Min(DateTime value)
{
var stamp = value.timestamp;
stamp = stamp.AddMinutes(-(stamp.Minute % 5));
stamp = stamp.AddMilliseconds(-stamp.Millisecond - 1000 * stamp.Second);
return stamp;
}
public int MyGetHashCode(DataRow r)
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + r[1].ToString().GetHashCode();
hash = hash * 23 + r[2].ToString().GetHashCode();
hash = hash * 23 + r[3].ToString().GetHashCode();
var stamp = Arrange5Min(Convert.ToDateTime(r[0].ToString()));
hash = hash * 23 + stamp.GetHashCode();
return hash;
}
}
borrowed from here: What is the best algorithm for an overridden System.Object.GetHashCode? and LINQ aggregate and group by periods of time
Then use the function in Linq
var q1 = from r in dt.Rows.Cast<DataRow>()
group r by MyGetHashCode(r)
into g
let intermidiate = new {
Row = g.First(),
Max = g.Max(v => (int.Parse(r[4].ToString())))
}
select
new {
Time = Arrange5Min(Convert.ToDateTime(intermidiate[0].ToString())),
Host = intermidiate.Row[2].ToString(),
Agent = intermidiate.Row[1].ToString(),
Metric = intermidiate.Row[3].ToString(),
Max = g.Max(v => (int.Parse(r[4].ToString())))
}

Related

Aggregate and Group by over 15 second using c# linq

I have following datatable;
SAMPLE_TIME CPU
-----------------------------
14:59:32 3
14:59:20 2
14:59:14 9
14:58:57 2
14:58:48 1
What i want is, summing "count" over 15 second intervals and getting average to new datatable.
So, I want to get the following result using linq;
SAMPLE_TIME CPU
-----------------------------
14:59:32 0.33
14:59:17 0.6
14:59:02 0.2
I tried to get it like below but i can't find the way;
dtTA = (from dr1 in dtTA.AsEnumerable()
group dr1 by dr1.Field<DateTime>("SAMPLE_TIME") into g
select new
{
ST = g.Key,
CPU = g.Sum(h => h.Field<double>("CPU")),
}).ToDataTable();
What should i change on it?
You can create a function which truncates DateTimes to 15 seconds precision.
private static DateTime By15Seconds(DateTime d)
{
long fifteenSeconds = TimeSpan.FromSeconds(15).Ticks;
return new DateTime((d.Ticks / fifteenSeconds) * fifteenSeconds);
}
Then use it like this
dtTA = (from dr1 in dtTA.AsEnumerable()
group dr1 by By15Seconds(dr1.Field<DateTime>("SAMPLE_TIME")) into g
select new {
ST = g.Key,
CPU = g.Sum(h => h.Field<double>("CPU")) / 15.0,
}).ToDataTable();
Note: This creates 15 seconds blocks starting at 00:00:00. If you want another start value for the seconds, you can first subtract this value, do the truncation and finally re-add this value. this is done in this generalized extension method:
public static DateTime BySeconds(this DateTime d, int blockSize, int startAt = 0)
{
long blockTicks = TimeSpan.FromSeconds(blockSize).Ticks;
long startTicks = TimeSpan.FromSeconds(startAt).Ticks;
return new DateTime(((d.Ticks - startTicks) / blockTicks * blockTicks) + startTicks);
}

Use Linq let sum to calculate an average

How to calculate sum of timedifference and average using linq let method, I tried below mentioned code it's return timedifference list only.
var query1 = from c in DBCollection.Find(Query_Collection).ToList()
let DtCreateDate = Convert.ToDateTime(c["CreatedDate"])
let DtModifiedDate = Convert.ToDateTime(c["LastModifiedDate"])
let difference = (DtModifiedDate - DtCreateDate).TotalSeconds select new { difference };
By doing the following:
var query1 = from c in DBCollection.Find(Query_Collection).ToList()
let DtCreateDate = Convert.ToDateTime(c["CreatedDate"])
let DtModifiedDate = Convert.ToDateTime(c["LastModifiedDate"])
let difference = (DtModifiedDate - DtCreateDate).TotalSeconds
let averageSum = (((DtCreateDate + DtModifiedDate) / 2) + difference) //calculate the average
select new { difference, averageSum };
Above, the 'difference' between two given dates is saved in the variable difference.
I have added another variable called 'averageSum', that now stores the value of the average between the two dates, and then adds the difference to the average.
I've don't understund how you calculate average for every row, but to accumulate custom values you can use Aggregate method as shown below:
var query1 = from c in DBCollection.Find(Query_Collection).ToList()
.Select(e => new {DtCreateDate = Convert.ToDateTime(e["CreatedDate"]), DtModifiedDate = Convert.ToDateTime(e["LastModifiedDate"])})
.Select(e => new {Diff = (e.DtModifiedDate - e.DtCreateDate).TotalSeconds, Av = 0 /* Calculate average for the row */} )
.Aggregate(new {Diff = (double) 0, Av = 0}, (group, cur) => new {Diff = group.Diff + cur.Diff, Av = group.Av + cur.Av});
Result of the statement is a single object which contais sums for difference and average on the whole list

Linq query to find average time difference per group

I have a datatable that I want to query to get the average time difference per groups of Case ID. My data looks as follows.
Name Case ID Incept Time Edit Time
---------------------------------------------------------------------------
Blue 1 2017-02-26T02:35:49-04:00 2017-03-26T02:35:49-04:00
Blue 1 2017-02-26T02:34:49-04:00 2017-04-26T02:35:49-04:00
Blue 1 2017-02-26T02:33:49-04:00 2017-05-26T02:35:49-04:00
Blue 2 2017-02-26T02:32:49-04:00 2017-06-26T02:35:49-04:00
Blue 2 2017-02-26T02:31:49-04:00 2017-07-26T01:35:49-04:00
Blue 2 2017-02-26T02:30:49-04:00 2017-08-26T03:35:49-04:00
Red 5 2017-02-26T02:25:49-04:00 2017-09-26T04:35:49-04:00
Red 5 2017-02-26T02:15:49-04:00 2017-10-26T05:35:49-04:00
Red 1 2017-02-26T02:05:49-04:00 2017-11-26T02635:49-04:00
Red 1 2017-02-26T01:35:49-04:00 2017-12-26T02:35:49-04:00
Red 5 2017-02-26T05:35:49-04:00 2017-12-27T02:35:49-04:00
So far I have the following query which can get into each group of Case ID and get the min and max values.
private IEnumerable<DataRow> _data;
var query =
from data in this._data
group data by data.Field<string>("Name") into groups
select new
{
formName = groups.Key,
caseDiffs =
from d in groups
group d by d.Field<string>("Case ID") into grps
select new
{
min = grps.Min(t =>
DateTimeOffset.ParseExact(t.Field<string>("Incept Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
),
max = grps.Max(t =>
DateTimeOffset.ParseExact(t.Field<string>("Edit Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
)
}
};
My questions are
1) is it possible to include the difference between the min and max values (per case ID group) to the query
2) At the end how can I get the averages calculated like the diagram below
UPDATED to reflect your changed question...
I've split this into three separate queries so that you can read it more easily (you can combine if you want):
//convert the data using a projection query
var query1 = from data in _data
let inceptTime = DateTimeOffset.ParseExact(data.Field<string>("Incept Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let editTime = DateTimeOffset.ParseExact(data.Field<string>("Edit Time"), "yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let difference = editTime - inceptTime
select new
{
name = data.Field<string>("Name"),
caseId = data.Field<string>("Case ID"),
inceptTime,
editTime,
difference
};
//group by caseID (also by NAME, but that won't matter for this grouping and is needed in query3)
var query2 = from data in query1
group data by new { data.caseId, data.name } into groups
let min = groups.Min(x => x.inceptTime)
let max = groups.Max(x => x.editTime)
select new
{
name = groups.Key.name,
caseId = groups.Key.caseId,
min,
max,
diff = max - min
};
//now group by name
var query3 = from data in query2
group data by new { data.name } into groups
select new
{
name = groups.Key.name,
minDiff = groups.Min(x => x.diff),
maxDiff = groups.Max(x => x.diff),
avgDiff = new TimeSpan((long)groups.Average(x => x.diff.Ticks)),
};
NOTE: The "edit time" for the 9th record is in an invalid format
You just need to define a few let variables in your LINQ query. About 3 more lines, in fact. Your grouping LINQ should look like this:
var query =
from data in this._data
group data by data.Field<string>("Name") into groups
select new
{
formName = groups.Key,
caseDiffs = from d in groups group d by d.Field<string>("Case ID") into grps
// three variables here, so that you can do the
// date math that you require!
let minDt = caseGroup.Min(t =>
DateTimeOffset.ParseExact(t.Field<string>("Incept Time"),
"yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let maxDt = grps.Max(t =>
DateTimeOffset.ParseExact(t.Field<string>("Edit Time"),
"yyyy-MM-ddTHH:mm:sszzzz", CultureInfo.InvariantCulture)
let diffInSecs = (maxDt - minDt).TotalSeconds
select new
{
min = minDt,
max = maxDt,
diff = diffInSecs
}
};
Hope that helps!

How to perform operation on grouped records?

This is my records:
Id EmpName Stats
1 Abc 1000
1 Abc 3000
1 Abc 2000
2 Pqr 4000
2 Pqr 5000
2 Pqr 6000
2 Pqr 7000
I am trying to group by on Id fields and after doing group by i want output like this:
Expected output:
Id EmpName Stats
1 Abc 3000
2 Pqr 3000
For 1st output record calculation is like this:
3000 - 1000=2000 (i.e subtract highest - lowest from 1st and 2nd records)
3000 - 2000=1000 (i.e subtract highest - lowest from 2nd and 3rd records)
Total=2000 + 1000 =3000
For 2nd output record calculation is like this:
5000 - 4000=1000 (i.e subtract highest - lowest from first two records)
6000 - 5000=1000
7000 - 6000=1000
total=1000 + 1000=2000
This is 1 sample fiddle i have created:Fiddle
So far i have manage to group records by id but now how do i perform this calculation on group records??
You can use the Aggregate method overload that allows you to maintain custom accumulator state.
In your case, we'll be maintaining the following:
decimal Sum; // Current result
decimal Prev; // Previous element Stats (zero for the first element)
int Index; // The index of the current element
The Index is basically needed just to avoid accumulating the first element Stats into the result.
And here is the query:
var result = list.GroupBy(t => t.Id)
.Select(g => new
{
ID = g.Key,
Name = g.First().EmpName,
Stats = g.Aggregate(
new { Sum = 0m, Prev = 0m, Index = 0 },
(a, e) => new
{
Sum = (a.Index < 2 ? 0 : a.Sum) + Math.Abs(e.Stats - a.Prev),
Prev = e.Stats,
Index = a.Index + 1
}, a => a.Sum)
}).ToList();
Edit: As requested in the comments, here is the foreach equivalent of the above Aggregate usage:
static decimal GetStats(IEnumerable<Employee> g)
{
decimal sum = 0;
decimal prev = 0;
int index = 0;
foreach (var e in g)
{
sum = (index < 2 ? 0 : sum) + Math.Abs(e.Stats - prev);
prev = e.Stats;
index++;
}
return sum;
}
Firstly, like mentioned in my comment, this can be done using a single linq query but would have many complications, one being unreadable code.
Using a simple foreach on the IGrouping List,
Updated (handle dynamic group length):
var list = CreateData();
var groupList = list.GroupBy(t => t.Id);
var finalList = new List<Employee>();
//Iterate on the groups
foreach(var grp in groupList){
var part1 = grp.Count()/2;
var part2 = (int)Math.Ceiling((double)grp.Count()/2);
var firstSet = grp.Select(i=>i.Stats).Take(part2);
var secondSet = grp.Select(i=>i.Stats).Skip(part1).Take(part2);
var total = (firstSet.Max() - firstSet.Min()) + (secondSet.Max() - secondSet.Min());
finalList.Add(new Employee{
Id = grp.Key,
EmpName = grp.FirstOrDefault().EmpName,
Stats = total
});
}
*Note -
You can optimize the logic used in getting the data for calculation.
More complicated logic is to divide the group into equal parts in case it is not fixed.
Updated Fiddle
The LinQ way,
var list = CreateData();
var groupList = list.GroupBy(t => t.Id);
var testLinq = (from l in list
group l by l.Id into grp
let part1 = grp.Count()/2
let part2 = (int)Math.Ceiling((double)grp.Count()/2)
let firstSet = grp.Select(i=>i.Stats).Take(part2)
let secondSet = grp.Select(i=>i.Stats).Skip(part1).Take(part2)
select new Employee{
Id = grp.Key,
EmpName = grp.FirstOrDefault().EmpName,
Stats = (firstSet.Max() - firstSet.Min()) + (secondSet.Max() - secondSet.Min())
}).ToList();

Get n strings from Linq

I am trying to get every 5 "NewNumber" int's to insert in to var q. Let's say there are 20 records returned by UniqueNumbers, I would like to get 1-5, 6-10, 11-15, 16-20 and then have Number1 = 1,Number2 = 2,Number3 = 3,Number4 = 4,Number5 = 5 passed to var q the first time, followed by Number1 = 6, Number2 = 7, Number3 = 8, Number4 = 9, Number5 = 10 and so on...
var UniqueNumbers =
from t in Numbers
group t by new { t.Id } into g
select new
{
NewNumber = g.Key.Id,
};
UniqueNumbers.Skip(0).Take(5)
var q = new SolrQueryInList("NewNumber1", "NewNumber2","NewNumber3","NewNumber4","NewNumber5");
If you have a list of items, you can easily separate them into groups of five like this:
int count = 0;
var groupsOfFive =
from t in remaining
group t by count++ / 5 into g
select new { Key=g.Key, Numbers = g };
And then:
foreach (var g in groupsOfFive)
{
var parms = g.Numbers.Select(n => n.ToString()).ToArray();
var q = new SolrQueryInList(parms[0], parms[1], parms[2], parms[3], parms[4]);
}
I think what you want is some variation on that.
Edit
Another way to do it, if for some reason you don't want to do the grouping, would be:
var items = remaining.Select(n => n.ToString()).ToArray();
for (int current = 0; current < remaining.Length; remaining += 5)
{
var q = new SolrQueryInList(
items[current],
items[current+1],
items[current+2],
items[current+3],
items[current+4]);
}
Both of these assume that the number of items is evenly divisible by 5. If it's not, you have to handle the possibility of not enough parameters.
Try something like this:
for (int i = 0; i < UniqueNumbers.Count / 5; i++)
{
// Gets the next 5 numbers
var group = UniqueNumbers.Skip(i * 5).Take(5);
// Convert the numbers to strings
var stringNumbers = group.Select(n => n.ToString()).ToList();
// Pass the numbers into the method
var q = new SolrQueryInList(stringNumbers[0], stringNumbers[1], ...
}
You'll have to figure out how to manage boundary conditions, like if UniqueNumbers.Count is not divisible by 5. You might also be able to modify SolrQueryInList to take a list of numbers so that you don't have to index into the list 5 times for that call.
EDIT:
Jim Mischel pointed out that looping over a Skip operation gets expensive fast. Here's a variant that keeps your place, rather than starting at the beginning of the list every time:
var remaining = UniqueNumbers;
while(remaining.Any())
{
// Gets the next 5 numbers
var group = remaining.Take(5);
// Convert the numbers to strings
var stringNumbers = group.Select(n => n.ToString()).ToList();
// Pass the numbers into the method
var q = new SolrQueryInList(stringNumbers[0], stringNumbers[1], ...
// Update the starting spot
remaining = remaining.Skip(5);
}

Categories

Resources