How to get hourly averages of my data - c#

I have a table called 'Samples' with the following columns: Id, Type, Quantity, Unit, SampleTime.
I have the following statement that calculates the daily weight average:
weightchart.data = db.Samples
.Where(n => n.Type == "weight")
.GroupBy(n => DbFunctions.TruncateTime(n.SampleTime))
.Select(item => new ChartData()
{
timestamp = item.Key,
average = item.Average(a => a.Quantity),
max = item.Max(a => a.Quantity),
min = item.Min(a => a.Quantity),
count = item.Count()
}).OrderBy(n => n.timestamp).ToList();
charts.Add(weightchart);
Now I would like to get the hourly average. I can't figure out how to do this. I'm new to c#, linq and entities.
Update: I've updated my statement to this:
weightchart.data = db.Samples.Where(n => n.Type == "weight").GroupBy(n => new { Date = n.SampleTime.Date, Hour = n.SampleTime.Hour }).Select(item => new ChartData()
{
timestamp = new DateTime(item.Key.Date.Year, item.Key.Date.Month, item.Key.Date.Day, item.Key.Date.Hour, 0, 0),
average = item.Average(a => a.Quantity),
max = item.Max(a => a.Quantity),
min = item.Min(a => a.Quantity),
count = item.Count()
}).OrderBy(n => n.timestamp).ToList();
charts.Add(weightchart);
However I get this exception:
An exception of type 'System.NotSupportedException' occurred in EntityFramework.SqlServer.dll but was not handled in user code
Additional information: The specified type member 'Date' is not supported in LINQ to Entities. Only initializers, entity members, and entity navigation properties are supported.

see update below
Assuming the type of SampleTime is a System.DateTime, you could change the GroupBy to .GroupBy(n => n.SampleTime.Hour), and that would let you compute the average per hour. Be aware that DateTime.Hour returns a hours from 0 to 23.
Try something like the following. I've slightly simplified, but if you can verify that it works, you can reshape it to the full thing you are trying. Also, it's hard for me to test it, because I'd have construct sample data first). An explanation for what is going on (a LINQ to SQL limitation) is here and a helpful hint on how to more easily get around the limitation via AsEnumerable() is here.
var dataAsList = (from n in db.Samples.Where(n => n.Type == "weight").AsEnumerable()
group n by new { Date = n.SampleTime.Date, Hour = n.SampleTime.Hour } into g
select new
{
dategrouping = g.Date,
hourgrouping = g.Hour,
average = g.Average(a => a.Quantity),
max = g.Max(a => a.Quantity),
min = g.Min(a => a.Quantity),
count = g.Count()
}).ToList();

Related

Group datetime by an interval of minutes

I'm trying to group my list using linq by an interval of 30 minutes.
Let’s say we have this list:
X called at 10:00 AM
Y called at 10:10 AM
Y called at 10:20 AM
Y called at 10:35 AM
X called at 10:40 AM
Y called at 10:45 AM
What i need is to group these items in a 30 minutes frame and by user, like so:
X called at 10:00 AM
Y called 3 times between 10:10 AM and 10:35 AM
X called at 10:40 AM
Y called at 10:45 AM
Here's what i'm using with Linq:
myList
.GroupBy(i => i.caller, (k, g) => g
.GroupBy(i => (long)new TimeSpan(Convert.ToDateTime(i.date).Ticks - g.Min(e => Convert.ToDateTime(e.date)).Ticks).TotalMinutes / 30)
.Select(g => new
{
count = g.Count(),
obj = g
}));
I need the result in one list, but instead im getting the result in nested lists, which needs multiple foreach to extract.
Any help is much appreciated!
I think you are looking for SelectMany which will unwind one level of grouping:
var ans = myList
.GroupBy(c => c.caller, (caller, cg) => new { Key = caller, MinDateTime = cg.Min(c => c.date), Calls = cg })
.SelectMany(cg => cg.Calls.GroupBy(c => (int)(c.date - cg.MinDateTime).TotalMinutes / 30))
.OrderBy(cg => cg.Min(c => c.date))
.ToList();
Note: The GroupBy return selects the Min as a minor efficiency improvement so you don't constantly re-find the minimum DateTime for each group per call.
Note 2: The (int) conversion creates the buckets - otherwise, .TotalMinutes returns a double and the division by 30 just gives you a (unique) fractional answer and you get no grouping into buckets.
By modifying the initial code (again for minor efficiency), you can reformat the answer to match your textual result:
var ans = myList
.GroupBy(c => c.caller, (caller, cg) => new { Key = caller, MinDateTime = cg.Min(c => c.date), Calls = cg })
.SelectMany(cg => cg.Calls.GroupBy(c => (int)(c.date - cg.MinDateTime).TotalMinutes / 30), (bucket, cg) => new { FirstCall = cg.MinBy(c => c.date), Calls = cg })
.OrderBy(fcc => fcc.FirstCall.date)
.ToList();
var ans2 = ans.Select(fcc => new { Caller = fcc.FirstCall.caller, FirstCallDateTime = fcc.FirstCall.date, LastCallDateTime = fcc.Calls.Max(c => c.date), Count = fcc.Calls.Count() })
.ToList();
Instead of grouping by a DateTime, try grouping by a key derived from the date.
string GetTimeBucketId(DateTime time) {
return $"${time.Year}-{time.Month}-{time.Day}T{time.Hour}-{time.Minute % 30}";
}
myList
.GroupBy(i => GetTimeBucketId(i.caller.date))
.Select(g => { Count = g.Count(), Key = g.Key });

Slow EF query grouping data by Month/Year

I have a record set of approximetly 1 million records. I'm trying to query the records to report monthly figures.
The following MySQL query executes in about 0.3 seconds
SELECT SUM(total), MONTH(create_datetime), YEAR(create_datetime)
FROM orders GROUP BY MONTH(create_datetime), YEAR(create_datetime)
However I am unable to figure out an entity framework lambda expression that can execute any near as fast
The only statement I have come up with that actually works is
var monthlySales = db.Orders
.Select(c => new
{
Total = c.Total,
CreateDateTime = c.CreateDateTime
})
.GroupBy(c => new { c.CreateDateTime.Year, c.CreateDateTime.Month })
.Select(c => new
{
CreateDateTime = c.FirstOrDefault().CreateDateTime,
Total = c.Sum(d => d.Total)
})
.OrderBy(c => c.CreateDateTime)
.ToList();
But it is horribly slow.
How can I get this query to execute as quickly as it does directly in MySQL
When you do ".ToList()" in the middle of query (before doing grouping) EF will effectively query all orders from database in memory and then do grouping in C#. Depending on amount of data in your table, that can take a while and I think this is why your query is so slow.
Try to rewrite your query having only 1 expression that enumerates results (ToList, ToArray, AsEnumerable)
Try this:
var monthlySales = from c in db.Orders
group c by new { y = c.CreateDateTime.Year, m = c.CreateDateTime.Month } into g
select new {
Total = c.Sum(t => t.Total),
Year = g.Key.y,
Month = g.Key.m }).ToList();
I came across this setup which executes quickly
var monthlySales = db.Orders
.GroupBy(c => new { Year = c.CreateDateTime.Year, Month = c.CreateDateTime.Month })
.Select(c => new
{
Month = c.Key.Month,
Year = c.Key.Year,
Total = c.Sum(d => d.Total)
})
.OrderByDescending(a => a.Year)
.ThenByDescending(a => a.Month)
.ToList();

Get top n rows and sum the rest and call it others in Entity Framework linq lambda query

My data structure:
BrowserName(Name) Count(Y)
MSIE9 7
MSIE10 8
Chrome 10
Safari 11
-- and so on------
What I'm trying to do is get the top 10 and then get the sum of rest and call it 'others'.
I'm trying to get the others as below but geting error..
Data.OrderBy(o => o.count).Skip(10)
.Select(r => new downModel { modelname = "Others", count = r.Sum(w => w.count) }).ToList();
The error is at 'r.Sum(w => w.count)' and it says
downModel does not contain a definition of Sum
The downModel just has string 'modelname' and int 'count'.
Any help is sincerely appreciated.
Thanks
It should be possible to get the whole result - the top ten and the accumulated "others" - in a single database query like so:
var downModelList = context.Data
.OrderByDescending(d => d.Count)
.Take(10)
.Select(d => new
{
Name = d.Name,
Count = d.Count
})
.Concat(context.Data
.OrderByDescending(d => d.Count)
.Skip(10)
.Select(d => new
{
Name = "Others",
Count = d.Count
}))
.GroupBy(x => x.Name)
.Select(g => new downModel
{
modelName = g.Key,
count = g.Sum(x => x.Count)
})
.ToList();
If you want to create just one model, then get the sum first and create your object:
var count = Data.OrderBy(o => o.count).Skip(10).Sum(x => x.count);
var model = new downModel { modelname = "Others", count = count };
Btw, OrderBy performs a sort in ascending order. If you want to get (or Skip) top results you need to use OrderByDescending.

Averaging with Linq while ignoring 0s cleanly

I have a linq statement that averages the rows in a DataTable and displays them on a chart, grouped by date and time of day.
There are 1 big problem: there are many 0 values that are returned, due to particular times of day simply not having anything going on. These are skewing my averages something awful
Different times of day may have 0s in different columns, so I can't just delete each row with a 0 in the columns (cleanly), as I would end up with no rows left in the datatable, or at least I can't think of a clean way to do it in any case.
This is what I have:
var results = from row2 in fiveDayDataTable.AsEnumerable()
group row2 by ((DateTime)row2["TheDate"]).TimeOfDay
into g
select new
{
Time = g.Key,
AvgItem1 = g.Average(x => (int)x["Item1"]),
AvgItem2 = g.Average(x => (int)x["Item2"]),
AvgItem3 = g.Average(x => (int)x["Item3"]),
AvgItem4 = g.Average(x => (int)x["Item4"]),
AvgItem5 = g.Average(x => (int)x["Item5"]),
};
I don't know if this is possible, so I figured I would ask- is there a way to do the average without the 0s?
Thank you!
Sure you can filter out the zeros:
AvgItem1 = g.Select(x => (int)x["Item1"]).Where(x => x != 0).Average(),
AvgItem2 = g.Select(x => (int)x["Item2"]).Where(x => x != 0).Average(),
AvgItem3 = g.Select(x => (int)x["Item3"]).Where(x => x != 0).Average(),
AvgItem4 = g.Select(x => (int)x["Item4"]).Where(x => x != 0).Average(),
AvgItem5 = g.Select(x => (int)x["Item5"]).Where(x => x != 0).Average(),
If your result set (after the Where) might be empty, you might need to call DefaultIfEmpty.
AvgItem1 = g.Select(x => (int)x["Item1"]).Where(x => x != 0).DefaultIfEmpty(0).Average(),
This will return a non-empty result set so your Average will be able to work with it.
Since you have a lot of repetition, you could consider refactoring your average logic into a separate method or anonymous function:
Func<IEnumerable<YourClass>, string, double> avg =
(g, name) => g.Select(x => (int)x[name]).Where(x => x != 0).Average();
var results = from row2 in fiveDayDataTable.AsEnumerable()
group row2 by ((DateTime)row2["TheDate"]).TimeOfDay
into g
select new
{
Time = g.Key,
AvgItem1 = avg(g, "Item1"),
AvgItem2 = avg(g, "Item2"),
AvgItem3 = avg(g, "Item3"),
AvgItem4 = avg(g, "Item4"),
AvgItem5 = avg(g, "Item5"),
};
Add a Where to each query just before the Average in which you ensure that the item is not equal to zero.

LINQ: how to get a group of a table ordering with a related table?

I have a doubt about the object IGrouping that results from a linq where I use a "group by" sentence.
I have two tables in the database, Products and Responses they have a relationship 1 to *. In the Responses table we have a column called FinalRate which is the rate of the product. The products can have n responses or rates.
I want to get the Products order by the sum of the FinalRate divided by the number of rates done. That is to say, order by the average rate descending from higher to lower marks.
As it can be read in the code (at the end of the question), I try to get the responses first. To sum all the finalrates and divide them by the count I use a group.
There are 2 problems with the code, even if the current code works:
1.-I tried to get the Products in a single query but it is impossible because I can not use the products table in the group and then use the Response table in the "orderby". One more thing LINQ only gives you the possibility to group one table, it is imposible to have "group prod, response".
I couldn't get this sql sentence in LINQ:
select prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name,
prod.ProductImageUrl
from rev_product prod
inner join rev_response res on res.AtProductid=prod.ProductID
group by prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name
,prod.ProductImageUrl
order by (sum(res.FinalRate)/count(res.AtProductid))
I tried this:
var gruposproductos = (from prod in ctx.Products
join res in ctx.Responses on prod.ProductID equals res.AtProductId
group prod by prod.ProductID into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2);
But as I say, the "orderby (g.Sum..." gives an error, because "into g" groups the Product table, not the Response Table.
So this is why in my final code I don't get the products in the same LINQ sentence.
2.-Once accepted this fact, the problem is that I get an IGrouping, but I don't obtain a list of Responses that I can iterate without doing the two foreach in the code. I wanted only one loop, as one would do if you had a "List" object.
It is not really a cool method but it works. Moreover, I have to control that in the second loop there is only added 1 time.
Any better code?
var groupproducts = (from res in ctx.Responses
group res by res.AtProductId into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2).ToList();
List<Product> theproducts = new List<Product>();
foreach (var groupresponse in groupproducts)
{
foreach (var response in groupresponse)
{
var producttemp= (from prod in ctx.Products
where prod.ProductID == response.AtProductId
select prod).First();
theproducts.Add(producttemp);
}
}
}
FINAL SOLUTION (thx a lot #Daniel)
var productsanonymtype = ctx.Products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Count() == 0 ? 0 : x.Responses.Select(r => (double)r.FinalRate).Sum() / x.Responses.Count()
}).OrderByDescending(x => x.AverageRating);
List<Product> products = new List<Product>();
foreach (var prod in productsanonymtype)
{
products.Add(prod.Product);
}
Try this:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Sum(x => x.FinalRate) /
x.Responses.Count()
});
The Sum overload I am using is not implemented in all providers. If that's a problem for you, you can use this alternate version:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Select(x => x.FinalRate)
.Sum() /
x.Responses.Count()
});
If there is no navigation property from product to its responses you should first try to fix that. If you can't you can use this version:
products.Join(responses, x => x.Id, x => x.ProductId,
(p, r) => new { Product = p, Response = r })
.GroupBy(x => x.Product)
.Select(g => new { Product = g.Key,
AverageRating = g.Select(x => x.Response.FinalRate)
.Sum() /
g.Count()
});
Assuming FinalRate is an int, both methods will calculate the average rating with an int, i.e. there will be no 4.5 rating. And there will be no rounding, i.e. an actual average rating of 4.9 will result in 4. You can fix that by casting one of the operands of the division to double.
Another problem is the case with no ratings so far. The code above will result in an exception in this case. If that's a problem for you, you can change the calculation to this:
AverageRating = g.Count() == 0
? 0
: g.Select(x => (double)x.Response.FinalRate).Sum() / g.Count()
ctx.Products.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);
And here with the projection.....
ctx.Products.Select(x => new {
ProductID = x.ProductID,
Commercial_Product_Name = x.Commercial_Product_Name,
Manufacturer_Name = x.Manufacturer_Name,
ProductImageUrl = x.ProductImageUrl,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.FinalRate,
CountProductId = x.CountProductId
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);

Categories

Resources