I've done a bit of research on this, and the best I've found so far is to use an Asenumerable on the whole dataset, so that the filtering occurs in linq to objects rather than on the DB. I'm using the latest EF.
My working (but very slow) code is:
var trendData =
from d in ExpenseItemsViewableDirect.AsEnumerable()
group d by new {Period = d.Er_Approved_Date.Year.ToString() + "-" + d.Er_Approved_Date.Month.ToString("00") } into g
select new
{
Period = g.Key.Period,
Total = g.Sum(x => x.Item_Amount),
AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
};
This gives me months in format YYYY-MM, along with the total amount and average amount. However it takes several minutes every time.
My other workaround is to do an update query in SQL so I have a YYYYMM field to group natively by. Changing the DB isn't an easy fix however so any suggestions would be appreciated.
The thread I found the above code idea (http://stackoverflow.com/questions/1059737/group-by-weeks-in-linq-to-entities) mentions 'waiting until .NET 4.0'. Is there anything recently introduced that helps in this situation?
The reason for poor performance is that the whole table is fetched into memory (AsEnumerable()). You can group then by Year and Month like this
var trendData =
(from d in ExpenseItemsViewableDirect
group d by new {
Year = d.Er_Approved_Date.Year,
Month = d.Er_Approved_Date.Month
} into g
select new
{
Year = g.Key.Year,
Month = g.Key.Month,
Total = g.Sum(x => x.Item_Amount),
AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
}
).AsEnumerable()
.Select(g=>new {
Period = g.Year + "-" + g.Month,
Total = g.Total,
AveragePerTrans = g.AveragePerTrans
});
edit
The original query, from my response, was trying to do a concatenation between an int and a string, which is not translatable by EF into SQL statements. I could use SqlFunctions class, but the query it gets kind ugly. So I added AsEnumerable() after the grouping is made, which means that EF will execute the group query on server, will get the year, month, etc, but the custom projection is made over objects (what follows after AsEnumerable()).
When it comes to group by month i prefer to do this task in this way:
var sqlMinDate = (DateTime) SqlDateTime.MinValue;
var trendData = ExpenseItemsViewableDirect
.GroupBy(x => SqlFunctions.DateAdd("month", SqlFunctions.DateDiff("month", sqlMinDate, x.Er_Approved_Date), sqlMinDate))
.Select(x => new
{
Period = g.Key // DateTime type
})
As it keeps datetime type in the grouping result.
Similarly to what cryss wrote, I am doing the following for EF. Note we have to use EntityFunctions to be able to call all DB providers supported by EF. SqlFunctions only works for SQLServer.
var sqlMinDate = (DateTime) SqlDateTime.MinValue;
(from x in ExpenseItemsViewableDirect
let month = EntityFunctions.AddMonths(sqlMinDate, EntityFunctions.DiffMonths(sqlMinDate, x.Er_Approved_Date))
group d by month
into g
select new
{
Period = g.Key,
Total = g.Sum(x => x.Item_Amount),
AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
}).Dump();
A taste of generated SQL (from a similar schema):
-- Region Parameters
DECLARE #p__linq__0 DateTime2 = '1753-01-01 00:00:00.0000000'
DECLARE #p__linq__1 DateTime2 = '1753-01-01 00:00:00.0000000'
-- EndRegion
SELECT
1 AS [C1],
[GroupBy1].[K1] AS [C2],
[GroupBy1].[A1] AS [C3]
FROM ( SELECT
[Project1].[C1] AS [K1],
FROM ( SELECT
DATEADD (month, DATEDIFF (month, #p__linq__1, [Extent1].[CreationDate]), #p__linq__0) AS [C1]
FROM [YourTable] AS [Extent1]
) AS [Project1]
GROUP BY [Project1].[C1]
) AS [GroupBy1]
Related
I'm trying implement the follow query in LINQ, but I don't find solution:
SQL:
SELECT COUNT(*) AS AmountMonths
FROM (SELECT SUBSTRING(CONVERT(NVARCHAR(12), pay_date, 112), 1, 6) AS Month
FROM #tmp
GROUP BY SUBSTRING(CONVERT(NVARCHAR(12), pay_date, 112), 1, 6)) AS AmountMonths
What I need is get the amounts of months in which the clients made payments, with the condition that there may be months in which no payments have been made.
In C# I tried the following:
int amountMonths = payDetail.GroupBy(x => Convert.ToDateTime(x.PayDate)).Count();
and
int amountMonths = payDetail.GroupBy(x => Convert.ToDateTime(x.PayDate).Month).Count();
But I am not getting the expected result.
(Assuming you're using EF Core)
You're almost there. You could do:
var amountMonths = context.AmountMonths.GroupBy(c => new { c.PayDate.Year, c.PayDate.Month }).Count();
This will translate to something like:
SELECT COUNT(*)
FROM (
SELECT DATEPART(year, [a].[PayDate]) AS [a]
FROM [AmountMonths] AS [a]
GROUP BY DATEPART(year, [a].[PayDate]), DATEPART(month, [a].[Pay_Date])
) AS [t]
which I'd find preferable over creating a string and chopping it up. EOMONTH isn't a standard mapped function, alas, otherwise it can be used to convert a date to month level granularity
I have a Blazor Web Application that has been working and in the field for a few months. I want to extend the DB querying to the group of similar "Detections".
It was written starting with .NET 5, and just today was updated to .NET 6 trying and get this working.
I would like to know how to get the results ordered by TimeStamp (a DateTime property). I have a working example with an in-memory DB, but production will be in SQL Server. I am not that great in SQL, but I have played around with it for a while in Management Studio with no luck.
Commenting out the OrderByDescending() groups things properly, but the results are not in the correct order. It seems the EF translation process is completely removing that line, it makes no difference in the generated query or the result set.
var results = context.Detections
//Line below makes no change ignored by SQL Server. Works when using in memory DB.
//.OrderByDescending(det => det.TimeStamp)
.GroupBy(det => new
{
Year = det.TimeStamp.Year,
Month = det.TimeStamp.Month,
Day = det.TimeStamp.Day,
Hour = det.TimeStamp.Hour,
})
.Select(grp => new
{
Count = grp.Count(),
Detection = grp.OrderByDescending(det => det.TimeStamp).First(),
})
//The following line will not translate
//.OrderByDescending(det => det.Detection.TimeStamp)
.ToList();
If any of this matters:
Visual Studio 2022 (4.8.04084)
.Net 6.0
SQL Server 2019 (15.0.2080.9)
*All NuGet packages related to EF have been updated to 6.0
Edit for clarification
The above code segment produces the following SQL query.
SELECT [t].[c], [t0].[Id], [t0].[TimeStamp]
FROM (
SELECT COUNT(*) AS [c], DATEPART(year, [d].[TimeStamp]) AS [c0], DATEPART(month, [d].[TimeStamp]) AS [c1], DATEPART(day, [d].[TimeStamp]) AS [c2], DATEPART(hour, [d].[TimeStamp]) AS [c3]
FROM [Detections] AS [d]
WHERE [d].[TimeStamp] > DATEADD(day, CAST(-16.0E0 AS int), GETUTCDATE())
GROUP BY DATEPART(year, [d].[TimeStamp]), DATEPART(month, [d].[TimeStamp]), DATEPART(day, [d].[TimeStamp]), DATEPART(hour, [d].[TimeStamp])
) AS [t]
OUTER APPLY (
SELECT TOP(1) [d0].[Id], [d0].[TimeStamp]
FROM [Detections] AS [d0]
WHERE ([d0].[TimeStamp] > DATEADD(day, CAST(-30.0E0 AS int), GETUTCDATE())) AND (((([t].[c0] = DATEPART(year, [d0].[TimeStamp])) AND ([t].[c1] = DATEPART(month, [d0].[TimeStamp]))) AND ([t].[c2] = DATEPART(day, [d0].[TimeStamp]))) AND ([t].[c3] = DATEPART(hour, [d0].[TimeStamp])))
ORDER BY [d0].[TimeStamp] DESC
) AS [t0]
It produces results similar to the following. Notice not sorted by time.
1 628591 2021-11-02 14:34:06.0442966
10 628601 2021-11-12 05:43:27.7015291
150 628821 2021-11-12 21:59:27.6444236
20 628621 2021-11-12 06:17:13.7798282
50 628671 2021-11-12 15:17:23.8893856
If I add ORDER BY [t0].TimeStamp DESC at the end of that SQL query in Management Studio I get the results I am looking for (see below). I just need to know how to write that in LINQ.
150 628821 2021-11-12 21:59:27.6444236
50 628671 2021-11-12 15:17:23.8893856
20 628621 2021-11-12 06:17:13.7798282
10 628601 2021-11-12 05:43:27.7015291
1 628591 2021-11-02 14:34:06.0442966
Adding .OrderByDescending(det => det.Detection.TimeStamp) at the end before ToList() was my first thought, but that "could not be translated". I will need to do some pagination with these results so I would really like to do the sorting in SQL.
GroupBy has to do its own Ordering so that 'ignores' is not totally unexpected.
Move it to below the grouping:
var results = context.Detections
//.OrderByDescending(det => det.TimeStamp)
.GroupBy(det => new
{
Year = det.TimeStamp.Year,
Month = det.TimeStamp.Month,
Day = det.TimeStamp.Day,
Hour = det.TimeStamp.Hour,
})
// .OrderByDescending(grp => grp.Key) // may have to split into y/m/d/h again
.OrderByDescending(grp => grp.Key.Year)
.ThenByDescending( grp => grp.Key.Month)
.ThenByDescending( grp => grp.Key.Day)
.ThenByDescending( grp => grp.Key.Hour)
.Select(grp => new
{
Count = grp.Count(),
Detection = grp.OrderByDescending(det => det.TimeStamp).First(),
})
.ToList();
When EF supports it, the Ordering and Grouping might become a little easier with
.GroupBy(det => new
{
Date = det.TimeStamp.Date,
Hour = det.TimeStamp.Hour,
})
For anyone looking at this in the future.
I was able to make this work by declaring and populating a TimeStamp property and using the OrderByDescending() at the end. I am not sure if this is the best solution, but it did solve my problem.
var results = context.Detections
.GroupBy(det => new
{
Year = det.TimeStamp.Year,
Month = det.TimeStamp.Month,
Day = det.TimeStamp.Day,
Hour = det.TimeStamp.Hour,
})
.Select(grp => new
{
Count = grp.Count(),
TimeStamp = grp.OrderByDescending(det => det.TimeStamp).First().TimeStamp,
Detection = grp.OrderByDescending(det => det.TimeStamp).First(),
})
.OrderByDescending(det => det.TimeStamp)
.ToList();
I have a table function which returns table names and number of entries within that table :
CREATE FUNCTION [dbo].[ufnGetLookups] ()
RETURNS
#lookupsWithItemCounts TABLE
(
[Name] VARCHAR(100),
[EntryCount] INT
)
AS
BEGIN
INSERT INTO #lookupsWithItemCounts([Name],[EntryCount])
VALUES
('Table1', (SELECT COUNT(*) FROM Table1)),
('Table2', (SELECT COUNT(*) FROM Table2)),
('Table3', (SELECT COUNT(*) FROM Table))
RETURN;
END
What would be the Linq equivalent of above simple function? Notice that I want to get the result in one single shot and the speed of the operation is quite important for me. If I realise that the converted linq to sql results in a massive bulky sql with performance hit, I would rather stick to my existing user defined function and forget about the linq equivilant.
You can do that with a UNION query. EG
var q = db.Books.GroupBy(g => "Books").Select(g => new { Name = g.Key, EntryCount = g.Count() })
.Union(db.Authors.GroupBy(g => "Authors").Select(g => new { Name = g.Key, EntryCount = g.Count() }));
var r = q.ToList();
Not an EF guy, and not sure if this would be more performant.
Select TableName = o.name
,RowCnt = sum(p.Rows)
From sys.objects as o
Join sys.partitions as p on o.object_id = p.object_id
Where o.type = 'U'
and o.is_ms_shipped = 0x0
and index_id < 2 -- 0:Heap, 1:Clustered
--and o.name in ('Table1','Table2','Table3' ) -- Include (or not) your own filter
Group By o.schema_id,o.name
Note: Wish I could recall the source of this, but I've used it in my discovery process.
Currently I have SQL query like
select tt.userId, count(tt.userId) from (SELECT userId,COUNT(userId) as cou
FROM [dbo].[users]
where createdTime> DATEADD(wk,-1,GETDATE())
group by userId,DATEPART(minute,createdTime)/5) tt group by tt.userId
Now I have the Data in the Data Table, I need to convert the above the query to LINQ and execute against the data table. I am unable to do so , can anybody help me out.
This is what query does, It groups the users into 5 minutes time slots and then counts the number of timeslots per user.
Note : I am not able to use Linqer to create the Linq queries because this table does not exist in the database, it's a virtual one created dynamically.
Bit complex query, giving my best to make it work.
var result = table.AsEnumerable().Where(u=> u.Field<DateTime>("createdTime") > DateTime.Now.AddDays(-7)) //subtract a week
.GroupBy(g=> new { userid = g.Field<string>("userId") , span = g.Field<DateTime>("createdTime").Minute })
.Select(g=> new { userid = g.Key.userid, count = g.Count()})
.GroupBy(g=> g.userid ).Select(s=> new {userid = s.Key, count = s.Count()});
Working Demo
This SQL can be rewritten like this
SELECT
COUNT(U.UserId),
U.[createdTime]
FROM USERS U WHERE createdTime> DATEADD(wk,-1,GETDATE())
GROUP BY U.UserId,
DATEPART(MONTH, U.[createdTime]),
DATEPART(DAY, U.[createdTime]),
DATEPART(HOUR, U.[createdTime]),
(DATEPART(MINUTE, U.[createdTime]) / 5)
And its corresponding Linq for DataTable would be
var users = myDataTable.AsEnumerable()
.Select(r=> new {
UserId = r.Field<int>("UserId"),
CreatedTime = r.Field<DateTime>("createdTime")
}).ToList();
var groupedUsersResult = from user in users where user.CreatedTime > user.CreatedTime.AddDays(-7) group user by
new {user.CreatedTime.Year,user.CreatedTime.Month,user.CreatedTime.Day,Minute=(user.CreatedTime.Minute/5),user.UserId}
into groupedUsers select groupedUsers;
Fiddle is here
I will suggest to use LINQPad4. It would be easy to do that and that will help you a lot in writing LINQ queries.
https://www.linqpad.net/
UPDATE
thanks to #usr I have got this down to ~3 seconds simply by changing
.Select(
log => log.OrderByDescending(
d => d.DateTimeUTC
).FirstOrDefault()
)
to
.Select(
log => log.OrderByDescending(
d => d.Id
).FirstOrDefault()
)
I have a database with two tables - Logs and Collectors - which I am using Entity Framework to read. There are 86 collector records and each one has 50000+ corresponding Log records.
I want to get the most recent log record for each collector which is easily done with this SQL
SELECT CollectorLogModels_1.Status, CollectorLogModels_1.NumericValue,
CollectorLogModels_1.StringValue, CollectorLogModels_1.DateTimeUTC,
CollectorSettingsModels.Target, CollectorSettingsModels.TypeName
FROM
(SELECT CollectorId, MAX(Id) AS Id
FROM CollectorLogModels GROUP BY CollectorId) AS RecentLogs
INNER JOIN CollectorLogModels AS CollectorLogModels_1
ON RecentLogs.Id = CollectorLogModels_1.Id
INNER JOIN CollectorSettingsModels
ON CollectorLogModels_1.CollectorId = CollectorSettingsModels.Id
This takes ~2 seconds to execute.
the closest I have been able to get with LINQ is the following
var logs = context.Logs.Include(co => co.Collector)
.GroupBy(
log => log.CollectorId, log => log
)
.Select(
log => log.OrderByDescending(
d => d.DateTimeUtc
).FirstOrDefault()
)
.Join(
context.Collectors,
(l => l.CollectorId),
(c => c.Id),
(l, c) => new
{
c.Target,
DateTimeUTC = l.DateTimeUtc,
l.Status,
l.StringValue,
CollectorName = c.TypeName
}
).OrderBy(
o => o.Target
).ThenBy(
o => o.CollectorName
)
;
This produces the results I want but takes ~35 seconds to execute.
This becomes the following SQL
SELECT
[Distinct1].[CollectorId] AS [CollectorId],
[Extent3].[Target] AS [Target],
[Limit1].[DateTimeUtc] AS [DateTimeUtc],
[Limit1].[Status] AS [Status],
[Limit1].[StringValue] AS [StringValue],
[Extent3].[TypeName] AS [TypeName]
FROM (SELECT DISTINCT
[Extent1].[CollectorId] AS [CollectorId]
FROM [dbo].[CollectorLogModels] AS [Extent1] ) AS [Distinct1]
OUTER APPLY (SELECT TOP (1) [Project2].[Status] AS [Status], [Project2].[StringValue] AS [StringValue], [Project2].[DateTimeUtc] AS [DateTimeUtc], [Project2].[CollectorId] AS [CollectorId]
FROM ( SELECT
[Extent2].[Status] AS [Status],
[Extent2].[StringValue] AS [StringValue],
[Extent2].[DateTimeUtc] AS [DateTimeUtc],
[Extent2].[CollectorId] AS [CollectorId]
FROM [dbo].[CollectorLogModels] AS [Extent2]
WHERE [Distinct1].[CollectorId] = [Extent2].[CollectorId]
) AS [Project2]
ORDER BY [Project2].[DateTimeUtc] DESC ) AS [Limit1]
INNER JOIN [dbo].[CollectorSettingsModels] AS [Extent3] ON [Limit1].[CollectorId] = [Extent3].[Id]
ORDER BY [Extent3].[Target] ASC, [Extent3].[TypeName] ASC
How can I get performance closer to what is achievable with SQL alone?
In your original SQL you can select a collection DateTimeUTC from a different row than the MAX(ID). That's probably a bug. The EF does not have that problem. It's not semantically identical, it is a harder query.
If you rewrite the EF query to be structurally the same as the SQL query you'll get identical performance. I see nothing here that EF would not support.
Compute the max(id) with EF as well and join on that.
I had the exact same issue, i solved it by adding indexes.
A query of mine would take 45 seconds to complete, i managed to get it completing in less than a second.