Why is Group By translating in Order By by Entity Framework? - c#

I am using Entity Framework in .NET 7.
I have 3 entities:
Course that contains a ProfessorId among other things
Grade that has a CourseId among other things
Professor
I want to get all the courses that are assigned to a professor and have at least 1 grade associated with them and filter them in a Dictionary<string, CourseViewModel> where string is the semester.
I have written the following LINQ query:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderBy(course => course.Key)
.ToDictionary(group => group.Key, group => group.ToList());
When that executes I get an exception saying it can't be translated.
If I remove the OrderBy and keep only the GroupBy, it works and the translated SQL in Microsoft SQL Server is:
SELECT [c].[Semester], [c].[Title]
FROM [Courses] AS [c]
WHERE [c].[ProfessorId] = #__professorId_0
AND EXISTS (SELECT 1
FROM [Grades] AS [g]
WHERE [c].[Id] = [g].[CourseId])
ORDER BY [c].[Semester]
As you can see it adds ORDER BY anyway, even though I have removed it and kept only GroupBy(). Can someone explain why is that? What if I wanted to order by descending would that be possible? Also the weird thing is that if I remove GroupBy() and keep only OrderBy() and replace the ToDictionary with ToList, it works and the exact same query is produced (only now I can't really use the results without further actions).

LINQ GroupBy :
Groups the elements of a sequence.
SQL GROUP BY :
A SELECT statement clause that divides the query result into groups of rows, usually by performing one or more aggregations on each group. The SELECT statement returns one row per group.
They aren't equivalent. The main difference is LINQ GroupBy return a collection by key, when SQL GROUP BY return ONE element (column) by key.
If the projection ask ONE element by key, then EF Core translate LINQ GroupBy to SQL GROUP BY :
// Get the number of course by semester
context
.Courses
.GroupBy(c => c.Semester)
.Select(cs => new { Semester = cs.Key, Count = cs.Count() })
.ToList();
Translated to :
SELECT [c].[Semester], COUNT(*) AS [Count]
FROM [Courses] AS [c]
GROUP BY [c].[Semester]
But if the projection ask several element, then EF Core translate LINQ GroupBy to SQL ORDER BY and group by itself.
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToDictionary(cs => cs.Key, cs => cs.ToList());
Translated to :
SELECT [c].[Semester], [c].[Id]
FROM [Courses] AS [c]
ORDER BY [c].[Semester]
If the result is :
Semester
Id
2023 S1
1
2023 S1
4
2023 S2
2
...
...
Then EF Core read like :
Read first row : Semester is "2023 S1"
No group
Then create a group and add the row in.
Read second row : Semester is "2023 S1"
The key is the same that precedent element
Then Add the row in the group
Read the third row : Semester is "2023 S2"
The key is different that precedent element
Then create a new group and the row in.
And so on...
You understand the interest of sorting.
About the error, I don't know that EF Core can't. The query sound legit. Maybe this should not be implemented at this time.
About that you try, to convert a sorted grouping enumeration to a dictionary. This is weird because the dictionary isn't sortable. Then this sorts elements and put them in loose.
If Dictionary seem sorted, it's a coincidence, not a feature. In intern, the dictionary sort element by key's has code, that is generally the sorted order... But not every time.
If you want a sorted dictionary, you can use SortedDictyonary. But it can be tricky if you need a custom sort rule, like :
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToImmutableSortedDictionary(cs => cs.Key, cs => cs.ToList(), new ReverseComparer<string>());
public class ReverseComparer<T> : IComparer<T>
{
private IComparer<T> _comparer = Comparer<T>.Default;
public int Compare(T? x, T? y)
{
return _comparer.Compare(x, y) * -1;
}
}

The exception you are encountering is most likely due to the fact that the OrderBy clause cannot be translated into SQL by Entity Framework. The OrderBy clause is executed in memory after the data has been retrieved from the database, which is why it works when you remove it and keep only the GroupBy clause.
However, if you want to order the dictionary by descending, you can simply call the Reverse method on the ToDictionary result:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse();
This way, the dictionary will be sorted in descending order based on the semester.
Give this a try and let me know how it works for you.
EDIT:
Converting the IEnumerable back to a Dictionary should work like this:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse()
.ToDictionary(pair => pair.Key,
pair => pair.Value);

Related

orderby not working before groupby in entity framework core [duplicate]

This question already has an answer here:
How to select top N rows for each group in a Entity Framework GroupBy with EF 3.1
(1 answer)
Closed 15 days ago.
I want to order users, then grouped by GroupCode and get first item of group
I have to use take because the number of users is large
I use this code, it's work fine but OrderBy not working.
public class User
{
public int Id { get; set; }
public int GroupCode { get; set; }
public DateTime CreatedDateTime { get; set; }
}
var query = _context.Users
.OrderByDescending(s => s.CreatedDateTime)
.GroupBy(s => s.GroupCode)
.Select(g => g.First())
.Take(10)
.ToListAsync()
EF Core translates the query to SQL and in SQL ORDER BY cannot precede GROUP BY.
Example:
SELECT *
FROM t
ORDER BY a
GROUP BY b
Oracle: SQL Error: ORA-00933: SQL command not properly ended (Line 4, Column 1)
SQL Server: Incorrect syntax near the keyword 'group'.
You can do this in LINQ-to-Objects, so. The documentation says:
The IGrouping<TKey,TElement> objects are yielded in an order based on the order of the elements in source that produced the first key of each IGrouping<TKey,TElement>. Elements in a grouping are yielded in the order that the elements that produced them appear in source.
As #Ivan Stove points out: move OrderBy before First():
List<User> query = await _context.Users
.GroupBy(s => s.GroupCode)
.OrderBy(g => g.Key)
.Take(10)
.Select(g => g.OrderByDescending(s => s.CreatedDateTime).First())
.ToList();
I also think that you need to order twice. Once to get the groups in the right order so that Take(10) returns the first 10 GroupCodes and another one to get the User in each group with the last CreatedDateTime. Unless you need something else. But then please explain exactly what you need.
Your query has no analogue in the SQL world, ORDER BY followed by grouping is prohibited. While it is working with LINQ to Objects, with EF Core you have to follow it's rules.
Rewrite this query in the following way:
var query = await _context.Users
.GroupBy(s => s.GroupCode)
.Select(g => g.OrderByDescending(s => s.CreatedDateTime).First())
.Take(10)
.ToListAsync();

Linq OrderBy then GroupBy - group by unexpectedly changes order of list

When I run this expression i can see that the list order is correctly in sequence by the highest ActionId's
var list = db.Actions.Where(z => z.RunId
== RunId).OrderByDescending(w =>
w.ActionId).ToList();
I only want to select the highest ActionId's of each ActionName so I now do:
var list = db.Actions.Where(z => z.RunId
== RunId).OrderByDescending(w =>
w.ActionId).GroupBy(c => new
{
c.ActionName,
c.MachineNumber,
})
.Select(y =>
y.FirstOrDefault()).ToList();
When I look at the contents of list, it hasn't selected the ActionName/MachineNumber with the highest ActionId, which I assumed would be the case by ordering then using FirstOrDefault().
Any idea where I'm going wrong? I want to group the records by the ActionName and MachineId, and then pick the record with the highest ActionId for each group
Instead of grouping an ordered collection, group the collection first, and then select the record with the highest ID for each of the groups. GroupBy is not guaranteed to preserve the order in each group in LINQ to SQL - it depends on your database server.
var list = db.Actions.Where(z => z.RunId == RunId).GroupBy(c => new
{
c.ActionName,
c.MachineNumber,
})
.Select(y => y.OrderByDescending(z => z.ActionId).FirstOrDefault()).ToList();

c# LINQ AsEnumerable Groupby Where

I have a datatable that I am returning to the UI layer.
I have multiple tables with the same FirstId value. A few may have a value in teh FieldOne. I only want to group the records where FieldOne is null.
I tried the following LINQ statement with .Where and .Groupby but the .Where removes all the records with values in FieldOne and then do the GroupBy. In the UI grid, the records with FieldOne values are missing. I want to only group the records with empty FieldOne values and still have the records with FieldOne values. Thanks.
MyDataAsEnumerable()
.Where(f => f.Field<string>("FieldOne") == null)
.GroupBy(r => new { pp1 = r.Field<int>("FirstId") })
.Select(g => g.First())
.CopyToDataTable();
You could make an artifical grouping key:
.GroupBy(
r => new { pp1 = f.Field<string>("FieldOne") == null ? -1 : r.Field<int>("FirstId") })
Here, I used -1 as a hack to create a separate group. Make sure this int value is not in use. You could also solve this precisely but hopefully this is OK.

Casting Nhibernate result into IDictionary<string,int>

I am trying to convert the result of the query into IDictionary
Here string will contain orderId and the int will contain the TradedQuantity
The query below should join three objects Order, OrderRevision and OrderEvent.
1 Order can have many orderRevisions
1 OrderRevision can have many orderEvents
What the query is trying to do is to inner join three objects and get all order objects whose order id matches the list of orderids supplied to it. Then it does a group by based on orderId and gets the latest TradedQuantity from orderEvents object. LatestTradedQuantity will be the TradedQuantityFrom latest OrderEvent. For now the latest orderevent can be regarded as the one that has highest OrderEventId value.
OrderRevision revisionAlias = null;
Order orderAlias = null;
var query =
Session.QueryOver<OrderEvent>()
.JoinAlias(oe => oe.OrderRevision,() => revisionAlias)
.JoinAlias(oe => oe.OrderRevision.Order,() => orderAlias)
.Where(x => x.OrderRevision.Order.SourceSystem.Name.ToLower() == sourceSystem.ToLower())
.WhereRestrictionOn(x => x.OrderRevision.Order.Id).IsIn(orderIds.ToList())
.SelectList(list => list.SelectGroup(x => x.OrderRevision.Order.SourceOrderIdentifier)
.SelectMax(x => x.Id).Select(x => x.TradedQuantity))
.Select(x => new KeyValuePair<string, int?>(x.OrderRevision.Order.SourceOrderIdentifier, x.TradedQuantity)
);
As this query does not do what is supposed to. Could you please help and let me know how the result can be cast into IDictionary?
You have tagged your question with linq-to-nhibernate, so I guess using it instead of queryover would suit you. With Linq, use a sub-query for selecting the "max" order events ids for each order, then query them and project them to a dictionary.
using System.Linq;
using NHibernate.Linq;
...
var orderEventsIdsQuery = Session.Query<OrderEvent>()
.Where(oe => orderIds.Contains(oe.OrderRevision.Order.Id))
.GroupBy(oe => oe.OrderRevision.Order.SourceOrderIdentifier,
(soi, oes) => oes.Max(oe => oe.Id));
var result = Session.Query<OrderEvent>()
.Where(oe => orderEventsIdsQuery.Contains(oe.Id))
.ToDictionary(oe => oe.OrderRevision.Order.SourceOrderIdentifier,
oe => oe.TradedQuantity);
This should do the job. I do not use QueryOver and I will not try to give an answer for doing it with QueryOver.

LINQ to SQL: select array of arrays of integers in one query

I have a table of users, grouped into sessions. I would like to select an array for each user that consists of the number of tasks they have in each session:
var taskCounts =
from session in gzClasses.e_userLongSessions
orderby session.workerId ascending, session.startTime ascending
group session by session.workerId
into record
select record.Select(s => s.totalTasks).ToArray();
int[][] result = taskCounts.ToArray();
The above theoretically works, but it results in a separate SQL query for each user, as shown in the image below. Since the database is not local, this takes quite a long time. Is there a way to grab all the data in one query and reduce the overhead of running a bunch of individual queries?
At the same time, I'd like ensure that it's efficient by only transmitting the totalTasks integer values over the wire, instead of sending the entire database record.
Put another way, I'd like to grab a set of grouped integers from a remote database all in one query, and have them arranged into arrays in a C# program. It sounds pretty simple in principle, but I'm having a hard time getting LINQ to do this.
Depending on how many records you get back, you could return a minimal amount of data and do the grouping part in memory (pretty quickly since it'll already be sorted):
Using method syntax:
gzClasses.e_userLongSessions
.OrderBy(s => s.workerId)
.ThenBy(s => s.startTime)
.Select(s => new { s.workerId, s.totalTasks })
.ToList()
.GroupBy(x => x.workerId)
.Select(g => g.Select(x => x.totalTasks).ToArray())
.ToArray();
Using query syntax:
var query = from session in gzClasses.e_userLongSessions
orderby session.workerId ascending, session.startTime ascending
select new { Id = s.workerId, s.totalTasks };
var taskCounts = from worker in query.ToList()
group worker by worker.Id into g
select g.Select(x => x.totalTasks).ToArray();
var result = taskCounts.ToArray();
I see the same behavior in Linqpad (Linq-to-SQL default), but I feel as though I've seen Linq-to-Entities handle a GroupBy followed by a group.Select(x => x.Something) without resulting in an n+1 query...Could be imagining things though (not sure what the SQL would look like to achieve that).
Wouldn't a Dictionary be more useful than an array of arrays?
gzClasses.e_userLongSessions.OrderBy(s => s.workerId)
.ThenBy(s => s.startTime)
.GroupBy(s => s.workerId, t => t.TotalTasks).ToArray()
.ToDictionary(g => g.Key, h => h.ToArray());
That should return a Dictionary, with the workerId as the key, and an array of the number of tasks as the value.
Maybe if you replace "from session in gzClasses.e_userLongSessions" by "from session in gzClasses.e_userLongSessions.AsEnumerable()"?
The position of your ToArray() causes LINQ to be more eager than it should be, resulting in your many queries. I think this will result in just one query:
var taskCounts =
from session in gzClasses.e_userLongSessions
orderby session.workerId ascending, session.startTime ascending
group session by session.workerId
into record
select record.Select(s => s.totalTasks);
int[][] result = taskCounts.Select(x => x.ToArray()).ToArray();

Categories

Resources