I have a table of users, grouped into sessions. I would like to select an array for each user that consists of the number of tasks they have in each session:
var taskCounts =
from session in gzClasses.e_userLongSessions
orderby session.workerId ascending, session.startTime ascending
group session by session.workerId
into record
select record.Select(s => s.totalTasks).ToArray();
int[][] result = taskCounts.ToArray();
The above theoretically works, but it results in a separate SQL query for each user, as shown in the image below. Since the database is not local, this takes quite a long time. Is there a way to grab all the data in one query and reduce the overhead of running a bunch of individual queries?
At the same time, I'd like ensure that it's efficient by only transmitting the totalTasks integer values over the wire, instead of sending the entire database record.
Put another way, I'd like to grab a set of grouped integers from a remote database all in one query, and have them arranged into arrays in a C# program. It sounds pretty simple in principle, but I'm having a hard time getting LINQ to do this.
Depending on how many records you get back, you could return a minimal amount of data and do the grouping part in memory (pretty quickly since it'll already be sorted):
Using method syntax:
gzClasses.e_userLongSessions
.OrderBy(s => s.workerId)
.ThenBy(s => s.startTime)
.Select(s => new { s.workerId, s.totalTasks })
.ToList()
.GroupBy(x => x.workerId)
.Select(g => g.Select(x => x.totalTasks).ToArray())
.ToArray();
Using query syntax:
var query = from session in gzClasses.e_userLongSessions
orderby session.workerId ascending, session.startTime ascending
select new { Id = s.workerId, s.totalTasks };
var taskCounts = from worker in query.ToList()
group worker by worker.Id into g
select g.Select(x => x.totalTasks).ToArray();
var result = taskCounts.ToArray();
I see the same behavior in Linqpad (Linq-to-SQL default), but I feel as though I've seen Linq-to-Entities handle a GroupBy followed by a group.Select(x => x.Something) without resulting in an n+1 query...Could be imagining things though (not sure what the SQL would look like to achieve that).
Wouldn't a Dictionary be more useful than an array of arrays?
gzClasses.e_userLongSessions.OrderBy(s => s.workerId)
.ThenBy(s => s.startTime)
.GroupBy(s => s.workerId, t => t.TotalTasks).ToArray()
.ToDictionary(g => g.Key, h => h.ToArray());
That should return a Dictionary, with the workerId as the key, and an array of the number of tasks as the value.
Maybe if you replace "from session in gzClasses.e_userLongSessions" by "from session in gzClasses.e_userLongSessions.AsEnumerable()"?
The position of your ToArray() causes LINQ to be more eager than it should be, resulting in your many queries. I think this will result in just one query:
var taskCounts =
from session in gzClasses.e_userLongSessions
orderby session.workerId ascending, session.startTime ascending
group session by session.workerId
into record
select record.Select(s => s.totalTasks);
int[][] result = taskCounts.Select(x => x.ToArray()).ToArray();
Related
I am using Entity Framework in .NET 7.
I have 3 entities:
Course that contains a ProfessorId among other things
Grade that has a CourseId among other things
Professor
I want to get all the courses that are assigned to a professor and have at least 1 grade associated with them and filter them in a Dictionary<string, CourseViewModel> where string is the semester.
I have written the following LINQ query:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderBy(course => course.Key)
.ToDictionary(group => group.Key, group => group.ToList());
When that executes I get an exception saying it can't be translated.
If I remove the OrderBy and keep only the GroupBy, it works and the translated SQL in Microsoft SQL Server is:
SELECT [c].[Semester], [c].[Title]
FROM [Courses] AS [c]
WHERE [c].[ProfessorId] = #__professorId_0
AND EXISTS (SELECT 1
FROM [Grades] AS [g]
WHERE [c].[Id] = [g].[CourseId])
ORDER BY [c].[Semester]
As you can see it adds ORDER BY anyway, even though I have removed it and kept only GroupBy(). Can someone explain why is that? What if I wanted to order by descending would that be possible? Also the weird thing is that if I remove GroupBy() and keep only OrderBy() and replace the ToDictionary with ToList, it works and the exact same query is produced (only now I can't really use the results without further actions).
LINQ GroupBy :
Groups the elements of a sequence.
SQL GROUP BY :
A SELECT statement clause that divides the query result into groups of rows, usually by performing one or more aggregations on each group. The SELECT statement returns one row per group.
They aren't equivalent. The main difference is LINQ GroupBy return a collection by key, when SQL GROUP BY return ONE element (column) by key.
If the projection ask ONE element by key, then EF Core translate LINQ GroupBy to SQL GROUP BY :
// Get the number of course by semester
context
.Courses
.GroupBy(c => c.Semester)
.Select(cs => new { Semester = cs.Key, Count = cs.Count() })
.ToList();
Translated to :
SELECT [c].[Semester], COUNT(*) AS [Count]
FROM [Courses] AS [c]
GROUP BY [c].[Semester]
But if the projection ask several element, then EF Core translate LINQ GroupBy to SQL ORDER BY and group by itself.
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToDictionary(cs => cs.Key, cs => cs.ToList());
Translated to :
SELECT [c].[Semester], [c].[Id]
FROM [Courses] AS [c]
ORDER BY [c].[Semester]
If the result is :
Semester
Id
2023 S1
1
2023 S1
4
2023 S2
2
...
...
Then EF Core read like :
Read first row : Semester is "2023 S1"
No group
Then create a group and add the row in.
Read second row : Semester is "2023 S1"
The key is the same that precedent element
Then Add the row in the group
Read the third row : Semester is "2023 S2"
The key is different that precedent element
Then create a new group and the row in.
And so on...
You understand the interest of sorting.
About the error, I don't know that EF Core can't. The query sound legit. Maybe this should not be implemented at this time.
About that you try, to convert a sorted grouping enumeration to a dictionary. This is weird because the dictionary isn't sortable. Then this sorts elements and put them in loose.
If Dictionary seem sorted, it's a coincidence, not a feature. In intern, the dictionary sort element by key's has code, that is generally the sorted order... But not every time.
If you want a sorted dictionary, you can use SortedDictyonary. But it can be tricky if you need a custom sort rule, like :
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToImmutableSortedDictionary(cs => cs.Key, cs => cs.ToList(), new ReverseComparer<string>());
public class ReverseComparer<T> : IComparer<T>
{
private IComparer<T> _comparer = Comparer<T>.Default;
public int Compare(T? x, T? y)
{
return _comparer.Compare(x, y) * -1;
}
}
The exception you are encountering is most likely due to the fact that the OrderBy clause cannot be translated into SQL by Entity Framework. The OrderBy clause is executed in memory after the data has been retrieved from the database, which is why it works when you remove it and keep only the GroupBy clause.
However, if you want to order the dictionary by descending, you can simply call the Reverse method on the ToDictionary result:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse();
This way, the dictionary will be sorted in descending order based on the semester.
Give this a try and let me know how it works for you.
EDIT:
Converting the IEnumerable back to a Dictionary should work like this:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse()
.ToDictionary(pair => pair.Key,
pair => pair.Value);
I have a problem getting last row in a each group. I am using Linq query to retrieve the groups.
Here is my LINQ query.
return View(db.tblMsgs.OrderByDescending(a => a.Id)
.GroupBy(a => new { a.Sender, a.Receiver }).Select(x => x.FirstOrDefault())
.Where(a => a.Receiver == username).ToList());
using FirstOfDefault() I am getting first row in a group.
Using LastOrDefault() I am getting run time exception.
That's what I run into too, in some time now.
After a little bit of research I found out that only way that works as it must is to reverse the list that you want the get the last item of and get the first item.
The reason behind this is that SQL languages does not have a statement as SELECT BOTTOM but SELECT TOP. Hence our LastOrDefault query could not be translated into SQL.
The possible way of doing so is to OrderByDescending method.
return View(db.tblMsgs.OrderByDescending(a => a.Id)
.GroupBy(a => new { a.Sender, a.Receiver }).Select(x => x.OrderByDescending(y => y.SomeAttribute).FirstOrDefault())
.Where(a => a.Receiver == username).ToList());
Edit:
Only thing you should be choosy about is the column to order by. It can be the id field if it is an auto incremented number value, or an add date of the row(better be generated by server or it can cause problems).
I have the following simple table with ID, ContactId and Comment.
I want to select records and GroupBy contactId. I used this LINQ extension method statement:
Mains.GroupBy(l => l.ContactID)
.Select(g => g.FirstOrDefault())
.ToList()
It returns record 1 and 4. How can I use LINQ to get the ContactID with the highest ID? (i.e. return 3 and 6)
You can order you items
Mains.GroupBy(l => l.ContactID)
.Select(g=>g.OrderByDescending(c=>c.ID).FirstOrDefault())
.ToList()
Use OrderByDescending on the items in the group:
Mains.GroupBy(l => l.ContactID)
.Select(g => g.OrderByDescending(l => l.ID).First())
.ToList();
Also, there is no need for FirstOrDefault when selecting an item from the group; a group will always have at least one item so you can use First() safely.
Perhaps selecting with Max instead of OrderByDescending could result into improving of performance (I'm not sure how it's made inside so it needs to be tested):
var grouped = Mains.GroupBy(l => l.ContactID);
var ids = grouped.Select(g => g.Max(x => x.Id));
var result = grouped.Where(g => ids.Contains(g.Id));
As I assume it could result into a query that will take MAX and then do SELECT * FROM ... WHERE id IN ({max ids here}) which could be significantly faster than OrderByDescending.
Feel free to correct me if I'm not right.
OrderByDescending
Mains.GroupBy(l => l.ContactID)
.Select(g=>g.OrderByDescending(c=>c.ID).FirstOrDefault())
.ToList()
is your best solution
It orders by the highest ID descending, which is pretty obvious from the name.
You could use MoreLinq like this for a shorter solution:
Main.OrderByDescending(i => i.Id).DistinctBy(l => l.ContactID).ToList();
I have 2 LINQ Queries here, i just want to know which of these query is proper and fast to use.
Sample I
var GetUSer = (from UserItem in dbs.users
where UserItem.UserID == UserID
select new User(UserItem))
.OrderBy(item => item.FirstName)
.Skip(0)
.Take(10)
.ToList();
Sample II
var GetUSer = (from UserITem in dbs.user
.Where(item => item.UserID == UserID)
.OrderBy(item => item.FirstName)
.Skip(0)
.Take(10)
.AsEnumerable()
select new User(UserItem)).ToList();
Although they are both working well, i just want to know which is the best.
The Second one is better, the first 1 does a select then does filtering, meaning it has to get the data from the database first to turn it into a User object, then it filters.
The second one will do the query on the DB side, then turn it into a User object
The first one can be fixed by moving the select till just before the ToList()
Between those two, I would prefer the first (for readability, you'd need to switch some things around if you want the whole query to execute in the database). If they both work, it's up to you though.
Personally, I don't like mixing query syntax with lambda syntax if I don't have to, and I prefer lambda. I would write it something like:
var GetUsers = db.user
.Where(u => u.UserID == UserID)
.OrderBy(u => u.FirstName)
.Take(10)
.Select(u => new User(u))
.ToList();
This uses a single syntax, queries as much as possible in the database, and leaves out any superfluous calls.
Say I have a list static list of Ids in a particular order:
List<int> ordered = new List<int> {7,2,3,4,5};
And I would like to select the items out of the database maintaining that order.
The trivial:
var posts = ( from p in Current.DB.Posts
where ordered.Contains(p.Id)
select p).ToList();
Comes back fast, but out of order.
How do I select these posts out of the db and maintain the order in an elegant and efficient way?
If you don't explicitly include an order-by clause, you only really have a set - any ordering is purely convenience and will usually happen to be on the clustered index - but IIRC this is not guaranteed (and I imagine things like the server choosing to use parallelism would throw this out)
Include an order-by; either at the DB, or at the client. Alternatively, throw the results into a dictionary:
var dict = Current.DB.Posts.Where(p => ordered.Contains(p.Id))
.ToDictionary(p => p.Id);
then you can pull out the one you need at will, ignoring order.
Here's a combination of Marc's answer and your answer:
var dict = Current.DB.Posts.Where(p => ordered.Contains(p.Id))
.ToDictionary(p => p.Id);
return ordered.Select(id => dict[id]).ToList();
Since it omits the OrderBy step, I suspect that it will be a bit more efficient. It's certainly a bit prettier.
We ended up going with:
var reverseIndex = ordered.Select((id, index) => new { Id = id, Index = index }).ToDictionary(pair => pair.Id, s => s.Index);
model.Posts = Current.DB.Posts
.Where(p => postIds.Contains(p.Id))
.AsEnumerable()
.OrderBy(p => reverseIndex[p.Id] )
.ToList();
Ugly, yet reasonably efficient for large lists.
You could project the List<int> onto your list of posts.
var posts = ( from p in Current.DB.Posts
where ordered.Contains(p.Id)
select p).ToList();
return ordered.Select(o => posts.Single(post => post.Id == o)).ToList();
You could also do this at the database retrieval level but you'd be doing multiple select statements
ordered.Select(o => Current.DB.Posts.Single(post => post.Id == o)).ToList();