How to use Max and Group By in LINQ - c#

I try to use this line:
return ObjectContext.TH_Proposal.GroupBy.where(/* condition*/).(y => y.ProposalID).Max(y =>y.ProposalDate);
I want a result like this using LINQ
select *
from TH_Proposal
where ProposalDate in (select max(ProposalDate)from TH_Proposal group by ProposalID)

You can do it like this:
ObjectContext.TH_Proposal.GroupBy(
p => p.ProposalID,
(id, g) => g.OrderByDescending(p => p.ProposalDate).First());
This groups the entries by ProposalID and selects the one with the highest ProposalDate.

This is much easier in Linq than SQL since Linq can take the first item of each group:
return
ObjectContext.TH_Proposal
.OrderBy(y => y.ProposalDate)
.GroupBy(y => y.ProposalID)
.Select(g => g.First());

Related

How to select last record in a LINQ GroupBy clause

I have the following simple table with ID, ContactId and Comment.
I want to select records and GroupBy contactId. I used this LINQ extension method statement:
Mains.GroupBy(l => l.ContactID)
.Select(g => g.FirstOrDefault())
.ToList()
It returns record 1 and 4. How can I use LINQ to get the ContactID with the highest ID? (i.e. return 3 and 6)
You can order you items
Mains.GroupBy(l => l.ContactID)
.Select(g=>g.OrderByDescending(c=>c.ID).FirstOrDefault())
.ToList()
Use OrderByDescending on the items in the group:
Mains.GroupBy(l => l.ContactID)
.Select(g => g.OrderByDescending(l => l.ID).First())
.ToList();
Also, there is no need for FirstOrDefault when selecting an item from the group; a group will always have at least one item so you can use First() safely.
Perhaps selecting with Max instead of OrderByDescending could result into improving of performance (I'm not sure how it's made inside so it needs to be tested):
var grouped = Mains.GroupBy(l => l.ContactID);
var ids = grouped.Select(g => g.Max(x => x.Id));
var result = grouped.Where(g => ids.Contains(g.Id));
As I assume it could result into a query that will take MAX and then do SELECT * FROM ... WHERE id IN ({max ids here}) which could be significantly faster than OrderByDescending.
Feel free to correct me if I'm not right.
OrderByDescending
Mains.GroupBy(l => l.ContactID)
.Select(g=>g.OrderByDescending(c=>c.ID).FirstOrDefault())
.ToList()
is your best solution
It orders by the highest ID descending, which is pretty obvious from the name.
You could use MoreLinq like this for a shorter solution:
Main.OrderByDescending(i => i.Id).DistinctBy(l => l.ContactID).ToList();

What can I do to improve the speed of this query?

I have a linq query that returns the last page a user looked at based on a table of page hits. The fields are simply TimeStamp, UserID and URL which are logged from user activity. The query looks like this:
public static IQueryable GetUserStatus()
{
var ctx = new AppEntities();
var currentPageHits = ctx.Pagehits
.GroupBy(x => x.UserID)
.Select(x => x.Where(y => y.TimeStamp == x.Max(z => z.TimeStamp)))
.SelectMany(x => x);
return currentPageHits.OrderByDescending(o => o.TimeStamp);
}
The query works perfectly but runs slowly. Our DBA assures us that the table has indexes in all the right places and that the trouble must be with the query.
Is there anything inherently wrong or BAD with this, or is there a more efficient way of getting the same results?
You could try:
var currentPageHits2 = ctx.Pagehits
.GroupBy(x => x.UserID)
.Select(x => x.OrderByDescending(y => y.TimeStamp).First())
.OrderByDescending(x => x.TimeStamp);
But the speed should be the same.
Note that there is a subtle difference between this query and yours... With yours, if a UserId has two "max TimeStamp" PageHits with the same TimeStamp, two "rows" will be returned, with this one only one will be returned.
So you try to implement DENSE_RANK() OVER (PARTITION BY UserID ORDER BY TimeStamp DESC) with LINQ? So all latest records per user-group according to the Timestamp. You could try:
public static IQueryable GetUserStatus()
{
var ctx = new AppEntities();
var currentPageHits = ctx.Pagehits
.GroupBy(x => x.UserID)
.SelectMany(x => x.GroupBy(y => y.TimeStamp).OrderByDescending(g=> g.Key).FirstOrDefault())
.OrderByDescending(x => x.TimeStamp);
return currentPageHits;
}
So it's grouping the user-group by TimeStamp, then it takes the latest group(one or more records in case of ties). The SelectMany flattens the goups to records. I think this is more efficient than your query.

LINQ: Select all from each group except the first item

It is easy to select the first of each group:
var firstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.Select(g => g.First());
But...
Question: how can I select all from each group except the first item?
var everythingButFirstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.Select( ...? );
Additional information:
My real goal is to delete all duplicates except the last (in a bulk way, ie: not using an in-memory foreach), so after the previous query I want to use RemoveRange:
dbContext.Measurements.RemoveRange(everythingButFirstOfEachGroup);
So, if my question has no sense, this information might be handy.
Use Skip(1) to skip the first record and select the rest.
Something like:
var firstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.SelectMany(g => g.OrderByDescending(r => r.SomeColumn).Skip(1));
See: Enumerable.Skip
If you do not need a flattened collection then replace SelectMany with Select in code snippet.
IGrouping<K, V> implements IEnumerable<V>; you simply need to skip inside the select clause to apply it to each group:
.Select(g => g.Skip(1))
You can always use .Distinct() to remove duplicates; presumably sorting or reverse-sorting and then applying .distinct() would give you what you want.

How Can I Generate Nhibernate GROUP BY without SELECT the property

I'd like to resolve that problem :
SELECT Max(Date)
FROM Table
GROUP BY SubId
(Then pass it as a SubQuery to mid-action so I can get the Id of the item in Table)
SELECT Id
FROM Table
WHERE Date in
[[[ previous request ]]]
(Then Get the full Table Item with other table join)
SELECT *
FROM Table
LEFT JOIN...
WHERE Id in
[[[ previous request ]]]
I tried this kind of request :
var subquery = QueryOver.Of<Table>(() => x)
.SelectList(list => list
.SelectMax(() => x.Date)
.SelectGroup(() => x.Sub.Id)
);
var filter = QueryOver.Of<Table>().WithSubquery.
WhereExists(subquery)
.Select(p => p.Id);
var result = Session.QueryOver<Table>().WithSubquery.WhereProperty(p => p.Id).In(filter).Left.JoinQueryOver(p => p.Sub).List();
But the problem is that I can't get the first request right with only the date out of my request.
Is there a better way to do it than that kind of subqueries ? And is there a possibility in NHibernate to Groupy By a Property without selecting it ?
Thanks !
Finally did it that way and it generated the SQL i wanted. But it wasn't 3 subqueries exactly it was 3 queries looking in a set of datas (The arrays subquery and CorrespondingIds).
var subquery = Session.QueryOver<Table>(() => x)
.SelectList(list => list
.SelectMax(() => x.Date)
.SelectGroup(() => x.Sub.Id))
.List<object[]>().Select(p => p[0]).ToArray();
var CorrespondingIds = Session.QueryOver<Table>(() => x)
.WhereRestrictionOn(() => x.Date).IsIn(subquery)
.Select(p => p.Id).List<int>().ToArray();
var result = Session.QueryOver<Table>(() => x).WhereRestrictionOn(() => x.Id).IsIn(CorrespondingIds).Left.JoinQueryOver(p => p.Sub).List();

LINQ group by then order groups of result

I have a table that has the following 3 columns, ID, ShortCode, UploadDate.
I want to use LINQ to group the results by shortcode (and keep all the results) then order those groups and return a list.
I have the following:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList<PDFDocument>().
GroupBy(b=>b.ShortCode)
.SelectMany(b=>b).ToList<PDFDocument>()
I want to return all results, grouped by ShortCode, the items within each group sorted by UploadDate and the groups sorted so the one that has the most recent document in it first.
Does anyone know if this is even possible?
Try
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.OrderByDescending(d => d.UploadDate)
.GroupBy(d => d.ShortCode)
.SelectMany(g => g)
.ToList();
This should
Order the items by upload date (descending so newest first)
Then group them by short code - so within each group the items are still sorted
The groups are still in descending order, so no need to order again
Finally concatenate the results into a single list
If performance is an issue you many be better off doing
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.GroupBy(d => d.ShortCode)
.Select(g => g.OrderByDescending(d => d.UploadDate))
.OrderByDescending(e => e.First().UploadDate)
.SelectMany(e => e)
.ToList();
which sorts the contents of each group separately rather than sorting everything first and then grouping.
In fact, you don't want to group by short code, you want to order by them. So the following query should do the trick:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.OrderBy(b => b.ShortCode)
.ThenBy(b => b.UploadDate)
.ToList()
Edit
If you really want to use a GroupBy, you can do so this way:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.SelectMany(grouping => grouping.OrderBy(b => b.UploadDate))
.ToList()
But I discourage it. There is no point creating groups if you do not want groups in the first place!
Second edit
I did not get you wanted the groups ordered by UpdateTime too. It complicates a little the query:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.Select(grouping => grouping.OrderByDescending(b => b.UploadDate))
.OrderByDescending(grouping => grouping.First().UploadDate)
.SelectMany(grouping => grouping)
.ToList()

Categories

Resources