Return list from grouped values - c#

I want to make a table of the top 10 bestselling items. However, when I use GroupBy in my query I can't return a list.
This is the SQL statement that I tested on my database:
SELECT itemId, SUM(amountOrdered) AS amount FROM order GROUP BY itemId ORDER BY amount desc LIMIT 10;
And this is how I approached it in my project:
public List<Order?> GetBestsellingItems()
{
var result = context.Order
.Include(p => p.Item)
.Where(p => p.ItemId== p.Item.ItemId)
.GroupBy(p => p.ItemId)
.OrderByDescending(g => g.Sum(p => p.AmountOrdered))
.Take(10)
.ToList();
return result;
}
The error I get is:
Cannot implicitly convert type System.Collections.Generic.List<System.Linq.IGrouping<int, Project.Models.Order>>' to 'System.Collections.Generic.List<Project.Models.Order?>'

You are close. Your grouping contains a collection of orders for each Item ID. Once you have reduced the list to the top 10 groups, you need to extract the desired items. Since the group key only contains the Item ID, you need to drill down into the collection of orders, pick one (any will do), and select the linked Item object.
The result will be a list of Items, not a list of Orders.
Try:
public List<Item> GetBestsellingItems()
{
var result = context.Order
.Include(p => p.Item)
.GroupBy(p => p.ItemId)
.OrderByDescending(g => g.Sum(p => p.AmountOrdered))
.Take(10)
.Select(g => g.First().Item)
.ToList();
return result;
}
I omitted the .Where(p => p.ItemId == p.Item.ItemId) operation, because I don't believe it does anything. If that test ever fails, you have some serious database integrity issues on your hands.
If you also want to return the calculated amounts along with the items, you will need to change the return type to be a list of tuples or custom objects, and then chance the .Select(...) to construct each instance using both the item and calculated amount.

Related

Linq OrderBy then GroupBy - group by unexpectedly changes order of list

When I run this expression i can see that the list order is correctly in sequence by the highest ActionId's
var list = db.Actions.Where(z => z.RunId
== RunId).OrderByDescending(w =>
w.ActionId).ToList();
I only want to select the highest ActionId's of each ActionName so I now do:
var list = db.Actions.Where(z => z.RunId
== RunId).OrderByDescending(w =>
w.ActionId).GroupBy(c => new
{
c.ActionName,
c.MachineNumber,
})
.Select(y =>
y.FirstOrDefault()).ToList();
When I look at the contents of list, it hasn't selected the ActionName/MachineNumber with the highest ActionId, which I assumed would be the case by ordering then using FirstOrDefault().
Any idea where I'm going wrong? I want to group the records by the ActionName and MachineId, and then pick the record with the highest ActionId for each group
Instead of grouping an ordered collection, group the collection first, and then select the record with the highest ID for each of the groups. GroupBy is not guaranteed to preserve the order in each group in LINQ to SQL - it depends on your database server.
var list = db.Actions.Where(z => z.RunId == RunId).GroupBy(c => new
{
c.ActionName,
c.MachineNumber,
})
.Select(y => y.OrderByDescending(z => z.ActionId).FirstOrDefault()).ToList();

Field expression GroupBy not returning included objects

In this code:
var dbrepayments = _context.Repayments.Include("Loan").Include("Loan.Borrower").Include("Loan.LoanProduct")
.Where(c => c.PaidOn == null && c.DateOfRepayment <= today)
.GroupBy(c => c.Loan.Id, (key, g) => g.OrderByDescending(c => c.Id).FirstOrDefault())
.OrderBy(c => c.DateOfRepayment);
_context is ApplicationDbContext type that I am using to get results from database using Code-First approach.
The problem is when I try to iterate through dbrepayments and get the value of Loan, Loan.Borrower, and Loan.LoanProduct objects they are showing as null. But when I remove GroupBy, these objects are returned correctly.
I'd wager the issue here is the element selector in your GroupBy statement:
(key, g) => g.OrderByDescending(c => c.Id).FirstOrDefault()
This didn't make a lot of sense when I first read it. You are taking repayments grouped by loan, but then trying to select just the last repayment for each loan? Followed by ordering those first repayments by date.
I believe this will give you the results you're looking for with the eager loaded relationships:
var dbrepayments = _context.Repayments.Include("Loan").Include("Loan.Borrower").Include("Loan.LoanProduct")
.Where(c => c.PaidOn == null && c.DateOfRepayment <= today)
.GroupBy(c => c.Loan.Id)
.Select(c => c.OrderByDescending(x => x.Id).FirstOrDefault())
.OrderBy(c => c.DateOfRepayment);
GroupBy will respect Include but if you are using a select expression, that overrides it. You cannot add Include inside the selector as that is working with IEnumerable of the expected results. Instead, group the results by loan as expected, but then Select from the results to get the latest repayment. This will give you a list of the latest repayments that you can then order.

How can I split a List<T> into two lists, one containing all duplicate values and the other containing the remainder?

I have a basic class for an Account (other properties removed for brevity):
public class Account
{
public string Email { get; set; }
}
I have a List<T> of these accounts.
I can remove duplicates based on the e-mail address easily:
var uniques = list.GroupBy(x => x.Email).Select(x => x.First()).ToList();
The list named 'uniques' now contains only one of each account based on e-mail address, any that were duplicates were discarded.
I want to do something a little different and split the list into two.
One list will contain only 'true' unique values, the other list will contain all duplicates.
For example the following list of Account e-mails:
unique#email.com
dupe#email.com
dupe#email.com
Would be split into two lists:
Unique
unique#email.com
Duplicates
dupe#email.com
dupe#email.com
I have been able to achieve this already by creating a list of unique values using the example at the top. I then use .Except() on the original list to get the differences which are the duplicates. Lastly I can loop over each duplicate to 'pop' it out of the unique list and move it to the duplicate list.
Here is a working example on .NET Fiddle
Can I split the list in a more efficient or syntactically sugary way?
I'd be happy to use a third party library if necessary but I'd rather just stick to pure LINQ.
I'm aware of CodeReview but feel the question also fits here.
var groups = list.GroupBy(x => x.Email)
.GroupBy(g => g.Count() == 1 ? 0 : 1)
.OrderBy(g => g.Key)
.Select(g => g.SelectMany(x => x))
.ToList();
groups[0] will be the unique ones and group[1] will be the non-unique ones.
var duplicates = list.GroupBy(x => x) // or x.Property if you are grouping by some property.
.Where(g => g.Count() > 1)
.SelectMany(g => g);
var uniques = list.GroupBy(x => x) // or x.Property if you are grouping by some property.
.Where(g => g.Count() == 1)
.SelectMany(g => g);
Alternatively, once you get one list, you can get the other one using Except:
var uniques = list.Except(duplicates);
// or
var duplicates = list.Except(uniques);
Another way to do it would be to get uniques, and then for duplicates simply get the elements in the original list that aren't in uniques.
IEnumerable<Account> uniques;
IEnumerable<Account> dupes;
dupes = list.Where(d =>
!(uniques = list.GroupBy(x => x.Email)
.Where(g => g.Count() == 1)
.SelectMany(u => u))
.Contains(d));

What can I do to improve the speed of this query?

I have a linq query that returns the last page a user looked at based on a table of page hits. The fields are simply TimeStamp, UserID and URL which are logged from user activity. The query looks like this:
public static IQueryable GetUserStatus()
{
var ctx = new AppEntities();
var currentPageHits = ctx.Pagehits
.GroupBy(x => x.UserID)
.Select(x => x.Where(y => y.TimeStamp == x.Max(z => z.TimeStamp)))
.SelectMany(x => x);
return currentPageHits.OrderByDescending(o => o.TimeStamp);
}
The query works perfectly but runs slowly. Our DBA assures us that the table has indexes in all the right places and that the trouble must be with the query.
Is there anything inherently wrong or BAD with this, or is there a more efficient way of getting the same results?
You could try:
var currentPageHits2 = ctx.Pagehits
.GroupBy(x => x.UserID)
.Select(x => x.OrderByDescending(y => y.TimeStamp).First())
.OrderByDescending(x => x.TimeStamp);
But the speed should be the same.
Note that there is a subtle difference between this query and yours... With yours, if a UserId has two "max TimeStamp" PageHits with the same TimeStamp, two "rows" will be returned, with this one only one will be returned.
So you try to implement DENSE_RANK() OVER (PARTITION BY UserID ORDER BY TimeStamp DESC) with LINQ? So all latest records per user-group according to the Timestamp. You could try:
public static IQueryable GetUserStatus()
{
var ctx = new AppEntities();
var currentPageHits = ctx.Pagehits
.GroupBy(x => x.UserID)
.SelectMany(x => x.GroupBy(y => y.TimeStamp).OrderByDescending(g=> g.Key).FirstOrDefault())
.OrderByDescending(x => x.TimeStamp);
return currentPageHits;
}
So it's grouping the user-group by TimeStamp, then it takes the latest group(one or more records in case of ties). The SelectMany flattens the goups to records. I think this is more efficient than your query.

LINQ group by then order groups of result

I have a table that has the following 3 columns, ID, ShortCode, UploadDate.
I want to use LINQ to group the results by shortcode (and keep all the results) then order those groups and return a list.
I have the following:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList<PDFDocument>().
GroupBy(b=>b.ShortCode)
.SelectMany(b=>b).ToList<PDFDocument>()
I want to return all results, grouped by ShortCode, the items within each group sorted by UploadDate and the groups sorted so the one that has the most recent document in it first.
Does anyone know if this is even possible?
Try
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.OrderByDescending(d => d.UploadDate)
.GroupBy(d => d.ShortCode)
.SelectMany(g => g)
.ToList();
This should
Order the items by upload date (descending so newest first)
Then group them by short code - so within each group the items are still sorted
The groups are still in descending order, so no need to order again
Finally concatenate the results into a single list
If performance is an issue you many be better off doing
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.GroupBy(d => d.ShortCode)
.Select(g => g.OrderByDescending(d => d.UploadDate))
.OrderByDescending(e => e.First().UploadDate)
.SelectMany(e => e)
.ToList();
which sorts the contents of each group separately rather than sorting everything first and then grouping.
In fact, you don't want to group by short code, you want to order by them. So the following query should do the trick:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.OrderBy(b => b.ShortCode)
.ThenBy(b => b.UploadDate)
.ToList()
Edit
If you really want to use a GroupBy, you can do so this way:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.SelectMany(grouping => grouping.OrderBy(b => b.UploadDate))
.ToList()
But I discourage it. There is no point creating groups if you do not want groups in the first place!
Second edit
I did not get you wanted the groups ordered by UpdateTime too. It complicates a little the query:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.Select(grouping => grouping.OrderByDescending(b => b.UploadDate))
.OrderByDescending(grouping => grouping.First().UploadDate)
.SelectMany(grouping => grouping)
.ToList()

Categories

Resources