LINQ group by then order groups of result - c#

I have a table that has the following 3 columns, ID, ShortCode, UploadDate.
I want to use LINQ to group the results by shortcode (and keep all the results) then order those groups and return a list.
I have the following:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList<PDFDocument>().
GroupBy(b=>b.ShortCode)
.SelectMany(b=>b).ToList<PDFDocument>()
I want to return all results, grouped by ShortCode, the items within each group sorted by UploadDate and the groups sorted so the one that has the most recent document in it first.
Does anyone know if this is even possible?

Try
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.OrderByDescending(d => d.UploadDate)
.GroupBy(d => d.ShortCode)
.SelectMany(g => g)
.ToList();
This should
Order the items by upload date (descending so newest first)
Then group them by short code - so within each group the items are still sorted
The groups are still in descending order, so no need to order again
Finally concatenate the results into a single list
If performance is an issue you many be better off doing
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.GroupBy(d => d.ShortCode)
.Select(g => g.OrderByDescending(d => d.UploadDate))
.OrderByDescending(e => e.First().UploadDate)
.SelectMany(e => e)
.ToList();
which sorts the contents of each group separately rather than sorting everything first and then grouping.

In fact, you don't want to group by short code, you want to order by them. So the following query should do the trick:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.OrderBy(b => b.ShortCode)
.ThenBy(b => b.UploadDate)
.ToList()
Edit
If you really want to use a GroupBy, you can do so this way:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.SelectMany(grouping => grouping.OrderBy(b => b.UploadDate))
.ToList()
But I discourage it. There is no point creating groups if you do not want groups in the first place!
Second edit
I did not get you wanted the groups ordered by UpdateTime too. It complicates a little the query:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.Select(grouping => grouping.OrderByDescending(b => b.UploadDate))
.OrderByDescending(grouping => grouping.First().UploadDate)
.SelectMany(grouping => grouping)
.ToList()

Related

How to select last record in a LINQ GroupBy clause

I have the following simple table with ID, ContactId and Comment.
I want to select records and GroupBy contactId. I used this LINQ extension method statement:
Mains.GroupBy(l => l.ContactID)
.Select(g => g.FirstOrDefault())
.ToList()
It returns record 1 and 4. How can I use LINQ to get the ContactID with the highest ID? (i.e. return 3 and 6)
You can order you items
Mains.GroupBy(l => l.ContactID)
.Select(g=>g.OrderByDescending(c=>c.ID).FirstOrDefault())
.ToList()
Use OrderByDescending on the items in the group:
Mains.GroupBy(l => l.ContactID)
.Select(g => g.OrderByDescending(l => l.ID).First())
.ToList();
Also, there is no need for FirstOrDefault when selecting an item from the group; a group will always have at least one item so you can use First() safely.
Perhaps selecting with Max instead of OrderByDescending could result into improving of performance (I'm not sure how it's made inside so it needs to be tested):
var grouped = Mains.GroupBy(l => l.ContactID);
var ids = grouped.Select(g => g.Max(x => x.Id));
var result = grouped.Where(g => ids.Contains(g.Id));
As I assume it could result into a query that will take MAX and then do SELECT * FROM ... WHERE id IN ({max ids here}) which could be significantly faster than OrderByDescending.
Feel free to correct me if I'm not right.
OrderByDescending
Mains.GroupBy(l => l.ContactID)
.Select(g=>g.OrderByDescending(c=>c.ID).FirstOrDefault())
.ToList()
is your best solution
It orders by the highest ID descending, which is pretty obvious from the name.
You could use MoreLinq like this for a shorter solution:
Main.OrderByDescending(i => i.Id).DistinctBy(l => l.ContactID).ToList();

Selecting distinct attrbutes where count isn't same

I have the following Linq query which selects a distinct list of attributes from all products:
products
.SelectMany(p => p.Attributes)
.Where(a => a.AttributeGroup.IsProductFilter)
.Distinct()
.ToList();
Each attribute is able to be assigned to each product, so I am only wanting a list of attributes where the number of attributes is less than the number of products (as they are used for filtering and there would be no change if the numbers were equal)
I'm not sure how to go about doing this - I thought I need to use GroupBy but wasn't sure how to get a list of attributes back:
IEnumerable<ProductAttribute> attributes = products.SelectMany(p => p.Attributes).Where(a => a.AttributeGroup.IsProductFilter);
return attributes.GroupBy(a => a.ID)
.Where(g => g.Count() < products.Count) // this is now an ienumarable group object so not sure how to get it back to an ienumarable attribute
Or this seemed a bit better
attributes.GroupBy(a => a)
.Where(g => g.Count() < products.Count)
.Select(g => g.ToList())
.Distinct()
.OrderBy(a => a.AttributeGroup.Order) // this doesn't work as a isn't an attribute
It's probably really simple but I'm not that great with Linq so any help solving this would be appreciated
I'm not sure, but doesn't SelectMany helps here too?
return attributes.GroupBy(a => a.ID)
.Where(g => g.Count() < products.Count)
.SelectMany(g => g); // perhaps Distinct after

LINQ expression: Specifing maximum groupby size

Is there a elegant way of doing following in LINQ or should I write an extension for this
i have a list of objects that need to be grouped by startdate
lets say
09.00,
13.00,
13.00,
13.00,
15.00,
var groupedStartDates = startdate.groupby(x => x.StartDate);
I need to have maximum size of group to be 2.
Expected result is
var groupedStartDates = startDate.GroupBy(x => x.StartDate);
List
list1 {09.00}
list2 {13.00; 13.00}
list3 {13.00}
list4 {15.00}
After the initial grouping you can then group by the index (in the groups) divided by 2 to do a further grouping, then use SelectMany to flatten that back out.
var result = startDate.GroupBy(x => x.StartDate)
.SelectMany(grp => grp.Select((x,i) => new{x,i})
.GroupBy(a => a.i / 2)
.Select(sgrp => sgrp.Select(a => a.x)));
Here's a break down of what's going on. Note curly brackets will represent collections and square will represent object with multiple properties.
Initial data
09.00, 13.00, 13.00, 13.00, 15.00
After GroupBy(x => x.StartDate)
[Key:09.00, {09.00}], [Key:13.00, {13.00, 13.00, 13.00}], [Key:15.00, {15.00}]
Now it's going to operate on each group, but I'll show the results for all of them at each step.
After the Select((x,i) => new{x,i})
{[x:09.00, i:0]}, {[x:13.00, i:0], [x:13.00, i:1], [x:13.00, i:2]}, {[x:15.00, i:0]}
After the GroupBy(a => a.i / 2)
{[Key:0, {[x:09.00, i:0]}]}, {[Key:0, {[x:13.00, i:0], [x:13.00, i:1]}], [Key:1, {[x:13.00, i:2]}}, {[Key:0, {[x:15.00, i:0]}}
After the .Select(sgrp => sgrp.Select(a => a.x))
{{09.00}}, {{13.00, 13.00}, {13.00}}, {{15.00}}
And finally the SelectMany will flatten that to.
{09.00}, {13.00, 13.00}, {13.00}, {15.00}
Note that each line represents a collection, but I didn't put curly braces around them as I felt it made it even harder to read.
Or with an extension method
public static IEnumerable<IEnumerable<T>> Bin<T>(this IEnumerable<T> items, int binSize)
{
return items
.Select((x,i) => new{x,i})
.GroupBy(a => a.i / binSize)
.Select(grp => grp.Select(a => a.x));
}
You can make it a little nicer.
var result = startDate
.GroupBy(x => x.StartDate)
.SelectMany(grp => grp.Bin(2));
Update: As of .Net 6 they have added the new Linq method Chuck that does the same thing as my Bin method above. So now you can do
var result = startDate
.GroupBy(x => x.StartDate)
.SelectMany(grp => grp.Chunk(2));
If I understand your question correctly, you can use Take:
var result= startDate.GroupBy(x => x.StartDate)
.Select(x => x.Take(2))
.ToList();
Each group will contains at most 2 members and additional items of groups will not return.

Ordering list by shared value of property

I have a list of objects with field UserID, Property:
I would like to order the list by most shared property value. So if every user has Property= "Popular", that should come up first. If everyone but one user has Property="Second" that should come up second in list...
even if its only used once for each user.
I would do distinct() on each possible Property and but that doesnt seem efficient with many possible Property.
You can use a grouping on Property, order the groups by the number of counts in each group and then flatten the list again using SelectMany():
var items = myList.GroupBy(x => x.Property)
.OrderByDescending(g => g.Count())
.SelectMany(g => g);
.ToList();
From your question its not quite clear to me whether you want duplicates to show up or not and if you are at all interested in the UserID. If not, you can just select the keys of the groups to give you a List<string> of unique Property values in the desired order:
var props = myList.GroupBy(x => x.Property)
.OrderByDescending(g => g.Count())
.Select(g => g.Key);
.ToList();
Edit:
It seems like this would be more what you are actually are looking for - groups are are ordered by the number of unique users that have a given property.
var props = myList.GroupBy(x => x.Property)
.OrderByDescending(g => g.Select(x=> x.UserID)
.Distinct()
.Count())
.Select(g => g.Key);
.ToList();

LINQ DataTable Sum In Where Clause

I'm having a rough time figuring out exactly how to do a LINQ query on a DataTable and return a full row while having the WHERE clause test on a sum.
My code:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.Where(...)
.GroupBy(r => (decimal)r["AccountNumber"]));
There are multiple transactions for each AccountNumber, and I need to sum those together and determine whether they're less than a user inputted amount (For my purpose, it's called balanceGreaterThan). I can't find any examples to go by where someone has done this sort of thing.
Thanks in advance, SO.
Edit: My apologies -- The column I need to sum is called "Balance"
Edit 2: Final code
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => r.Field<string>("AccountNumber"))
.Where(g => g.Sum(r => r.Field<decimal>("Balance")) < balanceGreaterThan)
.SelectMany(g => g));
I had to change the GroupBy to use r.Field rather than r["AccountNumber"]
You're trying to filter the groups themselves (to find which groups have a large sum), not the rows that go into the groups.
Therefore, you need to put the Where() call after the GroupBy:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"])
.Where(g => g.Sum(r => r.Field<decimal>("Balance") < balanceGreaterThan));
EDIT: If you want to get an IEnumerable<DataRow> back (as opposed to an IEnumerable<IGrouping<DataRow>>), you'll need to add .SelectMany(g => g).
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"])
.Where(g => g.Sum(r => (decimal)r["Balance"]) <= someInput)
.SelectMany(g => g));
I think you want something like:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"]));
.Where(grp => grp.Sum(r => r["amount"]) < balanceGreaterThan);

Categories

Resources