LINQ: Select all from each group except the first item - c#

It is easy to select the first of each group:
var firstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.Select(g => g.First());
But...
Question: how can I select all from each group except the first item?
var everythingButFirstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.Select( ...? );
Additional information:
My real goal is to delete all duplicates except the last (in a bulk way, ie: not using an in-memory foreach), so after the previous query I want to use RemoveRange:
dbContext.Measurements.RemoveRange(everythingButFirstOfEachGroup);
So, if my question has no sense, this information might be handy.

Use Skip(1) to skip the first record and select the rest.
Something like:
var firstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.SelectMany(g => g.OrderByDescending(r => r.SomeColumn).Skip(1));
See: Enumerable.Skip
If you do not need a flattened collection then replace SelectMany with Select in code snippet.

IGrouping<K, V> implements IEnumerable<V>; you simply need to skip inside the select clause to apply it to each group:
.Select(g => g.Skip(1))

You can always use .Distinct() to remove duplicates; presumably sorting or reverse-sorting and then applying .distinct() would give you what you want.

Related

Linq get Distinct ToDictionary

need help to only select/get distinct entries based on i.Code.
There are duplicates and thus I'm getting an error in my expression "An item with the same key has already been added."
var myDictionary = dbContext.myDbTable
.Where(i => i.shoesize>= 4)
.OrderBy(i => i.Code)
.ToDictionary(i => i.Code, i => i);
Have tried to use Select and/or Distinct in different combinations and also by themselves but am still getting the same error
var myDictionary= dbContext.myDbTable
.Where(i => i.shoesize>= 4)
.OrderBy(i => i.Code)
//.Select(i => i)
//.Distinct()
.ToDictionary(i => i.Code, i => i);
Can anybody help? C#
UPDATE: If there are multiple objects with the same code I only want to add the first object(with that particular code) to myDictionary.
You can group by Code and select the first item from each group (which is equivalent to distinct):
var myDictionary = dbContext.myDbTable
.Where(i => i.shoesize >= 4) // filter
.GroupBy(x => x.Code) // group by Code
.Select(g => g.First()) // select 1st item from each group
.ToDictionary(i => i.Code, i => i);
You don't need the OrderBy since Dictionarys represent an unordered collection. If you need an ordered dictionary you could use SortedDictionary.
It sounds to me that what you are looking for is .DistinctBy() (available in .NET 6), which lets you specify which property to distinct the elements in your collection by:
var myDictionary= dbContext.myDbTable
.Where(i => i.shoesize>= 4)
.DistinctBy(i => i.Code)
.ToDictionary(i => i.Code, i => i);
By dividing it and creating a list first it worked as compared to when it was all bundled up into one linq, guess the First() needed it to be in a list before being able to make it into a dict.
var firstLinq = dbContext.myDbTable
.Where(i => i.shoesize>= 4)
.ToList();
then
var finalLinq = fromConcurWithDuplicates
.GroupBy(i => i.Code)
.Select(i => i.First())
.ToList()
.ToDictionary(i => i.Code, i => i);

Selecting distinct attrbutes where count isn't same

I have the following Linq query which selects a distinct list of attributes from all products:
products
.SelectMany(p => p.Attributes)
.Where(a => a.AttributeGroup.IsProductFilter)
.Distinct()
.ToList();
Each attribute is able to be assigned to each product, so I am only wanting a list of attributes where the number of attributes is less than the number of products (as they are used for filtering and there would be no change if the numbers were equal)
I'm not sure how to go about doing this - I thought I need to use GroupBy but wasn't sure how to get a list of attributes back:
IEnumerable<ProductAttribute> attributes = products.SelectMany(p => p.Attributes).Where(a => a.AttributeGroup.IsProductFilter);
return attributes.GroupBy(a => a.ID)
.Where(g => g.Count() < products.Count) // this is now an ienumarable group object so not sure how to get it back to an ienumarable attribute
Or this seemed a bit better
attributes.GroupBy(a => a)
.Where(g => g.Count() < products.Count)
.Select(g => g.ToList())
.Distinct()
.OrderBy(a => a.AttributeGroup.Order) // this doesn't work as a isn't an attribute
It's probably really simple but I'm not that great with Linq so any help solving this would be appreciated
I'm not sure, but doesn't SelectMany helps here too?
return attributes.GroupBy(a => a.ID)
.Where(g => g.Count() < products.Count)
.SelectMany(g => g); // perhaps Distinct after

Finding duplicate texts in IEnumerable<TextBox> collection

I have a collection of textboxes in my winform application.
I need help with LINQ query to get the collection of TextBox (i.e. IEnumerable) which contain duplicate entries.I want to make use of LINQ.
This query I used, is returning just the duplicate entry. But I need all the duplicate entries.
var duplicates = emailAddressList.GroupBy(t => t.Text)
.Where(g => !string.IsNullOrEmpty(g.Key))
.SelectMany(grp => grp.Skip(1))
.ToList();
Can any one help where am I going wrong ?
Regards
This query I used, is returning just the duplicate entry. But I need
all the duplicate entries.
Check if g.Count() > 1 and use SelectMany(g => g) to get all of each duplicate-group instead of only the duplicates (without first).
var duplicates = emailAddressList
.GroupBy(t => t.Text)
.Where(g => !string.IsNullOrEmpty(g.Key) && g.Count() > 1)
.SelectMany(g => g)
.ToList();

Remove Duplicates and Original from C# List

I have a List of custom types where I want to remove the duplicate and the original if a duplicate is found. Can only be one possible duplicate.
I can overide Equals and GetHashCode and then use Distinct but this only removes the duplicate. I need to remove both original and duplicate... Any ideas for something elegant so I don't have to use a hammer.
You can use GroupBy, followed by Where (g => g.Count() == 1) to filter out all records that have duplicates:
var res = orig.GroupBy(x => x).Where(g => g.Count() == 1).Select(g => g.Key);
In order for this to work, you still need to override GetHashCode and Equals.
var itemsExistingExactlyOnce = list.GroupBy(x => x)
.Where(group => group.Count() == 1)
.Select(group => group.Key);

LINQ DataTable Sum In Where Clause

I'm having a rough time figuring out exactly how to do a LINQ query on a DataTable and return a full row while having the WHERE clause test on a sum.
My code:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.Where(...)
.GroupBy(r => (decimal)r["AccountNumber"]));
There are multiple transactions for each AccountNumber, and I need to sum those together and determine whether they're less than a user inputted amount (For my purpose, it's called balanceGreaterThan). I can't find any examples to go by where someone has done this sort of thing.
Thanks in advance, SO.
Edit: My apologies -- The column I need to sum is called "Balance"
Edit 2: Final code
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => r.Field<string>("AccountNumber"))
.Where(g => g.Sum(r => r.Field<decimal>("Balance")) < balanceGreaterThan)
.SelectMany(g => g));
I had to change the GroupBy to use r.Field rather than r["AccountNumber"]
You're trying to filter the groups themselves (to find which groups have a large sum), not the rows that go into the groups.
Therefore, you need to put the Where() call after the GroupBy:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"])
.Where(g => g.Sum(r => r.Field<decimal>("Balance") < balanceGreaterThan));
EDIT: If you want to get an IEnumerable<DataRow> back (as opposed to an IEnumerable<IGrouping<DataRow>>), you'll need to add .SelectMany(g => g).
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"])
.Where(g => g.Sum(r => (decimal)r["Balance"]) <= someInput)
.SelectMany(g => g));
I think you want something like:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"]));
.Where(grp => grp.Sum(r => r["amount"]) < balanceGreaterThan);

Categories

Resources