Selecting distinct attrbutes where count isn't same

Selecting distinct attrbutes where count isn't same - c#

I have the following Linq query which selects a distinct list of attributes from all products:
products
.SelectMany(p => p.Attributes)
.Where(a => a.AttributeGroup.IsProductFilter)
.Distinct()
.ToList();
Each attribute is able to be assigned to each product, so I am only wanting a list of attributes where the number of attributes is less than the number of products (as they are used for filtering and there would be no change if the numbers were equal)
I'm not sure how to go about doing this - I thought I need to use GroupBy but wasn't sure how to get a list of attributes back:
IEnumerable<ProductAttribute> attributes = products.SelectMany(p => p.Attributes).Where(a => a.AttributeGroup.IsProductFilter);
return attributes.GroupBy(a => a.ID)
.Where(g => g.Count() < products.Count) // this is now an ienumarable group object so not sure how to get it back to an ienumarable attribute
Or this seemed a bit better
attributes.GroupBy(a => a)
.Where(g => g.Count() < products.Count)
.Select(g => g.ToList())
.Distinct()
.OrderBy(a => a.AttributeGroup.Order) // this doesn't work as a isn't an attribute
It's probably really simple but I'm not that great with Linq so any help solving this would be appreciated

I'm not sure, but doesn't SelectMany helps here too?
return attributes.GroupBy(a => a.ID)
.Where(g => g.Count() < products.Count)
.SelectMany(g => g); // perhaps Distinct after

Related

Finding duplicate texts in IEnumerable<TextBox> collection

I have a collection of textboxes in my winform application.
I need help with LINQ query to get the collection of TextBox (i.e. IEnumerable) which contain duplicate entries.I want to make use of LINQ.
This query I used, is returning just the duplicate entry. But I need all the duplicate entries.
var duplicates = emailAddressList.GroupBy(t => t.Text)
.Where(g => !string.IsNullOrEmpty(g.Key))
.SelectMany(grp => grp.Skip(1))
.ToList();
Can any one help where am I going wrong ?
Regards

This query I used, is returning just the duplicate entry. But I need
all the duplicate entries.
Check if g.Count() > 1 and use SelectMany(g => g) to get all of each duplicate-group instead of only the duplicates (without first).
var duplicates = emailAddressList
.GroupBy(t => t.Text)
.Where(g => !string.IsNullOrEmpty(g.Key) && g.Count() > 1)
.SelectMany(g => g)
.ToList();

LINQ: Select all from each group except the first item

It is easy to select the first of each group:
var firstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.Select(g => g.First());
But...
Question: how can I select all from each group except the first item?
var everythingButFirstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.Select( ...? );
Additional information:
My real goal is to delete all duplicates except the last (in a bulk way, ie: not using an in-memory foreach), so after the previous query I want to use RemoveRange:
dbContext.Measurements.RemoveRange(everythingButFirstOfEachGroup);
So, if my question has no sense, this information might be handy.

Use Skip(1) to skip the first record and select the rest.
Something like:
var firstOfEachGroup = dbContext.Measurements
.OrderByDescending(m => m.MeasurementId)
.GroupBy(m => new { m.SomeColumn })
.Where(g => g.Count() > 1)
.SelectMany(g => g.OrderByDescending(r => r.SomeColumn).Skip(1));
See: Enumerable.Skip
If you do not need a flattened collection then replace SelectMany with Select in code snippet.

IGrouping<K, V> implements IEnumerable<V>; you simply need to skip inside the select clause to apply it to each group:
.Select(g => g.Skip(1))

You can always use .Distinct() to remove duplicates; presumably sorting or reverse-sorting and then applying .distinct() would give you what you want.

Remove Duplicates and Original from C# List

I have a List of custom types where I want to remove the duplicate and the original if a duplicate is found. Can only be one possible duplicate.
I can overide Equals and GetHashCode and then use Distinct but this only removes the duplicate. I need to remove both original and duplicate... Any ideas for something elegant so I don't have to use a hammer.

You can use GroupBy, followed by Where (g => g.Count() == 1) to filter out all records that have duplicates:
var res = orig.GroupBy(x => x).Where(g => g.Count() == 1).Select(g => g.Key);
In order for this to work, you still need to override GetHashCode and Equals.

var itemsExistingExactlyOnce = list.GroupBy(x => x)
.Where(group => group.Count() == 1)
.Select(group => group.Key);

LINQ group by then order groups of result

I have a table that has the following 3 columns, ID, ShortCode, UploadDate.
I want to use LINQ to group the results by shortcode (and keep all the results) then order those groups and return a list.
I have the following:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList<PDFDocument>().
GroupBy(b=>b.ShortCode)
.SelectMany(b=>b).ToList<PDFDocument>()
I want to return all results, grouped by ShortCode, the items within each group sorted by UploadDate and the groups sorted so the one that has the most recent document in it first.
Does anyone know if this is even possible?

Try
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.OrderByDescending(d => d.UploadDate)
.GroupBy(d => d.ShortCode)
.SelectMany(g => g)
.ToList();
This should
Order the items by upload date (descending so newest first)
Then group them by short code - so within each group the items are still sorted
The groups are still in descending order, so no need to order again
Finally concatenate the results into a single list
If performance is an issue you many be better off doing
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.AsEnumerable()
.GroupBy(d => d.ShortCode)
.Select(g => g.OrderByDescending(d => d.UploadDate))
.OrderByDescending(e => e.First().UploadDate)
.SelectMany(e => e)
.ToList();
which sorts the contents of each group separately rather than sorting everything first and then grouping.

In fact, you don't want to group by short code, you want to order by them. So the following query should do the trick:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.OrderBy(b => b.ShortCode)
.ThenBy(b => b.UploadDate)
.ToList()
Edit
If you really want to use a GroupBy, you can do so this way:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.SelectMany(grouping => grouping.OrderBy(b => b.UploadDate))
.ToList()
But I discourage it. There is no point creating groups if you do not want groups in the first place!
Second edit
I did not get you wanted the groups ordered by UpdateTime too. It complicates a little the query:
rawData.Provider.CreateQuery<PDFDocument>(qb.rootExperession)
.ToList()
.GroupBy(b => b.ShortCode)
.Select(grouping => grouping.OrderByDescending(b => b.UploadDate))
.OrderByDescending(grouping => grouping.First().UploadDate)
.SelectMany(grouping => grouping)
.ToList()

LINQ DataTable Sum In Where Clause

I'm having a rough time figuring out exactly how to do a LINQ query on a DataTable and return a full row while having the WHERE clause test on a sum.
My code:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.Where(...)
.GroupBy(r => (decimal)r["AccountNumber"]));
There are multiple transactions for each AccountNumber, and I need to sum those together and determine whether they're less than a user inputted amount (For my purpose, it's called balanceGreaterThan). I can't find any examples to go by where someone has done this sort of thing.
Thanks in advance, SO.
Edit: My apologies -- The column I need to sum is called "Balance"
Edit 2: Final code
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => r.Field<string>("AccountNumber"))
.Where(g => g.Sum(r => r.Field<decimal>("Balance")) < balanceGreaterThan)
.SelectMany(g => g));
I had to change the GroupBy to use r.Field rather than r["AccountNumber"]

You're trying to filter the groups themselves (to find which groups have a large sum), not the rows that go into the groups.
Therefore, you need to put the Where() call after the GroupBy:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"])
.Where(g => g.Sum(r => r.Field<decimal>("Balance") < balanceGreaterThan));
EDIT: If you want to get an IEnumerable<DataRow> back (as opposed to an IEnumerable<IGrouping<DataRow>>), you'll need to add .SelectMany(g => g).

transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"])
.Where(g => g.Sum(r => (decimal)r["Balance"]) <= someInput)
.SelectMany(g => g));

I think you want something like:
transactionsToRemove.AddRange(paymentTransactionResults
.AsEnumerable()
.GroupBy(r => (decimal)r["AccountNumber"]));
.Where(grp => grp.Sum(r => r["amount"]) < balanceGreaterThan);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Selecting distinct attrbutes where count isn't same - c#

I'm not sure, but doesn't SelectMany helps here too? return attributes.GroupBy(a => a.ID) .Where(g => g.Count() < products.Count) .SelectMany(g => g); // perhaps Distinct after

Related

Finding duplicate texts in IEnumerable<TextBox> collection

LINQ: Select all from each group except the first item

Remove Duplicates and Original from C# List

LINQ group by then order groups of result

LINQ DataTable Sum In Where Clause

Categories

Resources