Retrieving non-duplicates from 2 Collections using LINQ

Retrieving non-duplicates from 2 Collections using LINQ - c#

Background: I have two Collections of different types of objects with different name properties (both strings). Objects in Collection1 have a field called Name, objects in Collection2 have a field called Field.
I needed to compare these 2 properties, and get items from Collection1 where there is not a match in Collection2 based on that string property (Collection1 will always have a greater or equal number of items. All items should have a matching item by Name/Field in Collection2 when finished).
The question: I've found answers using Lists and they have helped me a little(for what it's worth, I'm using Collections). I did find this answer which appears to be working for me, however I would like to convert what I've done from query syntax (if that's what it's called?) to a LINQ query. See below:
//Query for results. This code is what I'm specifically trying to convert.
var result = (from item in Collection1
where !Collection2.Any(x => x.ColumnName == item.FieldName)
select item).ToList();
//** Remove items in result from Collection1**
//...
I'm really not at all familiar with either syntax (working on it), but I think I generally understand what this is doing. I'm struggling trying to convert this to LINQ syntax though and I'd like to learn both of these options rather than some sort of nested loop.
End goal after I remove the query results from Collection1: Collection1.Count == Collection2 and the following is true for each item in the collection: ItemFromCollection1.Name == SomeItemFromCollection2.Field (if that makes sense...)

You can convert this to LINQ methods like this:
var result = Collection1.Where(item => !Collection2.Any(x => x.ColumnName == item.FieldName))
.ToList();

Your first query is the opposite of what you asked for. It's finding records that don't have an equivalent. The following will return all records in Collection1 where there is an equivalent:
var results=Collection1.Where(c1=>!Collection2.Any(c2=>c2.Field==c1.Name));
Please note that this isn't the fastest approach, especially if there is a large number of records in collection2. You can find ways of speeding it up through HashSets or Lookups.

if you want to get a list of non duplicate values to be retained then do the following.
List<string> listNonDup = new List<String>{"6","1","2","4","6","5","1"};
var singles = listNonDup.GroupBy(n => n)
.Where(g => g.Count() == 1)
.Select(g => g.Key).ToList();
Yields: 2, 4, 5
if you want a list of all the duplicate values then you can do the opposite
var duplicatesxx = listNonDup.GroupBy(s => s)
.SelectMany(g => g.Skip(1)).ToList();

Related

Linq challenge: converting this piece of code from method chain to standard Linq

The challenge is about converting from method chain to standard linq a piece of code full of group by.
The context
To fully understand the topic here you can read the original question (with class definitions, sample data and so on): Linq: rebuild hierarchical data from the flattened list
Thanks to #Akash Kava, I've found the solution to my problem.
Chain method formulation
var macroTabs = flattenedList
.GroupBy(x => x.IDMacroTab)
.Select((x) => new MacroTab
{
IDMacroTab = x.Key,
Tabs = x.GroupBy(t => t.IDTab)
.Select(tx => new Tab {
IDTab = tx.Key,
Slots = tx.Select(s => new Slot {
IDSlot = s.IDSlot
}).ToList()
}).ToList()
}).ToList();
But, for sake of knowledge, I've tried to convert the method chain to the standard Linq formulation but something is wrong.
What happens is similar to this..
My attempt to convert it to Linq standard syntax
var antiflatten = flattenedList
.GroupBy(x => x.IDMacroTab)
.Select(grouping => new MacroTab
{
IDMacroTab = grouping.Key,
Tabs = (from t in grouping
group grouping by t.IDTab
into group_tx
select new Tab
{
IDTab = group_tx.Key,
Slots = (from s in group_tx
from s1 in s
select new Slot
{
IDSlot = s1.IDSlot
}).ToList()
}).ToList()
});
The result in LinqPad
The classes and the sample data on NetFiddle:
https://dotnetfiddle.net/8mF1qI

This challenge helped me to understand what exactly returns a Linq Group By (and how prolix is the Linq syntax with Group By).
As LinqPad clearly shows a Group By returns a List of Groups. Group is a very simple class which has just one property: a Key
As this answer states, from definition of IGrouping (IGrouping<out TKey, out TElement> : IEnumerable<TElement>, IEnumerable) the only way to access to the content of the subgroups is to iterate through elements (a foreach, another group by, a select, ecc).
Here is shown the Linq syntax formulation of the method chain.
And here is the source code on Fiddle
But let's go on trying to see another solution:
What we usually do in SQL when we do a Group By is to list all the columns but the one which have been grouped. With Linq is different.. it still returns ALL the columns.
In this example we started with a dataset with 3 'columns' {IDMacroTab, IDTab, IDSlot}. We grouped for the first column, but Linq would return the whole dataset, unless we explicitly tell him..

Group by Linq setting properties

I'm working on a groupby query using Linq, but I want to set the value for a new property in combination with another list. This is my code:
var result = list1.GroupBy(f => f.Name)
.ToList()
.Select(b => new Obj
{
ClientName = b.Name,
Status = (AnotherClass.List().Where(a=>a.state_id=b.????).First()).Status
})
I know I'm using a group by, but I'm not sure of how to access the value inside my bcollection to compare it with a.state_id.
This snippet:
Status = (AnotherClass.List().Where(a=>a.state_id=b.????).First()).Status
I've done that before but months ago I don't remember the syntax, when I put a dot behind b I have acces only to Key and the Linq Methods... What should be the syntax?`

Issue in your code is happening here:
a=>a.state_id=b.????
Why ?
Check type of b here, it would be IGrouping<TKey,TValue>, which is because, post GroupBy on an IEnumerable, you get result as IEnumerable<IGrouping<TKey,TValue>>
What does that mean?
Think of Grouping operation in the database, where when you GroupBy on a given Key, then remaining columns that are selected need an aggregation operation,since there could be more than one record per key and that needs to be represented
How it is represented in your code
Let's assume list1 has Type T objects
You grouped the data by Name property, which is part of Type T
There's no data projection so for a given key, it will aggregate the remaining data as IEnumerable<T>, as grouped values
Result is in the format IEnumerable<IGrouping<TK, TV>>, where TK is Name and TV represent IEnumerable<T>
Let's check out some code, break your original code in following parts
var result = list1.GroupBy(f => f.Name) - result will be of type IEnumerable<IGrouping<string,T>>, where list1 is IEnumerable<T>
On doing result.Select(b => ...), b is of type IGrouping<string,T>
Further you can run Linq queries on b, as follows:
b.Key, will give access to Name Key, there's no b.Value, for that your options could be following or any other relevant Linq operations:
a=>b.Any(x => a.state_id == x.state_id) or // Suuggests if an Id match in the Collection
a=>a.state_id == b.FirstOrDefault(x => x.state_id) //Selects First or default Value
Thus you can create a final result, from the IGrouping<string,T>, as per the logical requirement / use case

Remove items from a List which are in another List by a specific property?

I have two lists:
ListA with CancelledFlights
ListB with all Flights
I would like to remove all CancelledFlights from ListB with all flights compared not by the object but by the property FlightNumber. How could I achieve this with Lambda or LINQ? I know how to select in LINQ and lambda but not how to remove ...

You can use linq to do that
var cancelledFlightNumbers = ListA
.Select(x => x.FlightNumber)
.ToList();
var cancelledFlightsRemoved = ListB
.Where(x => !cancelledFlightNumbers.Contains(x.FlightNumber))
.ToList();
If you have too many items then you can use HashSet to improve performance
var cancelledFlightNumbers = new HashSet<int>(ListA.Select(x => x.FlightNumber));

There are two ways I can think of to do that, one with LINQ, one simply with List<T>'s methods:
The List<T> approach:
allFlights.RemoveAll(flight =>
cancelledFlight.Any(cancelled => cancelled.FlightNumber == flight.FlightNumber);
The LINQ approach:
// an IEqualityComparer<T> that can compare objects by FlightNumber.
var flightNumberComparer = new FlightNumberComparer();
allFlights = allFlights.Except(cancelledFlights, flightNumberComparer).ToList();
As you can see, for this simple scenario, RemoveAll is probably the simplest. It removes the items in-place on the original list rather than creating a new list, and will probably be easiest to write.
Remember that LINQ is a query language. It's not designed to have remove operations. At most, it can create a filtered view of a collection and create a new list based on that (like we did here). But LINQ, in itself, won't help you remove items from a list.

How to select all objects sharing a property value with a property value in list of objects?

I have a list of objects that I want to reload their data.
Like always, I have several options. I wanted just to select these items but encountered this "Additional information": Unable to create a constant value of type 'Item'. Only primitive types or enumeration types are supported in this context.
// (System.Collections.Generic.List<Item> selectedItems)
System.Collections.Generic.List<Item> items;
var q = from i in db.Items
where selectedItems.Any(s => s.Id == i.Id)
select i;
items = q.ToList()
the following yields the same, as expected...
var q = db.Items.Where(i => selectedItems.Any(si => i.Id == si.Id));
items = q.ToList();
I could have reattached each of the objects and call the reload, but then I would have(or not, but I don't know how) to run across the db lot of times to reload their Navigation Properties.
The only "fine" solution I've found until now is selecting the Id's of selectedItems and then running with it like follows:
int[] itemIds = selectedItems.Select(i => i.Id).ToArray();
var q = db.Items.Where(i => itemIds.Any(iId => i.Id == iId)); //Of course `Contains` could be used instead of `Any` here, since `itemIds` is a simple array of integers
items = q.ToList();
But is it a necessity or there is a more straight forward, neat or proper way to accomplish this?

But is it a necessity or there is a more straight forward, neat or proper way to accomplish this?
Not that I can think of. EF will try and turn your where clause into SQL (which is not as easy as you'd think). When it parses the expression and encounters a call to Any on a collection of non-primitive types, it does not know how to generically convert that list into a list of values to put in an into an in clause and gives you the error you quoted.
When the source collection is a collection of primitive or enumeration types, it can turn the source collection into a list of values and create an in clause. Contains does the same thing (it is also shorter is closer to the intent IMHO):
var q = db.Items.Where(i => itemIds.Contains(i.Id));

how to use linq to retrieve values from a 2 dimensional generic list

I have a generic List List[int, myClass], and I would like to find the smallest int value, and retrieve the items from the list that match this.
I am generating this from another LINQ statement
var traysWithExtraAisles = (from t in poolTrays
where t.TrayItems.Select(i=>i.Aisle)
.Any(a=> ! selectedAisles.Contains(a))
select new
{
count= t.TrayItems.Select(i=>i.Aisle)
.Count(a=> !selectedAisles.Contains(a)),
tray=t
}).ToList();
this gives me my anonymous List of [count, Tray], but now I want to figure out the smallest count, and return a sublist for all the counts that match this.
Can anyone help me out with this?

var smallestGroup = traysWithExtraAisles
.GroupBy(x => x.count)
.OrderBy(g => g.Key)
.First();
foreach(var x in smallestGroup)
{
var poolTray = x.tray;
}

You can use SelectMany to "flatten" your list. Meaning, combine all of the lists into one, then take the Min. So;
int minimum = poolTrays.SelectMany(x => x).Min(x => x.TheIntegerIWantMinOf);
Will give you the smallest value contained in the sub lists. I'm not entirely sure this is what you're asking for but if your goal is simply to find the smallest element in the collection then I would scrap the code you posted and use this instead.

Right, I now realise this is actually incredibly easy to do with a bit more fiddling around. I have gone with
int minCount = traysWithExtraAisles.Min(x=>x.count);
var minAislesList = (from t in trayswithExtraAisles
where t.count==mincount
select t).ToList()
I imagine it is probably possible to do this in one statement

You can use GroupBy as answered by Tim... or OrderBy as follow:
var result = traysWithExtraAisles.OrderBy(x=>x.count)
.TakeWhile((x,i)=> i == 0 || x.count == traysWithExtraAisles[i-1]).count;

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Retrieving non-duplicates from 2 Collections using LINQ - c#

You can convert this to LINQ methods like this: var result = Collection1.Where(item => !Collection2.Any(x => x.ColumnName == item.FieldName)) .ToList();

Related

Linq challenge: converting this piece of code from method chain to standard Linq

Group by Linq setting properties

Remove items from a List which are in another List by a specific property?

How to select all objects sharing a property value with a property value in list of objects?

how to use linq to retrieve values from a 2 dimensional generic list

Categories

Resources