I am trying to learn advanced LINQ techniques, so how we could achieve, if possible with LINQ only, to select distinct items of a collection and merge their sub-items in a dictionary/struct or dynamic ExpandoObject?
Lets say i have these two class:
public class TestItem
{
public int Id;
public string Name;
public List<TestSubItem> SubItems;
}
public class TestSubItem
{
public string Name;
}
How can i create a single LINQ query(again, if possible) to select all distinct TestItem based on the Name property and if the same TestItem is found two time with the same name to merge the two List in the final result?
I know i could select distinct TestItem by doing the below code, but i'm stuck there:
var result = items.GroupBy(item => item.Name)
.ToList();
Thanks in advance!
A combination of a Select and an Aggregate on the groupings should get the job done. Finally, a ToDictionary call cleans everything up and gets rid of the potentially invalid ID field:
var result = items.GroupBy(item => item.Name)
.Select(g => g.Aggregate((i, j) =>
{
i.SubItems.AddRange(j.SubItems);
return i;
}))
.ToDictionary(k => k.Name, v => v.SubItems);
Alternatively, the query syntax is a bit more verbose, but I find it easier to read:
var result = (from item in items
group item by item.Name
into g
let n =
from ti in g
select ti.Name
let i =
from ti in g
from si in ti.SubItems
select si
select new { n, i }).ToDictionary(k => k.n, v => v.i);
Related
I am using Entity Framework in .NET 7.
I have 3 entities:
Course that contains a ProfessorId among other things
Grade that has a CourseId among other things
Professor
I want to get all the courses that are assigned to a professor and have at least 1 grade associated with them and filter them in a Dictionary<string, CourseViewModel> where string is the semester.
I have written the following LINQ query:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderBy(course => course.Key)
.ToDictionary(group => group.Key, group => group.ToList());
When that executes I get an exception saying it can't be translated.
If I remove the OrderBy and keep only the GroupBy, it works and the translated SQL in Microsoft SQL Server is:
SELECT [c].[Semester], [c].[Title]
FROM [Courses] AS [c]
WHERE [c].[ProfessorId] = #__professorId_0
AND EXISTS (SELECT 1
FROM [Grades] AS [g]
WHERE [c].[Id] = [g].[CourseId])
ORDER BY [c].[Semester]
As you can see it adds ORDER BY anyway, even though I have removed it and kept only GroupBy(). Can someone explain why is that? What if I wanted to order by descending would that be possible? Also the weird thing is that if I remove GroupBy() and keep only OrderBy() and replace the ToDictionary with ToList, it works and the exact same query is produced (only now I can't really use the results without further actions).
LINQ GroupBy :
Groups the elements of a sequence.
SQL GROUP BY :
A SELECT statement clause that divides the query result into groups of rows, usually by performing one or more aggregations on each group. The SELECT statement returns one row per group.
They aren't equivalent. The main difference is LINQ GroupBy return a collection by key, when SQL GROUP BY return ONE element (column) by key.
If the projection ask ONE element by key, then EF Core translate LINQ GroupBy to SQL GROUP BY :
// Get the number of course by semester
context
.Courses
.GroupBy(c => c.Semester)
.Select(cs => new { Semester = cs.Key, Count = cs.Count() })
.ToList();
Translated to :
SELECT [c].[Semester], COUNT(*) AS [Count]
FROM [Courses] AS [c]
GROUP BY [c].[Semester]
But if the projection ask several element, then EF Core translate LINQ GroupBy to SQL ORDER BY and group by itself.
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToDictionary(cs => cs.Key, cs => cs.ToList());
Translated to :
SELECT [c].[Semester], [c].[Id]
FROM [Courses] AS [c]
ORDER BY [c].[Semester]
If the result is :
Semester
Id
2023 S1
1
2023 S1
4
2023 S2
2
...
...
Then EF Core read like :
Read first row : Semester is "2023 S1"
No group
Then create a group and add the row in.
Read second row : Semester is "2023 S1"
The key is the same that precedent element
Then Add the row in the group
Read the third row : Semester is "2023 S2"
The key is different that precedent element
Then create a new group and the row in.
And so on...
You understand the interest of sorting.
About the error, I don't know that EF Core can't. The query sound legit. Maybe this should not be implemented at this time.
About that you try, to convert a sorted grouping enumeration to a dictionary. This is weird because the dictionary isn't sortable. Then this sorts elements and put them in loose.
If Dictionary seem sorted, it's a coincidence, not a feature. In intern, the dictionary sort element by key's has code, that is generally the sorted order... But not every time.
If you want a sorted dictionary, you can use SortedDictyonary. But it can be tricky if you need a custom sort rule, like :
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToImmutableSortedDictionary(cs => cs.Key, cs => cs.ToList(), new ReverseComparer<string>());
public class ReverseComparer<T> : IComparer<T>
{
private IComparer<T> _comparer = Comparer<T>.Default;
public int Compare(T? x, T? y)
{
return _comparer.Compare(x, y) * -1;
}
}
The exception you are encountering is most likely due to the fact that the OrderBy clause cannot be translated into SQL by Entity Framework. The OrderBy clause is executed in memory after the data has been retrieved from the database, which is why it works when you remove it and keep only the GroupBy clause.
However, if you want to order the dictionary by descending, you can simply call the Reverse method on the ToDictionary result:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse();
This way, the dictionary will be sorted in descending order based on the semester.
Give this a try and let me know how it works for you.
EDIT:
Converting the IEnumerable back to a Dictionary should work like this:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse()
.ToDictionary(pair => pair.Key,
pair => pair.Value);
How to use OrderBy for shaping output in the same order as per the requested distinct list
public DataCollectionList GetLatestDataCollection(List<string> requestedDataPointList)
{
var dataPoints = _context.DataPoints.Where(c => requestedDataPointList.Contains(c.dataPointName))
.OrderBy(----------) //TODO: RE-ORDER IN THE SAME ORDER AS REQUESTED requestedDataPointList
.ToList();
dataPoints.ForEach(dp =>
{
....
});
}
Do the sorting on the client side:
public DataCollectionList GetLatestDataCollection(List<string> requestedDataPointList)
{
var dataPoints = _context.DataPoints.Where(c => requestedDataPointList.Contains(c.dataPointName))
.AsEnumerable()
.OrderBy(requestedDataPointList.IndexOf(c.dataPointName));
foreach (var dp in dataPoints)
{
....
});
}
NOTE: Also, I don't think ToList().ForEach() is ever better than foreach ().
It think the fastest method is to join the result back with the request list. This makes use of the fact that LINQ's join preserves the sort order of the first list:
var dataPoints = _context.DataPoints
.Where(c => requestedDataPointList.Contains(c.dataPointName))
.ToList();
var ordered = from n in requestedDataPointList
join dp in dataPoints on n equals dp.dataPointName
select dp;
foreach (var dataPoint in ordered)
{
...
}
This doesn't involve any ordering, joining does it all, which will be close to O(n).
Another fast method consists of creating a dictionary of sequence numbers:
var indexes = requestedDataPointList
.Select((n, i) => new { n, i }).ToDictionary(x => x.n, x => x.i);
var ordered = dataPoints.OrderBy(dp => indexes[dp.dataPointName]);
I'm working with a matrix filled with similarities between items. I save these as a list of objects in my database. The Similarity object looks like this:
public class Similarity
{
public virtual Guid MatrixId { get; set; } //The id of the matrix the similarity is in
public virtual Guid FirstIndex { get; set; } //The id of the item of the left side of the matrix
public virtual Guid SecondIndex { get; set; } //The id of the item of the top side of the matrix
public virtual double Similarity { get; set; } //The similarity
}
A user can review these items. I want to retrieve a list of items which are 'similar' to the items the user has reviewed. The problem is where I can't tell for sure whether the item's id is in the FirstIndex or the SecondIndex. I have written some code which does what I want, but I want to know if this is possible in 1 statement.
var itemsNotReviewed = Similarities.Where(x => !itemsReviewed.Contains(x.SecondIndex))
.GroupBy(x => x.SecondIndex)
.ToList();
itemsNotReviewed.AddRange(Similarities.Where(x => !itemsReviewed.Contains(x.FirstIndex))
.GroupBy(x => x.FirstIndex)
.ToList());
Where itemsReviewed is a list of guids of the items the user has reviewed and where Similarities is a list of all items which are similar to the items the user has reviewed. I retrieve that list with this function:
return (from Row in _context.SimilarityMatrix
where itemIds.Contains(Row.FirstIndex) || itemIds.Contains(Row.SecondIndex)
select Row)
.Distinct()
.ToList();
where itemIds is a list of guids of the items the user has reviewed.
Is there a way to group by either the first or second index based on the Where clause?
Please let me know if I should elaborate!
By my understanding, you have a list of Similarity which is guaranteed to contain items with either FirstIndex or SecondIndex contained in itemsReviewed list of Guid. And you need to take the elements (if any) with either index not contained in itemsReviewed (it could be only one of them due to the first constraint) and group by that index.
The straightforward LINQ translation of the above would be like this:
var itemsNotReviewed = Similarities
.Where(item => !itemsReviewed.Contains(item.FirstIndex) || !itemsReviewed.Contains(item.SecondIndex))
.GroupBy(item => !itemsReviewed.Contains(item.FirstIndex) ? item.FirstIndex : item.SecondIndex)
.ToList();
But it contains duplicate itemsReviewed.Contains checks, which affect negatively the performance.
So a better variant would be introducing intermediate variable, and the easiest way to do that is query syntax and let clause:
var itemsNotReviewed =
(from item in Similarities
let index = !itemsReviewed.Contains(item.FirstIndex) ? 1 :
!itemsReviewed.Contains(item.SecondIndex) ? 2 : 0
where index != 0
group item by index == 1 ? item.FirstIndex : item.SecondIndex)
.ToList();
I would go for changing the way you source the original list:
_context.SimilarityMatrix.Where(Row => itemIds.Contains(Row.FirstIndex) || itemIds.Contains(Row.SecondIndex))
.Select(r => new { r.MatrixId, r.FirstIndex, r.SecondIndex, r.Similarity, MatchingIndex = itemIds.Contains(r.FirstIndex) ? r.FirstIndex : r.SecondIndex })
.Distinct()
.ToList();
This way you only need to group by Matching Index.
var itemsNotReviewed = Similarities.
.GroupBy(x => x.MatchingIndex)
.ToList();
You may want to convert after the dynamic object to your Similarity class or just change the class to include the Matching Index.
You can convert them to your Similarity type by:
var itemsNotReviewed = Similarities.
.GroupBy(x => x.MatchingIndex)
.Select(g => new { g.Key, Values = g.Values.Select(d => new Similarity { MatrixId = d.MatrixId, FirstIndex = d.FirstIndex, SecondIndex = d.SecondIndex, Similarity = d.Similarity }).ToList() })
.ToList();
What about
(from x in Similarities
let b2 = !itemsReviewed.Contains(x.SecondIndex)
let b1 = !itemsReviewed.Contains(x.FirstIndex)
where b1 || b2
groupby b2 ? x.SecondIndex : x.FirstIndex into grp
select grp)
.ToList()
The let statement introduces a new tempoary variable storing the boolean. You can of course inline the other function, too:
(from x in (from Row in _context.SimilarityMatrix
where itemIds.Contains(Row.FirstIndex) || itemIds.Contains(Row.SecondIndex)
select Row)
.Distinct()
.ToList()
let b2 = !itemsReviewed.Contains(x.SecondIndex)
let b1 = !itemsReviewed.Contains(x.FirstIndex)
where b1 || b2
groupby b2 ? x.SecondIndex : x.FirstIndex into group
select group)
.ToList()
If you wanted to use non-LINQ syntax, you would probably need to introduce some anonymous types:
Similarities
.Select(s => new
{
b2 = !itemsReviewed.Contains(x.SecondIndex),
b1 = !itemsReviewed.Contains(x.FirstIndex),
s
})
.Where(a => a.b1 || a.b2)
.GroupBy(a => a.b2 ? a.s.SecondIndex : a.s.FirstIndex, a => a.x) //edit: to get same semantics, you of course also need the element selector
.ToList()
I have two list.
I need remove items from the first list are not in the second list and add the other elements of the first.
foreach (var product in item.Products)
{
item.Products.Remove(product);
}
var newProducts = _catalogService.GetProductBaseItems(x => model.Products.Contains(x.Id))
.ToList();
foreach (var product in newProducts)
{
item.Products.Add(product);
}
You can use Enumerable.Except to find all which are in the first but not in the second. But since your Product class might not override Equals and GetHashCode by comparing the ID's you either have to do it, create a custom IEqualityComparer<Product> or use following approach:
IEnumerable<int> idsInFirstNotSecond = item.Products
.Select(x => x.Id)
.Except(newProducts.Select(x => x.Id));
var productsInFirstNotSecond = from p in item.Products
join id in idsInFirstNotSecond
on p.Id equals id
select p;
List<Product> completeListOfOldAndNew = productsInFirstNotSecond
.Concat(newProducts)
.ToList()
I know that there are a brazillion examples of LINQ nested queries here on SO and elsewhere, but it just isn't making sense to me. If someone could explain this like I'm five, I'd be very appreciative. This is all pseudo-obfuscated, so please be patient with this contrived example.
I have a EF model that has these:
public class Car(){
public String Vin {get; private set;}
public string Type {get; private set;}
public List<Option> Options {get; set;}
}
public class Option(){
public string Name { get; set; }
}
and I get the collection of IQueryable<Car> cars from the repository.
Simple enough. The goal is:
List all of the car Types (say "truck", "suv", "minivan" etc)
Under each of these Types, have a sub-list of Option Names that exist on cars of that Type.
Under each Option, list the VIN of each car that has that option and is of that Type.
so the list would look like:
Truck
Trailer Hitch
vin1
vin8
Truck Nuts
vin2
vin3
vin4
Gun Rack
vin1
Folding Rear Seat
vin2
vin3
Minivan
Swivel Seats
vin6
Dvd Player
vin6
vin10
Folding Rear Seat
vin6
vin10
Suv
Folding Rear Seat
vin9
vin5
You probably get the idea. I know that I can group cars by Type, like this:
from c in cars
group by c.Type into g
but I think what I need to do is group Vin into Option and group that result into Type. I also think I might need to
cars.SelectMany(c => c.Options)
.Select(o => o.Name)
.Distinct();
to get a list of unique option Names, but I not sure that a) this is the most efficient way to do this and b) how to incorporate this into my grouping query. I don't really understand how to write a nested grouping query to accomplish this - is the first group the outer group or the inner group?
My understanding of this is below remedial, so please, once again: explain like I'm five.
Thanks to all.
That's surely not a trivial query.
I would do it something like this:
cars.SelectMany(c => c.Options.Select(o => new { Car = c, Option = o.Name }))
.GroupBy(x => x.Car.Type)
.Select(x => new
{
Type = x.Key,
Options = x.GroupBy(y => y.Option, y => y.Car.Vin)
.Select(y => new { Option = y.Key,
Cars = y.ToList() } )
});
This query does the following:
We change the data we work on a little bit to be easier to handle: You want to have the options above the cars. That means that the end result will have each car possibly under multiple options, so what we really need is a list of (Option, Car) tuples.
We achieve this with
cars.SelectMany(c => c.Options.Select(o => new { Car = c, Option = o.Name }))
This basically says: For each car and each option select a new anonymous type with the car and the option name as properties. Let's name that anonymous type Ano1.
The result will be an IEnumerable<Ano1>.
This flat data is now grouped by the car type. This means that for each car type we have a list of Ano1 instances. So we now have a list of groups with each group having a list of Ano1 instances.
On this list of groups we issue a select. For each group (= car type) the select returns a new anonymous type with a property for the car type - so that information is not lost - and a property for the options.
The options from the previous step will eventually be a list of anonymous types with the properties Option and Cars. To get this data, we group all the Ano1 instances for our by the Option and select the VIN of the car as the element inside the group. These groups are now transformed into a new anonymous type with a property for the option name and a property for the cars.
The query is not trivial and so is the explanation. Please ask if something is not clear.
This isnt going to be pretty, and im not sure L2E will handle this, you might need to select the entire list from the database and do this in L2O:
var result = cars.GroupBy(c => c.Type)
.Select(c => new {
Type = c.Key,
Options = c.SelectMany(x => x.Options)
.GroupBy(x => x.Name)
.Select(x => new {
Option = x.Key ,
Vins = c.Where(y => y.Options.Any(z => z.Name == x.Key)).Select(z => z.Vin)
})
});
Live example (Just the trucks modelled, but will work for all): http://rextester.com/OGD12123
This linq-to-entities query will do the trick
var query = from c in cars
group c by c.Type into g
select new {
Type = g.Key,
Options = from o in g.Options.SelectMany(x => x.Options).Distinct()
select new {
o.Name,
Vins = from c in cars
where c.Options.Any(x => x.Name == o.Name)
where c.Type == g.Key
select c.Vin
}
}