Conditional GroupBy() in LINQ - c#

I'm working with a matrix filled with similarities between items. I save these as a list of objects in my database. The Similarity object looks like this:
public class Similarity
{
public virtual Guid MatrixId { get; set; } //The id of the matrix the similarity is in
public virtual Guid FirstIndex { get; set; } //The id of the item of the left side of the matrix
public virtual Guid SecondIndex { get; set; } //The id of the item of the top side of the matrix
public virtual double Similarity { get; set; } //The similarity
}
A user can review these items. I want to retrieve a list of items which are 'similar' to the items the user has reviewed. The problem is where I can't tell for sure whether the item's id is in the FirstIndex or the SecondIndex. I have written some code which does what I want, but I want to know if this is possible in 1 statement.
var itemsNotReviewed = Similarities.Where(x => !itemsReviewed.Contains(x.SecondIndex))
.GroupBy(x => x.SecondIndex)
.ToList();
itemsNotReviewed.AddRange(Similarities.Where(x => !itemsReviewed.Contains(x.FirstIndex))
.GroupBy(x => x.FirstIndex)
.ToList());
Where itemsReviewed is a list of guids of the items the user has reviewed and where Similarities is a list of all items which are similar to the items the user has reviewed. I retrieve that list with this function:
return (from Row in _context.SimilarityMatrix
where itemIds.Contains(Row.FirstIndex) || itemIds.Contains(Row.SecondIndex)
select Row)
.Distinct()
.ToList();
where itemIds is a list of guids of the items the user has reviewed.
Is there a way to group by either the first or second index based on the Where clause?
Please let me know if I should elaborate!

By my understanding, you have a list of Similarity which is guaranteed to contain items with either FirstIndex or SecondIndex contained in itemsReviewed list of Guid. And you need to take the elements (if any) with either index not contained in itemsReviewed (it could be only one of them due to the first constraint) and group by that index.
The straightforward LINQ translation of the above would be like this:
var itemsNotReviewed = Similarities
.Where(item => !itemsReviewed.Contains(item.FirstIndex) || !itemsReviewed.Contains(item.SecondIndex))
.GroupBy(item => !itemsReviewed.Contains(item.FirstIndex) ? item.FirstIndex : item.SecondIndex)
.ToList();
But it contains duplicate itemsReviewed.Contains checks, which affect negatively the performance.
So a better variant would be introducing intermediate variable, and the easiest way to do that is query syntax and let clause:
var itemsNotReviewed =
(from item in Similarities
let index = !itemsReviewed.Contains(item.FirstIndex) ? 1 :
!itemsReviewed.Contains(item.SecondIndex) ? 2 : 0
where index != 0
group item by index == 1 ? item.FirstIndex : item.SecondIndex)
.ToList();

I would go for changing the way you source the original list:
_context.SimilarityMatrix.Where(Row => itemIds.Contains(Row.FirstIndex) || itemIds.Contains(Row.SecondIndex))
.Select(r => new { r.MatrixId, r.FirstIndex, r.SecondIndex, r.Similarity, MatchingIndex = itemIds.Contains(r.FirstIndex) ? r.FirstIndex : r.SecondIndex })
.Distinct()
.ToList();
This way you only need to group by Matching Index.
var itemsNotReviewed = Similarities.
.GroupBy(x => x.MatchingIndex)
.ToList();
You may want to convert after the dynamic object to your Similarity class or just change the class to include the Matching Index.
You can convert them to your Similarity type by:
var itemsNotReviewed = Similarities.
.GroupBy(x => x.MatchingIndex)
.Select(g => new { g.Key, Values = g.Values.Select(d => new Similarity { MatrixId = d.MatrixId, FirstIndex = d.FirstIndex, SecondIndex = d.SecondIndex, Similarity = d.Similarity }).ToList() })
.ToList();

What about
(from x in Similarities
let b2 = !itemsReviewed.Contains(x.SecondIndex)
let b1 = !itemsReviewed.Contains(x.FirstIndex)
where b1 || b2
groupby b2 ? x.SecondIndex : x.FirstIndex into grp
select grp)
.ToList()
The let statement introduces a new tempoary variable storing the boolean. You can of course inline the other function, too:
(from x in (from Row in _context.SimilarityMatrix
where itemIds.Contains(Row.FirstIndex) || itemIds.Contains(Row.SecondIndex)
select Row)
.Distinct()
.ToList()
let b2 = !itemsReviewed.Contains(x.SecondIndex)
let b1 = !itemsReviewed.Contains(x.FirstIndex)
where b1 || b2
groupby b2 ? x.SecondIndex : x.FirstIndex into group
select group)
.ToList()
If you wanted to use non-LINQ syntax, you would probably need to introduce some anonymous types:
Similarities
.Select(s => new
{
b2 = !itemsReviewed.Contains(x.SecondIndex),
b1 = !itemsReviewed.Contains(x.FirstIndex),
s
})
.Where(a => a.b1 || a.b2)
.GroupBy(a => a.b2 ? a.s.SecondIndex : a.s.FirstIndex, a => a.x) //edit: to get same semantics, you of course also need the element selector
.ToList()

Related

Why is Group By translating in Order By by Entity Framework?

I am using Entity Framework in .NET 7.
I have 3 entities:
Course that contains a ProfessorId among other things
Grade that has a CourseId among other things
Professor
I want to get all the courses that are assigned to a professor and have at least 1 grade associated with them and filter them in a Dictionary<string, CourseViewModel> where string is the semester.
I have written the following LINQ query:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderBy(course => course.Key)
.ToDictionary(group => group.Key, group => group.ToList());
When that executes I get an exception saying it can't be translated.
If I remove the OrderBy and keep only the GroupBy, it works and the translated SQL in Microsoft SQL Server is:
SELECT [c].[Semester], [c].[Title]
FROM [Courses] AS [c]
WHERE [c].[ProfessorId] = #__professorId_0
AND EXISTS (SELECT 1
FROM [Grades] AS [g]
WHERE [c].[Id] = [g].[CourseId])
ORDER BY [c].[Semester]
As you can see it adds ORDER BY anyway, even though I have removed it and kept only GroupBy(). Can someone explain why is that? What if I wanted to order by descending would that be possible? Also the weird thing is that if I remove GroupBy() and keep only OrderBy() and replace the ToDictionary with ToList, it works and the exact same query is produced (only now I can't really use the results without further actions).
LINQ GroupBy :
Groups the elements of a sequence.
SQL GROUP BY :
A SELECT statement clause that divides the query result into groups of rows, usually by performing one or more aggregations on each group. The SELECT statement returns one row per group.
They aren't equivalent. The main difference is LINQ GroupBy return a collection by key, when SQL GROUP BY return ONE element (column) by key.
If the projection ask ONE element by key, then EF Core translate LINQ GroupBy to SQL GROUP BY :
// Get the number of course by semester
context
.Courses
.GroupBy(c => c.Semester)
.Select(cs => new { Semester = cs.Key, Count = cs.Count() })
.ToList();
Translated to :
SELECT [c].[Semester], COUNT(*) AS [Count]
FROM [Courses] AS [c]
GROUP BY [c].[Semester]
But if the projection ask several element, then EF Core translate LINQ GroupBy to SQL ORDER BY and group by itself.
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToDictionary(cs => cs.Key, cs => cs.ToList());
Translated to :
SELECT [c].[Semester], [c].[Id]
FROM [Courses] AS [c]
ORDER BY [c].[Semester]
If the result is :
Semester
Id
2023 S1
1
2023 S1
4
2023 S2
2
...
...
Then EF Core read like :
Read first row : Semester is "2023 S1"
No group
Then create a group and add the row in.
Read second row : Semester is "2023 S1"
The key is the same that precedent element
Then Add the row in the group
Read the third row : Semester is "2023 S2"
The key is different that precedent element
Then create a new group and the row in.
And so on...
You understand the interest of sorting.
About the error, I don't know that EF Core can't. The query sound legit. Maybe this should not be implemented at this time.
About that you try, to convert a sorted grouping enumeration to a dictionary. This is weird because the dictionary isn't sortable. Then this sorts elements and put them in loose.
If Dictionary seem sorted, it's a coincidence, not a feature. In intern, the dictionary sort element by key's has code, that is generally the sorted order... But not every time.
If you want a sorted dictionary, you can use SortedDictyonary. But it can be tricky if you need a custom sort rule, like :
context
.Courses
.Select(c => new { c.Id, c.Semester })
.GroupBy(c => c.Semester)
.ToImmutableSortedDictionary(cs => cs.Key, cs => cs.ToList(), new ReverseComparer<string>());
public class ReverseComparer<T> : IComparer<T>
{
private IComparer<T> _comparer = Comparer<T>.Default;
public int Compare(T? x, T? y)
{
return _comparer.Compare(x, y) * -1;
}
}
The exception you are encountering is most likely due to the fact that the OrderBy clause cannot be translated into SQL by Entity Framework. The OrderBy clause is executed in memory after the data has been retrieved from the database, which is why it works when you remove it and keep only the GroupBy clause.
However, if you want to order the dictionary by descending, you can simply call the Reverse method on the ToDictionary result:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse();
This way, the dictionary will be sorted in descending order based on the semester.
Give this a try and let me know how it works for you.
EDIT:
Converting the IEnumerable back to a Dictionary should work like this:
var professorGradedCourses = _dbContext.Courses
.Where(course => course.ProfessorId == professorId && course.Grades.Any())
.Select(course => new CourseViewModel
{
Title = course.Title,
Semester = course.Semester,
})
.GroupBy(course => course.Semester)
.OrderByDescending(course => course.Key)
.ToDictionary(group => group.Key,
group => group.ToList())
.Reverse()
.ToDictionary(pair => pair.Key,
pair => pair.Value);

Linq on List<object>

Existing legacy code is as follows:
List<object> myItems;
//myItems gets populated by a method call
foreach (object[] item in myItems)
{
string Id = item[0].ToString();
string Number = item[1].ToString();
//now do some processing if Number satisfies some criteria
}
would like to convert this using linq to select all Ids that match a certain Number.
All suggestions would be appreciated.
Thanks.
Use Select() and Where()
bool IsSatisfyingNumber(String number) {
// True if number satisfies some criteria
}
List<String> matchingIds = myItems
.Where(item => IsSatisfyingNumber(item[1].ToString()))
.Select(item => item[0].ToString())
.ToList();
The list myItems contains items of type object where each this item is actually object[] so we need to cast to object[] first and then filter and select based on the searched certain number.
string certainNumber = "1";
var myIds = myItems
.Where(o => ((object[]) o)[1].ToString() == certainNumber)
.Select(o => ((object[]) o)[0].ToString());
The equality operator on strings performs an ordinal (case-sensitive and culture-insensitive) comparison so change it in the Where... if you need some different kind of comparison in your case.
Got it working and wanted to share the information:
var myIds =
(from item in myItems.Cast<object[]>()
select new
{ Id = item[0], Number = (string)item[1] }
)
.Where(x => x.Number == filtercondition)
.Select(x => (string)x.Id)
.ToList();

Group by some columns depending on values in Entity Framework

I have the following simple statement in my Entity Framework code:
query = query
.Where(c => c.NotificationType == NotificationType.AppMessage)
.GroupBy(c => c.ConversationId)
.Select(d => d.OrderByDescending(p => p.DateCreated).FirstOrDefault());
It simply finds the latest Notification based on a group by with conversationId and select latest. Easy.
However, this is ONLY what I want if c.NotificationType == NotificationType.AppMessage. If the column is different than AppMessage (c.NotificationType <> NotificationType.AppMessage), I just want the column. What I truly Want to write is a magical statement such as:
query = query
.Where(c => (c.NotificationType <> NotificationType.AppMessage)
|| ((c.NotificationType == NotificationType.AppMessage)
.GroupBy(c => c.ConversationId)
.Select(d => d.OrderByDescending(p => p.DateCreated).FirstOrDefault()));
But this doesn't make sense because the GroupBy/Select is based on the first where statement.
How do I solve this?
The simplest way is to compose UNION ALL query using Concat at the end of your original query:
query = query
.Where(c => c.NotificationType == NotificationType.AppMessage)
.GroupBy(c => c.ConversationId)
.Select(d => d.OrderByDescending(p => p.DateCreated).FirstOrDefault())
.Concat(query.Where(c => c.NotificationType != NotificationType.AppMessage));
public class EntityClass
{
public int NotificationType { get; set; }
public int ConversationId { get; set; }
public DateTime Created { get; set; }
public static EntityClass GetLastNotification(int convId)
{
var list = new List<EntityClass>(); // Fill the values
list = list
.GroupBy(i => i.ConversationId) // Group by ConversationId.
.ToDictionary(i => i.Key, n => n.ToList()) // Create dictionary.
.Where(i => i.Key == convId) // Filter by ConversationId.
.SelectMany(i => i.Value) // Project multiple lists to ONLY one list.
.ToList(); // Create list.
// Now, you can filter it:
// 0 - NotificationType.AppMessage
// I didn't get what exactly you want to filter there, but this should give you an idea.
var lastNotification = list.OrderByDescending(i => i.Created).FirstOrDefault(i => i.NotificationType == 0);
return lastNotification;
}
}
you filter your list with "GroupBy" based on ConversationId. Next, create a dictionary from the result and make only one list (SelectMany). Then, you already have one list where should be only records with ConversationId you want.
Last part is for filtering this list - you wanted to last notification with certain NotificationType. Should be working :)

C# Lambda? In list of class find highest property (int) where property (character) = 1

I have a List of Node where Node class has properties:
public int ID;
public MovingObject character;
I need to, maybe using Lambda, iterate the List and get the highest ID where character = X
I tried Linq extension methods GroupBy and OrderByDescending which does give me the highest ID but that leaves out where character = x. Any help please?
You can use the Where method to filter the collection by the objects who has a character 'x'.
var result = items.Where(item => item.character == X).OrderByDescending(item => item.ID).FirstOrDefault();
You should use Where to filter. Then just ordering and getting the first will find the item you want.
var value = myList.Where(x => object.Equals(x.Character, "h"))
.OrderByDescending(x => x.ID)
.FirstOrDefault();
// check for null
Yet another approach: Filter out nodes with "X" character. Find the highest ID among the filtered out nodes. Search for the node with this ID.
The code:
var highestId = nodes.Where(n => n.Character == "X").Max(n => n.ID);
var highestNode = nodes.Single(n => n.ID == highestId);

LINQ select distinct items and subitems

I am trying to learn advanced LINQ techniques, so how we could achieve, if possible with LINQ only, to select distinct items of a collection and merge their sub-items in a dictionary/struct or dynamic ExpandoObject?
Lets say i have these two class:
public class TestItem
{
public int Id;
public string Name;
public List<TestSubItem> SubItems;
}
public class TestSubItem
{
public string Name;
}
How can i create a single LINQ query(again, if possible) to select all distinct TestItem based on the Name property and if the same TestItem is found two time with the same name to merge the two List in the final result?
I know i could select distinct TestItem by doing the below code, but i'm stuck there:
var result = items.GroupBy(item => item.Name)
.ToList();
Thanks in advance!
A combination of a Select and an Aggregate on the groupings should get the job done. Finally, a ToDictionary call cleans everything up and gets rid of the potentially invalid ID field:
var result = items.GroupBy(item => item.Name)
.Select(g => g.Aggregate((i, j) =>
{
i.SubItems.AddRange(j.SubItems);
return i;
}))
.ToDictionary(k => k.Name, v => v.SubItems);
Alternatively, the query syntax is a bit more verbose, but I find it easier to read:
var result = (from item in items
group item by item.Name
into g
let n =
from ti in g
select ti.Name
let i =
from ti in g
from si in ti.SubItems
select si
select new { n, i }).ToDictionary(k => k.n, v => v.i);

Categories

Resources