Finding common items is evaluated locally - c#

Using Entity Framework Core 2.2 I have the following query:
IQueryable<User> users = _context.Users.AsNoTracking();
User user = await users
.Include(x => x.UserSkills)
.ThenInclude(x => x.Skill)
.FirstOrDefaultAsync(x => x.Id == 1);
var userSkills = user.UserSkills.ToList();
IQueryable<Lesson> lessons = _context.Lessons.AsNoTracking();
var test = lessons
.Where(x => x.IsEnabled)
.Where(x => x.LessonSkills.All(y => userSkills.Any(z => y.SkillId == z.SkillId)))
.ToList();
I am looking to get User Skills contains all Lesson Skills.
When I run this query I get the following error:
Exception thrown: 'System.InvalidOperationException' in System.Private.CoreLib.dll:
'Error generated for warning 'Microsoft.EntityFrameworkCore.Query.QueryClientEvaluationWarning:
The LINQ expression 'where ([y].SkillId == [z].SkillId)' could not be translated and will be evaluated locally.'.
How to change the query to solve this problem?
Update
I need to extend this query with an extra option (y.SkillLevelId <= z.SkillLevelId):
var test = lessons
.Where(x => x.IsEnabled)
.Where(x => x.LessonSkills.All(y => userSkills.Any(z =>
y.SkillId == z.SkillId
&&
y.SkillLevelId <= z.SkillLevelId)))
.ToList();

userSkills is in-memory collection, and from my experience with EF6 and EF Core so far I can say that the only reliable translatable construct with in-memory collections is Enumerable.Contains method on primitive type in-memory collection.
So the following solves the problem is question.
First (should be outside the query expression tree):
var userSkillIds = user.UserSkills.Select(x => x.SkillId);
Then instead of
.Where(x => x.LessonSkills.All(y => userSkills.Any(z => y.SkillId == z.SkillId)))
use the equivalent (but translatable):
.Where(x => x.LessonSkills.All(y => userSkillIds.Contains(y.SkillId)))
Update: If you can't use Contains, the options you have until EF Core starts supporting it are (1) EntityFrameworkCore.MemoryJoin package (I personally haven't tested it, but the idea is interesting), (2) manually building Or based predicate with Expression class (hard and works for small memory collections) and (3) replace the memory collection with real IQueryable<>, for instance
var userSkills = users
.Where(x => x.Id == 1)
.SelectMany(x => x.UserSkills);
and use the original query.

Related

How to only get the ID from EF Linq query

I am attempting to get a list of Order Ids from a EF linq query. The sql query is returning back quickly but I think the EF framework is trying to create the full entity. I only want the ID of the order. It seems that it creates the whole entity and then it parses it out to only the id. Which seems to be a complete waste of resources.
Orders are a complex object that includes lots of child entitys. I dont need anything but the Ids of the orders in the list. Orders are organized into OrderCollection which is a many to many relationship.
The basic query in English is get the order ids in the specified order collection and have a cart date newer then the date specified and only send the specified page (skip and take).
example:
_repo.Orders.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId)).ToList()
.Where(o => o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize).Select(x => new { x.Id }).ToArray();
The sql runs in just 102ms for this in debug mode. But afterwards I see the memory go to up 4GB before failing. The batchsize is only 100. Its like it grabbing everything.
I tried moving the select around but that failed also or gave syntax errors or poor performance in running the sql (SQL taking 16 seconds).
Example
_repo.Orders.Select(x => new { x.Id, x.OrderCollection, x.OrderDate})
.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId)).ToList()
.Where(o => o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize).Select(x => new { x.Id }).ToArray();
The database has millions of records.
What you have is roughly;
List<Order> list = _repo.Orders
.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId))
.ToList();
list.Where(o => o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize)
.Select(x => new { x.Id })
.ToArray();
That first .ToList is forcing EF Core to load every order with a matching routing rule into memory. The rest of the expression is then using IEnumerable extension methods to process those results.
I think you want to rearrange that to;
IQueryable<Order> query = _repo.Orders
.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId)
&& o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize)
.Select(x => new { x.Id });
query.ToArray();
Creating an IQueryable doesn't trigger EF Core to execute any SQL. An IQueryable is just a description of the query you would like to run. Then it's the .ToArray method that will finally cause EF Core to compile and execute an sql statement.

Problem with LINQ query: Select first task from each goal

I'm looking for suggestions on how to write a query. For each Goal, I want to select the first Task (sorted by Task.Sequence), in addition to any tasks with ShowAlways == true. (My actual query is more complex, but this query demonstrates the limitations I'm running into.)
I tried something like this:
var tasks = (from a in DbContext.Areas
from g in a.Goals
from t in g.Tasks
let nextTaskId = g.Tasks.OrderBy(tt => tt.Sequence).Select(tt => tt.Id).DefaultIfEmpty(-1).FirstOrDefault()
where t.ShowAlways || t.Id == nextTaskId
select new CalendarTask
{
// Member assignment
}).ToList();
But this query appears to be too complex.
System.InvalidOperationException: 'Processing of the LINQ expression 'OrderBy<Task, int>(
source: MaterializeCollectionNavigation(Navigation: Goal.Tasks(< Tasks > k__BackingField, DbSet<Task>) Collection ToDependent Task Inverse: Goal, Where<Task>(
source: NavigationExpansionExpression
Source: Where<Task>(
source: DbSet<Task>,
predicate: (t0) => Property<Nullable<int>>((Unhandled parameter: ti0).Outer.Inner, "Id") == Property<Nullable<int>>(t0, "GoalId"))
PendingSelector: (t0) => NavigationTreeExpression
Value: EntityReferenceTask
Expression: t0
,
predicate: (i) => Property<Nullable<int>>(NavigationTreeExpression
Value: EntityReferenceGoal
Expression: (Unhandled parameter: ti0).Outer.Inner, "Id") == Property<Nullable<int>>(i, "GoalId"))),
keySelector: (tt) => tt.Sequence)' by 'NavigationExpandingExpressionVisitor' failed. This may indicate either a bug or a limitation in EF Core. See https://go.microsoft.com/fwlink/?linkid=2101433 for more detailed information.'
The problem is the line let nextTaskId =.... If I comment out that, there is no error. (But I don't get what I'm after.)
I'll readily admit that I don't understand the details of the error message. About the only other way I can think of to approach this is return all the Tasks and then sort and filter them on the client. But my preference is not to retrieve data I don't need.
Can anyone see any other ways to approach this query?
Note: I'm using the very latest version of Visual Studio and .NET.
UPDATE:
I tried a different, but less efficient approach to this query.
var tasks = (DbContext.Areas
.Where(a => a.UserId == UserManager.GetUserId(User) && !a.OnHold)
.SelectMany(a => a.Goals)
.Where(g => !g.OnHold)
.Select(g => g.Tasks.Where(tt => !tt.OnHold && !tt.Completed).OrderBy(tt => tt.Sequence).FirstOrDefault()))
.Union(DbContext.Areas
.Where(a => a.UserId == UserManager.GetUserId(User) && !a.OnHold)
.SelectMany(a => a.Goals)
.Where(g => !g.OnHold)
.Select(g => g.Tasks.Where(tt => !tt.OnHold && !tt.Completed && (tt.DueDate.HasValue || tt.AlwaysShow)).OrderBy(tt => tt.Sequence).FirstOrDefault()))
.Distinct()
.Select(t => new CalendarTask
{
Id = t.Id,
Title = t.Title,
Goal = t.Goal.Title,
CssClass = t.Goal.Area.CssClass,
DueDate = t.DueDate,
Completed = t.Completed
});
But this also produced an error:
System.InvalidOperationException: 'Processing of the LINQ expression 'Where<Task>(
source: MaterializeCollectionNavigation(Navigation: Goal.Tasks (<Tasks>k__BackingField, DbSet<Task>) Collection ToDependent Task Inverse: Goal, Where<Task>(
source: NavigationExpansionExpression
Source: Where<Task>(
source: DbSet<Task>,
predicate: (t) => Property<Nullable<int>>((Unhandled parameter: ti).Inner, "Id") == Property<Nullable<int>>(t, "GoalId"))
PendingSelector: (t) => NavigationTreeExpression
Value: EntityReferenceTask
Expression: t
,
predicate: (i) => Property<Nullable<int>>(NavigationTreeExpression
Value: EntityReferenceGoal
Expression: (Unhandled parameter: ti).Inner, "Id") == Property<Nullable<int>>(i, "GoalId"))),
predicate: (tt) => !(tt.OnHold) && !(tt.Completed))' by 'NavigationExpandingExpressionVisitor' failed. This may indicate either a bug or a limitation in EF Core. See https://go.microsoft.com/fwlink/?linkid=2101433 for more detailed information.'
This is a good example for the need of full reproducible example. When trying to reproduce the issue with similar entity models, I was either getting a different error about DefaulIfEmpty(-1) (apparently not supported, don't forget to remove it - the SQL query will work correctly w/o it) or no error when removing it.
Then I noticed a small deeply hidden difference in your error messages compared to mine, which led me to the cause of the problem:
MaterializeCollectionNavigation(Navigation: Goal.Tasks (<Tasks>k__BackingField, DbSet<Task>)
specifically the DbSet<Task> at the end (in my case it was ICollection<Task>). I realized that you used DbSet<T> type for collection navigation property rather than the usual ICollection<T>, IEnumerable<T>, List<T> etc., e.g.
public class Goal
{
// ...
public DbSet<Task> Tasks { get; set; }
}
Simply don't do that. DbSet<T> is a special EF Core class, supposed to be used only from DbContext to represent db table, view or raw SQL query result set. And more importantly, DbSets are the only real EF Core query roots, so it's not surprising that such usage confuses the EF Core query translator.
So change it to some of the supported interfaces/classes (for instance, ICollection<Task>) and the original problem will be solved.
Then removing the DefaultIfEmpty(-1) will allow successfully translating the first query in question.
I don't have EF Core up and running, but are you able to split it up like this?
var allTasks = DbContext.Areas
.SelectMany(a => a.Goals)
.SelectMany(a => a.Tasks);
var always = allTasks.Where(t => t.ShowAlways);
var next = allTasks
.OrderBy(tt => tt.Sequence)
.Take(1);
var result = always
.Concat(next)
.Select(t => new
{
// Member assignment
})
.ToList();
Edit: Sorry, I'm not great with query syntax, maybe this does what you need?
var allGoals = DbContext.Areas
.SelectMany(a => a.Goals);
var allTasks = DbContext.Areas
.SelectMany(a => a.Goals)
.SelectMany(a => a.Tasks);
var always = allGoals
.SelectMany(a => a.Tasks)
.Where(t => t.ShowAlways);
var nextTasks = allGoals
.SelectMany(g => g.Tasks.OrderBy(tt => tt.Sequence).Take(1));
var result = always
.Concat(nextTasks)
.Select(t => new
{
// Member assignment
})
.ToList();
I would recommend you start by breaking up this query into individual parts. Try iterating through the Goals in a foreach with your Task logic inside. Add each new CalendarTask to a List that you defined ahead of time.
Overall breaking this logic up and experimenting a bit will probably lead you to some insight with the limitations of Entity Framework Core.
I think we might separate the query into two steps. First, query each goals and get the min Sequence task and store them(maybe with a anonymous type like {NextTaskId,Goal}). Then, we query the temp data and get the result. For example
Areas.SelectMany(x=>x.Goals)
.Select(g=>new {
NextTaskId=g.Tasks.OrderBy(t=>t.Sequence).FirstOrDefault()?.Id,
Tasks=g.Tasks.Where(t=>t.ShowAlways)
})
.SelectMany(a=>a.Tasks,(a,task)=>new {
NextTaskId = a.NextTaskId,
Task = task
});
I tried to create the linq request but I'm not sure about the result
var tasks = ( from a in DbContext.Areas
from g in a.Goals
from t in g.Tasks
join oneTask in (from t in DbContext.Tasks
group t by t.Id into gt
select new {
Id = gt.Key,
Sequence = gt.Min(t => t.Sequence)
}) on new { t.Id, t.Sequence } equals new { oneTask.Id,oneTask.Sequence }
select new {Area = a, Goal = g, Task = t})
.Union(
from a in DbContext.Areas
from g in a.Goals
from t in g.Tasks
where t.ShowAlways
select new {Area = a, Goal = g, Task = t});
I currently don't have EF Core, but do you really need to compare this much?
Wouldn't querying the tasks be sufficient?
If there is a navigation property or foreign key defined I could imaging using something like this:
Tasks.Where(task => task.Sequence == Tasks.Where(t => t.GoalIdentity == task.GoalIdentity).Min(t => t.Sequence) || task.ShowAlways);

Include not working in LINQ query but works in LINQ Fluent API

The Problem
I have a LINQ query (against Entity Framework) that use Include to include some navigation properties. One of those properties uses ThenInclude to include its own property collection. When I run the query, the first level properties are included on the primary object but the sub-collection (the one using ThenInclude) is always empty.
However, if I change the query to use Fluent API form, the query works and the sub-collection is actually included. Why does this work for the Fluent form and not the normal LINQ query?
Example
//FAIL - This returns Benefits but Benefits.Dates.Count = 0 on all Benefits
var list1 = (from s in _context.Subscribers
.Include(s => s.Dates)
.Include(s => s.Benefits)
.ThenInclude(b => b.Dates)
where s.Id == 13643
select new { benefits = s.Benefits }).ToList();
//SUCCESS - This returns Benefits and Benefits.Dates.Count is > 0 on the ones with Dates
var list2 = _context.Subscribers
.Include(s => s.Dates)
.Include(s => s.Benefits)
.ThenInclude(b => b.Dates)
.Where(s => s.Id == 13643)
.Select(s => new { benefits = s.Benefits}).ToList();
Am I mistaken that these queries should give the same output?
Update
I just tried manually linking things using LINQ and I am able to get Benefit dates included -- granted, its not apples to apples as the resulting set is different, but the point is that Include(b => b.Dates) seems to work in this case.
var list3 = (from s in _context.Subscribers.Include(s => s.Dates)
join b in _context.Benefits.Include(b => b.Dates) on s.Id equals b.SubscriberId
select new {benefits = b}).ToList();
I'm beginning to wonder if ThenInclude() may be a little more restrictive in where/when it can be used?
Update 2
I just noticed a warning in my Debug Output window that led me to this link about ignored includes. This seems to be on the right track as the Debug Output clearly indicates that these includes are being ignored.
What doesn't make sense, though, is that I'm actually selecting to anonymous in both examples and only one of them seems to ignore the includes. Why one and not the other?
This is just a guess: since You have Datas, in both Subscribers and Benefits, there is a chance that you are not querying the correct model/entity... you can try confirming the entity type like below (assuming Benefit is your Entity type).
var list1 = (from s in _context.Subscribers
.Include(s => s.Dates)
.Include(s => s.Benefits)
.ThenInclude(b => (b as Benefit).Dates)
where s.Id == 13643
select new { benefits = s.Benefits }).ToList();
var list2 = _context.Subscribers
.Include(s => s.Dates)
.Include(s => s.Benefits)
.ThenInclude(b => (b as Benefit).Dates) // <-- I suggest renaming s to b
.Where(s => s.Id == 13643)
.Select(s => new { benefits = s.Benefits}).ToList();

Visiting Entity Framework's Include method

I'm trying to visit the Entity Framework's Include method using QueryResultCache class which is motioned here. It's a very popular article and a lot of query caching libraries are using it.
When I try an expression like:
var exp1 = context.Products.Include(x => x.Tags)
.Where(x => x.Tags.Any(y => y.Name.Contains("Test")))
.Select(x => new {x.ProductId}).Expression;
with it, it produces this string:
value(System.Data.Entity.Core.Objects.ObjectQuery`1
[EfSecondLevelCaching.Test.Models.Product]).MergeAs(AppendOnly).IncludeSpan
(value(System.Data.Entity.Core.Objects.Span))
.Where(x => x.Tags.Any(y => y.Name.Contains("Test")))
.Select(x => new <>f__AnonymousType5`1(ProductId = x.ProductId))
As you can see, the result doesn't contain the parameters of Include method (x => x.Tags). So most of the linq caching libraries on the net can't create a valid unique query key for the EF queries. How can I fix this?
Edit:
If I remove the select method, it will produce:
value(System.Data.Entity.Core.Objects.ObjectQuery`1
[EfSecondLevelCaching.Test.Models.Product])
.MergeAs(AppendOnly)
.IncludeSpan(value(System.Data.Entity.Core.Objects.Span))
.Where(x => x.Tags.Any(y => y.Name.Contains("Test")))
So here there is no difference between Include(x=>x.Tags) and Include(x=>x.Users).
The query will only return what is in your Select expression. In this case Select(x => new {x.ProductId}) means that only a single field ProductId will be returned.
Your Include would have made a difference if you were returning Products as they contain Tags, but makes no difference if you just have ProductId.
See this MSDN article for more information on eager loading (Include ensures eager loading)

LINQ Lambda query 'select' not working with oData

I'm currently trying to understand some of the fundamentals with LINQ. I have been using LINQPad to query the Netflix OData source.
Source: http://odata.netflix.com/v2/Catalog/
I can't seem to select single properties when using a lambda query - the comprehension query works perfectly. I found a snippet of code that performs a more complex query using lambdas on the Netflix oData source, and this seems to work fine for returning one property of the entity.
// works fine
var compQuery = from t in Titles
where t.ReleaseYear == 2007
select new { t.Name };
compQuery.Dump();
// fails: "Can only specify query options (orderby, where, take, skip) after last navigation."
var lambdaQuery = Titles
.Where(t => t.ReleaseYear == 2007)
.Select(t => t.Name);
lambdaQuery.Dump();
// works fine - found on SO.
var lambdaQuery2 = People
.Expand("TitlesActedIn")
.Where(p => p.Name == "George Lucas")
.First()
.TitlesActedIn.Select(t => t.ShortName);
lambdaQuery2.Dump();
Could anyone shed some light as to why the basic lambda query is failing when asked to return one property?
OData doesn't have support for projecting to properties - you can work around this though:
var lambdaQuery = Titles
.Where(t => t.ReleaseYear == 2007)
.Select(x=> new { x.Name })
.AsEnumerable()
.Select(t => t.Name);
Using AsEnumerable() forces the last part of the query to be executed in Linq-to-Objects context (instead of an OData query) where the projection works just fine.
Try this- it is what is actually equivalent to your first one:
// fails: "Can only specify query options (orderby, where, take, skip) after last navigation."
var lambdaQuery = Titles
.Where(t => t.ReleaseYear == 2007)
.Select(t => new { t.Name });
lambdaQuery.Dump();
Using the answers given, I have ran some tests and found some interesting things regarding execution time:
// Avg Execution Time: 5 seconds
var query1 = Titles
.Where(t => t.ReleaseYear == 2007)
.Select(t => new {t.Name});
query1.Dump();
// Avg Execution Time: 15 seconds
var query2 = Titles
.Where(t => t.ReleaseYear == 2007)
.AsEnumerable()
.Select(t => t.Name);
query2.Dump();
So am I right in thinking that in query 1, only the 'Name' property is being returned? Whereas in query 2, the 'AsEnumerable()' method is bringing back the entity with all property values, hence longer execution time?

Categories

Resources