How to only get the ID from EF Linq query - c#

I am attempting to get a list of Order Ids from a EF linq query. The sql query is returning back quickly but I think the EF framework is trying to create the full entity. I only want the ID of the order. It seems that it creates the whole entity and then it parses it out to only the id. Which seems to be a complete waste of resources.
Orders are a complex object that includes lots of child entitys. I dont need anything but the Ids of the orders in the list. Orders are organized into OrderCollection which is a many to many relationship.
The basic query in English is get the order ids in the specified order collection and have a cart date newer then the date specified and only send the specified page (skip and take).
example:
_repo.Orders.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId)).ToList()
.Where(o => o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize).Select(x => new { x.Id }).ToArray();
The sql runs in just 102ms for this in debug mode. But afterwards I see the memory go to up 4GB before failing. The batchsize is only 100. Its like it grabbing everything.
I tried moving the select around but that failed also or gave syntax errors or poor performance in running the sql (SQL taking 16 seconds).
Example
_repo.Orders.Select(x => new { x.Id, x.OrderCollection, x.OrderDate})
.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId)).ToList()
.Where(o => o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize).Select(x => new { x.Id }).ToArray();
The database has millions of records.

What you have is roughly;
List<Order> list = _repo.Orders
.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId))
.ToList();
list.Where(o => o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize)
.Select(x => new { x.Id })
.ToArray();
That first .ToList is forcing EF Core to load every order with a matching routing rule into memory. The rest of the expression is then using IEnumerable extension methods to process those results.
I think you want to rearrange that to;
IQueryable<Order> query = _repo.Orders
.Where(o => o.OrderCollection.Any(r => r.Id == RoutingRuleId)
&& o.OrderDate >= StartDateTime)
.OrderBy(x => x.OrderDate )
.Skip(RecordsToSkipCount)
.Take(BatchSize)
.Select(x => new { x.Id });
query.ToArray();
Creating an IQueryable doesn't trigger EF Core to execute any SQL. An IQueryable is just a description of the query you would like to run. Then it's the .ToArray method that will finally cause EF Core to compile and execute an sql statement.

Related

Field expression GroupBy not returning included objects

In this code:
var dbrepayments = _context.Repayments.Include("Loan").Include("Loan.Borrower").Include("Loan.LoanProduct")
.Where(c => c.PaidOn == null && c.DateOfRepayment <= today)
.GroupBy(c => c.Loan.Id, (key, g) => g.OrderByDescending(c => c.Id).FirstOrDefault())
.OrderBy(c => c.DateOfRepayment);
_context is ApplicationDbContext type that I am using to get results from database using Code-First approach.
The problem is when I try to iterate through dbrepayments and get the value of Loan, Loan.Borrower, and Loan.LoanProduct objects they are showing as null. But when I remove GroupBy, these objects are returned correctly.
I'd wager the issue here is the element selector in your GroupBy statement:
(key, g) => g.OrderByDescending(c => c.Id).FirstOrDefault()
This didn't make a lot of sense when I first read it. You are taking repayments grouped by loan, but then trying to select just the last repayment for each loan? Followed by ordering those first repayments by date.
I believe this will give you the results you're looking for with the eager loaded relationships:
var dbrepayments = _context.Repayments.Include("Loan").Include("Loan.Borrower").Include("Loan.LoanProduct")
.Where(c => c.PaidOn == null && c.DateOfRepayment <= today)
.GroupBy(c => c.Loan.Id)
.Select(c => c.OrderByDescending(x => x.Id).FirstOrDefault())
.OrderBy(c => c.DateOfRepayment);
GroupBy will respect Include but if you are using a select expression, that overrides it. You cannot add Include inside the selector as that is working with IEnumerable of the expected results. Instead, group the results by loan as expected, but then Select from the results to get the latest repayment. This will give you a list of the latest repayments that you can then order.

Finding common items is evaluated locally

Using Entity Framework Core 2.2 I have the following query:
IQueryable<User> users = _context.Users.AsNoTracking();
User user = await users
.Include(x => x.UserSkills)
.ThenInclude(x => x.Skill)
.FirstOrDefaultAsync(x => x.Id == 1);
var userSkills = user.UserSkills.ToList();
IQueryable<Lesson> lessons = _context.Lessons.AsNoTracking();
var test = lessons
.Where(x => x.IsEnabled)
.Where(x => x.LessonSkills.All(y => userSkills.Any(z => y.SkillId == z.SkillId)))
.ToList();
I am looking to get User Skills contains all Lesson Skills.
When I run this query I get the following error:
Exception thrown: 'System.InvalidOperationException' in System.Private.CoreLib.dll:
'Error generated for warning 'Microsoft.EntityFrameworkCore.Query.QueryClientEvaluationWarning:
The LINQ expression 'where ([y].SkillId == [z].SkillId)' could not be translated and will be evaluated locally.'.
How to change the query to solve this problem?
Update
I need to extend this query with an extra option (y.SkillLevelId <= z.SkillLevelId):
var test = lessons
.Where(x => x.IsEnabled)
.Where(x => x.LessonSkills.All(y => userSkills.Any(z =>
y.SkillId == z.SkillId
&&
y.SkillLevelId <= z.SkillLevelId)))
.ToList();
userSkills is in-memory collection, and from my experience with EF6 and EF Core so far I can say that the only reliable translatable construct with in-memory collections is Enumerable.Contains method on primitive type in-memory collection.
So the following solves the problem is question.
First (should be outside the query expression tree):
var userSkillIds = user.UserSkills.Select(x => x.SkillId);
Then instead of
.Where(x => x.LessonSkills.All(y => userSkills.Any(z => y.SkillId == z.SkillId)))
use the equivalent (but translatable):
.Where(x => x.LessonSkills.All(y => userSkillIds.Contains(y.SkillId)))
Update: If you can't use Contains, the options you have until EF Core starts supporting it are (1) EntityFrameworkCore.MemoryJoin package (I personally haven't tested it, but the idea is interesting), (2) manually building Or based predicate with Expression class (hard and works for small memory collections) and (3) replace the memory collection with real IQueryable<>, for instance
var userSkills = users
.Where(x => x.Id == 1)
.SelectMany(x => x.UserSkills);
and use the original query.

How can I order in Linq Select Distinct if ordered by an Included entity?

I know that in Linq I have to do the OrderBy after doing a Select - Distinct, but I'm trying to order by an Included entity property that get lost after the Select.
For example:
var accounts = _context.AccountUser
.Include(o => o.Account)
.Where(o => o.UserId == userId || o.Account.OwnerId == userId)
.OrderByDescending(o => o.LastAccessed)
.Select(o => o.Account)
.Distinct();
As I'm doing the Where by an or of two different parameters, there is a good chance to obtain duplicated results. That's why I'm using the Distinct.
The problem here is that after I do the Select, I don't have the LastAccessed property anymore because it doesn't belong to the selected entity.
I thing the structure of the AccountUser and Account can be inferred from the query itself.
If you have the bi-directional navigation properties set up:
var accountsQuery = _context.AccountUser
.Where(o => o.UserId == userId || o.Account.OwnerId == userId)
.Select(o => o.Account)
.Distinct()
.OrderByDescending(a => a.AccountUser.LastAccessed);
When Selecting the Account you do not need .Include() Keep in mind that any related entities that you access off the Account will be lazy-loaded. I recommend using a .Select() to extract either a flattened view model or a view model hierarchy so that the SQL loads all needed fields rather than either eager-loading everything or tripping lazy-load calls.
Since LINQ doesn't implement DistinctBy and LINQ to SQL doesn't implement Distinct that takes an IEqualityComparer, you must substiture GroupBy+Select instead:
var accounts = _context.AccountUser
.Include(o => o.Account)
.Where(o => o.UserId == userId || o.Account.OwnerId == userId)
.GroupBy(o => o.Account).Select(og => og.First())
.OrderByDescending(o => o.LastAccessed)
.Select(o => o.Account);

Filtering on Include in EF Core

I'm trying to filter on the initial query. I have nested include leafs off a model. I'm trying to filter based on a property on one of the includes. For example:
using (var context = new BloggingContext())
{
var blogs = context.Blogs
.Include(blog => blog.Posts)
.ThenInclude(post => post.Author)
.ToList();
}
How can I also say .Where(w => w.post.Author == "me")?
Entity Framework core 5 is the first EF version to support filtered Include.
How it works
Supported operations:
Where
OrderBy(Descending)/ThenBy(Descending)
Skip
Take
Some usage examples (from the original feature request and the github commmit)
:
Only one filter allowed per navigation, so for cases where the same navigation needs to be included multiple times (e.g. multiple ThenInclude on the same navigation) apply the filter only once, or apply exactly the same filter for that navigation.
context.Customers
.Include(c => c.Orders.Where(o => o.Name != "Foo")).ThenInclude(o => o.OrderDetails)
.Include(c => c.Orders).ThenInclude(o => o.Customer)
or
context.Customers
.Include(c => c.Orders.Where(o => o.Name != "Foo")).ThenInclude(o => o.OrderDetails)
.Include(c => c.Orders.Where(o => o.Name != "Foo")).ThenInclude(o => o.Customer)
Another important note:
Collections included using new filter operations are considered to be loaded.
That means that if lazy loading is enabled, addressing one customer's Orders collection from the last example won't trigger a reload of the entire Orders collection.
Also, two subsequent filtered Includes in the same context will accumulate the results. For example...
context.Customers.Include(c => c.Orders.Where(o => !o.IsDeleted))
...followed by...
context.Customers.Include(c => c.Orders.Where(o => o.IsDeleted))
...will result in customers with Orders collections containing all orders.
Filtered Include and relationship fixup
If other Orders are loaded into the same context, more of them may get added to a customers.Orders collection because of relationship fixup. This is inevitable because of how EF's change tracker works.
context.Customers.Include(c => c.Orders.Where(o => !o.IsDeleted))
...followed by...
context.Orders.Where(o => o.IsDeleted).Load();
...will again result in customers with Orders collections containing all orders.
The filter expression
The filter expression should contain predicates that can be used as a stand-alone predicate for the collection. An example will make this clear. Suppose we want to include orders filtered by some property of Customer:
context.Customers.Include(c => c.Orders.Where(o => o.Classification == c.Classification))
It compiles, but it'll throw a very technical runtime exception, basically telling that o.Classification == c.Classification can't be translated because c.Classification can't be found. The query has to be rewritten using a back-reference from Order to Customer:
context.Customers.Include(c => c.Orders.Where(o => o.Classification == o.Customer.Classification))
The predicate o => o.Classification == o.Customer.Classification) is "stand alone" in the sense that it can be used to filter Orders independently:
context.Orders.Where(o => o.Classification == o.Customer.Classification) // No one would try 'c.Classification' here
This restriction may change in later EF versions than the current stable version (EF core 5.0.7).
What can (not) be filtered
Since Where is an extension method on IEnumerable it's clear that only collections can be filtered. It's not possible to filter reference navigation properties. If we want to get orders and only populate their Customer property when the customer is active, we can't use Include:
context.Orders.Include(o => o.Customer.Where( ... // obviously doesn't compile
Filtered Include vs filtering the query
Filtered Include has given rise to some confusion on how it affects filtering a query as a whole. The rule of the thumb is: it doesn't.
The statement...
context.Customers.Include(c => c.Orders.Where(o => !o.IsDeleted))
...returns all customers from the context, not only the ones with undeleted orders. The filter in the Include doesn't affect the number of items returned by the main query.
On the other hand, the statement...
context.Customers
.Where(c => c.Orders.Any(o => !o.IsDeleted))
.Include(c => c.Orders)
...only returns customers having at least one undeleted order, but having all of their orders in the Orders collections. The filter on the main query doesn't affect the orders per customer returned by Include.
To get customers with undeleted orders and only loading their undeleted orders, both filters are required:
context.Customers
.Where(c => c.Orders.Any(o => !o.IsDeleted))
.Include(c => c.Orders.Where(o => !o.IsDeleted))
Filtered Include and projections
Another area of confusion is how filtered Include and projections (select new { ... }) are related. The simple rule is: projections ignore Includes, filtered or not. A query like...
context.Customers
.Include(c => c.Orders)
.Select(c => new { c.Name, c.RegistrationDate })
...will generate SQL without a join to Orders. As for EF, it's the same as...
context.Customers
.Select(c => new { c.Name, c.RegistrationDate })
It gets confusing when the Include is filtered, but Orders are also used in the projection:
context.Customers
.Include(c => c.Orders.Where(o => !o.IsDeleted))
.Select(c => new
{
c.Name,
c.RegistrationDate,
OrderDates = c.Orders.Select(o => o.DateSent)
})
One might expect that OrderDates only contains dates from undeleted orders, but they contain the dates from all Orders. Again, the projection completely ignores the Include. Projection and Include are separate worlds.
How strictly they lead their own lives is amusingly demonstrated by this query:
context.Customers
.Include(c => c.Orders.Where(o => !o.IsDeleted))
.Select(c => new
{
Customer = c,
OrderDates = c.Orders.Select(o => o.DateSent)
})
Now pause for a moment and predict the outcome...
The not so simple rule is: projections don't always ignore Include. When there is an entity in the projection to which the Include can be applied, it is applied. That means that Customer in the projection contains its undeleted Orders, whereas OrderDates still contains all dates. Did you get it right?
Not doable.
There is an on-going discussion about this topic:
https://github.com/aspnet/EntityFramework/issues/1833
I'd suggest to look around for any of the 3rd party libraries listed there, ex.: https://github.com/jbogard/EntityFramework.Filters
You can also reverse the search.
{
var blogs = context.Author
.Include(author => author.posts)
.ThenInclude(posts => posts.blogs)
.Where(author => author == "me")
.Select(author => author.posts.blogs)
.ToList();
}
Not sure about Include() AND ThenInclude(), but it's simple to do that with a single include:
var filteredArticles =
context.NewsArticles.Include(x => x.NewsArticleRevisions)
.Where(article => article.NewsArticleRevisions
.Any(revision => revision.Title.Contains(filter)));
Hope this helps!
Although it's (still in discussion) not doable with EF Core, I've managed to do it using Linq to Entities over EF Core DbSet. In your case instead of:
var blogs = context.Blogs
.Include(blog => blog.Posts)
.ThenInclude(post => post.Author)
.ToList()
.. you'll have:
await (from blog in this.DbContext.Blogs
from bPost in blog.Posts
from bpAuthor in bPost.Author
where bpAuthor = "me"
select blog)
.ToListAsync();
I used below package
Use Z.EntityFramework.Plus
IncludeFilter and IncludeFilterByPath two methods are which you can use.
var list = context.Blogs.IncludeFilter(x => x.Posts.Where(y => !y.IsSoftDeleted))
.IncludeFilter(x => x.Posts.Where(y => !y.IsSoftDeleted)
.SelectMany(y => y.Comments.Where(z => !z.IsSoftDeleted)))
.ToList();
Here is the example https://dotnetfiddle.net/SK934m
Or you can do like this
GetContext(session).entity
.Include(c => c.innerEntity)
.Select(c => new Entity()
{
Name = c.Name,
Logo = c.Logo,
InnerEntity= c.InnerEntity.Where(s => condition).ToList()
})
Interesting case and it worked!!
If you have table/model user(int id, int? passwordId, ICollection<PwdHist> passwordHistoryCollection) where collection is history of passwords. Could be many or none.
And PwdHistory(int id, int UserId, user User). This has a quasi relationship via attributes.
Needed to get user, with related current password record, while leaving historical records behind.
User user = _userTable
.Include(u => u.Tenant)
.Include(u => u.PwdHistory.Where(p => p.Id == p.PwdUser.PasswordId))
.Where(u => u.UserName == userName)
.FirstOrDefault();
Most interesting part is .Include(u => u.PwdHistory.Where(p => p.Id == p.PwdUser.PasswordId))
works with user and many passwords
works with user and no passwords
works with no user
We can use by extension
public static IQueryable<TEntity> IncludeCondition<TEntity, TProperty>(this IQueryable<TEntity> query, Expression<Func<TEntity, TProperty>> predicate, bool? condition) where TEntity : class where TProperty : class
{
return condition == true ? query.Include(predicate) : query;
}
Usage;
_context.Tables.IncludeCondition(x => x.InnerTable, true)
This task can be accomplished with two queries. For example:
var query = _context.Employees
.Where(x =>
x.Schedules.All(s =>
s.ScheduleDate.Month != DateTime.UtcNow.AddMonths(1).Month &&
s.ScheduleDate.Year != DateTime.UtcNow.AddMonths(1).Year) ||
(x.Schedules.Any(s =>
s.ScheduleDate.Month == DateTime.UtcNow.AddMonths(1).Month &&
s.ScheduleDate.Year == DateTime.UtcNow.AddMonths(1).Year) &&
x.Schedules.Any(i => !i.ScheduleDates.Any())));
var employees = await query.ToListAsync();
await query.Include(x => x.Schedules)
.ThenInclude(x => x.ScheduleDates)
.SelectMany(x => x.Schedules)
.Where(s => s.ScheduleDate.Month == DateTime.UtcNow.AddMonths(1).Month &&
s.ScheduleDate.Year == DateTime.UtcNow.AddMonths(1).Year).LoadAsync();

How to implement SkipWhile with Linq to Sql without first loading the whole list into memory?

I need to order the articles stored in a database by descending publication date and then take the first 20 records after the article with Id == 100.
This is what I would like to do with Linq:
IQueryable<Article> articles =
db.Articles
.OrderByDescending(a => a.PublicationDate)
.SkipWhile(a => a.Id != 100)
.Take(20);
However, this generates a NotSupportedException because SkipWhile is not supported in Linq to Sql (see here).
A possible solution is to execute the query and then apply SkipWhile using Linq to Object:
IEnumerable<ArticleDescriptor> articles =
db.Articles
.OrderByDescending(a => a.PublicationDate)
.ToList()
.SkipWhile(a => a.Article.Id != 100)
.Take(20);
But this means I need to load the whole ordered list into memory first and then take 20 articles after the one with Id == 100.
Is there a way to avoid this huge memory consumption?
More in general, what is the best way to achieve this in SQL?
If, as I'm guessing from the column name, PublicationDate doesn't change, you can do this in two separate queries:
Establish the PublicationDate of the Article with Id == 100
Retrieve the 20 articles from that date onwards
Something like:
var thresholdDate = db.Articles.Single(a => a.Id == 100).PublicationDate;
var articles =
db.Articles
.Where(a => a.PublicationDate <= thresholdDate)
.OrderByDescending(a => a.PublicationDate)
.Take(20);
It might even be that LINQ to SQL can translate this:
var articles =
db.Articles
.Where(a => a.PublicationDate
<= db.Articles.Single(aa => aa.Id == 100).PublicationDate)
.OrderByDescending(a => a.PublicationDate)
.Take(20);
but that may be too complex for it. Try it and see.
You can try like this
var articles =
db.Articles
.Where(a => a.PublicationDate < db.Articles
.Where(aa => aa.Id==100)
.Select(aa => aa.PublicationDate)
.SingleOrDefault())
.OrderByDescending(a => a.PublicationDate)
.Take(20);
Isnt the solution to just add a where statement?
IQueryable<Article> articles = db.Articles.Where(a => a.id != 100).OrderByDescending(a => a.PublicationDate).Take(20);

Categories

Resources