Eager Loading with LinqKit and EFCore

Eager Loading with LinqKit and EFCore - c#

B"H
C#sharps inclusion of linq right into the language is one of its most powerful features.
It is also what makes Entity Framework such an enticing option for working with a database.
Unfortunately due to typing restrictions its often difficult to create centralized expressions to use throughout your project/solution.
One answer to that question is LinqKit. This was working great for me in EF 6.x.
However when I moved to EF Core, I am facing a showstopper.
Instead of properly compiled expressions which should then be converted into SQL statement. What I am getting are function like expressions which don't include their sub expressions in the SQL that they generate. Instead they create AsyncLinqOperatorProvider.EnumerableAdapters which .Net tries to execute (out of an async context) when you access the IEnumerables a few lines later.
So the question is: How do I get EF Core to execute the entire expression as one SQL statement and return a complete materialized object?
Given two classes
class OrderItemDTO
{
public string OrderItemName { get; set; }
}
class OrderDTO
{
public string OrderName { get; set; }
public ICollection<OrderItemDTO> OrderItems { get; set; }
}
I would like to create a global expressions somewhere
public static Expression<Func<Order, OrderDTO>> ToDTO = x => new OrderDTO
{
OrderName = x.Name,
OrderItems = x.Items.Select(y => new OrderItemDTO { OrderItemName = y.Name })
};
Which I would then use somewhere as var orders = await db.Orders.AsExpandable().Select(ToDTO).ToListAsync();
Expecting a fully materialized OrderDTO.
Instead what I get is a DTO with OrderItems as a AsyncLinqOperatorProvider.EnumerableAdapter which causes a race condition when executed

Call AsQueryable() and then ToList() on your inner collections - EF will then be able to correctly treat this as full projection:
public static Expression<Func<Order, OrderDTO>> ToDTO = x => new OrderDTO
{
OrderName = x.Name,
OrderItems = x.Items.AsQueryable().Select(y => new OrderItemDTO { OrderItemName = y.Name }).ToList()
};

Related

LINQ Query optimalisation using EF6

I'm trying my hand at LINQ for the first time and just wanted to post a small question to make sure if this was the best way to go about it. I want a list of every value in a table. So far this is what I have, and it works, but is this the best way to go about collecting everything in a LINQ friendly way?
public static List<Table1> GetAllDatainTable()
{
List<Table1> Alldata = new List<Table1>();
using (var context = new EFContext())
{
Alldata = context.Tablename.ToList();
}
return Alldata;
}

For simple entities, that is an entity that has no references to other entities (navigation properties) your approach is essentially fine. It can be condensed down to:
public static List<Table1> GetAllDatainTable()
{
using (var context = new EFContext())
{
return context.Table1s.ToList();
}
}
However, in most real-world scenarios you are going to want to leverage things like navigation properties for the relationships between entities. I.e. an Order references a Customer with Address details, and contains OrderLines which each reference a Product, etc. Returning entities this way becomes problematic because any code that accepts the entities returned by a method like this should be getting either complete, or completable entities.
For instance if I have a method that returns an order, and I have various code that uses that order information: Some of that code might try to get info about the order's customer, other code might be interested in the products. EF supports lazy loading so that related data can be pulled if, and when needed, however that only works within the lifespan of the DbContext. A method like this disposes the DbContext so Lazy Loading is off the cards.
One option is to eager load everything:
using (var context = new EFContext())
{
var order = context.Orders
.Include(o => o.Customer)
.ThenInclude(c => c.Addresses)
.Include(o => o.OrderLines)
.ThenInclude(ol => ol.Product)
.Single(o => o.OrderId == orderId);
return order;
}
However, there are two drawbacks to this approach. Firstly, it means loading considerably more data every time we fetch an order. The consuming code may not care about the customer or order lines, but we've loaded it all anyways. Secondly, as systems evolve, new relationships may be introduced that older code won't necessarily be noticed to be updated to include leading to potential NullReferenceExceptions, bugs, or performance issues when more and more related data gets included. The view or whatever is initially consuming this entity may not expect to reference these new relationships, but once you start passing around entities to views, from views, and to other methods, any code accepting an entity should expect to rely on the fact that the entity is complete or can be made complete. It can be a nightmare to have an Order potentially loaded in various levels of "completeness" and code handling whether data is loaded or not. As a general recommendation, I advise not to pass entities around outside of the scope of the DbContext that loaded them.
The better solution is to leverage projection to populate view models from the entities suited to your code's consumption. WPF often utilizes the MVVM pattern, so this means using EF's Select method or Automapper's ProjectTo method to populate view models based each of your consumer's needs. When your code is working with ViewModels containing the data views and such need, then loading and populating entities as needed this allows you to produce far more efficient (fast) and resilient queries to get data out.
If I have a view that lists orders with a created date, customer name, and list of products /w quantities we define a view model for the view:
[Serializable]
public class OrderSummary
{
public int OrderId { get; set; }
public string OrderNumber { get; set; }
public DateTime CreatedAt { get; set; }
public string CustomerName { get; set; }
public ICollection<OrderLineSummary> OrderLines { get; set; } = new List<OrderLineSummary>();
}
[Serializable]
public class OrderLineSummary
{
public int OrderLineId { get; set; }
public int ProductId { get; set; }
public string ProductName { get; set; }
public int Quantity { get; set; }
}
then project the view models in the Linq query:
using (var context = new EFContext())
{
var orders = context.Orders
// add filters & such /w Where() / OrderBy() etc.
.Select(o => new OrderSummary
{
OrderId = o.OrderId,
OrderNumber = o.OrderNumber,
CreatedAt = o.CreatedAt,
CustomerName = o.Customer.Name,
OrderLines = o.OrderLines.Select( ol => new OrderLineSummary
{
OrderLineId = ol.OrderLineId,
ProductId = ol.Product.ProductId,
ProductName = ol.Product.Name,
Quantity = ol.Quantity
}).ToList()
}).ToList();
return orders;
}
Note that we don't need to worry about eager loading related entities, and if later down the road an order or customer or such gains new relationships, the above query will continue to work, only being updated if the new relationship information is useful for the view(s) it serves. It can compose a faster, less memory intensive query fetching fewer fields to be passed over the wire from the database to the application, and indexes can be employed to tune this even further for high-use queries.
Update:
Additional performance tips: Generally avoid methods like GetAll*() as a lowest common denominator method. Far too many performance issues I come across with methods like this are in the form of:
var ordersToShip = GetAllOrders()
.Where(o => o.OrderStatus == OrderStatus.Pending)
.ToList();
foreach(order in ordersToShip)
{
// do something that only needs order.OrderId.
}
Where GetAllOrders() returns List<Order> or IEnumerable<Order>. Sometimes there is code like GetAllOrders().Count() > 0 or such.
Code like this is extremely inefficient because GetAllOrders() fetches *all records from the database, only to load them into memory in the application to later be filtered down or counted etc.
If you're following a path to abstract away the EF DbContext and entities into a service / repository through methods then you should ensure that the service exposes methods to produce efficient queries, or forgo the abstraction and leverage the DbContext directly where data is needed.
var orderIdsToShip = context.Orders
.Where(o => o.OrderStatus == OrderStatus.Pending)
.Select(o => o.OrderId)
.ToList();
var customerOrderCount = context.Customer
.Where(c => c.CustomerId == customerId)
.Select(c => c.Orders.Count())
.Single();
EF is extremely powerful and when selected to service your application should be embraced as part of the application to give the maximum benefit. I recommend avoiding coding to abstract it away purely for the sake of abstraction unless you are looking to employ unit testing to isolate the dependency on data with mocks. In this case I recommend leveraging a unit of work wrapper for the DbContext and the Repository pattern leveraging IQueryable to make isolating business logic simple.

EF Core 3.0 using Skip() and Take() in queries with nested collection projections

I'm using EF Core 3.0.1 in my project and I need to perform nested projections in sql instead of client-side.
public class Order
{
public int Id {get;set;}
public ICollection<Detail> Details {get;set;}
}
public class Detail
{
public string Comment {get;set;}
}
// Order and Detail has exact same dto's: OrderDto and DetailDto
public PagedCollectionResponse<OrderDto> Execute()
{
// consider I need to use .Count() somewhere inside this method after applying nested projection.
var orders = context.Orders
.Select(x => new OrderDto
{
Id = x.Id,
Details = x.Details.Select(d => new DetailDto
{
Comment = d.Comment
})
// initially, I had no ToList here, but this hasn't fixed the issue
.ToList();
});
var ordersCount = orders.Count(); // works fine, executing only one query in db
var ordersCount2 = orders.Skip(5).Take(1).Count(); --> System.InvalidCastException: Unable to cast object of type
'Microsoft.EntityFrameworkCore.Query.SqlExpressions.SqlFunctionExpression' to
type'System.Linq.Expressions.ConstantExpression'
}
However, if I remove collection projection, then both statements assigning counter and counter2 execute normally.
...
.Select(x => new OrderDto
{
Id = x.Id
};
var ordersCount = orders.Count(); // OK
var ordersCount2 = orders.Skip(5).Take(1).Count(); // OK
According to the post on EF Core projections I've found, it was a known issue in EF Core 2.0. Was it fixed?
VERSION WARNING (EF Core 2.0.0): At the time of writing nested collection projections only work in memory, defeating the purpose of using them in the first place. This will be fixed soon, so it’s still worth understanding. They DO still work for nested single element projections which we’ll learn about in the next section, so keep reading!
So, is there any known fix on how to make this work?

Entity Framework Core Generating SQL With Ambiguous Column Names

I am using Entity Framework Core 2.2.6. I'm going to try and make this question concise and apologies in advance if it ends up being a wall of text.
The error I am seeing is an ambiguous column name in the SQL Entity Framework Core generates.
So my situation is this: I have two entities with a many-to-one relationship. The "parent" entity implements
an interface that has a property that is of type IChildEntity. Here are the interfaces:
public interface IParentEntity
{
IChildEntity Child { get; set; }
string Prop1 { get; set; }
string Prop2 { get; set; }
}
public interface IChildEntity
{
string ChildProp1 { get; set; }
string ChildProp2 { get; set; }
}
I am using ef core's fluent api and in order to set up the relationship between parent and child
I am using a concrete type of ChildEntity and defining a IChildEntity property to conform to the
interface and just passing things through to the concrete type:
public class ChildEntity : IChildEntity
{
public long ID {get; set;}
public string ChildProp1 { get; set; }
public string ChildProp2 { get; set; }
}
public class ParentEntity : IParentEntity
{
public long ID { get; set; }
public string Prop1 { get; set; }
public string Prop2 { get; set; }
public long ChildID { get; set; }
// Navigation property so EF Core can create the relationship
public ChildEntity MappedChild { get; private set; }
// this is to adhere to the interface
// just pass this through to the backing concrete instance
[NotMapped]
public IChildEntity Child
{
get => MappedChild;
set => MappedChild = (ChildEntity)value;
}
}
Then in OnModelCreating I set up the relationship like so:
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<ParentEntity>()
.HasOne(e => e.MappedChild)
.WithMany()
.HasForeignKey(e => e.ChildID);
}
This works and the relationship gets set up as expected, however I am finding when I do a query it can generate
some SQL that can result in an ambigous column error in some database engines. Here is the example query:
MyContext.ParentEntity
.Include(p => p.MappedChild)
.Where(p => p.Prop1.Equals("somestring")
.FirstOrDefault()
The SQL that gets generated is similar to:
SELECT p."ID", p."ChildID", p."Prop1", p."Prop1", "p.MappedChild"."ID", "p.MappedChild"."ChildProp1", "p.MappedChild"."ChildProp2"
FROM "ParentEntity" AS p
INNER JOIN "ChildEntity" AS "p.MappedChild" ON p."ChildID" = "p.MappedChild"."ID"
WHERE p."Prop1" = 'somestring'
ORDER BY "p.MappedChild"."ID"
LIMIT 1
The problem here is we are selecting two columns with the name ID and not aliasing. Some databases will be ok with this
but some will not. A work around I can do for this is to do two separate queries to get the entity and the child entity:
var parent = MyContext.ParentEntity
.Where(p => p.Prop1.Equals("somestring")
.FirstOrDefault()
MyContext.Entry(parent).Reference(p => s.MappedChild).Load();
But this is less than ideal since it does multiple queries and is a bit less elegant than just using Include()
Because this seems like such a common use case and I couldn't find any bug reports against EF Core for this type of
behavior it is my suspicion that I am doing something wrong here that is resulting in EFCore not aliasing column names
for this type of query. I was thinking it could be the bit of trickery I have to do to ensure my entity implements it's interface
(this is something I can't due to constraints in the codebase and other integrations) but the more I look at it the less likely that
seems to me since we are directly dealing with the "mapped" property in EF related code and it's completely unaware of the interface.
My questions are - can anyone see something in my implementation that would cause this? Could anyone
suggest a better workaround than what I have here? Any advice here would be appreciated. Thanks much.

This is an old Entity framework bug with the Oracle company products bug including the MySQL database and Oracle database (12.1 and older).
I see the
ORA-00918: column ambiguously defined
error mostly when:
Selecting one entity with including parent entity.
Selecting one entity with value object own one command
This error appears when using Find, First, FirstOrDefault, Last, Single and all single entity selector commands.
I tested many solutions and check generated sql statement to find out a very unique way without any performance overhead:
// This the way of getting one entity from oracle 12.1 without throwing Oracle exception => ORA-00918: column ambiguously defined without any extra overhead
var entities = await dbSet.Where(x => x.Id == id).Take(1).ToListAsync();
var entity = entities.FirstOrDefault();
Another Sample:
var entities = await dbSet.OrderByDescending(x => x.Id).Take(1).ToListAsync();
var entity = entities.FirstOrDefault();
At the end of your IQueryable Linq add Take(1) and get all with .ToList() or .ToListAsync() to execute the statement and fetch a list with one record. Then use Enumerable Single Entity Selector to change the list to an entity.
That’s all.

Reusable Calculations For LINQ Projections In Entity Framework (Code First)

My domain model has a lot of complex financial data that is the result of fairly complex calculations on multiple properties of various entities. I generally include these as [NotMapped] properties on the appropriate domain model (I know, I know - there's plenty of debate around putting business logic in your entities - being pragmatic, it just works well with AutoMapper and lets me define reusable DataAnnotations - a discussion of whether this is good or not is not my question).
This works fine as long as I want to materialize the entire entity (and any other dependent entities, either via .Include() LINQ calls or via additional queries after materialization) and then map these properties to the view model after the query. The problem comes in when trying to optimize problematic queries by projecting to a view model instead of materializing the entire entity.
Consider the following domain models (obviously simplified):
public class Customer
{
public virtual ICollection<Holding> Holdings { get; private set; }
[NotMapped]
public decimal AccountValue
{
get { return Holdings.Sum(x => x.Value); }
}
}
public class Holding
{
public virtual Stock Stock { get; set; }
public int Quantity { get; set; }
[NotMapped]
public decimal Value
{
get { return Quantity * Stock.Price; }
}
}
public class Stock
{
public string Symbol { get; set; }
public decimal Price { get; set; }
}
And the following view model:
public class CustomerViewModel
{
public decimal AccountValue { get; set; }
}
If I attempt to project directly like this:
List<CustomerViewModel> customers = MyContext.Customers
.Select(x => new CustomerViewModel()
{
AccountValue = x.AccountValue
})
.ToList();
I end up with the following NotSupportedException: Additional information: The specified type member 'AccountValue' is not supported in LINQ to Entities. Only initializers, entity members, and entity navigation properties are supported.
Which is expected. I get it - Entity Framework can't convert the property getters into a valid LINQ expression. However, if I project using the exact same code but within the projection, it works fine:
List<CustomerViewModel> customers = MyContext.Customers
.Select(x => new CustomerViewModel()
{
AccountValue = x.Holdings.Sum(y => y.Quantity * y.Stock.Price)
})
.ToList();
So we can conclude that the actual logic is convertible to a SQL query (I.e., there's nothing exotic like reading from disk, accessing external variables, etc.).
So here's the question: is there any way at all to make logic that should be convertible to SQL reusable within LINQ to entity projections?
Consider that this calculation may be used within many different view models. Copying it to the projection in each action is cumbersome and error prone. What if the calculation changes to include a multiplier? We'd have to manually locate and change it everywhere it's used.
One thing I have tried is encapsulating the logic within an IQueryable extension:
public static IQueryable<CustomerViewModel> WithAccountValue(
this IQueryable<Customer> query)
{
return query.Select(x => new CustomerViewModel()
{
AccountValue = x.Holdings.Sum(y => y.Quantity * y.Stock.Price)
});
}
Which can be used like this:
List<CustomerViewModel> customers = MyContext.Customers
.WithAccountValue()
.ToList();
That works well enough in a simple contrived case like this, but it's not composable. Because the result of the extension is an IQueryable<CustomerViewModel> and not a IQueryable<Customer> you can't chain them together. If I had two such properties in one view model, one of them in another view model, and then the other in a third view model, I would have no way of using the same extension for all three view models - which would defeat the whole purpose. With this approach, it's all or nothing. Every view model has to have the exact same set of calculated properties (which is rarely the case).
Sorry for the long-winded question. I prefer to provide as much detail as possible to make sure folks understand the question and potentially help others down the road. I just feel like I'm missing something here that would make all of this snap into focus.

I did a lot of research on this the last several days because it's been a bit of a pain point in constructing efficient Entity Framework queries. I've found several different approaches that all essentially boil down to the same underlying concept. The key is to take the calculated property (or method), convert it into an Expression that the query provider knows how to translate into SQL, and then feed that into the EF query provider.
I found the following libraries/code that attempted to solve this problem:
LINQ Expression Projection
http://www.codeproject.com/Articles/402594/Black-Art-LINQ-expressions-reuse and http://linqexprprojection.codeplex.com/
This library allows you to write your reusable logic directly as an Expression and then provides the conversion to get that Expression into your LINQ query (since the query can't directly use an Expression). The funny thing is that it'll be translated back to an Expression by the query provider. The declaration of your reusable logic looks like this:
private static Expression<Func<Project, double>> projectAverageEffectiveAreaSelector =
proj => proj.Subprojects.Where(sp => sp.Area < 1000).Average(sp => sp.Area);
And you use it like this:
var proj1AndAea =
ctx.Projects
.AsExpressionProjectable()
.Where(p => p.ID == 1)
.Select(p => new
{
AEA = Utilities.projectAverageEffectiveAreaSelector.Project<double>()
});
Notice the .AsExpressionProjectable() extension to set up projection support. Then you use the .Project<T>() extension on one of your Expression definitions to get the Expression into the query.
LINQ Translations
http://damieng.com/blog/2009/06/24/client-side-properties-and-any-remote-linq-provider and https://github.com/damieng/Linq.Translations
This approach is pretty similar to the LINQ Expression Projection concept except it's a little more flexible and has several points for extension. The trade off is that it's also a little more complex to use. Essentially you still define your reusable logic as an Expression and then rely on the library to convert that into something the query can use. See the blog post for more details.
DelegateDecompiler
http://lostechies.com/jimmybogard/2014/05/07/projecting-computed-properties-with-linq-and-automapper/ and https://github.com/hazzik/DelegateDecompiler
I found DelegateDecompiler via the blog post on Jimmy Bogard's blog. It has been a lifesaver. It works well, is well architected, and requires a lot less ceremony. It does not require you to define your reusable calculations as an Expression. Instead, it constructs the necessary Expression by using Mono.Reflection to decompile your code on the fly. It knows which properties, methods, etc. need to be decompiled by having you decorate them with ComputedAttribute or by using the .Computed() extension within the query:
class Employee
{
[Computed]
public string FullName
{
get { return FirstName + " " + LastName; }
}
public string LastName { get; set; }
public string FirstName { get; set; }
}
This can also be easily extended, which is a nice touch. For example, I set it up to look for the NotMapped data annotation instead of having to explicitly use the ComputedAttribute.
Once you've set up your entity, you just trigger decompilation by using the .Decompile() extension:
var employees = ctx.Employees
.Select(x => new
{
FullName = x.FullName
})
.Decompile()
.ToList();

You can encapsulate logic by creating a class that contains the original Entity and the additional calculated property. You then create helper methods that project to the class.
For example, if we were trying to calculate the tax for an Employee and a Contractor entity, we could do this:
//This is our container for our original entity and the calculated field
public class PersonAndTax<T>
{
public T Entity { get; set; }
public double Tax { get; set; }
}
public class PersonAndTaxHelper
{
// This is our middle translation class
// Each Entity will use a different way to calculate income
private class PersonAndIncome<T>
{
public T Entity { get; set; }
public int Income { get; set; }
}
Income calculating methods
public static IQueryable<PersonAndTax<Employee>> GetEmployeeAndTax(IQueryable<Employee> employees)
{
var query = from x in employees
select new PersonAndIncome<Employee>
{
Entity = x,
Income = x.YearlySalary
};
return CalcualateTax(query);
}
public static IQueryable<PersonAndTax<Contractor>> GetContratorAndTax(IQueryable<Contractor> contractors)
{
var query = from x in contractors
select new PersonAndIncome<Contractor>
{
Entity = x,
Income = x.Contracts.Sum(y => y.Total)
};
return CalcualateTax(query);
}
Tax calculation is defined in one place
private static IQueryable<PersonAndTax<T>> CalcualateTax<T>(IQueryable<PersonAndIncome<T>> personAndIncomeQuery)
{
var query = from x in personAndIncomeQuery
select new PersonAndTax<T>
{
Entity = x.Entity,
Tax = x.Income * 0.3
};
return query;
}
}
Our view model projections using the Tax property
var contractorViewModel = from x in PersonAndTaxHelper.GetContratorAndTax(context.Contractors)
select new
{
x.Entity.Name,
x.Entity.BusinessName
x.Tax,
};
var employeeViewModel = from x in PersonAndTaxHelper.GetEmployeeAndTax(context.Employees)
select new
{
x.Entity.Name,
x.Entity.YearsOfService
x.Tax,
};

Reusing Linq to Entities' Expression<Func<T, TResult> in Select and Where calls

Suppose I have an entity object defined as
public partial class Article
{
public Id
{
get;
set;
}
public Text
{
get;
set;
}
public UserId
{
get;
set;
}
}
Based on some properties of an Article, I need to determine if the article can be deleted by a given user. So I add a static method to do the checking. Something like:
public partial class Article
{
public static Expression<Func<Article, bool>> CanBeDeletedBy(int userId)
{
//Add logic to be reused here
return a => a.UserId == userId;
}
}
So now I can do
using(MyEntities e = new MyEntities())
{
//get the current user id
int currentUserId = 0;
e.Articles.Where(Article.CanBeDeletedBy(currentUserid));
}
So far so good. Now I want to reuse the logic in CanBeDeletedBy while doing a Select, something like:
using(MyEntities e = new MyEntities())
{
//get the current user id
int currentUserId = 0;
e.Articles.Select(a => new
{
Text = a.Text,
CanBeDeleted = ???
};
}
But no matter what I try, I can't use the expression in the select method. I guess that If I can do
e.Articles.Select(a => new
{
Text = a.Text,
CanBeDeleted = a => a.UserId == userId
};
Then I should be able to use the same expression. Tried to compile the expression and call it by doing
e.Articles.Select(a => new
{
Text = a.Text,
CanBeDeleted = Article.CanBeDeletedBy(currentUserId).Compile()(a)
};
but it won't work either.
Any ideas on how to get this to work? Or if it isn't possible, what are the alternatives to reuse business logic in both places?
Thanks
Pedro

Re-using expression trees is a black art; you can do it, but you would need to switch a lot of code to reflection and you'd lose all the static checking. In particular, working with the anonymous types becomes a nightmare (although dynamic in 4.0 might be workable).
Further, if you cheat and use Expression.Invoke, then it isn't supported by all providers (most noticeably not on EF in .NET 3.5SP1).
Unless this is a major pain point, I'd leave it with duplication. Or do you need to re-use the expression tree?

What I did is I used PredicateBuilder which is a class in LinqKit and also AsExpandable() http://www.albahari.com/nutshell/linqkit.aspx to build up expressions and I stored them Statically as
public readonly Expression<Func<T,bool>>
in a static class. Each expression was building on a previous expression thus reducing the amount of duplication.
As the previous question by Marc Gravell suggests this kinda thing is a black art, but thankfully a lot of the work has been done by other poeple.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Eager Loading with LinqKit and EFCore - c#

Related

LINQ Query optimalisation using EF6

EF Core 3.0 using Skip() and Take() in queries with nested collection projections

Entity Framework Core Generating SQL With Ambiguous Column Names

Reusable Calculations For LINQ Projections In Entity Framework (Code First)

Reusing Linq to Entities' Expression<Func<T, TResult> in Select and Where calls

Categories

Resources