EF: Include with where clause [duplicate] - c#

This question already has answers here:
Filtering on Include in EF Core
(9 answers)
Closed 2 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
As the title suggest I am looking for a way to do a where clause in combination with an include.
Here is my situations:
I am responsible for the support of a large application full of code smells.
Changing too much code causes bugs everywhere so I am looking for the safest solution.
Let's say I have an object Bus and an object People(Bus has a navigation prop Collection of People).
In my Query I need to select all the Busses with only the Passengers that are awake. This is a simplistic dummy example
In the current code:
var busses = Context.Busses.Where(b=>b.IsDriving == true);
foreach(var bus in busses)
{
var passengers = Context.People.Where(p=>p.BusId == bus.Id && p.Awake == true);
foreach(var person in passengers)
{
bus.Passengers.Add(person);
}
}
After this code the Context is disposed and in the calling method the resulting Bus entities are Mapped to a DTO class (100% copy of Entity).
This code causes multiple calls to DB which is a No-Go, so I found this solution ON MSDN Blogs
This worked great when debugging the result but when the entities are mapped to the DTO (Using AutoMapper) I get an exception that the Context/Connection has been closed and that the object can't be loaded. (Context is always closed can’t change this :( )
So I need to make sure that the Selected Passengers are already loaded (IsLoaded on navigation property is also False). If I inspect the Passengers collection The Count also throws the Exception but there is also a collection on the Collection of Passegers called “wrapped related entities” which contain my filtered objects.
Is there a way to load these wrapped related entities into the whole collection?
(I can't change the automapper mapping config because this is used in the whole application).
Is there another way to Get the Active Passengers?
Any hint is welcome...
Edit
Answer of Gert Arnold doesn't work because the data isn't loaded eagerly.
But when I simplify it and delete the where it is loaded. This is realy strange since the execute sql returns all the passengers in both cases. So there must be a problem when putting the results back into the entity.
Context.Configuration.LazyLoadingEnabled = false;
var buses = Context.Busses.Where(b => b.IsDriving)
.Select(b => new
{
b,
Passengers = b.Passengers
})
.ToList()
.Select(x => x.b)
.ToList();
Edit2
After a lot of struggle the answer of Gert Arnold work!
As Gert Arnold suggested you need to disable Lazy Loading and Keep it OFF.
This will ask for some extra changes to the appliaction since the prev developer loved Lazy Loading -_-

This feature has now been added to Entity Framework core 5. For earlier versions you need a work-around (note that EF6 is an earlier version).
Entity Framework 6 work-around
In EF6, a work-around is to first query the required objects in a projection (new) and let relationship fixup do its job.
You can query the required objects by
Context.Configuration.LazyLoadingEnabled = false;
// Or: Context.Configuration.ProxyCreationEnabled = false;
var buses = Context.Busses.Where(b => b.IsDriving)
.Select(b => new
{
b,
Passengers = b.Passengers
.Where(p => p.Awake)
})
.AsEnumerable()
.Select(x => x.b)
.ToList();
What happens here is that you first fetch the driving buses and awake passengers from the database. Then, AsEnumerable() switches from LINQ to Entities to LINQ to objects, which means that the buses and passengers will be materialized and then processed in memory. This is important because without it EF will only materialize the final projection, Select(x => x.b), not the passengers.
Now EF has this feature relationship fixup that takes care of setting all associations between objects that are materialized in the context. This means that for each Bus now only its awake passengers are loaded.
When you get the collection of buses by ToList you have the buses with the passengers you want and you can map them with AutoMapper.
This only works when lazy loading is disabled. Otherwise EF will lazy load all passengers for each bus when the passengers are accessed during the conversion to DTOs.
There are two ways to disable lazy loading. Disabling LazyLoadingEnabled will re-activate lazy loading when it is enabled again. Disabling ProxyCreationEnabled will create entities that aren't capable of lazy loading themselves, so they won't start lazy loading after ProxyCreationEnabled is enabled again. This may be the best choice when the context lives longer than just this single query.
But... many-to-many
As said, this work-around relies on relationship fixup. However, as explained here by Slauma, relationship fixup doesn't work with many-to-many associations. If Bus-Passenger is many-to-many, the only thing you can do is fix it yourself:
Context.Configuration.LazyLoadingEnabled = false;
// Or: Context.Configuration.ProxyCreationEnabled = false;
var bTemp = Context.Busses.Where(b => b.IsDriving)
.Select(b => new
{
b,
Passengers = b.Passengers
.Where(p => p.Awake)
})
.ToList();
foreach(x in bTemp)
{
x.b.Pasengers = x.Passengers;
}
var busses = bTemp.Select(x => x.b).ToList();
...and the whole thing becomes even less appealing.
Third-party tools
There is a library, EntityFramework.DynamicFilters that makes this a lot easier. It allows you to define global filters for entities, that will subsequently be applied any time the entity is queried. In your case this could look like:
modelBuilder.Filter("Awake", (Person p) => p.Awake, true);
Now if you do...
Context.Busses.Where(b => b.IsDriving)
.Include(b => b.People)
...you'll see that the filter is applied to the included collection.
You can also enable/disable filters, so you have control over when they are applied. I think this is a very neat library.
There is a similar library from the maker of AutoMapper: EntityFramework.Filters
Entity Framework core work-around
Since version 2.0.0, EF-core has global query filters. These can be used to set predefined filter on entities that are to be included. Of course that doesn't offer the same flexibility as filtering Include on the fly.
Although global query filters are a great feature, so far the limitation is that a filter can't contain references to navigation properties, only to the root entity of a query. Hopefully in later version these filters will attain wider usage.

Now EF Core 5.0's Filter Include method now supports filtering of the entities included
var busses = _Context.Busses
.Include(b => b.Passengers
.Where(p => p.Awake))
.Where(b => b.IsDriving);

Disclaimer: I'm the owner of the project Entity Framework Plus
EF+ Query IncludeFilter feature allows filtering related entities.
var buses = Context.Busses
.Where(b => b.IsDriving)
.IncludeFilter(x => x.Passengers.Where(p => p.Awake))
.ToList();
Wiki: EF+ Query IncludeFilter

In my case the Include was an ICollection, and also did not want to return them, I just needed to get the main entities but filtered by the referenced entity. (in other words, Included entity), what I ended up doing is this. This will return list of Initiatives but filtered by InitiativeYears
return await _context.Initiatives
.Where(x => x.InitiativeYears
.Any(y => y.Year == 2020 && y.InitiativeId == x.Id))
.ToListAsync();
Here the Initiatives and the InitiativeYears has following relationship.
public class Initiative
{
public int Id { get; set; }
public string Name { get; set; }
public ICollection<InitiativeYear> InitiativeYears { get; set; }
}
public class InitiativeYear
{
public int Year { get; set; }
public int InitiativeId { get; set; }
public Initiative Initiative { get; set; }
}

For any one still curious about this. there builtin functionality for doing this in EF Core. using .Any inside of a where clause so the code would like similar to something like this
_ctx.Parent
.Include(t => t.Children)
.Where(t => t.Children.Any(t => /* Expression here */))

Related

How does Distinct manage to work on an IQueryable after mapping from entities to DTOs in EF Core?

I have a query similar to this that I am running against our database:
var result = await _context.MyEntities
.Select(x => new SubEntityDto { Id = x.SubEntity.Id })
.Distinct()
.ToListAsync();
The entity SubEntity can be the same entity set on multiple MyEntity.
Since I want a list of SubEntityDto with no duplicates, I run .Distinct() on the resulting IQueryable<SubEntityDto>. This is the SubEntityDto class:
public class SubEntityDto
{
public int Id { get; set; }
}
And this works, but I don't know how it manages to make the list distinct when working with the DTOs. Doesn't .Distinct() use the .Equals method in this scenario? And doesn't .Equals default to reference equality, which checks whether two instances are the same instance?
If I load the list from the database, and then do Distinct(), it doesn't work anymore, like this:
var subEntities = await _context.MyEntities
.Select(x => new SubEntityDto { Id = x.SubEntity.Id })
.ToListAsync();
var distinctSubEntities = subEntities.Distinct().ToList(); // Not distinct.
I'm thinking that it somehow manages to do this when creating the SQL for the SQL query, but can anyone tell me what is happening? I'm puzzled about the fact that it seems to keep track of the entites after they're mapped to DTOs.
As mentioned in the comments to your question, when you create your query it gets translated into SQL query by your provider. So in the first case it is Queryable.Distinct(), however, in the second case you have already run your query against your database by calling ToListAsync() and result of it already translated to DTO's in memory (objects). Therefore Enumerable.Distinct() is run.
You can also use your debugger to see the created query
You can see the created query by your provider by putting a breakpoint and then investigating the contents of the variable DebugView→Query→Text Visualizer will show you the result.

Preventing EF from including related entities

I have a SQL Server database table that has a couple million records in it. I have an MVC site with a page to display data from this table, and I'm running into extensive performance issues.
Running a simple query like this takes about 25-30 seconds to return about two thousand rows:
_dbContext.Contracts
.Where(c => c.VendorID == vendorId)
.ToList();
When I run a query against the database, it only takes a couple seconds.
Turns out, EF is loading all the related entities for my Contract, so it's slowing down my query a ton.
In the debugger, the objects returned are of a strange type, not sure if that's an issue:
System.Data.Entity.DynamicProxies.Contract_3EF6BECBB56F2ADDDA6E0050AC82D03A4E993CEDF4FCA49244D3EE4005572C46
And the same with the related entities on my Contract:
System.Data.Entity.DynamicProxies.Vendor_4FB727808BD6E0BF3B25085B40F3F0B9B10EE4BD17D2A4C600214634F494DB66
The site is a bit old, it's MVC 3 with EF 4. I know on the current version of EF, I have to explicitly use Include() to get related entities, but here it seems to be included automatically.
I have an EDMX file, with a .tt file and entity classes under that, but I don't see anywhere that I can prevent my Courses from getting related objects.
Is there any way for me to do that?
If your MVC controller is returning Entities to the view, the trap you're hitting is that the serializer is iterating over the entities returned and lazy-loading all related data. This is considerably worse than triggering an eager load because in the case of loading collections, this will fetch related entities/sets one parent at a time.
Say I fetch 100 Contracts and contracts contain a Vendor reference.
Eager loading I would use:
context.Contracts.Where(x => /* condition */).Include(x => x.Vendor).ToList();
which would compose 1 query loading all applicable contracts and their vendor details. However, if you let the serializer lazy load Vendors you get effectively the following:
context.Contracts.Where(x => /* condition */).ToList(); // gets applicable contracts...
// This happens behind the scenes for every single related entity touched while serializing...
context.Vendors.Where(x => x.VendorId == 1);
context.Vendors.Where(x => x.VendorId == 1);
// ... continue for each and every contract returned in the above list...
If Contract also has an Employee reference...
context.Employees.Where(x => x.EmployeeId == 16);
context.Employees.Where(x => x.EmployeeId == 12);
context.Employees.Where(x => x.EmployeeId == 11);
... and this continues for every related entity/collection in each contract and each related entity. It adds up, fast. You can see just how crazy it gets by hooking up a profiler to your server and kicking off a read. You expect 1 SQL, but then get hit with hundreds to thousands of calls.
The best way to avoid this is to simply not return entities from controllers, instead compose a view model with just the detail you want to display and use .Select() or Automapper's .ProjectTo<ViewModel>() to populate it from an EF query. This avoids falling into the trap of having a serializer touching lazy load properties, and also minimizes the payload sent to the client.
So if I wanted to display a list of contracts for a vendor and I only needed to display the Contract ID, the contract #, and a dollar figure:
[Serializable]
public class ContractSummaryViewModel
{
public int ContractId { get; set; }
public string ContractNumber { get; set; }
public decimal Amount { get; set; }
}
var contracts = _dbContext.Contracts
.Where(c => c.VendorID == vendorId)
.Select( c => new ContractSummaryViewModel
{
ContractId = c.ContractId,
ContractNumber = c.ContractNumber,
Amount = c.Amount
})
.ToList();
You can include details from related entities into the view model or compose related view models for key details, all without having to worry about using .Include() or tripping lazy loading. This composes a single SQL statement to load just the data you need, and passes just that back to the UI. By streamlining the payload the performance can increase quite dramatically.

Entity Framework Code First - select based on type of child TPH class

I am currently working with Entity Framework 4 on a project that is using Table Per Hierarchy to represent one set of classes using a single table. The way this works is that this table represents states and different states are associated with different other classes.
So you might imagine it to look like this, ignoring the common fields all states share:
InactiveState
has a -> StateStore
ActiveState
has a -> ActivityLocation
CompletedState
has a -> CompletionDate
Each state has a collection of all the InventoryItems that belong to it.
Now each item in my inventory has many states, where the last one in the history is the current state. To save on lists I have a shortcut field that points to the current state of my Inventory:
public class InventoryItem : Entity
{
// whole bunch of other properties
public virtual ICollection<InventoryState> StateHistory { get; set; }
public virtual InventoryState LastState { get; set; }
}
The first problem I am having is when I want to find, for example, all the InventoryItems which are in the Active state.
It turns out Linq-To-Entities doesn't support GetType() so I can't use a statement like InventoryRepository.Get().Where( x => x.LastState.GetType() == someType ). I can use the is operator, but that requires a fixed type so rather than being able to have a method like this:
public ICollection<InventoryItem> GetInventoryItemsByState( Type state )
{
return inventoryRepository.Get().Where( x => x.LastState is state );
}
I have to run some kind of if statement based on the type before I make the Linq query, which feels wrong. The InventoryItem list is likely to get large, so I need to do this at the EF level, I can't pull the whole list into memory and use GetType(), for example.
I have two questions in this situation, connected closely enough that I think they can be combined as they probably reflect a lack of understanding on my part:
Is it possible to find a list of items that share a child table type using Linq To Entities?
Given that I am not using Lazy Loading, is it possible to Include related items for child table types using TPH so that, for example, if I have an InactiveState as the child of my InventoryItem I can preload the StateStore for that InactiveState?
Is it possible to find a list of items that share a child table type
using Linq To Entities?
I don't think it's possible in another way than using an if/switch that checks for the type and builds a filter expression using is T or OfType<T>. You could encapsulate this logic into an extension method for example to have a single place to maintain and a reusable method:
public static class Extensions
{
public static IQueryable<InventoryItem> WhereIsLastState(
this IQueryable<InventoryItem> query, Type state)
{
if (state == typeof(InactiveState))
return query.Where(i => i.LastState is InactiveState);
if (state == typeof(ActiveState))
return query.Where(i => i.LastState is ActiveState);
if (state == typeof(CompletedState))
return query.Where(i => i.LastState is CompletedState);
throw new InvalidOperationException("Unsupported type...");
}
}
To be used like this:
public ICollection<InventoryItem> GetInventoryItemsByState(Type state)
{
return inventoryRepository.Get().WhereIsLastState(state).ToList();
}
I don't know if it would be possible to build the i => i.LastState is XXX expression manually using the .NET Expression API and based on the Type passed into the method. (Would interest me too, to be honest, but I have almost no clue about expression manipulation to answer that myself.)
Given that I am not using Lazy Loading, is it possible to Include
related items for child table types using TPH so that, for example, if
I have an InactiveState as the child of my InventoryItem I can preload
the StateStore for that InactiveState?
I am not sure if I understand that correctly but generally eager loading with Include does not support any filtering or additional operations on specific included children.
One way to circumvent this limitation and still get the result in a single database roundtrip is using a projection which would look like this:
var result = context.InventoryItems
.Select(i => new
{
InventoryItem = i,
LastState = i.LastState,
StateStore = (i.LastState is InactiveState)
? (i.LastState as InactiveState).StateStore
: null
})
.AsEnumerable()
.Select(x => x.InventoryItem)
.ToList();
If the query is a tracked query (which it is in the example above) and the relationships are not many-to-many (they are not in your example) the context will fixup the relationships when the entities are loaded into the context, that is InventoryItem.LastState and InventoryItem.LastState.StateStore (if LastState is of type InactiveState) will be set to the loaded entities (as if they had been loaded with eager loading).

EF Lambda: The Include path expression must refer to a navigation property [duplicate]

This question already has answers here:
EF: Include with where clause [duplicate]
(5 answers)
Closed 5 years ago.
Here is my expression:
Course course = db.Courses
.Include(
i => i.Modules.Where(m => m.IsDeleted == false)
.Select(s => s.Chapters.Where(c => c.IsDeleted == false))
).Include(i => i.Lab).Single(x => x.Id == id);
I know the cause is Where(m => m.IsDeleted == false) in the Modules portion, but why does it cause the error? More importantly, how do I fix it?
If I remove the where clause it works fine but I want to filter out deleted modules.
.Include is used to eagerly load related entities from the db. I.e. in your case make sure the data for modules and labs is loaded with the course.
The lamba expression inside the .Include should be telling Entity Framework which related table to include.
In your case you are also trying to perform a condition inside of the include, which is why you are receiving an error.
It looks like your query is this:
Find the course matching a given id, with the related module and lab. As long as the matching module and chapter are not deleted.
If that is right, then this should work:
Course course = db.Courses.Include(c => c.Modules)
.Include(c => c.Lab)
.Single(c => c.Id == id &&
!c.Module.IsDeleted &&
!c.Chapter.IsDeleted);
but why does it cause the error?
I can imagine that sometimes the EF team regrets the day they introduces this Include syntax. The lambda expressions suggest that any valid linq expression can be used to subtly manipulate the eager loading. But too bad, not so. As I explained here the lambdas only serve as a disguised string argument to the underlying "real" Include method.
how do I fix it?
Best would be to project to another class (say, a DTO)
db.Courses.Select(x => new CourseDto {
Id = x.Id,
Lab = x.Lab,
Modules = x.Modules.Where(m => !m.IsDeleted).Select( m => new ModuleDto {
Moudle = m,
Chapters = x.Chapters.Where(c => c.IsDeleted)
}
}).Single(x => x.Id == id);
but that may be a major modification for you.
Another option is to disable lazy loading and pre-load the non-deleted Modules and Chapters of the course in the context by the Load command. Relationship fixup will fill the right navigation properties. The Include for Lab will work normally.
By the way, there is a change request for this feature.

Entity Framework - Eager loading of subclass related objects

I wonder if there is a possibility to eager load related entities for certain subclass of given class.
Class structure is below
Order has relation to many base suborder classes (SuborderBase). MySubOrder class inherits from SuborderBase. I want to specify path for Include() to load MySubOrder related entities (Customer) when loading Order, but I got an error claiming that there is no relation between SuborderBase and Customer. But relation exists between MySubOrder and Customer.
Below is query that fails
Context.Orders.Include("SubOrderBases").Include("SubOrderBases.Customers")
How can I specify that explicitly?
Update. Entity scheme is below
This is a solution which requires only a single roundtrip:
var orders = Context.Orders
.Select(o => new
{
Order = o,
SubOrderBases = o.SubOrderBases.Where(s => !(s is MyOrder)),
MyOrdersWithCustomers = o.SubOrderBases.OfType<MyOrder>()
.Select(m => new
{
MyOrder = m,
Customers = m.Customers
})
})
.ToList() // <- query is executed here, the rest happens in memory
.Select(a =>
{
a.Order.SubOrderBases = new List<SubOrderBase>(
a.SubOrderBases.Concat(
a.MyOrdersWithCustomers.Select(m =>
{
m.MyOrder.Customers = m.Customers;
return m.MyOrder;
})));
return a.Order;
})
.ToList();
It is basically a projection into an anonymous type collection. Afterwards the query result is transformed into entities and navigation properties in memory. (It also works with disabled tracking.)
If you don't need entities you can omit the whole part after the first ToList() and work directly with the result in the anonymous objects.
If you must modify this object graph and need change tracking, I am not sure if this approach is safe because the navigation properties are not completely set when the data are loaded - for example MyOrder.Customers is null after the projection and then setting relationship properties in memory could be detected as a modification which it isn't and cause trouble when you call SaveChanges.
Projections are made for readonly scenarios, not for modifications. If you need change tracking the probably safer way is to load full entities in multiple roundtrips as there is no way to use Include in a single roundtrip to load the whole object graph in your situation.
Suppose u loaded the orders list as lstOrders, try this:
foreach (Orders order in lstOrders)
order.SubOrderBases.Load();
and the same for the customers..

Categories

Resources