EF Core: Load related tables in query vs LoadAsync - c#

In EF Core, suppose my database model contains a Patient entity with related collections like this:
public class Patient
{
// Patient properties
public int Id { get; set; }
public string Name { get; set; }
// Navigation properties
public List<Allergy> Allergies { get; set; }
public List<Medication> Medications { get; set; }
public List<Symptom> Symptoms { get; set; }
etc.
}
public class Allergy
{
public int Id { get; set; }
<more properties>
// Navigation properties
public int PatientId { get; set; } // FK to the related Patient
public Patient Patient { get; set; }
}
In my actual code, I have 8 of these types of relations. If I fetch a Patient and its related collections via LINQ, EF will create a monster query with 8 joins:
var qry = from patient in db.Patients
.Include(p => p.Allergies)
.Include(p => p.Medications)
.Include(p => p.Symptoms)
.Include(<etc.>)
where patient.Id = <desired Id>
select patient;
var Patient = await qry.FirstOrDefaultAsync();
My other option is to get the Patient without loading the related collections, then make 8 trips to the database to load the related collections:
var qry = from patient in db.Patients
where patient.Id = <desired Id>
select patient;
var Patient = await qry.FirstOrDefaultAsync();
await db.Entry(Patient).Collection(p => p.Allergies).LoadAsync();
await db.Entry(Patient).Collection(p => p.Medications).LoadAsync();
await db.Entry(Patient).Collection(p => p.Symptoms).LoadAsync();
etc.
Which strategy is more efficient (faster to execute)? And is there a "happy middle ground" where I load the basic entity with a simple SQL query, then make only a single trip to the database to load all of the desired related collections?
(I've found SO "answers" like this one, that rely on current EF behaviors such as "This works, because the context will detect that entity you are looking for is loaded and attached already and therefore not create a new one but rather update the old one". My problem with those types of solutions is that they may stop working when the EF implementation details change.)

You can use Split Queries in EF Core like this:
Just call .AsSplitQuery() at the end of the .Include() chain.
EF allows you to specify that a given LINQ query should be split into multiple SQL queries. Instead of JOINs, split queries generate an additional SQL query for each included collection navigation.
var qry =
from patient in db.Patients
.Include(p => p.Allergies)
.Include(p => p.Medications)
.Include(p => p.Symptoms)
.Include(/* etc. */)
.AsSplitQuery() // <-- add this method call
where patient.Id = <desired Id>
select patient;

Related

EF core load data from child object using FK

I have a class of Agent
public class Agent
{
public string Id { get; set; }
public string Name { get; set; }
public List<AgentUser> Users { get; set; }
}
public class User
{
public string Id { get; set; }
public string Email { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
public class AgentUser
{
public string Id { get; set; }
public string AgentId { get; set; }
public Agent Agent { get; set; }
public string UserId { get; set; }
public User User { get; set; }
}
the relationship is 1-to-many. on the AgentUsers table, it only contains UserId. I want to know how to load the user details(email, firstname, lastName) when I call the Agents table.
This is my query, but it only contains UserId.
var result= await _context.Agents
.Include(au => au.Users)
.Where(x => x.Id.Equals(request.Id))
.AsNoTracking()
.FirstOrDefaultAsync();
Yes I can do a JOIN but How can I get the details from User table in the most efficient way using the FK only?
Thanks!
If you want to use the joining entity in the relationships like your example then the other option when reading data is to use projection rather than loading the entire entity graph:
var result= await _context.Agents
.Where(a => a.Id.Equals(request.Id))
.Select(a => new AgentSummaryViewModel
{
AgentId = a.Id,
// other details from agent...
Users = a.Users.Select(u => new UserSummaryViewModel
{
UserId = u.User.Id,
Email = u.User.EMail,
// ...
}).ToList()
})
.SingleOrDefaultAsync();
This can be simplified using Automapper and it's ProjectTo() method against an EF IQueryable once a mapper is configured with how to build an Agent and User ViewModel. Projection has the added advantage of just loading as much data from the entity graph as your consumer (view, etc.) needs, building efficient queries. There is no need to eager load related entities or worry about tracking references since the entities themselves are not being loaded.
If you want to load that Agent and their Users for editing purposes then:
var agent = await _context.Agents
.Include(a => a.Users)
.ThenInclude(u => u.User)
.Where(a => a.Id.Equals(request.Id))
.SingleOrDefaultAsync();
Alternatively, with Many-to-Many relationships, if the joining table can be represented with just the two FKs as a composite PK then EF can map this table by convention without needing to declare the AgentUser as an entity. You only need to define the AgentUser if you want other details recorded and accessible about the relationship. So for example if you want to use a soft-delete system with an IsActive flag or such and removing an association between agent and user resulted in that record being made inactive rather than deleted, you would need an AgentUser entity.
So if you can simplify your AgentUser table to just:
AgentId [PK, FK(Agents)]
UserId [PK, FK(Users)]
... then your Entity model can be simplified to:
public class Agent
{
public string Id { get; set; }
public string Name { get; set; }
public List<User> Users { get; set; }
}
The entity mapping would need to be added explicityly such as with OnModelCreating or via an IEntityTypeConfiguration implementation that is registered so that:
modelBuilder.Entity<Agent>()
.HasMany(x => x.Users)
.WithMany(x => Agents);
Unfortunately with EF Core, many-to-many relationships mapped this way require navigation properties on both sides. (Agent has Users, and User has Agents)
EF should be able to work out the linking table by convention, but if you want more control over the table name etc. you can explicitly tell it what to use. This is where it can get a little tricky, but you can either do it by defining a joining entity, or without one, using a "shadow" entity. (Essentially a dictionary)
modelBuilder.Entity<Agent>()
.HasMany(x => x.Users)
.WithMany(x => Agents);
.UsingEntity<Dictionary<string, object>>(
"AgentUsers",
l => l.HasOne<Agent>().WithMany().HasForeignKey("AgentId"),
r => r.HasOne<User>().WithMany().HasForeignKey("UserId"),
j =>
{
j.HasKey("AgentId", "UserId");
j.ToTable("AgentUsers"); // Or "UserAgents", or "AgentUserLinks" etc.
});
Edit: A simple example with Automapper...
var config = new MapperConfiguration(cfg => {
cfg.CreateMap<Agent, AgentSummaryViewModel>();
cfg.CreateMap<User, UserSummaryViewModel>();
});
var result= await _context.Agents
.Where(a => a.Id.Equals(request.Id))
.ProjectTo<AgentSummaryViewModel>()
.SingleOrDefaultAsync();
"config" above can be extracted from a global Mapper set up with all of your mapper configurations, created ad-hoc as needed, or what I typically do is derive from mapping configuration expressions built as static methods in the view models themselves:
var config = new MapperConfiguration(AgentSummaryViewModel.BuildMapperConfig());
... then in the AgentSummaryViewModel:
public static MapperConfigurationExpression BuildConfig(MapperConfigurationExpression config = null)
{
if (config == null)
config = new MapperConfigurationExpression();
config = UserSummaryViewModel.BuildMapperConfig(config);
config.CreateMap<Agent, AgentSummaryViewModel>()
.ForMember(x => x.Users, opt => opt.MapFrom(src => src.Users.Select(u => u.User));
// append any other custom mappings such as renames or flatting from related entities, etc.
return config;
}
The UserSummaryViewModel would have a similar BuildMapperConfig() method to set up converting a User to UserSummaryViewModel. The resulting mapper configuration expression can chain as many needed related view models as you need, located under their respective view models.

Filtering with EF Core through multiple nested object relationships

I have a new requirement to filter reports return from a query based on what amounts to a blacklist.
So originally I had this query which brings back all the pertinent information required and gets a list of reports associated to that particular user.
var reports = from r in Context.Reports
join ax in Context.AuthorizationXref on r.Id equals ax.ReportId
join g in Context.Groups on ax.GroupId equals g.Id
join ugx in Context.UsersGroupsXref on g.Id equals ugx.GroupId
where ugx.UserId == id
select r;
AuthorizationXref has ReportID and GroupId.
public partial class AuthorizationXref
{
public int Id { get; set; }
public int ReportId { get; set; }
public int GroupId { get; set; }
public virtual Groups Group { get; set; }
public virtual Reports Report { get; set; }
}
public partial class Groups
{
public Groups()
{
AuthorizationXref = new HashSet<AuthorizationXref>();
UsersGroupsXref = new HashSet<UsersGroupsXref>();
}
public int Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public virtual ICollection<AuthorizationXref> AuthorizationXref { get; set; }
public virtual ICollection<UsersGroupsXref> UsersGroupsXref { get; set; }
}
User many to many Groups many to many Reports (the many to many is done through the XREFs).
While in here I have benched it at 700ms which seems really slow and I realize I am doing at least 4 hits to the DB so I tried to do this which amounted to the same data more or less but ~7 times faster:
var repo = context.Reports
.Include(x => x.AuthorizationXref)
.ThenInclude(x => x.Group)
.ThenInclude(x => x.UsersGroupsXref)
.ToList();
This benched at about 100ms but doesn't do any filtering on userId. These benchmarks are in Dev and will only worsen as we get into the higher environments where more and more reports are added to users. I know that using Selects is more efficient but I can't find examples of complex nested many-to-many selects on entities.
I can get about this far but don't know where to put the additional steps to drill deeper into the object.
var asdf = repo.Where(x=>x.AuthorizationXref.Any(y=>y.ReportId==x.Id));
End Goal is I need a list of Reports by userId removing those reports that appear by ID on another table. So there is a table called UserReportFilter and it has ReportId and UserId, any reports in that table should not appear in my end result.
As a side note if anyone can point me in the direction of a tutorial (preferably one that doesn't assume the reader knows absolutely everything) on how to use the Expressions I would appreciate it. I ran across this article and it would seem like a powerful thing to learn for things like this, however I will need a bit more meat in the explanation. I understand the concept and I have used basic Func returns for queries but nothing that extensive.
I assume you have relation between Reports and UserReportFilter tables since you have ReportId in UserReportFilter table. Below expression should work,
var reports = context.Reports
.Include(x => x.AuthorizationXref)
.ThenInclude(x => x.Group)
.ThenInclude(x => x.UsersGroupsXref)
.Include(x => x.UserReportFilters)
.Where(x => x.AuthorizationXref.Any(y => y.Group.UsersGroupsXref.Any(z => z.UserId==id))
.ToList();

Pull data from multiple tables in one SQL query using LINQ and Entity Framework (Core)

I want to pull data from multiple tables using LINQ in my .NET Core application. Here's an example:
public class Customer {
public Guid Id { get; set; }
public string Name { get; set; }
public DateTime Created { get; set; }
public HashSet<Transaction> Transactions { get; set; }
}
public class Transaction {
public Guid Id { get; set; }
public decimal Amount { get; set; }
public DateTime Created { get; set; }
public Guid CustomerId { get; set; }
public Customer Customer { get; set; }
}
These have a one-to-many relation in my solution. One customer has many transactions and one transaction has one customer. If I wanted to grab the 10 latest transactions and 10 lastest customers in one LINQ query, how would I do that? I've read that .Union() should be able to do it, but it won't work for me. Example:
var ids = _context
.Customers
.OrderByDescending(x => x.Created)
.Take(10)
.Select(x => x.Id)
.Union(_context
.Transactions
.OrderByDescending(x => x.Created)
.Take(10)
.Select(x => x.CustomerId)
)
.ToList();
This gives me two lists of type Guid, but they contain the same elements. Not sure if it's just me who understands this wrong, but it seems a bit weird. I am happy as long as it asks the database once.
You wrote:
I wanted to grab the 10 latest transactions and 10 latest customers in one LINQ query
It is a bit unclear what you want. I doubt that you want one sequence with a mix of Customers and Transactions. I guess that you want the 10 newest Customers, each with their last 10 Transactions?
I wonder why you would deviate from the entity framework code first conventions. If your class Customer represents a row in your database, then surely it doesn't have a HashSet<Transaction>?
A one-to-many of a Customer with his Transactions should be modeled as follows:
class Customer
{
public int Id {get; set;}
... // other properties
// every Customer has zero or more Transactions (one-to-many)
public virtual ICollection<Transaction> Transactions {get; set;}
}
class Transaction
{
public int Id {get; set;}
... // other properties
// every Transaction belongs to exactly one Customer, using foreign key
public int CustomerId {get; set;}
public virtual Customer Customer {get; set;}
}
public MyDbContext : DbContext
{
public DbSet<Customer> Customers {get; set;}
public DbSet<Transaction> Transactions {get; set;}
}
This is all that entity framework needs to know to detect the tables you want to create, to detect your one-to-many relationship, and to detect the primary keys and foreign keys. Only if you want different names of tables or columns, you'll need attributes and/or fluent API
The major differences between my classes and yours, is that the one-to-many relation is represented by virtual properties. The HashSet is an ICollection. After all, your Transactions table is a collection of rows, not a HashSet
In entity framework the columns of your tables are represented by non-virtual properties; the virtual properties represent the relations between the tables (one-to-many, many-to-many, ...)
Quite a lot of people tend to (group-)join tables, when they are using entity framework. However, life is much easier if you use the virtual properties
Back to your question
I want (some properties of) the 10 newest Customers, each with (several properties of) their 10 latest Transactions
var query = dbContext.Customers // from the collection of Customer
.OrderByDescending(customer => customer.Created) // order this by descending Creation date
.Select(customer => new // from every Customer select the
{ // following properties
// select only the properties you actually plan to use
Id = Customer.Id,
Created = Customer.Created,
Name = Customer.Name,
...
LatestTransactions = customer.Transactions // Order the customer's collection
.OrderBy(transaction => transaction.Created) // of Transactions
.Select(transaction => new // and select the properties
{
// again: select only the properties you plan to use
Id = transaction.Id,
Created = transaction.Created,
...
// not needed you know it equals Customer.Id
// CustomerId = transaction.CustomerId,
})
.Take(10) // take only the first 10 Transactions
.ToList(),
})
.Take(10); // take only the first 10 Customers
Entity framework knows the one-to-many relationship and recognizes that a group-join is needed for this.
One of the slower parts of your query is the transfer of the selected data from the DBMS to your local process. Hence it is wise to limit the selected data to the data you actually plan to use. If Customer with Id 4 has 1000 Transactions, it would be a waste to transfer the foreign key for every Transaction, because you know it has value 4.
If you really want to do the join yourself:
var query = dbContext.Customers // GroupJoin customers and Transactions
.GroupJoin(dbContext.Transactions,
customer => customer.Id, // from each Customer take the primary key
transaction => transaction.CustomerId, // from each Transaction take the foreign key
(customer, transactions) => new // take the customer with his matching transactions
{ // to make a new:
Id = customer.Id,
Created = customer.Created,
...
LatestTransactions = transactions
.OrderBy(transaction => transaction.Created)
.Select(transaction => new
{
Id = transaction.Id,
Created = transaction.Created,
...
})
.Take(10)
.ToList(),
})
.Take(10);
Try following. I models you database _context as a class just so I could test the syntax. Remember that one customer may map to more than one transaction. You may want to use GroupBy ID so you get 10 different customers.
class Program
{
static void Main(string[] args)
{
Context _context = new Context();
var ids = (from c in _context.customers
join t in _context.transactions on c.Id equals t.CustomerId
select new { c = c, t = t})
.OrderByDescending(x => x.c.Created)
.Take(10)
.ToList();
}
}
public class Context
{
public List<Customer> customers { get; set; }
public List<Transaction> transactions { get; set; }
}
public class Customer
{
public Guid Id { get; set; }
public string Name { get; set; }
public DateTime Created { get; set; }
public HashSet<Transaction> Transactions { get; set; }
}
public class Transaction
{
public Guid Id { get; set; }
public decimal Amount { get; set; }
public DateTime Created { get; set; }
public Guid CustomerId { get; set; }
public Customer Customer { get; set; }
}
You may want to try this instead :
var ids = (from c in _context.customers
join t in _context.transactions on c.Id equals t.CustomerId
select new { c = c, t = t})
.OrderByDescending(x => x.c.Created)
.GroupBy(x => x.c.Id)
.SelectMany(x => x.Take(10))
.ToList();
Eliminating the Join will speed up results. You always can get the customer info in another query.
var transactions = _context.transactions
.OrderByDescending(x => x.Created)
.GroupBy(x => x.CustomerId)
.Select(x => x.Take(10))
.ToList();
try this:
var customers = customerService.GetAll().OrderByDescending(c => c.Created).Take(10).ToList().AsQueryable();
var transactions = transactionService.GetAll().OrderByDescending(t => t.Created).Take(10).ToList().AsQueryable();
transactions = transactions.Where(t => customers.Any(c => c.CustomerId == t.Id));

How to select properties of simulated many-to-many in EFCore with single SQL generated query

EFCore does not support many-to-many relationships without creating a linking entity. I need to efficiently select a subset of properties from the 'other end' of the one-to-many-to-one relationship.
I'd swear this would have an answer already but haven't found it.
With these Models:
public class Book
{
public int BookId { get; set; }
public string Title { get; set; }
public Author Author { get; set; }
public ICollection<BookCategory> BookCategories { get; set; }
}
public class Category
{
public int CategoryId { get; set; }
public string CategoryName { get; set; }
public string ExtraProperties {get; set; }
public ICollection<BookCategory> BookCategories { get; set; }
}
public class BookCategory
{
public int BookId { get; set; }
public Book Book { get; set; }
public int CategoryId { get; set; }
public Category Category { get; set; }
}
This question is an extension of a similar, but different question titled, "Select specific properties from include ones in entity framework core"
I am looking for a query that returns a List<string> categoryNames of the Categories of the Book.
This nested select, using "projection" results in multiple SQL Queries:
var result= await _ctx.Books
.Where(x => x.BookId == id)
.Select(x => x.BookCategorys
.Select(y => y.Category.CategoryName )
.ToList())
.FirstOrDefaultAsync();
Any solution with .Include(x => x.BookCategory).ThenInclude(x => Category) will load all the data form the Server before applying the select.
Is there any query that meets the following criteria?:
Only generates 1 SQL query
Does not load the entire linking entity and or the entire navigation property 2 'hops' in.
Returns only List<string> of CategoryNames.
I infer from this Entity Framework Core generates two select queries for one-to-many relationship, it's not possible.
In general, you cannot control the generated SQL and how many SQL queries are executed by ORM. And EF Core at the time of writing (version 2.0.2) is known to produce N + 1 queries when the query contains collection projection. This is fixed in the next 2.1 release, but still will generate and execute at least 2 queries.
But every rule has exceptions. Since you want to return only a single related collection projection, you can simply use SelectMany instead of the original Select + FirstOrDefault construct. These are equivalent to this scenario, and EF Core is not smart enough to treat the later the same way as the former. Which is understandable counting the fact how many other cases need to be considered. The good thing is that rewriting the LINQ query this way produces the desired single SQL query translation:
var result = await _ctx.Books
.Where(x => x.BookId == id)
.SelectMany(x => x.BookCategorys
.Select(y => y.Category.CategoryName))
.ToListAsync();

Linq Projection to DTO for Nested/Hierarchical Collections

I have a User and Product entities which are defined as follows:
public class User {
Guid Id { get; set; }
Guid ParentId { get; set; }
ICollection<Product> PermittedProducts { get; set; }
ICollection<User> Children { get; set; }
}
public class Product {
int Id { get; set; }
string Name { get; set; }
ICollection<User> PermittedUsers { get; set; }
}
Conceptually, a Product has a collection of PermittedUsers - ie users that can purchase the product. Additionally, each User has a collection of PermittedProducts, as well as a collection of child users, who also have their own collection of PermittedProducts.
I need to run a query via a repository to return a list of products. The repository method and DTO are defined as:
public ICollection<ProductListDto> GetProductsForUser(Guid userId) {
// Linq query here
}
public class ProductListDto {
int Id { get; set; }
string Name { get; set; }
ICollection<User> Users { get; set; }
}
The repository method needs to take a Guid userId and retrieve the PermittedProducts for that User AND the PermittedProducts for the user's children.
For example, if a product is available for a user and his two children, then the ProductListDto would have all three users in it's users collection.
As a further example, if a product is not available for a user, but it is available for his children, then this would need to be returned as well.
Both the Product and User are available as aggregate roots, so I can use either a ProductRepository or UserRepository to query through, via EntityFramework's DbSet.
At the moment my repository method is in the UserRepository (but could move to the ProductRepository if the query is simpler) and looks like:
public ICollection<ProductListDto> GetProductsForUser(Guid userId) {
// Linq query here - Set is the EF DbSet<User>
var products = from u in
Set.Where(x => x.Id == userId) //.... NOT SURE ABOUT THE REST!
}
My problem is I cannot work out how to write the Linq query to achieve what I need to do!
EDIT
The answers so far don't address how to achieve the projection to the ProductListDto
How I would approach this is simply build a list of Id's from the parent UserId so this will contain the Parent UserId and all of it's ChildId's. From this list we can then select from Products where PermittedUsers contains one of these Id's. This is where you can get the product list from.
var childIds = DbContext.Users.Where(x => x.Id == userId).SelectMany(y => y.Children.Select(z => z.Id)).ToList();
childIds.Add(userId);
var products = DbContext.Products.Where(x => x.Users.SelectMany(y => childIds.Contains(y.Id))).ToList();
Try this
userRepository.Where(u => u.Id == userId && u.ParentId == userId)
.SelectMany(u => u.PermittedProducts)
.GroupBy(p => p.Id)
.Select(u => u.First());
Line explanation
- 1) Retrieves target user and his children
- 2) Select all products from user collection we get in first line
- 3- 4) Remove duplicates.
Note such linq can be translated into heavy sql query that can decrease performance. Maybe it is better to call ToList after lines (1, 2 or 3). Also it is possible to write your own SQL query, store them in sql server and call them from code by name.

Categories

Resources