In my Entity Framework 6 application, I have a table of people's email addresses:
public class EmailAddress
{
public int Id { get; set; }
public int PersonId { get; set; }
public string EmailAddress { get; set; }
public virtual Person Person { get; set; }
}
And the Person object references these email addresses also:
public class Person
{
public int Id { get; set; }
{...}
public virtual ICollection<EmailAddress> EmailAddresses { get; set; }
}
If I want to get all the email addresses for a single person and check whether the person actually exists, would it be more efficient to either:
Run an Any() query on the Persons table and then another query on the EmailAddresses table, using PersonId as a parameter:
public IEnumerable<EmailAddress> GetAddressesByPerson(int personId)
{
if (!Context.Persons.Any(x => x.Id == personId))
{
throw new Exception("Person not found");
}
return Context.EmailAddresses.Where(x => x.PersonId == personId).ToList();
}
Get the Person object and return the EmailAddresses navigation property:
public IEnumerable<EmailAddress> GetAddressesByPerson(int personId)
{
var person = Context.Persons.Find(personId);
if (person == null)
{
throw new Exception("Person not found")
}
return person.EmailAddress;
}
In case of 1st solution, EF will generate sql query which includes EXISTS statement. Then, if it exits you will execute completely different 2nd query against to the database.
In case of 2nd solution, you will send just a select ... from Persons where .. statement. And as you set EmailAddress as navigational property, if Lazy Loading is enabled then EF will generate and execute query against EmailAdress table based on personId. If Lazy Loading is not enabled, then EmailAddress will be null or empty.
As a 3rd option you use Eager Loading feature, which will let EF to generate join query and will fecth person and related EmailAddresses in one go.
So, if mostly you expect to have correct personId, then you can switch to Eager Loading mode. Lazy Loading is mostly helpful in scnearios, when you need to fetch related entities only in some cases.
By the way, I suggest you to turn on logging in EF, to see generated queries.
As a result, here is the code sample for loading related entities eagerly:
var person = Context.Persons
.Include(s ⇒ s.EmailAddresses)
.FirstOrDefault(x => x.Id == personId);
The key point is to add a call to Include method and pass the navigational property. Passed entity will be loaded eagerly. And at the end of query you can use any of the methods which will do immediate execution, like First, FirstOrDefault, Single, SingleOrDefault, ToList and so on.
You can't use Include with Find, because the latter one is the method of DbSet. In your case the most relevant one is Single, which will automatically throw exception if there is no person in the table with the specified id.
An option:
public IEnumerable<EmailAddress> GetAddressesByPerson(int personId)
{
var queryResults = Context.Persons
.Where(x => x.Id == personId)
.Select(x => new { EmailAddresses = x.EmailAddresses })
.Single();
return queryResults.EmailAddresses;
}
The above query asserts that a single Person's e-mail addresses should be returned. You could do a SingleOrDefault and then check the result for #null to customize the error message, though I tend to keep exception messages pure. We then return the selected collection. So if a person exists, but has no e-mail addresses, you'll receive an empty list. If the person doesn't exist you'll get an Expected 1, found 0 exception. If more than one person exists for the Id (shouldn't, but...) you'll get an Expected 1, found more than one exception. Don't use FirstOrDefault unless you expect more than one is possible and provide an OrderBy to ensure the data order is predictable.
Related
I have this code:
using (var context = new MyDbContext(connectionString))
{
context.Configuration.LazyLoadingEnabled = true;
context.Configuration.ProxyCreationEnabled = true;
context.Database.Log = logValue => File.AppendAllText(logFilePath, logValue);
var testItem1 = context.ParentTable
.FirstOrDefault(parent => parent.Id == 1)
.ChildEntities
.FirstOrDefault(child => child.ChildId == 2000);
}
When executing this code and examining log file for EF 6 (logFilePath), I see that children entities are loaded for the entire ParentTable record with Id == 1, while LazyLoading is enabled and Where condition for child table is specified (child.ChildId == 2000).
Shouldn't EF load only relevant children or is reading Items executed first and then on in-memory data FirstOrDefault gets executed?
Because if some parent has many children entities, this way, it can significantly decrease performance when loading children with condition?
I guess the workaround would be to load children entities separately?
This is a complete log file for above code (some lines excluded for easier reading):
SELECT TOP (1)
....
FROM [dbo].[ParentTable] AS [Extent1]
WHERE 1 = [Extent1].[Id]
SELECT
...
FROM [dbo].[ChildTable] AS [Extent1]
WHERE [Extent1].[ParentId] = #EntityKeyValue1
-- EntityKeyValue1: '1' (Type = Int32, IsNullable = false)
NOTE: Added relevant classes:
public class MyDbContext : DbContext
{
public DbSet<ParentTable> ParentTable { get; set; }
public DbSet<ChildTable> ChildTable { get; set; }
static MyDbContext()
{
Database.SetInitializer<MyDbContext>(null);
}
public MyDbContext(string connStr)
: base(connStr)
{
}
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.Entity<ParentTable>()
.HasMany(t => t.ChildEntities);
}
}
[Table("ParentTable", Schema = "dbo")]
public class ParentTable
{
public int Id { get; set; }
public virtual ICollection<ChildTable> ChildEntities { get; set; }
}
[Table("ChildTable", Schema = "dbo")]
public class ChildTable
{
public int ChildId { get; set; }
public int ParentId { get; set; }
[ForeignKey("ParentId")]
public virtual ParentTable Parent { get; set; }
}
use this query:
var testItem1 = context.ChildTables
.Include(p=>p.ParentTable)
.Where(ch => ch.ChildId == 2000)
.FirstOrDefault();
Your problem has nothing to do with lazy loading. It is because you use FirstOrDefault too early in your sequence of LINQ methods.
I'll first write the proper query, then I'll explain why that one is better.
var result = dbContext.ParentTable
.Where(parent => parent.Id == 1)
.SelectMany(parent => parent.ChildEntities.Where(child => child.ChildId == 2000))
.FirstOrDefault();
If you look closely to the LINQ methods, you'll see there are two types: those that return IQueryable<...>, and the others. LINQ methods of the first group are use lazy execution, also called deferred execution. This means that these statements won't execute the query. They will only change the Expression in the IQueryable. The database is not queried yet.
LINQ statements from the latter group will deep inside call GetEnumerator() and most of the times repeatedly call MoveNext() / Current. This will send the IQueryable.Expression to the IQueryable.Provider, who will try to translate the Expression into SQL and execute the query to fetch data from the database (to be precise: the translation doesn't always have to be SQL, that depends on the Provider). The fetched data is presented as an IEnumerator<...>, of which you can call MoveNext() / Current.
Your first FirstOrDefault will already execute the query. Apart from that it is executed too early, and might fetch more data than you want, you can also have the problem that it returns null.
The proper method would be to use Select. Only the last statement should contain a non_IQueryable method like FirstOrDefault.
I used SelectMany instead of Select, because you are only interested in the ChildEntities of the Parent, not in any of the Parent properties.
var result = dbContext.ParentTable
.Where(parent => parent.Id == 1)
.SelectMany(parent => parent.ChildEntities.Where(child => child.ChildId == 2000))
.FirstOrDefault();
Although this solves your problem, this will fetch more data than you actually plan to use. For instance, every Child will have a foreign key to the Parent. You know the Parent has a primary key value equal to 1, so the foreign key of the Child will also have a value of 1. Why transfer it?
In this case, I expect only one Child, so the problem is not too big. But in other cases you might be sending the same value often.
When using entity framework, always use Select and select only the properties that you plan to use. Only fetch the complete row or use Include if you plan to update the fetched item.
Another thing that will slow down your process if you don't use Select, is that when you fetch complete rows, the original fetched data and a copy of it are put in the DbContext.ChangeTracker. This is done to make it possible to detect what values must be save when you call SaveChanges. If you don't plan to update the fetched data, don't waste processing power to put the fetched data in the change tracker.
I have a DbSet class:
public class Manufacturer
{
public Guid Id { get; set; }
public string Name { get; set; }
public string City { get; set; }
public virtual Category Category { get; set; }
public virtual ICollection<Product> Products { get; set; }
}
I know I can use Skip() and Take() to get limited manufacturers. But my requirement is to get limited Products of all the manufacturers. I'm using something like this but it's not working
var manufacturers = await _context.Manufacturers.Where(x => x.Products.Take(10))
.ToListAsync();
PS: I'm using Lazy Loading (Not eager loading)
Compile error is:
Cannot implicitly convert type
'System.Collections.Generic.IEnumerable<Domain.Product>' to 'bool'
Cannot convert lambda expression to intended delegate type because
some of the return types in the block are not implicitly convertible
to the delegate return type
How can I achieve to get all the manufacturers but limited products in them?
I believe there is no way to do this directly with a queryable source. You can manage it in memory.
var manufacturers = await _context.Manufacturers.Include(m => m.Products).ToListAsync();
foreach(var m in manufacturers)
{
m.Products = m.Products.Take(10).ToList();
}
This will get all products for each manufacturer from the DB and then keep only the first 10.
You can load the Manufacturer entity without the Product list first (so without an Include() call) and then run a separate query to load only the products you want for a specific Manufacturer entity. EF will automatically update the navigation properties. See the following example (authors can have multiple posts in this example):
using (var context = new MyContext())
{
Author author = context.Author.First();
Console.WriteLine(context.Post.Where(it => it.Author == author).Count());
context.Post.Where(it => it.Author == author).Take(2).ToList();
Console.WriteLine(author.Posts.Count());
}
This will generate the following output:
3
2
Even though there are three entries available in my test database, only two are actually read. See the generated SQL queries:
For the Author author = context.Author.First(); line:
SELECT `a`.`Id`, `a`.`Name`
FROM `Author` AS `a`
LIMIT 1
For the context.Post.Where(it => it.Author == author).Count() line:
SELECT COUNT(*)
FROM `Post` AS `p`
INNER JOIN `Author` AS `a` ON `p`.`AuthorId` = `a`.`Id`
WHERE `a`.`Id` = 1
For the context.Post.Where(it => it.Author == author).Take(2).ToList(); line:
SELECT `p`.`Id`, `p`.`AuthorId`, `p`.`Content`
FROM `Post` AS `p`
INNER JOIN `Author` AS `a` ON `p`.`AuthorId` = `a`.`Id`
WHERE `a`.`Id` = 1
LIMIT 2
However, you have to do this trick for each individual Manufacturer entity, that it loads only ten associated Product entities. This can result in 1+N SELECT queries.
Try the longer way:
_await _context.Manufacturers.Select(x =>
{
x.Products = x.Products.Take(10).ToList();
return x;
}).ToListAsync();
I am in the process of migrating to EF6 from Linq To Sql, and I have the autogenerated object
public partial class PCU
{
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Usage", "CA2214:DoNotCallOverridableMethodsInConstructors")]
public PCU()
{
this.PUs = new HashSet<PU>();
}
public int ID { get; set; }
public int FileNumberID { get; set; }
public Nullable<int> PartnerID { get; set; }
public virtual Company Company { get; set; }
public virtual File File { get; set; }
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Usage", "CA2227:CollectionPropertiesShouldBeReadOnly")]
public virtual ICollection<PU> PUs { get; set; }
}
where PartnerID is the Foreign key for company
when I call:
var company = dc.Set<PCU>().FirstOrDefault(c => c.FileNumber == fileNumber).Company;
I get a Null object, however if I call:
var company = dc.Set<PCU>().Where(c => c.FileNumber == fileNumber).Select(x => x.Company).First();
It returns the company object as expected. I have both LazyLoading and ProxyCreation enabled.
I understand I could use:
var company = dc.Set<PCU>().Include(x => x.Company).FirstOrDefault(c => c.FileNumber == fileNumber).Company;
however, as this is existing code, and I have the same problem for hundreds of different objects, this will mean massive amounts of changes. Is there an easier way to achieve this?
At first it indeed looks like:
dc.Set<PCU>().FirstOrDefault(c => c.FileNumber == fileNumber).Company
is similar to:
dc.Set<PCU>().Where(c => c.FileNumber == fileNumber).Select(x => x.Company).First()
but in case the foreign key 'Company' is null while using 'FirstOrDefault', returning 'Company' will obviously return null.
The second case, selects a valid 'Company' FK from the result set which was created by the 'Where' condition, and returns the first one from that set, this is why the 'Where' query returns a 'Company'.
If you don't wish to alter existing code, it seems to me that the best solution for you will be to actually see why you have null foreign keys in your database in the first place.
If it's the way its supposed to be (e.g., a null 'Company' entry could exist) then you'll have to take it into account in your queries, hence changing them to return only existing 'Company' entries.
Edit: I take it back, I missed the 'LazyLoading enabled' part 🤔
As a follow up, I believe the cause of the error is the name of the ForeignKey (PartnerID), and if it were named "CompanyID" it would work fine.
I have had to bite the bullet, and had to implement
var company = dc.Set<PCU>().Include(x => x.Company).FirstOrDefault(c => c.FileNumber == fileNumber).Company;
where neccesary. There does not seem to be another workaround, except for renaming the columns in my DB (which I can't do).
I need to load only 5 elements from a list without loading all the list. I have these two entities:
public class Company
{
public int ID { get; set; }
public String Name{ get; set; }
public List<Employee> EmployeeList{ get; set; }
}
and:
public class Employee
{
public int ID { get; set; }
public String Name{ get; set; }
}
I need to load only the last 5 records of the Employee for a company named "CompanyName".
I tried to use :
Company companySearch =systemDB.Companies
.Include("EmployeeList").Take(5)
.Where(d => d.Name.Equals("CompanyName"))
.SingleOrDefault();
But this code loads all the list and after gives me back only the last 5 records. I need a faster query.
PS: It's code first EF
For loading selective N records of EmployeeList you will have to have some criterion based on which the members of your collection navigation property will be filtered. I've taken that criterion as value of ID property of Employee entity. Here are all the steps required along with code snippet which will do the lazy loading of EmployeeList collection for Company entity
Enable lazy loading in constructor of your inherited dbContext class. I believe systemDB is object of a class which inherits from DbContext
public SystemDB()
{
this.Configuration.LazyLoadingEnabled = true;
}
Remove the include clause to avoid eager loading:
Company companySearch =systemDB.Companies
.Where(d => d.Name.Equals("CompanyName"))
.SingleOrDefault();
After execution of this line of code, If you check the EmployeeList property of companySearch object it will be shown as Null in quick watch window.
Perform the lazy loading of EmployeeList property using the below mentioned call. Put explicit criterion for filtering the records. I've the set the filter criteria to restrict the employees whose ID lies between 1 and 5, both boundaries being inclusive.
db.Entry<Company>(companySearch).Collection(s => s.EmployeeList).Query().Where(p => p.ID >= 1 && p.ID <= 5).Load();
Note that it is not currently possible to filter which related entities are loaded. Include will always bring in all related entities. Reference
You could still try anonymous projection without lazyloading
this.Configuration.LazyLoadingEnabled = false;
Anonymous projection.
Company companySearch =systemDB.Companies
.Where(d => d.Name.Equals("CompanyName"))
.Select(x=> new
{
company = x,
employees = x.Employees.Take(5),
}
.FirstOrDefault()
You will get more idea about how to do anonymous projection is here
I have a very simple EF operation that fails: break the relationship between two entities as shown in the code below:
public async Task RemoveCurrentTeacherOfGroup(int groupId)
{
var group = await _dataContext.Groups.SingleAsync(g => g.Id == groupId);
group.Teacher = null;
await _dataContext.SaveChangesAsync();
}
The database was generated code-first. The entities are defined like this:
public class Teacher
{
public int Id { get; set; }
..
public virtual List<Group> Groups { get; set; }
}
public class Group
{
public int Id { get; set; }
..
public virtual Teacher Teacher { get; set; }
}
However, breaking the relationship doesn't work, Teacher keeps pointing to the same entity. When stepping with the debugger, I see that the Teacher property doesn't become null after .Teacher = null. I tried it with the synchronous alternative, which had the same effect:
public void RemoveCurrentTeacherOfGroup(int groupId)
{
var group = _dataContext.Groups.Single(g => g.Id == groupId);
group.Teacher = null;
_dataContext.SaveChanges();
}
If Teacher is not loaded you can't break the relationship. Either include it (eager-load) on the query:
_dataContext.Groups.Include(g => g.Teacher).Single(g => g.Id == groupId);
Or if lazy loading is enabled, access the property for reading before setting it to null:
var teacher = group.Teacher;
group.Teacher = null;
You see that "Teacher is not null after setting it to null" because the debugger is accessing the property for reading (lazy-loading it) after you have set it to null.
The value was already null before you hit the group.Teacher = null line since you hadn't previously loaded it (you can't however debug this, since accessing the property for reading would cause EF to actually load it if lazy loading is enabled). If you see the property value with the debugger before setting it to null, it'll work as expected and break the relationship, since Teacher would be loaded