How does entity framework access associations?

How does entity framework access associations? - c#

I have two tables, Kittens and Owners in my Entity Framework model. A Kitten has 1 Owner but an Owner can have many Kittens.
I have a repository method called GetKittens() that returns _context.Kittens.ToList();
Since I set up an association, I can do kitten.Owner.Name.
But since ToList() was already called, and the context disposed of, how does it access the property? When retrieving an Entity, does it do a Join to all tables that have an association?
I have to write a query that pulls data from 4 tables, so I am wondering how to do this efficiently, hence this question trying to understand a bit more about how EF works.

By default, a DbContext will use lazy loading. There is a few options available to you, depending on your use cases.
1- If you have control over the lifetime of your DbContext, do not dispose it. However, every time you will access a related entity (for the first time), a new query will be sent to the database to fetch it.
2- Eagerly include the related entity by use Include on the IQueryable<Kitten>:
// For imagine context is the DbContext for your EF Model
context.Kittens.Include(c => c.Owners); // Or Include("Owners")
However, if you have no control over your repository, you have no option but to call a related method of your repository (like IEnumerable<Owner> GetOwners(Kitten kitten)) since the repository already returns the list.
If you do, consider either eagerly include the Kitten's owners in the repository before materializing with ToList() or return an IQuerable and leave the responsibility to the calling class to include related entities or customizing the query. If you do not want a caller to be able to alter the query, you can add an overload with includes that could be something along the line of:
public List<Kitten> GetKittens(params string[] includes)
{
return includes.Aggregate(
context.Kittens.AsQueryable(),
(query, include) => return query.Include(include)).ToList();
}
All in all, this is an implementation decision that you will have to take.

Related

Is it okay to use .ToList() to bypass DbContext tracking?

I would like to keep the lifespan of my DbContext as short as possible, so I take advantage of the using statement.
However, since Entity Framework tracks the entities, I cannot do something like this :
public IEnumerable<Person> GetPersons()
{
using (var db = new AppContext())
{
Logging.Log(Information, "Requested getPersons from service");
return db.Persons;
}
}
Because as soon as I use the list of Persons in my ViewModel, I will receive an InvalidOperationException (the context has been disposed before I use the object).
To bypass that, I convert the result of the query to a concrete List, by calling ToList()
public IEnumerable<Person> GetPersons()
{
using (var db = new AppContext())
{
Logging.Log(Information, "Requested getPersons from service");
return db.Persons.ToList();
}
}
Is this an okay thing to do ? AsNoTracking() doesn't have any effects because it still returns some entity that gets dropped if the context is disposed.

When it comes to Entities you have two options. One is simple, and the other looks even simpler but is a whole lot more complex.
The simple option: An entity should never be referenced beyond the scope of it's DBContext. This means in something like an MVC app, entities don't get sent to a view. They don't get serialized, they are consumed solely within the scope of their DbContext.
The key to this approach is using POCO view models and projection, (Select and Automapper's ProjectTo) and when dealing with various common layers and such, adopting something like a Unit of Work pattern to manage the lifetime scope of a DbContext.
The deceptively simple option: Detached entities. Simply ensuring the DbContext is alive when returning entities can be enough to solve your problem. A Unit of Work pattern can be leveraged for this approach as well. While it looks simple enough, it is loaded with pitfalls. Serializing an entity for instance to pass back to a view will trigger lazy load calls as the serializer traverses the entity(ies). Passing entities back to be persisted is also problematic as you have to be cautious of trying to re-attach entities where the Context may already be tracking leading to situational runtime errors, overwriting data with stale copies, and exposing your system to unintended tampering if overly trusting the passed in entity. From a performance standpoint, this is also the worst option as you will be serializing and transmitting entire entity graphs back and forth rather than just the data a view needs.
Fixing your problem with option 2 is generally as easy as Eager Loading anything the calling code might touch. For example, if an Person has a reference to an Address:
public IEnumerable<Person> GetPersons()
{
using (var db = new AppContext())
{
Logging.Log(Information, "Requested getPersons from service");
return db.Persons
.Include(x => x.Address)
.ToList();
}
}
However, here is where the complexity creeps in. This requires the method to know about how the Persons might be consumed. Person might have references to five or more other objects/sets, and those objects may have references to more. (I.e. Address having a reference to a Country entity, or AddressType, etc.) The catch-all fix ends up being to eager load everything, which results in a lot of unnecessary data and is prone to bugs as entities evolve and code grows to rely on different bits. Then when performance becomes a problem you start diving down a rabbit hole in trying to support the caller telling this method what to eager load. As the system grows and you want to support pagination, sorting, etc. the method becomes more and more complex or it becomes slower and slower.
The solution I advocate for is to ensure that entities never cross the boundary of their DbContext, and leveraging Linq/EF's IQueryable implementation in combination with a unit of work so that consumers of these repository or service methods can be responsible for the scope of the DbContext. For example:
private AppDbContext Context
{
get { return AmbientDbContextLocator.Get<AppDbContext>(); }
}
public IQueryable<Person> GetPersons()
{
Logging.Log(Information, "Requested getPersons from service");
return Context.Persons.AsQueryable();
}
Then in calling code:
var mapperConfig = new MapperConfiguration(cfg =>
{
cfg.CreateMap<Person, PersonViewModel>(); // Can include any relevant mapping to flatten needed fields...
});
using( var contextScope = ContextScopeFactory.Create())
{
var viewModels = PersonRepository.GetPersons()
.ProjectTo<PersonViewModel>(mapperConfig)
.ToList();
return View(viewModel);
}
Often you will have low-level rules such as defaulting to only returning Active rows. .AsQueryable() is only needed if you have no rules. Anything that results in a .Where or such inside the repository/service method returns IQueryable so it could return:
return Context.Persons.Where(x => x.IsActive);
Repositories can enforce low level filtering rules like IsActive, authorization checks, etc. leaving more variable filtering, sorting, etc. up to the consumers.
The advantage of this approach is that the consumer (Controller, etc.) has full control over how the data is consumed without introducing any complexity/business logic into the repository/service. If I want to support sorting and pagination:
var viewModels = PersonRepository.GetPersons()
.OrderBy(x => x.Age)
.ProjectTo<PersonViewModel>(mapperConfig)
.Skip(pageNumber * pageSize)
.Take(pageSize)
.ToList();
The key point is that the entity (Person) doesn't leave the scope of the unit of work / DbContext, it is projected into a ViewModel which holds no references to entities, only data. That model can be safely serialized and represents only the data that the view needs. For projection I've used Automappper's ProjectTo as an example. You can use Linq's Select as a more manual option. The unit of work pattern I use and have outlined above is Mehdime's DbContextScope.

If you just return db.Persons which is an IEnumerable<Person> to some caller and then dispose the db context before that caller can iterate the result, you will get the exception, due to the lazy nature by default of entity framework. ToList gets around this by forcing iteration immediately before the context is disposed, but still has overhead for object tracking, so combining ToList with AsNoTracking gets you the best performance.
AsNoTracking removes some internal tracking code inside the db context, it does not allow you to pass out the IEnumerable<Person> after disposing the context, you still need ToList for that.
Use the extension method AsNoTracking and then call ToList:
db.Persons.AsNoTracking().ToList();
Or you can disable change tracking on the change tracker property of the db context to make the whole thing read only, which avoids having to call AsNoTracking everywhere. You still need ToList with this approach.
/// <summary>
/// Disable all change tracking - place in db context sub class
/// </summary>
public void DisableChangeTracking()
{
ChangeTracker.AutoDetectChangesEnabled = false;
ChangeTracker.LazyLoadingEnabled = false;
ChangeTracker.QueryTrackingBehavior = QueryTrackingBehavior.NoTracking;
}

It is acceptable to do so. The problem with IEnumerable is that Entity Framework supports deferred execution, so they won't be populated until you fetch them to a concrete collection (i.e. .ToList()), which could happen after it's been disposed of.
Your 2 possibilities are:
Executing the work on the deferred object by fetching it to a concrete collection (.ToList()).
Using dependency injection with a scoped Lifetime to let it live until the request completes.
Both are correct, and which one to prefer depends on your use-case.

Is there a way to automatically create CRUD for EF Model (DB First currently)

I am creating a WPF app and I have an existing DB that I would like to use and NOT recreate. I will if I have to, but I would rather not. The DB Is sqlite and when I add it to my data later and create a DataModel based on the DB, I get the model and the DB Context, however there are no methods created for CRUD or for instance .ToList() so I can return all of the items on the table.
Do I need to create all of these manually or is there a way to do it like the way that MVC can scaffold?
I am using VS 2017, WPF, EF6 and Sqlite installed with Nu-Get

To answer the question in the title.
No.
There is no click-a-button method of scaffolding out UI like you get with MVC.
If you just deal with a table at a time then you could build a generic repository that returns a List for a given table. That won't save you much coding, but you could do it.
If you made that return an iQueryable rather than just a List then you could "chain" such a query. Linq queries aren't turned into SQL until you force iteration and you can base one on another adding criteria, what to select etc etc for flexibility.
In the body of your post you ask about methods to read and write data. This seems to be almost totally unrelated from the other question because it's data access rather than UI.
"there are no methods created for CRUD or for instance .ToList() so I can return all of the items on the table."
There are methods available in the form of LINQ extension methods.
ToList() is one of these, except it is usual to use async await and ToListAsync.
Where and Select are other extension methods.
You would be writing any model layer that exposed the results of those though.
I'm not clear whether you are just unaware of linq or what, but here's an example query.
var customers = await (from c in db.Customers
orderby c.CustomerName
select c)
.Include(x => x.Orders) //.Include("Orders") alternate syntax
.ToListAsync();
EF uses "lazy loading" of related entities, that Include makes it read the Orders for each customer.

Entity Framework is an Object Relational Mapper
Which means it will Map your C# objects to Tables.
Whenever you are creating a model from bd it will create a Context Class which will in inherit the DbContext. in this class you will find all the tables in DbSet<Tablename> Tablename{get; set;}. Basically, this list contains will the rows. the operation performed on this list will affect the DB on SaveChange method.
Example for CURD
public DbSet<Student> Students { get; set; }
//Create
using (var context = new YourDataContext()) {
var std = new Student()
{
Name = "Aviansh"
};
context.Students.Add(std);
context.SaveChanges();
}//Basically saving it will add a row in student table with name field as avinash
//Delete
using (var context = new YourDataContext()) {
var CurrentStudent=context.Students.FirstOrDefault(x=>x.Name=="Avinash")
CurrentStudent.context.Students.Remove(CurrentStudent);
context.SaveChanges();
}
Note: on SaveChanges the change will reflect on Db

Load Entities AsNoTracking() with navigation properties, without specifying includes

I would like to know if the following scenario is possible with Entity Framework:
I want to load several tables with the option AsNoTracking since they are all like static tables that cannot be changed by user.
Those tables also happen to be navigation property of others. Up till now I relied on the AutoMapping feature of the Entity Framework, and don't use the .Include() or LazyLoading functionality.
So instead of:
var result = from x in context.TestTable
.Include("ChildTestTable")
select x;
I am using it like this:
context.ChildTestTable.Load();
context.TestTable.Load();
var result = context.TestTable.Local;
This is working smoothly because the application is so designed that the tables within the Database are very small, there won't be a table that exceeds 600 rows (and that's already pretty high value in my app).
Now my way of loading data, isn't working with .AsNoTracking().
Is there any way to make it working?
So I can write:
context.ChildTestTable.AsNoTracking().List();
var result = context.TestTable.AsNoTracking().List();
Instead of:
var result = from x in context.TestTable.AsNoTracking()
.Include("ChildTestTable")
select x;
So basically, I want to have 1 or more tables loaded with AutoMapping feature on but without loading them into the Object State Manager, is that a possibility?

The simple answer is no. For normal tracking queries, the state manager is used for both identity resolution (finding a previously loaded instance of a given entity and using it instead of creating a new instance) and fixup (connecting navigation properties together). When you use a no-tracking query it means that the entities are not tracked in the state manager. This means that fixup between entities from different queries cannot happen because EF has no way of finding those entities.
If you were to use Include with your no-tracking query then EF would attempt to do some fixup between entities within the query, and this will work a lot of the time. However, some queries can result in referencing the same entity multiple times and in some of those cases EF has no way of knowing that it is the same entity being referenced and hence you may get duplicates.
I guess the thing you don't really say is why you want to use no-tracking. If your tables don't have a lot of data then you're unlikely to see significant perf improvements, although many factors can influence this. (As a digression, using the ObservableCollection returned by .Local could also impact perf and should not be necessary if the data never changes.) Generally speaking you should only use no-tracking if you have an explicit need to do so since otherwise it ends up adding complexity without benefit.

Is DbSet<>.Local something to use with special care?

For a few days now, I have been struggling with retrieving my entities from a repository (DbContext).
I am trying to save all the entities in an atomic action. Thus, different entities together represent something of value to me. If all the entities are 'valid', then I can save them all to the database. Entity 'a' is already stored in my repository, and needs to be retrieved to 'validate' entity 'b'.
That's where the problem arises. My repository relies on the DbSet<TEntity> class which works great with Linq2Sql (Include() navigation properties e.g.). But, the DbSet<TEntity> does not contain entities that are in the 'added' state.
So I have (as far as I know) two options:
Use the ChangeTracker to see which entities are available and query them into a set based on their EntityState.
Use the DbSet<TEntity>.Local property.
The ChangeTracker seems to involve some extra hard work to get it working in a way such that I can use Linq2Sql to Include() navigation properties e.g.
The DbSet<TEntity>.Local seems a bit weird to me. It might just be the name. I just read something that it is not performing very well (slower than DbSet<> itself). Not sure if that is a false statement.
Could somebody with significant EntityFramework experience shine some light on this? What's the 'wise' path to follow? Or am I seeing ghosts and should I always use the .Local property?
Update with code examples:
An example of what goes wrong
public void AddAndRetrieveUncommittedTenant()
{
_tenantRepository = new TenantRepository(new TenantApplicationTestContext());
const string tenantName = "testtenant";
// Create the tenant, but not call `SaveChanges` yet until all entities are validated
_tenantRepository.Create(tenantName);
//
// Some other code
//
var tenant = _tenantRepository.GetTenants().FirstOrDefault(entity => entity.Name.Equals(tenantName));
// The tenant will be null, because I did not call save changes yet,
// and the implementation of the Repository uses a DbSet<TEntity>
// instead of the DbSet<TEntity>.Local.
Assert.IsNotNull(tenant);
// Can I safely use DbSet<TEntity>.Local ? Or should I play
// around with DbContext.ChangeTracker instead?
}
An example of how I want to use my Repository
In my Repository I have this method:
public IQueryable<TEntity> GetAll()
{
return Context.Set<TEntity>().AsQueryable();
}
Which I use in business code in this fashion:
public List<Case> GetCasesForUser(User user)
{
return _repository.GetAll().
Where(#case => #case.Owner.EmailAddress.Equals(user.EmailAddress)).
Include(#case => #case.Type).
Include(#case => #case.Owner).
ToList();
}
That is mainly the reason why I prefer to stick to DbSet like variables. I need the flexibility to Include navigation properties. If I use the ChangeTracker I retrieve the entities in a List, which does not allow me to lazy load related entities at a later point in time.
If this is close to incomprehensible bullsh*t, then please let me know so that I can improve the question. I desperately need an answer.
Thx a lot in advance!

If you want to be able to 'easily' issue a query against the DbSet and have it find newly created items, then you will need to call SaveChanges() after each entity is created. If you are using a 'unit of work' style approach to working with persistent entities, this is actually not problematic because you can have the unit of work wrap all actions within the UoW as a DB transaction (i.e. create a new TransactionScope when the UoW is created, and call Commit() on it when the UoW completed). With this structure, the changes are sent to the DB, and will be visible to DbSet, but not visible to other UoWs (modulo whatever isolation level you use).
If you don't want the overhead of this, then you need to modify your code to make use of Local at appropriate times (which may involve looking at Local, and then issuing a query against the DbSet if you didn't find what you were looking for). The Find() method on DbSet can also be quite helpful in these situations. It will find an entity by primary key in either Local or the DB. So if you only need to locate items by primary key, this is pretty convenient (and has performance advantages as well).

As mentioned by Terry Coatta, the best approach if you don't want to save the records first would be checking both sources.
For example:
public Person LookupPerson(string emailAddress, DateTime effectiveDate)
{
Expression<Func<Person, bool>> criteria =
p =>
p.EmailAddress == emailAddress &&
p.EffectiveDate == effectiveDate;
return LookupPerson(_context.ObjectSet<Person>.Local.AsQueryable(), criteria) ?? // Search local
LookupPerson(_context.ObjectSet<Person>.AsQueryable(), criteria); // Search database
}
private Person LookupPerson(IQueryable<Person> source, Expression<Func<Person, bool>> predicate)
{
return source.FirstOrDefault(predicate);
}

For those who come after, I ran into some similar issues and decided to give the .Concat method a try. I have not done extensive performance testing so someone with more knowledge than I should feel free to chime in.
Essentially, in order to properly break up functionality into smaller chunks, I ended up with a situation in which I had a method that didn't know about consecutive or previous calls to that same method in the current UoW. So I did this:
var context = new MyDbContextClass();
var emp = context.Employees.Concat(context.Employees.Local).FirstOrDefault(e => e.Name.Contains("some name"));

This may only apply to EF Core, but every time you reference .Local of a DbSet, you're silently triggering change detection on the context, which can be a performance hit, depending on how complex your model is, and how many entries are currently being tracked.
If this is a concern, you'll want to use (fore EFCore) dbContext.ChangeTracker.Entries<T>() to get the locally tracked entities, which will not trigger change detection, but does require manual filtering of the DB state, as it will include deleted and detached entities.
There's a similar version of this in EF6, but in EFCore the Entries is a list of EntityEntries which you'll have to select out the entry.Entity to get out the same data the DbSet would give you.

Repository with many methods or Entities with foreign keys

I have a database with a Customer, Supplier, and Services (this is a gross simplification, I really have about 100 tables)
I am developing a new Entity Framework library for accessing these tables.
A Customer has many Suppliers
A Supplier has many Services
I am trying to decide which approach to follow -
A )
Use mapping to connect the Customer to the Supplier and the Supplier to the Services, then every time I load a customer I get all his suppliers and their services (and other tables loaded)
B )
Have no mapping between entities, but provide methods in the relevant repository; e.g. in the supplier repository I'll have IEnumerable<Supplier> GetSupplierByCustomerID(int customerID)
EDIT Changed above to IEnumerable based on suggestions.
Are these the two main approaches when using EF? Which is considered better, from your perspective.
Is there another approach I'm not considering?

I would personnally expose many simple methods.
Use mapping to connect the Customer to the Supplier and the Supplier
to the Services, then every time I load a customer I get all his
suppliers and their services (and other tables loaded)
If you only need to get the Name of your customer from its ID, then the above solution would require you to load useless and heavy object graph unless you use lazy loading, but as you may have some serialization process (3-tier architecture ?), it's a problem for you as you can't use lazy loading in this case.
So you could expose for example:
Supplier GetSupplierByID(int supplierID)
IEnumerable<Supplier> GetSuppliersByCustomerID(int customerID)
...
I would also recommend not exposing IQueryable. If possible, use IEnumerable instead. See this article for more details about the danger of using IQueryable when all implications are not well known.

In general I feel like putting a repository over EF is always a good idea. You get to abstract your database logic from your client-side logic (or even business logic). And the specific case that you're mentioning you would be able to do one other nice benefit: You would only get the information that you want when you specifically call for it (like the GetSupplierByCustomerID example that you mentioned.
Another approach you might consider is the one that I mentioned in the answer to this question: Bounded Contexts. The more separation of concerns that you have in your application, the better it will be in the long run for you and your fellow programmers (especially when you want to unit test it all).

It is just my opinion, I do not know whether it is proper in your case since it depends on your business requirements, but I generally prefer the third option.
All repositories return IEnumerables, not IQueryables : this enables all database operations to be finished before running any business logic.
All repositories expose methods with optional parameters enabling to declare included navigation properties : this enables to call repository methods with required navigation entities.
Create a base generic repository and inherit from it in each of your repositories.
Implement unit of work pattern to share context and enable transaction.
sample method signiture from base repository (T is the type of entity):
IEnumerable<T> Find(Expression<Func<T, bool>> criteria, params Expression<Func<T, object>>[] navigationList)

When its about to map or not to map, I'd opt for A. There are many advantages to navigation properties (like Customer.Supplier) and there are many ways to control lazy/eager loading.
Advantages of navigation properties is that linq queries are much easier to write. Hardly ever you'll have to write a join:
With join:
from supp in db.Supliers
join serv in db.Services on supp.SupplierId equals serv.SupplierId
select ...
With navigation property
from supp in db.Supliers
from serv in supp.Services
select ...
Or things like this:
from supp in db.Supliers
select new { supp.Name, ServicesCount = supp.Services.Count() }
and EF will figure out how to do the joins in SQL.
Having navigation properties doesn't mean that they always get loaded. For lazy loading to happen, two conditions must be met
The property must be defined as virtual to enable EF to override it in a proxy type with wiring to cary out lazy loading.
The context must be lazy-loading enabled. They are by default, but you can turn it off per instance by setting context.Configuration.LazyLoadingEnabled = false.
So this also shows two ways to control lazy loading: you can enable/disable it structurally or temporarily.
Apart from that you can control the opposite, eager loading, in two ways:
Using the Include statement:
db.Suppliers.Include(s => s.Services)
Including navigation properties in projections:
from supp in db.Supliers
from serv in supp.Services
select new { supp.Name, serv.ServiceName }
(there are more ways, but these are the most important ones)
This would applies to writing linq queries in your services or repositories. As others have said: don't expose IQueryable to the consumers of your service/repository methods.
One last important note: lazy loading is only possible within the scope of a life context. If the context is disposed and a lazy-loading navigation property is addressed, an exception is thrown. At the same time it is recommended to uses context instances with a short life span. So there's the dilemma: expose entity objects or only DTO's or view models or stuff like that. When you expose lazy loading-enabled entity objects a consumer may inadvertently address a navigation property that has not been loaded yet, and the context is gone.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.