Referential equality of entity attached to multiple DbContexts? - c#

When receiving an entity from (so it is attached to) DbContext dbContext1, and then manually attaching this entity to DbContext dbContext2, will every entity (representing the same database object) received from either of these two contexts (e.g. as a query result) in the future have referential equality with the first-mentioned object?
I know that entities of the same database object always have referential equality withing the scope of one single DbContext, but can this be extended when attaching an entity to multiple DbContexts?
Background: I have an ASP.NET Core Web API application which does not only use the DbContexts coming with each per-request scope, but there is also a long-living scope for background tasks with its own DbContext. I want to know if entities originating within the request scope can be attached to the long-living scope so that changes to these entities within requests are also seen by the background tasks without the need to refresh those entities separately.

No. EF does some tricks in the context to ensure that you always get the same instance back of a particular entity (basically, it creates an object cache and doesn't create a new instance, unless it can't find an existing instance of that entity in the cache). However, this is per context. If you query entity X from context 1 and then query the same entity X from context 2 and attempt to compare them like x1 == x2, it will return false.
Your only real option if you need referential integrity is to override Equals, GetHashCode and the == operator on your entity class. However, doing so is not trivial, and it's very easy to make a mistake that will either make the equality check overly greedy or overly strict, causing errant true/false results. It's doable, for sure; just do your research first and make sure you know what you're doing. Microsoft has documentation to at least get you started.
However, the simpler and more robust solution here is to simply leave the two context scopes disconnected. Do what you need to do in the request scope and then simply ensure that your context in singleton scope always reloads from the database before doing anything. The documentation on handling concurrency conflicts should give you a good idea of what you need to do, as this is technically a form of concurrency.

Related

Context.SaveChanges in Winforms

This is basic winform application, no service or anything in-between. I am fetching some records from db using Entity Framework. Below code is in a class called PersonRepository.
var obj = Context.Persons.Where(u=>u.Id==20);
obj.RegisterDate = obj.RegisterDate.ToMountainStandardTime();
return obj;
ToMountainStandardTime is an extension method for date type.
Now after I pull this record, and display to UI. User does some action on screen, And based on requirements insert record in another table called "Activity". User don't need to save anything back in Person table.
After doing their things, like this
Context.Activities.Add(newActivityObject);
Context.SaveChange();
Both methods are in same class. Along with adding a new object in activity table, it also update the register date of selected person class.
I know the reason, this Context object initialize in constructor of PersonRepository class and being used by all the methods in this class.
Most of my experience is using this via restful services where I don't need much to worry about such things because for every request we create new instances of context.
I can simply handle this by Detach the object from context before editing it like this
Context.Entry(obj).State = EntityState.Detached;
But want to know if there is some better way to handle this?
You have a few choices to consider. Firstly, entities either can only be relied upon to be valid within the scope of the DbContext they were read, or they need to be detached and re-attached to transition between DbContext boundaries.
To keep entities scoped within their DbContext, your options are:
Long-lived (i.e. Singleton) DbContext.
Short-lived, project entities to POCO containers and re-load entities on-demand as needed.
The third option is to use short-lived DbContexts, but then manually manage detaching and re-attaching the entities.
I never recommend this third option as it is prone to errors and encourages issues like stale data overwrites. It's neat in concept, but more often than not becomes a repeated source of headaches in practice.
For smaller applications that themselves have relatively short runtime lives, a long-lived DbContext can be a simple to implement option. The biggest negatives of a long-lived DbContext are:
Having a context alive for extended periods of time can mean performance degrades over time as more entities are cached. The assumption that cached entities are better for performance can be misplaced as time to perform operations against entities (updates/inserts) increase as the cache grows, since EF will look through the cache for entity references that might be associated to new/changed entity values.
Data that the context is loading will become stale if multiple instances are running, or external processes can modify data state. By default EF will return cached copies which must manually be reloaded if suspected of being stale.
For larger applications, or long-running applications I would strongly lean towards using short-lived DbContext that rely on POCO ViewModels/DTOs for view-duration data state. This means leveraging Projection via Select or Automapper's ProjectTo to load relevant data from entities on demand to pass to views, then reload entities by ID, and transfer across data or perform actions with changed state values during updates after verifying Row Version Numbers / Timestamps to detect possible stale data state. Reloading an entity and it's related data by PK is extremely quick.
Not only does this avoid the complexity/mess of trying to juggle detached entities, (and reloading data state anyways to guard against stale overwrites) but it can lead to more optimal data read operations and index utilization for many scenarios, especially things like search results which only need a few values from specific tables rather than reading entire entity graphs. A cardinal sin of passing Entities to views is attempting to avoid extra data reads by avoiding eager loading and disabling lazy loading to leave "unused" relationships as #null, or even populating Entity class objects with just a few fields to serve as a view model with .Select which leads to errors or bad assumptions/overwrites in later code. An Entity should always represent a complete (or complete-able) state of the data row. Dual-purposing entities to serve as both data domain state and view state is asking for trouble. Methods expecting an entity should never need to be concerned about whether they are getting a complete entity or a partially complete one.

difference between DetectChange and ChangeTracking in entity framework

i am reading this topic on MSDN
can any one please explain me that what is difference between these two as mention
DetectChanges is use to detect the changes in DBContext and related Entities
and
ChangeTracking is also use to detect changes in Entity as mentioned in this link
please explain me the actual difference betweeen theese two.
So EF needs to detect changes you make to the context, like adding\modifying\removing entities. Entities might be plain POCO entities and so have no embedded behaviour to track changes to their properties. So EF should make a snapshot of the entities it receives from database and then compare this database snapshot with actual state of the context. Even more - EF should track relationships between objects in the context and keep them synchronized. All this is done by method called DetectChanges(). It is being called at various moments, most importantly when you call SaveChanges, but also when you add\remove\attach entities to the context and so on.
If you design your entity classes in a special way (all properties virtual, collections represented by ICollection etc) - you can use automatic change tracking. EF will create special proxy classes inherited from your entity classes and will use that to immediatly detect changes to your entity properties. Note that DetectChanges is still used in this case, exactly as described above. But it performs less work, since most of the changes are already detected right when they happened.
Summary: DetectChanges is a method to perform snapshot-based change detection (and more) and is a part of the group of ways used by Entity Framework to track (detect) changes to the context. Read more about DetectChanges here: http://blog.oneunicorn.com/2012/03/10/secrets-of-detectchanges-part-1-what-does-detectchanges-do/

Using EntityFramework navigation properties when the DbContext is disposed

The current retrieval pattern in my Service classes (in an ASP.NET MVC application) looks something like:
public Client Get(int id)
{
using (var repo = _repoFactory.Get<Client>())
{
return repo.Get(id);
}
}
Where _repoFactory.Get<T>() returns a repository which, when disposed, also disposes the Entity Framework DbContext;
However, when the consumer of the Get(int id) method needs to use navigation properties on the Client object, an exception is thrown because the context is already disposed.
I can foresee a few ways to negotiate this problem:
Don't use navigation properties outside of the service
Don't use lazy-loading navigation properties
Find some other way to dispose of the context when the request is finished
What is the "correct" (or least incorrect) way and how can it be accomplished?
All the ways that you suggested are "correct," and each has its advantages and disadvantages. You'll need to decide which approach you want to use.
Don't use navigation properties outside of the service
This is difficult to enforce if the service is returning entities. In my current project, we make heavy use of "DTO"s, which are new classes that represent the data that we expect to need in a given context. Because these aren't entities, we know that any property on them will be fully hydrated before it is returned from the repository.
Don't use lazy-loading navigation properties
This is roughly the same as above, except that you're allowing for the possibility of certain navigation properties to be eager-loaded. Again, how does the developer consuming this data know which properties are and are not going to be available? "DTO"s solve this problem, but they also introduce a bunch of extra classes that are almost identical to the existing entities.
Find some other way to dispose of the context when the request is finished
Usually people do this by having contexts bound in a per-request scope in their DI framework, and allow the DI framework to take care of instantiation/disposal of their contexts.
The main danger with this approach is that, while lazy-loading properties won't throw exceptions when accessed, each access requires another database round-trip. This makes it easy for developers to accidentally write code that ends up making thousands of round-trips when only two or three would be required otherwise.
However, if you have a reliable way of identifying performance issues and addressing them, then you could use this approach in the general case and then add a little eager-loading where you find it to be necessary. For example, MiniProfiler can sit on your front-end and give you information about the database round-trips you're making, as well as warnings when it notices that many database queries are practically identical.

One DbContext per web request... why?

I have been reading a lot of articles explaining how to set up Entity Framework's DbContext so that only one is created and used per HTTP web request using various DI frameworks.
Why is this a good idea in the first place? What advantages do you gain by using this approach? Are there certain situations where this would be a good idea? Are there things that you can do using this technique that you can't do when instantiating DbContexts per repository method call?
NOTE: This answer talks about the Entity Framework's DbContext, but
it is applicable to any sort of Unit of Work implementation, such as
LINQ to SQL's DataContext, and NHibernate's ISession.
Let start by echoing Ian: Having a single DbContext for the whole application is a Bad Idea. The only situation where this makes sense is when you have a single-threaded application and a database that is solely used by that single application instance. The DbContext is not thread-safe and since the DbContext caches data, it gets stale pretty soon. This will get you in all sorts of trouble when multiple users/applications work on that database simultaneously (which is very common of course). But I expect you already know that and just want to know why not to just inject a new instance (i.e. with a transient lifestyle) of the DbContext into anyone who needs it. (for more information about why a single DbContext -or even on context per thread- is bad, read this answer).
Let me start by saying that registering a DbContext as transient could work, but typically you want to have a single instance of such a unit of work within a certain scope. In a web application, it can be practical to define such a scope on the boundaries of a web request; thus a Per Web Request lifestyle. This allows you to let a whole set of objects operate within the same context. In other words, they operate within the same business transaction.
If you have no goal of having a set of operations operate inside the same context, in that case the transient lifestyle is fine, but there are a few things to watch:
Since every object gets its own instance, every class that changes the state of the system, needs to call _context.SaveChanges() (otherwise changes would get lost). This can complicate your code, and adds a second responsibility to the code (the responsibility of controlling the context), and is a violation of the Single Responsibility Principle.
You need to make sure that entities [loaded and saved by a DbContext] never leave the scope of such a class, because they can't be used in the context instance of another class. This can complicate your code enormously, because when you need those entities, you need to load them again by id, which could also cause performance problems.
Since DbContext implements IDisposable, you probably still want to Dispose all created instances. If you want to do this, you basically have two options. You need to dispose them in the same method right after calling context.SaveChanges(), but in that case the business logic takes ownership of an object it gets passed on from the outside. The second option is to Dispose all created instances on the boundary of the Http Request, but in that case you still need some sort of scoping to let the container know when those instances need to be Disposed.
Another option is to not inject a DbContext at all. Instead, you inject a DbContextFactory that is able to create a new instance (I used to use this approach in the past). This way the business logic controls the context explicitly. If might look like this:
public void SomeOperation()
{
using (var context = this.contextFactory.CreateNew())
{
var entities = this.otherDependency.Operate(
context, "some value");
context.Entities.InsertOnSubmit(entities);
context.SaveChanges();
}
}
The plus side of this is that you manage the life of the DbContext explicitly and it is easy to set this up. It also allows you to use a single context in a certain scope, which has clear advantages, such as running code in a single business transaction, and being able to pass around entities, since they originate from the same DbContext.
The downside is that you will have to pass around the DbContext from method to method (which is termed Method Injection). Note that in a sense this solution is the same as the 'scoped' approach, but now the scope is controlled in the application code itself (and is possibly repeated many times). It is the application that is responsible for creating and disposing the unit of work. Since the DbContext is created after the dependency graph is constructed, Constructor Injection is out of the picture and you need to defer to Method Injection when you need to pass on the context from one class to the other.
Method Injection isn't that bad, but when the business logic gets more complex, and more classes get involved, you will have to pass it from method to method and class to class, which can complicate the code a lot (I've seen this in the past). For a simple application, this approach will do just fine though.
Because of the downsides, this factory approach has for bigger systems, another approach can be useful and that is the one where you let the container or the infrastructure code / Composition Root manage the unit of work. This is the style that your question is about.
By letting the container and/or the infrastructure handle this, your application code is not polluted by having to create, (optionally) commit and Dispose a UoW instance, which keeps the business logic simple and clean (just a Single Responsibility). There are some difficulties with this approach. For instance, where do you Commit and Dispose the instance?
Disposing a unit of work can be done at the end of the web request. Many people however, incorrectly assume that this is also the place to Commit the unit of work. However, at that point in the application, you simply can't determine for sure that the unit of work should actually be committed. e.g. If the business layer code threw an exception that was caught higher up the callstack, you definitely don't want to Commit.
The real solution is again to explicitly manage some sort of scope, but this time do it inside the Composition Root. Abstracting all business logic behind the command / handler pattern, you will be able to write a decorator that can be wrapped around each command handler that allows to do this. Example:
class TransactionalCommandHandlerDecorator<TCommand>
: ICommandHandler<TCommand>
{
readonly DbContext context;
readonly ICommandHandler<TCommand> decorated;
public TransactionCommandHandlerDecorator(
DbContext context,
ICommandHandler<TCommand> decorated)
{
this.context = context;
this.decorated = decorated;
}
public void Handle(TCommand command)
{
this.decorated.Handle(command);
context.SaveChanges();
}
}
This ensures that you only need to write this infrastructure code once. Any solid DI container allows you to configure such a decorator to be wrapped around all ICommandHandler<T> implementations in a consistent manner.
There are two contradicting recommendations by microsoft and many people use DbContexts in a completely divergent manner.
One recommendation is to "Dispose DbContexts as soon as posible"
because having a DbContext Alive occupies valuable resources like db
connections etc....
The other states that One DbContext per request is highly
reccomended
Those contradict to each other because if your Request is doing a lot of unrelated to the Db stuff , then your DbContext is kept for no reason.
Thus it is waste to keep your DbContext alive while your request is just waiting for random stuff to get done...
So many people who follow rule 1 have their DbContexts inside their "Repository pattern" and create a new Instance per Database Query so X*DbContext per Request
They just get their data and dispose the context ASAP.
This is considered by MANY people an acceptable practice.
While this has the benefits of occupying your db resources for the minimum time it clearly sacrifices all the UnitOfWork and Caching candy EF has to offer.
Keeping alive a single multipurpose instance of DbContext maximizes the benefits of Caching but since DbContext is not thread safe and each Web request runs on it's own thread, a DbContext per Request is the longest you can keep it.
So EF's team recommendation about using 1 Db Context per request it's clearly based on the fact that in a Web Application a UnitOfWork most likely is going to be within one request and that request has one thread. So one DbContext per request is like the ideal benefit of UnitOfWork and Caching.
But in many cases this is not true.
I consider Logging a separate UnitOfWork thus having a new DbContext for Post-Request Logging in async threads is completely acceptable
So Finally it turns down that a DbContext's lifetime is restricted to these two parameters. UnitOfWork and Thread
Not a single answer here actually answers the question. The OP did not ask about a singleton/per-application DbContext design, he asked about a per-(web)request design and what potential benefits could exist.
I'll reference http://mehdi.me/ambient-dbcontext-in-ef6/ as Mehdi is a fantastic resource:
Possible performance gains.
Each DbContext instance maintains a first-level cache of all the entities its loads from the database. Whenever you query an entity by its primary key, the DbContext will first attempt to retrieve it from its first-level cache before defaulting to querying it from the database. Depending on your data query pattern, re-using the same DbContext across multiple sequential business transactions may result in a fewer database queries being made thanks to the DbContext first-level cache.
It enables lazy-loading.
If your services return persistent entities (as opposed to returning view models or other sorts of DTOs) and you'd like to take advantage of lazy-loading on those entities, the lifetime of the DbContext instance from which those entities were retrieved must extend beyond the scope of the business transaction. If the service method disposed the DbContext instance it used before returning, any attempt to lazy-load properties on the returned entities would fail (whether or not using lazy-loading is a good idea is a different debate altogether which we won't get into here). In our web application example, lazy-loading would typically be used in controller action methods on entities returned by a separate service layer. In that case, the DbContext instance that was used by the service method to load these entities would need to remain alive for the duration of the web request (or at the very least until the action method has completed).
Keep in mind there are cons as well. That link contains many other resources to read on the subject.
Just posting this in case someone else stumbles upon this question and doesn't get absorbed in answers that don't actually address the question.
I'm pretty certain it is because the DbContext is not at all thread safe. So sharing the thing is never a good idea.
One thing that's not really addressed in the question or the discussion is the fact that DbContext can't cancel changes. You can submit changes, but you can't clear out the change tree, so if you use a per request context you're out of luck if you need to throw changes away for whatever reason.
Personally I create instances of DbContext when needed - usually attached to business components that have the ability to recreate the context if required. That way I have control over the process, rather than having a single instance forced onto me. I also don't have to create the DbContext at each controller startup regardless of whether it actually gets used. Then if I still want to have per request instances I can create them in the CTOR (via DI or manually) or create them as needed in each controller method. Personally I usually take the latter approach as to avoid creating DbContext instances when they are not actually needed.
It depends from which angle you look at it too. To me the per request instance has never made sense. Does the DbContext really belong into the Http Request? In terms of behavior that's the wrong place. Your business components should be creating your context, not the Http request. Then you can create or throw away your business components as needed and never worry about the lifetime of the context.
I agree with previous opinions. It is good to say, that if you are going to share DbContext in single thread app, you'll need more memory. For example my web application on Azure (one extra small instance) needs another 150 MB of memory and I have about 30 users per hour.
Here is real example image: application have been deployed in 12PM
What I like about it is that it aligns the unit-of-work (as the user sees it - i.e. a page submit) with the unit-of-work in the ORM sense.
Therefore, you can make the entire page submission transactional, which you could not do if you were exposing CRUD methods with each creating a new context.
Another understated reason for not using a singleton DbContext, even in a single threaded single user application, is because of the identity map pattern it uses. It means that every time you retrieve data using query or by id, it will keep the retrieved entity instances in cache. The next time you retrieve the same entity, it will give you the cached instance of the entity, if available, with any modifications you have done in the same session. This is necessary so the SaveChanges method does not end up with multiple different entity instances of the same database record(s); otherwise, the context would have to somehow merge the data from all those entity instances.
The reason that is a problem is a singleton DbContext can become a time bomb that could eventually cache the whole database + the overhead of .NET objects in memory.
There are ways around this behavior by only using Linq queries with the .NoTracking() extension method. Also these days PCs have a lot of RAM. But usually that is not the desired behavior.
Another issue to watch out for with Entity Framework specifically is when using a combination of creating new entities, lazy loading, and then using those new entities (from the same context). If you don't use IDbSet.Create (vs just new), Lazy loading on that entity doesn't work when its retrieved out of the context it was created in. Example:
public class Foo {
public string Id {get; set; }
public string BarId {get; set; }
// lazy loaded relationship to bar
public virtual Bar Bar { get; set;}
}
var foo = new Foo {
Id = "foo id"
BarId = "some existing bar id"
};
dbContext.Set<Foo>().Add(foo);
dbContext.SaveChanges();
// some other code, using the same context
var foo = dbContext.Set<Foo>().Find("foo id");
var barProp = foo.Bar.SomeBarProp; // fails with null reference even though we have BarId set.

how can I save/keep-in-sync an in-memory graph of objects with the database?

Question - What is a good best practice approach for how can I save/keep-in-sync an jn-memory graph of objects with the database?
Background:
That is say I have the classes Node and Relationship, and the application is building up a graph of related objects using these classes. There might be 1000 nodes with various relationships between them. The application needs to query the structure hence an in-memory approach is good for performance no doubt (e.g. traverse the graph from Node X to find the root parents)
The graph does need to be persisted however into a database with tables NODES and RELATIONSHIPS.
Therefore what is a good best practice approach for how can I save/keep-in-sync an jn-memory graph of objects with the database?
Ideal requirements would include:
build up changes in-memory and then 'save' afterwards (mandatory)
when saving, apply updates to database in correct order to avoid hitting any database constraints (mandatory)
keep persistence mechanism separate from model, for ease in changing persistence layer if needed, e.g. don't just wrap an ADO.net DataRow in the Node and Relationship classes (desirable)
mechanism for doing optimistic locking (desirable)
Or is the overhead of all this for a smallish application just not worth it and I should just hit the database each time for everything? (assuming the response times were acceptable) [would still like to avoid if not too much extra overhead to remain somewhat scalable re performance]
I'm using the self tracking entities in Entity Framework 4. After the entities are loaded into memory the StartTracking() MUST be called on every entity. Then you can modify your entity graph in memory without any DB-Operations. When you're done with the modifications, you call the context extension method "ApplyChanges(rootOfEntityGraph)" and SaveChanges(). So your modifications are persisted. Now you have to start the tracking again on every entity in the graph. Two hints/ideas I'm using at the moment:
1.) call StartTracking() at the beginning on every entity
I'm using an Interface IWorkspace to abstract the ObjectContext (simplifies testing -> see the OpenSource implementation bbv.DomainDrivenDesign at sourceforge). They also use a QueryableContext. So I created a further concrete Workspace and QueryableContext implementation and intercept the loading process with an own IEnumerable implementation. When the workspace's consumer executes the query which he get with CreateQuery(), my intercepting IEnumerable object registers an eventhandler on the context's ChangeTracker. In this event handler I call StartTracking() for every entity loaded and added into the context (doesn't work if you load the objects with NoTrakcing, because in that case the objects aren't added to the context and the event handler will not be fired). After the enumeration in the self made Iterator, the event handler on the ObjectStateManager is deregistered.
2.) call StartTracking() after ApplyChanges()/SaveChanges()
In the workspace implementation, I ask the context's ObjectStateManager for the modified entities, i.e:
var addedEntities = this.context.ObjectStateManager.GetObjectStateEntries(EntityState.Added);
--> analogous for modified entities
cast them to IObjectWithChangeTracker and call the AcceptChanges() method on the entity itself. This starts the object's changetracker again.
For my project I have the same mandatory points as you. I played around with EF 3.5 and didn't find a satisfactory solution. But the new ability of self tracking entities in EF 4 seems to fit my requirements (as far as I explored the funcionality).
If you're interested, I'll send you my "spike"-project.
Have anyone an alternative solution? My project is a server application which holds objects in memory for fast operations, while modifications should also be persisted (no round trip to DB). At some points in code the object graphs are marked as deleted/terminated and are removed from the in-memory container. With the explained solution above I can reuse the generated model from EF and have not to code and wrapp all objects myself again. The generated code for the self tracking entities arises from T4 templates which can be adapted very easily.
Thanks a lot for other ideas/critism
Short answer is that you can still keep a graph (collection of linked objects) of the objects in memory and write the changes to the database as they occur. If this is taking too long, you could put the changes onto a message queue (but that is probably overkill) or execute the updates and inserts on a separate thread.

Categories

Resources