I have been reading a lot of articles explaining how to set up Entity Framework's DbContext so that only one is created and used per HTTP web request using various DI frameworks.
Why is this a good idea in the first place? What advantages do you gain by using this approach? Are there certain situations where this would be a good idea? Are there things that you can do using this technique that you can't do when instantiating DbContexts per repository method call?
NOTE: This answer talks about the Entity Framework's DbContext, but
it is applicable to any sort of Unit of Work implementation, such as
LINQ to SQL's DataContext, and NHibernate's ISession.
Let start by echoing Ian: Having a single DbContext for the whole application is a Bad Idea. The only situation where this makes sense is when you have a single-threaded application and a database that is solely used by that single application instance. The DbContext is not thread-safe and since the DbContext caches data, it gets stale pretty soon. This will get you in all sorts of trouble when multiple users/applications work on that database simultaneously (which is very common of course). But I expect you already know that and just want to know why not to just inject a new instance (i.e. with a transient lifestyle) of the DbContext into anyone who needs it. (for more information about why a single DbContext -or even on context per thread- is bad, read this answer).
Let me start by saying that registering a DbContext as transient could work, but typically you want to have a single instance of such a unit of work within a certain scope. In a web application, it can be practical to define such a scope on the boundaries of a web request; thus a Per Web Request lifestyle. This allows you to let a whole set of objects operate within the same context. In other words, they operate within the same business transaction.
If you have no goal of having a set of operations operate inside the same context, in that case the transient lifestyle is fine, but there are a few things to watch:
Since every object gets its own instance, every class that changes the state of the system, needs to call _context.SaveChanges() (otherwise changes would get lost). This can complicate your code, and adds a second responsibility to the code (the responsibility of controlling the context), and is a violation of the Single Responsibility Principle.
You need to make sure that entities [loaded and saved by a DbContext] never leave the scope of such a class, because they can't be used in the context instance of another class. This can complicate your code enormously, because when you need those entities, you need to load them again by id, which could also cause performance problems.
Since DbContext implements IDisposable, you probably still want to Dispose all created instances. If you want to do this, you basically have two options. You need to dispose them in the same method right after calling context.SaveChanges(), but in that case the business logic takes ownership of an object it gets passed on from the outside. The second option is to Dispose all created instances on the boundary of the Http Request, but in that case you still need some sort of scoping to let the container know when those instances need to be Disposed.
Another option is to not inject a DbContext at all. Instead, you inject a DbContextFactory that is able to create a new instance (I used to use this approach in the past). This way the business logic controls the context explicitly. If might look like this:
public void SomeOperation()
{
using (var context = this.contextFactory.CreateNew())
{
var entities = this.otherDependency.Operate(
context, "some value");
context.Entities.InsertOnSubmit(entities);
context.SaveChanges();
}
}
The plus side of this is that you manage the life of the DbContext explicitly and it is easy to set this up. It also allows you to use a single context in a certain scope, which has clear advantages, such as running code in a single business transaction, and being able to pass around entities, since they originate from the same DbContext.
The downside is that you will have to pass around the DbContext from method to method (which is termed Method Injection). Note that in a sense this solution is the same as the 'scoped' approach, but now the scope is controlled in the application code itself (and is possibly repeated many times). It is the application that is responsible for creating and disposing the unit of work. Since the DbContext is created after the dependency graph is constructed, Constructor Injection is out of the picture and you need to defer to Method Injection when you need to pass on the context from one class to the other.
Method Injection isn't that bad, but when the business logic gets more complex, and more classes get involved, you will have to pass it from method to method and class to class, which can complicate the code a lot (I've seen this in the past). For a simple application, this approach will do just fine though.
Because of the downsides, this factory approach has for bigger systems, another approach can be useful and that is the one where you let the container or the infrastructure code / Composition Root manage the unit of work. This is the style that your question is about.
By letting the container and/or the infrastructure handle this, your application code is not polluted by having to create, (optionally) commit and Dispose a UoW instance, which keeps the business logic simple and clean (just a Single Responsibility). There are some difficulties with this approach. For instance, where do you Commit and Dispose the instance?
Disposing a unit of work can be done at the end of the web request. Many people however, incorrectly assume that this is also the place to Commit the unit of work. However, at that point in the application, you simply can't determine for sure that the unit of work should actually be committed. e.g. If the business layer code threw an exception that was caught higher up the callstack, you definitely don't want to Commit.
The real solution is again to explicitly manage some sort of scope, but this time do it inside the Composition Root. Abstracting all business logic behind the command / handler pattern, you will be able to write a decorator that can be wrapped around each command handler that allows to do this. Example:
class TransactionalCommandHandlerDecorator<TCommand>
: ICommandHandler<TCommand>
{
readonly DbContext context;
readonly ICommandHandler<TCommand> decorated;
public TransactionCommandHandlerDecorator(
DbContext context,
ICommandHandler<TCommand> decorated)
{
this.context = context;
this.decorated = decorated;
}
public void Handle(TCommand command)
{
this.decorated.Handle(command);
context.SaveChanges();
}
}
This ensures that you only need to write this infrastructure code once. Any solid DI container allows you to configure such a decorator to be wrapped around all ICommandHandler<T> implementations in a consistent manner.
There are two contradicting recommendations by microsoft and many people use DbContexts in a completely divergent manner.
One recommendation is to "Dispose DbContexts as soon as posible"
because having a DbContext Alive occupies valuable resources like db
connections etc....
The other states that One DbContext per request is highly
reccomended
Those contradict to each other because if your Request is doing a lot of unrelated to the Db stuff , then your DbContext is kept for no reason.
Thus it is waste to keep your DbContext alive while your request is just waiting for random stuff to get done...
So many people who follow rule 1 have their DbContexts inside their "Repository pattern" and create a new Instance per Database Query so X*DbContext per Request
They just get their data and dispose the context ASAP.
This is considered by MANY people an acceptable practice.
While this has the benefits of occupying your db resources for the minimum time it clearly sacrifices all the UnitOfWork and Caching candy EF has to offer.
Keeping alive a single multipurpose instance of DbContext maximizes the benefits of Caching but since DbContext is not thread safe and each Web request runs on it's own thread, a DbContext per Request is the longest you can keep it.
So EF's team recommendation about using 1 Db Context per request it's clearly based on the fact that in a Web Application a UnitOfWork most likely is going to be within one request and that request has one thread. So one DbContext per request is like the ideal benefit of UnitOfWork and Caching.
But in many cases this is not true.
I consider Logging a separate UnitOfWork thus having a new DbContext for Post-Request Logging in async threads is completely acceptable
So Finally it turns down that a DbContext's lifetime is restricted to these two parameters. UnitOfWork and Thread
Not a single answer here actually answers the question. The OP did not ask about a singleton/per-application DbContext design, he asked about a per-(web)request design and what potential benefits could exist.
I'll reference http://mehdi.me/ambient-dbcontext-in-ef6/ as Mehdi is a fantastic resource:
Possible performance gains.
Each DbContext instance maintains a first-level cache of all the entities its loads from the database. Whenever you query an entity by its primary key, the DbContext will first attempt to retrieve it from its first-level cache before defaulting to querying it from the database. Depending on your data query pattern, re-using the same DbContext across multiple sequential business transactions may result in a fewer database queries being made thanks to the DbContext first-level cache.
It enables lazy-loading.
If your services return persistent entities (as opposed to returning view models or other sorts of DTOs) and you'd like to take advantage of lazy-loading on those entities, the lifetime of the DbContext instance from which those entities were retrieved must extend beyond the scope of the business transaction. If the service method disposed the DbContext instance it used before returning, any attempt to lazy-load properties on the returned entities would fail (whether or not using lazy-loading is a good idea is a different debate altogether which we won't get into here). In our web application example, lazy-loading would typically be used in controller action methods on entities returned by a separate service layer. In that case, the DbContext instance that was used by the service method to load these entities would need to remain alive for the duration of the web request (or at the very least until the action method has completed).
Keep in mind there are cons as well. That link contains many other resources to read on the subject.
Just posting this in case someone else stumbles upon this question and doesn't get absorbed in answers that don't actually address the question.
I'm pretty certain it is because the DbContext is not at all thread safe. So sharing the thing is never a good idea.
One thing that's not really addressed in the question or the discussion is the fact that DbContext can't cancel changes. You can submit changes, but you can't clear out the change tree, so if you use a per request context you're out of luck if you need to throw changes away for whatever reason.
Personally I create instances of DbContext when needed - usually attached to business components that have the ability to recreate the context if required. That way I have control over the process, rather than having a single instance forced onto me. I also don't have to create the DbContext at each controller startup regardless of whether it actually gets used. Then if I still want to have per request instances I can create them in the CTOR (via DI or manually) or create them as needed in each controller method. Personally I usually take the latter approach as to avoid creating DbContext instances when they are not actually needed.
It depends from which angle you look at it too. To me the per request instance has never made sense. Does the DbContext really belong into the Http Request? In terms of behavior that's the wrong place. Your business components should be creating your context, not the Http request. Then you can create or throw away your business components as needed and never worry about the lifetime of the context.
I agree with previous opinions. It is good to say, that if you are going to share DbContext in single thread app, you'll need more memory. For example my web application on Azure (one extra small instance) needs another 150 MB of memory and I have about 30 users per hour.
Here is real example image: application have been deployed in 12PM
What I like about it is that it aligns the unit-of-work (as the user sees it - i.e. a page submit) with the unit-of-work in the ORM sense.
Therefore, you can make the entire page submission transactional, which you could not do if you were exposing CRUD methods with each creating a new context.
Another understated reason for not using a singleton DbContext, even in a single threaded single user application, is because of the identity map pattern it uses. It means that every time you retrieve data using query or by id, it will keep the retrieved entity instances in cache. The next time you retrieve the same entity, it will give you the cached instance of the entity, if available, with any modifications you have done in the same session. This is necessary so the SaveChanges method does not end up with multiple different entity instances of the same database record(s); otherwise, the context would have to somehow merge the data from all those entity instances.
The reason that is a problem is a singleton DbContext can become a time bomb that could eventually cache the whole database + the overhead of .NET objects in memory.
There are ways around this behavior by only using Linq queries with the .NoTracking() extension method. Also these days PCs have a lot of RAM. But usually that is not the desired behavior.
Another issue to watch out for with Entity Framework specifically is when using a combination of creating new entities, lazy loading, and then using those new entities (from the same context). If you don't use IDbSet.Create (vs just new), Lazy loading on that entity doesn't work when its retrieved out of the context it was created in. Example:
public class Foo {
public string Id {get; set; }
public string BarId {get; set; }
// lazy loaded relationship to bar
public virtual Bar Bar { get; set;}
}
var foo = new Foo {
Id = "foo id"
BarId = "some existing bar id"
};
dbContext.Set<Foo>().Add(foo);
dbContext.SaveChanges();
// some other code, using the same context
var foo = dbContext.Set<Foo>().Find("foo id");
var barProp = foo.Bar.SomeBarProp; // fails with null reference even though we have BarId set.
Related
From the Documentation
Entity Framework contexts should be added to the services container
using the Scoped lifetime. This is taken care of automatically if you
use the helper methods as shown above. Repositories that will make use
of Entity Framework should use the same lifetime.
I always thought, that I should create a new Context for every single unit of work I have to process. This let me think, if I have a ServiceA and ServiceB, which are applying different actions on the DbContext that they should get a different Instance of DbContext.
The documentation reads as following:
Transient objects are always different; a new instance is provided to every controller and every service.
Scoped objects are the same within a request, but different across different request
Going back to ServiceA and ServiceB, it sounds to me, Transient is more suitable.
I have researched, that the Context should only saved once per HttpRequest, but I really do not understand how this does work.
Especially if we take a look at one Service:
using (var transaction = dbContext.Database.BeginTransaction())
{
//Create some entity
var someEntity = new SomeEntity();
dbContext.SomeEntity.Add(someEntity);
//Save in order to get the the id of the entity
dbContext.SaveChanges();
//Create related entity
var relatedEntity = new RelatedEntity
{
SomeEntityId = someEntity.Id
};
dbContext.RelatedEntity.Add(relatedEntity)
dbContext.SaveChanges();
transaction.Commit();
}
Here we need to Save the context in order to get the ID of an Entity which is related to another one we just have created.
At the same time another service could update the same context. From what I have read, DbContext is not thread safe.
Should I use Transient in this case? Why does the documentation suggest, I should use Scoped?
Do I miss some important part of the framework?
As others already explained, you should use a scoped dependency for database contexts to make sure it will be properly reused. For concurrency, remember that you can query the database asynchronously too, so you might not need actual threads.
If you do need threads, i.e. background workers, then it’s likely that those will have a different lifetime than the request. As such, those threads should not use dependencies retrieved from the request scope. When the request ends and its dependency scope is being closed, disposable dependencies will be properly disposed. For other threads, this would mean that their dependencies might end up getting disposed although they still need them: Bad idea.
Instead, you should explicitly open a new dependency scope for every thread you create. You can do that by injecting the IServiceScopeFactory and creating a scope using CreateScope. The resulting object will then contain a service provider which you can retrieve your dependencies from. Since this is a seperate scope, scoped dependencies like database contexts will be recreated for the lifetime of this scope.
In order to avoid getting into the service locator pattern, you should consider having one central service your thread executes that brings together all the necessary dependencies. The thread could then do this:
using (var scope = _scopeFactory.CreateScope())
{
var service = scope.ServiceProvider.GetService<BackgroundThreadService>();
service.Run();
}
The BackgroundThreadService and all its dependency can then follow the common dependency injection way of receiving dependencies.
I believe you wouldn't faced with concurrency issues in most cases when you use scoped lifetime. Even in example you posted there is no concurrency problems as services in current request will be called subsequently. I cant even imagine case when you will run 2 or more services in parallel (its possible but not usual) in context of one HTTP request(scope).
Lifetimes its just a way to store your data(to be simple here). Just look on some lifetime managers in popular DI frameworks, all of them works pretty match the same - this is just dictionary like objects that implementing disposable pattern. Using Transient I believe your get object method will always return null so DI will create new instance each time it requested. SingleInstance will store objects in something like static concurrent dictionary so container will create instance only once and then receive existing one.
Scoped is usually mean scope object is used to store created objects. In asp net pipeline it usually mean the same as per request (as scope can be passed through the pipeline)
To be short - don't worry just use scoped it is safe and you can call this as per request.
I've tried to be very simple in my explanation, you can always look to the source code to find as match details as you need here https://github.com/aspnet/DependencyInjection
Working on a WPF application using MVVM and powered by Entity Framework. We were very keen to allow users to multi-window this app, for usability purposes. However, that has the potential to cause problems with EF. If we stick to the usual advice of creating one copy of the Repository per ViewModel and someone opens multiple windows of the same ViewModel, it could cause "multiple instances of IEntityChangeTracker" errors.
Rather than go with a Singleton, which has its own problems, we solved this by putting a Refresh method on the repository that gets a fresh data context. Then we do things like this all over the shop:
using (IRepository r = Rep.Refresh())
{
r.Update(CurrentCampaign);
r.SaveChanges();
}
Which is mostly fine. However, it causes problems with maintaining state. If the context is refreshed while a user is working with an object, their changes will be lost.
I can see two ways round this, both of which have their own drawbacks.
We call SaveChanges constantly. This has the potential to slow down the application with constant database calls. Plus there are occasions when we don't want to store incomplete objects.
We copy EF objects into memory when loaded, have the user work with those, and then add a "Save" button which copies all the objects back to the EF object and saves. We could do this with an automapper, but is still seems unnecessary faff.
Is there another way?
I believe having the repository for accessing entity framework as a singleton may not always be wrong.
If you have a scenario were you have a client side repository, i.e. a repository which is part of the executable of the client application, i.e. is used by one client, then a singleton might be ok. Of course I would not use a singleton on the server side.
I asked Brian Noyes (Microsoft MVP) a similar question on a "MVVM" course on the pluralsight website.
I asked: "What is the correct way to dispose of client services which are used in the view model?"
And in his response he wrote: "...most of my client services are Singletons anyway and live for the life of the app."
In addition having a singleton does not prevent you from writing unit tests, as long as you have an interface for the singleton.
And if you use a dependency injection framework (see Marks comment), which is a good idea for itself, changing to singleton instantiation is just a matter of performing a small change in the setup of the injection container for the respective class.
In the past I have used AutoFac to inject a EntityFramework DB context to various services on a InstancePerRequest schedule.
builder.RegisterType<MyDataContext>()
.As<IDataContext>()
.As<IUnitOfWork>()
.InstancePerRequest();
This has allowed me to share the context across services when injecting multiple services into a controller.
// Note each of these services take a IDataContext via constructor injection
public FilesController(
IAnalysisService analysisService,
IUserService userService)
{
}
I have an action filter that commits my Context at the end of each action request (I am probably looking at changing this but for now it suites my purpose of this question).
public class UnitOfWorkActionAttribute : ActionFilterAttribute
{
public IUnitOfWork UnitOfWork { get; set; }
public override void OnActionExecuted(ActionExecutedContext filterContext)
{
UnitOfWork.SaveChangesAsync().WaitAndUnwrapException();
}
}
My reading of async in MVC is that the thread that started the request is not guaranteed to finish it due to it being freed up when using await.
In AutoFac injection does this mean that the UnitOfWork instance injected into the ActionFilterAttribute might differ from the one injected into the Controller or is InstancePerRequest not effected by changes to the Thread handling the request?
InstancePerRequest won't be affected by changes to the Thread handling the request, as it does not store anything inside of the Thread. If you read the source of the InstancePerRequest() method you'll see that it's just calling InstancePerMatchingLifetimeScope with a well-known scope tag.
On the other side of things, in the AutoFac.Mvc project, you'll see that in the RequestLifetimeScopeProvider class that it stores the scope in the HttpContext, which is immune to changes in the executing thread.
The only remaining question is: Does your filter execute inside of the scope of the AutoFac filter that starts the request scope? Well, the surest way to find out is to make sure that your UnitOfWork gets injected into your filter. Assuming it does, you can be assured that it is the SAME UnitOfWork that existed throughout the rest of the request.
I would agree with others' suggestions to NOT commit your UnitOfWork (DbContext) in a filter like this. One set of reasons you've already read in the this answer. Another reason is that it implies (to me) that your chosen pattern is to NEVER call SaveChanges() from inside of your individual service methods and instead wait until the end to atomically commit everything. This will work fine until the moment where it doesn't, for example when EF updates or inserts your entities in the wrong order. At this point, you'll have painted yourself into a corner.
I might also presume that you're doing this as a way of avoiding the headaches of explicitly using Transactions, since if you only call SaveChanges() once, all updates will happen in a single, isolated transaction internal to SaveChanges() and you can therefore avoid this pain. This idea of never using explicit transactions in EF, too, is a trap once your app grows to any substantial complexity, though "popularly" championed in various articles around the net. I suggest you plan your infrastructure to accommodate it. Thankfully if you're using something like AutoFac, you can fairly transparently either attach a transaction to your request scope (the all-or-nothing approach) or create a new scope at runtime that contains your transaction, for when you selectively want a transaction.
One possible solution is to call your read-only service calls without a surrounding transaction (to avoid any DB locking) and only use transactions when you know you'll be writing to the database.
The current retrieval pattern in my Service classes (in an ASP.NET MVC application) looks something like:
public Client Get(int id)
{
using (var repo = _repoFactory.Get<Client>())
{
return repo.Get(id);
}
}
Where _repoFactory.Get<T>() returns a repository which, when disposed, also disposes the Entity Framework DbContext;
However, when the consumer of the Get(int id) method needs to use navigation properties on the Client object, an exception is thrown because the context is already disposed.
I can foresee a few ways to negotiate this problem:
Don't use navigation properties outside of the service
Don't use lazy-loading navigation properties
Find some other way to dispose of the context when the request is finished
What is the "correct" (or least incorrect) way and how can it be accomplished?
All the ways that you suggested are "correct," and each has its advantages and disadvantages. You'll need to decide which approach you want to use.
Don't use navigation properties outside of the service
This is difficult to enforce if the service is returning entities. In my current project, we make heavy use of "DTO"s, which are new classes that represent the data that we expect to need in a given context. Because these aren't entities, we know that any property on them will be fully hydrated before it is returned from the repository.
Don't use lazy-loading navigation properties
This is roughly the same as above, except that you're allowing for the possibility of certain navigation properties to be eager-loaded. Again, how does the developer consuming this data know which properties are and are not going to be available? "DTO"s solve this problem, but they also introduce a bunch of extra classes that are almost identical to the existing entities.
Find some other way to dispose of the context when the request is finished
Usually people do this by having contexts bound in a per-request scope in their DI framework, and allow the DI framework to take care of instantiation/disposal of their contexts.
The main danger with this approach is that, while lazy-loading properties won't throw exceptions when accessed, each access requires another database round-trip. This makes it easy for developers to accidentally write code that ends up making thousands of round-trips when only two or three would be required otherwise.
However, if you have a reliable way of identifying performance issues and addressing them, then you could use this approach in the general case and then add a little eager-loading where you find it to be necessary. For example, MiniProfiler can sit on your front-end and give you information about the database round-trips you're making, as well as warnings when it notices that many database queries are practically identical.
I want to know how most people are dealing with the repository pattern when it involves hitting the same database multiple times (sometimes transactionally) and trying to do so efficiently while maintaining database agnosticism and using multiple repositories together.
Let's say we have repositories for three different entities; Widget, Thing and Whatsit. Each repository is abstracted via a base interface as per normal decoupling design processes. The base interfaces would then be IWidgetRepository, IThingRepository and IWhatsitRepository.
Now we have our business layer or equivalent (whatever you want to call it). In this layer we have classes that access the various repositories. Often the methods in these classes need to do batch/combined operations where multiple repositories are involved. Sometimes one method may make use of another method internally, while that method can still be called independently. What about, in this scenario, when the operation needs to be transactional?
Example:
class Bob
{
private IWidgetRepository _widgetRepo;
private IThingRepository _thingRepo;
private IWhatsitRepository _whatsitRepo;
public Bob(IWidgetRepository widgetRepo, IThingRepository thingRepo, IWhatsitRepository whatsitRepo)
{
_widgetRepo = widgetRepo;
_thingRepo= thingRepo;
_whatsitRepo= whatsitRepo;
}
public void DoStuff()
{
_widgetRepo.StoreSomeStuff();
_thingRepo.ReadSomeStuff();
_whatsitRepo.SaveSomething();
}
public void DoOtherThing()
{
_widgetRepo.UpdateSomething();
DoStuff();
}
}
How do I keep my access to that database efficient and not have a constant stream of open-close-open-close on connections and inadvertent invocation of MSDTS and whatnot? If my database is something like SQLite, standard mechanisms like creating nested transactions are going to inherently fail, yet the business layer should not have to be concerning itself with such things.
How do you handle such issues? Does ADO.Net provide simple mechanisms to handle this or do most people end up wrapping their own custom bits of code around ADO.Net to solve these types of problems?
Basically you want to utilize a UnitOfWork for this. I personally use NHibernate because their ISession interface is basically a Unit of Work, so it will batch together all commands and send them to the database together (there are additional smarts and object state tracking in there as well, which help with this). So the number of discrete commands sent to the database will depend on the life cycle of your ISession. In a web context I generally go with a Conversation Per Request, which means the ISession is created at the beginning of the request, and flushed (sent to the DB) at the end of the request.
There is a lot of potential here to change the life cycle of the session based on whether or not you need shorter or longer session, and you can also utilize transactions which can have a separate life cycle if you need to.
Consider that your abstracted repositories could be implemented in any number of databases: SQLite, Microsoft SQL Server, direct file access, etc -- even though you know that they are from the same database, it's not reasonable for "Bob" to attempt to make sure each repository can be transactionalized with respect to the other repositories.
Bob should perhaps talk to a DoStuffService that contains concrete implementations of your repositories. Since this service is creating concrete implementations of your repositories, it is capable of creating an appropriate transaction. This service would then be responsible for safely executing a UnitOfWork (thanks, ckramer) on behalf of Bob.