I have studied a bit into the lifespan of the DataContext trying to work out what is the best possible way of doing things.
Given I want to re-use my DAL in a web application I decided to go with the DataContext Per Business Object Request approach.
My idea was to extend my L2S entities from the dbml file to retrieve information the database creating a separate context per request e.g.
public partial class AnEntity
{
public IEnumerable<RelatedEntity> GetRelatedEntities()
{
using (var dc = new MyDataContext())
{
return dc.RelatedEntities.Where(r => r.EntityID == this.ID);
}
}
}
In terms of returning the Entities...do I need to return POCOs at this point or is it ok to simply return the business object returned from the query? I understand that if I was to try access properties of the returned entity (after the DataContext has been disposed) it would fail. However, this is the reason I have decided to implement these type of methods e.g.
Instead of:
AnEntity entity = null;
using (var repo = new EntityRepo())
{
entity = repo.GetEntity(12345);
}
var related = entity.RelatedEntities; // this would cause an exception
In theory I should be able to do:
AnEntity entity = null;
using (var repo = new EntityRepo())
{
entity = repo.GetEntity(12345);
}
var related = entity.GetRelatedEntities();
Given the circumstances of my particular app (needs to work in a windows service & web application) I would like to know if this seems a plausible approach, whether there are obvious flaws and if there are better approaches for what it is I am trying to do.
Thanks.
Generally speaking, as long as you are not calling a single DataContext object using more than one thread, you should be OK. In other words, use one DataContext object per thread, and do not share data or state between different DataContext objects.
The remaining multi-threaded issues have to do with concurrency in the database, not threading operations.
Other than these caveats, your approach seems sound. You can either use partial classes to implement your business methods, or you can add a business layer between the Linq to SQL classes and your repository.
You can get away with this:
var repo = new EntityRepo();
entity = repo.GetEntity(12345);
var related = entity.RelatedEntities;
See this StackOverflow post for an explanation why not disposing your context doesn't cause connection leaks or stuff like that.
The repository and context will get cleaned up by the garbage collector when the entities that were fetched by them fall out of scope (when building a website, at the end of the request).
Edit: MSDN documents that connections are opened as late as possible and closed as soon as possible. So skipping the using doesn't causes connection pool problems.
Related
We have an application using SDK provided by our provider to integrate easily with them. This SDK connects to AMQP endpoint and simply distributes, caches and transforms messages to our consumers. Previously this integration was over HTTP with XML as a data source and old integration had two ways of caching DataContext - per web request and per managed thread id. (1)
Now, however, we do not integrate over HTTP but rather AMQP which is transparent to us since the SDK is doing all the connection logic and we are only left with defining our consumers so there is no option to cache DataContext "per web request" so only per managed thread id is left.
I implemented chain of responsibility pattern, so when an update comes to us it is put in one pipeline of handlers which uses DataContext to update the database according to the new updates. This is how the invocation method of pipeline looks like:
public Task Invoke(TInput entity)
{
object currentInputArgument = entity;
for (var i = 0; i < _pipeline.Count; ++i)
{
var action = _pipeline[i];
if (action.Method.ReturnType.IsSubclassOf(typeof(Task)))
{
if (action.Method.ReturnType.IsConstructedGenericType)
{
dynamic tmp = action.DynamicInvoke(currentInputArgument);
currentInputArgument = tmp.GetAwaiter().GetResult();
}
else
{
(action.DynamicInvoke(currentInputArgument) as Task).GetAwaiter().GetResult();
}
}
else
{
currentInputArgument = action.DynamicInvoke(currentInputArgument);
}
}
return Task.CompletedTask;
}
The problem is (at least what I think it is) that this chain of responsibility is chain of methods returning/starting new tasks so when an update for entity A comes it is handled by managed thread id = 1 let's say and then only sometime after again same entity A arrives only to be handled by managed thread id = 2 for example. This leads to:
System.InvalidOperationException: 'An entity object cannot be referenced by multiple instances of IEntityChangeTracker.'
because DataContext from managed thread id = 1 already tracks entity A. (at least that's what I think it is)
My question is how can I cache DataContext in my case? Did you guys have the same problem? I read this and this answers and from what I understood using one static DataContext is not an option also.(2)
Disclaimer: I should have said that we inherited the application and I cannot answer why it was implemented like that.
Disclaimer 2: I have little to no experience with EF.
Comunity asked questions:
What version of EF we are using? 5.0
Why do entities live longer than the context? - They don't but maybe you are asking why entities need to live longer than the context. I use repositories that use cached DataContext to get entities from the database to store them in an in-memory collection which I use as a cache.
This is how entities are "extracted", where DatabaseDataContext is the cached DataContext I am talking about (BLOB with whole database sets inside)
protected IQueryable<T> Get<TProperty>(params Expression<Func<T, TProperty>>[] includes)
{
var query = DatabaseDataContext.Set<T>().AsQueryable();
if (includes != null && includes.Length > 0)
{
foreach (var item in includes)
{
query = query.Include(item);
}
}
return query;
}
Then, whenever my consumer application receives AMQP message my chain of responsibility pattern begins checking if this message and its data I already processed. So I have method that looks like that:
public async Task<TEntity> Handle<TEntity>(TEntity sportEvent)
where TEntity : ISportEvent
{
... some unimportant business logic
//save the sport
if (sport.SportID > 0) // <-- this here basically checks if so called
// sport is found in cache or not
// if its found then we update the entity in the db
// and update the cache after that
{
_sportRepository.Update(sport); /*
* because message update for the same sport can come
* and since DataContext is cached by threadId like I said
* and Update can be executed from different threads
* this is where aforementioned exception is thrown
*/
}
else // if not simply insert the entity in the db and the caches
{
_sportRepository.Insert(sport);
}
_sportRepository.SaveDbChanges();
... updating caches logic
}
I thought that getting entities from the database with AsNoTracking() method or detaching entities every time I "update" or "insert" entity will solve this, but it did not.
Whilst there is a certain overhead to newing up a DbContext, and using DI to share a single instance of a DbContext within a web request can save some of this overhead, simple CRUD operations can just new up a new DbContext for each action.
Looking at the code you have posted so far, I would probably have a private instance of the DbContext newed up in the Repository constructor, and then new up a Repository for each method.
Then your method would look something like this:
public async Task<TEntity> Handle<TEntity>(TEntity sportEvent)
where TEntity : ISportEvent
{
var sportsRepository = new SportsRepository()
... some unimportant business logic
//save the sport
if (sport.SportID > 0)
{
_sportRepository.Update(sport);
}
else
{
_sportRepository.Insert(sport);
}
_sportRepository.SaveDbChanges();
}
public class SportsRepository
{
private DbContext _dbContext;
public SportsRepository()
{
_dbContext = new DbContext();
}
}
You might also want to consider the use of Stub Entities as a way around sharing a DbContext with other repository classes.
Since this is about some existing business application I will focus on ideas that can help solve the issue rather than lecture about best practices or propose architectural changes.
I know this is kind of obvious but sometimes rewording error messages helps us better understand what's going on so bear with me.
The error message indicates entities are being used by multiple data contexts which indicates that there are multiple dbcontext instances and that entities are referenced by more than one of such instances.
Then the question states there is a data context per thread that used to be per http request and that entities are cached.
So it seems safe to assume entities read from a db context upon a cache miss and returned from the cache on a hit. Attempting to update entities loaded from one db context instance using a second db context instance cause the failure. We can conclude that in this case the exact same entity instance was used in both operations and no serialization/deserialization is in place for accessing the cache.
DbContext instances are in themselves entity caches through their internal change tracker mechanism and this error is a safeguard protecting its integrity. Since the idea is to have a long running process handling simultaneous requests through multiple db contexts (one per thread) plus a shared entity cache it would be very beneficial performance-wise and memory-wise (the change tracking would likely increase memory consumption in time) to attempt to either change db contexts lifecycle to be per message or empty their change tracker after each message is processed.
Of course in order to process entity updates they need to be attached to the current db context right after retrieving it from the cache and before any changes are applied to them.
I am trying to write some business layer logic for an asp.net app to insert or update an object. I am getting an object out of the business layer and then passing it back in to be saved to the db. The data context is contained in the business layer which is what I think is causing the exception.
The exceptions is "An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported."
I'm sure I am missing some small setting but I'm just not sure what.
this is the code that is doing the insertion and updating....
public static void Save(Order order)
{
using (TicketInformationDataContext db = new TicketInformationDataContext())
{
if (order.OrderID <= 0)
db.Orders.InsertOnSubmit(order);
else
{
db.ObjectTrackingEnabled = true;
ITable table = db.GetTable(typeof(Order));
table.Attach(order, true);
db.Orders.Attach(order, true);
}
db.SubmitChanges();
}
}
I believe your problem is that you are creating separate contexts for loading the Order and later for saving it.
You should design your repositories to work against a context you can pass during construction (or preferably inject through IoC).
That way, both operations would work against the same context.
Remember that entities are bound to their context, and attempting to mix them will cause all sorts of problems, especially with lazy initialization or associated entities.
Please see the accepted response for this similar question, which discusses repository design and the unit of work design pattern.
In my ASP.NET MVC application I need to implement persistence of data. I've choose Entity Framework for its ability to create classes, database tables and queries from entity model so that I don't have to write SQL table creation or Linq to SQL queries by hand. So simplicity is my goal.
My approach was to create model and than a custom HttpModule that gets called at the and of each request and that just called SaveChanges() on the context. That made my life very hard - entity framework kept throwing very strange exception. Sometimes it worked - no exception but sometimes it did not. First I was trying to fix the problems one by one but when I got another one I realized that my general approach is probably wrong.
So that is the general practice to implement for implementing persistence in ASP.NET MVC application ? Do I just call saveChanges after each change ? Isn't that little inefficient ? And I don't know how to do that with Services patter anyway (services work with entities so I'd have to pass context instance to them so that they could save changes if they make some).
Some links to study materials or tutorials are also appreciated.
Note: this question asks for programing practice. I ask those who will consider it vague to bear in mind that it is still solving my very particular problem and right technique will save me a lot of technical problems before voting to close.
You just need to make sure SaveChanges gets called before your request finishes. At the bottom of a controller action is an ideal place. My controller actions typically look like this:
public ActionResult SomeAction(...)
{
_repository.DoSomething();
...
_repository.DoSomethingElse();
...
_repository.SaveChanges();
return View(...);
}
This has the added benefit that if an exception gets thrown, then SaveChanges will not get called. And you can either handle the exception in the action or in Controller.OnException.
It's going to be no more or less efficient than calling a stored procedure that many number of times (with respect to number of connections that need to be made).
Nominally, you would make all your changes to the object set, then SaveChanges to commit all those changes.
So instead of doing this:
mySet.Objects.Add(someObject);
mySet.SaveChanges();
mySet.OtherObjects.Add(someOtherObject);
mySet.SaveChanges();
You just need to do:
mySet.Objects.Add(someObject);
mySet.OtherObjects.Add(someOtherObject);
mySet.SaveChanges();
// Commits Both Changes
Usually your data access is wrapped by an object implementing the repsitory pattern. You then invoke a Save() method on the repository.
Something like
var customer = customerRepository.Get(id);
customer.FirstName = firstName;
customer.LastName = lastName;
customerRepository.SaveChanges();
The repository can then be wrapped by a service layer to provide view model objects or DTO's
Isn't that little inefficient ?
Don't prematurely optimise. When you have a performance issue, analyse the performance, identify a cause and then optimise. Repeat.
Update
A repository wraps data access, usually a single entity. A service layer wraps business logic and can access multiply entities through multiple repositories. It usually deals with 'slim' models or DTO's.
An example could be something like getting a list of invoices for a customer
public Customer GetCustomerWithInvoices(int id) {
var customer = customerRepository.Get(id);
var invoiceList = invoiceRepository.GetAllInvoicesFor(id);
return new Customer {
Customer = customer,
Invoices = invoiceList
};
}
I have a Linq object, and I want to make changes to it and save it, like so:
public void DoSomething(MyClass obj) {
obj.MyProperty = "Changed!";
MyDataContext dc = new MyDataContext();
dc.GetTable<MyClass>().Attach(dc, true); // throws exception
dc.SubmitChanges();
}
The exception is:
System.InvalidOperationException: An entity can only be attached as modified without original state if it declares a version member or does not have an update check policy.
It looks like I have a few choices:
put a version member on every one of my Linq classes & tables (100+) that I need to use in this way.
find the data context that originally created the object and use that to submit changes.
implement OnLoaded in every class and save a copy of this object that I can pass to Attach() as the baseline object.
To hell with concurrency checking; load the DB version just before attaching and use that as the baseline object (NOT!!!)
Option (2) seems the most elegant method, particularly if I can find a way of storing a reference to the data context when the object is created. But - how?
Any other ideas?
EDIT
I tried to follow Jason Punyon's advice and create a concurrency field on on table as a test case. I set all the right properties (Time Stamp = true etc.) on the field in the dbml file, and I now have a concurrency field... and a different error:
System.NotSupportedException: An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported.
So what the heck am I supposed to attach, then, if not an existing entity? If I wanted a new record, I would do an InsertOnSubmit()! So how are you supposed to use Attach()?
Edit - FULL DISCLOSURE
OK, I can see it's time for full disclosure of why all the standard patterns aren't working for me.
I have been trying to be clever and make my interfaces much cleaner by hiding the DataContext from the "consumer" developers. This I have done by creating a base class
public class LinqedTable<T> where T : LinqedTable<T> {
...
}
... and every single one of my tables has the "other half" of its generated version declared like so:
public partial class MyClass : LinqedTable<MyClass> {
}
Now LinqedTable has a bunch of utility methods, most particularly things like:
public static T Get(long ID) {
// code to load the record with the given ID
// so you can write things like:
// MyClass obj = MyClass.Get(myID);
// instead of:
// MyClass obj = myDataContext.GetTable<MyClass>().Where(o => o.ID == myID).SingleOrDefault();
}
public static Table<T> GetTable() {
// so you can write queries like:
// var q = MyClass.GetTable();
// instead of:
// var q = myDataContext.GetTable<MyClass>();
}
Of course, as you can imagine, this means that LinqedTable must somehow be able to have access to a DataContext. Up until recently I was achieving this by caching the DataContext in a static context. Yes, "up until recently", because that "recently" is when I discovered that you're not really supposed to hang on to a DataContext for longer than a unit of work, otherwise all sorts of gremlins start coming out of the woodwork. Lesson learned.
So now I know that I can't hang on to that data context for too long... which is why I started experimenting with creating a DataContext on demand, cached only on the current LinqedTable instance. This then led to the problem where the newly created DataContext wants nothing to do with my object, because it "knows" that it's being unfaithful to the DataContext that created it.
Is there any way of pushing the DataContext info onto the LinqedTable at the time of creation or loading?
This really is a poser. I definitely do not want to compromise on all these convenience functions I've put into the LinqedTable base class, and I need to be able to let go of the DataContext when necessary and hang on to it while it's still needed.
Any other ideas?
Updating with LINQ to SQL is, um, interesting.
If the data context is gone (which in most situations, it should be), then you will need to get a new data context, and run a query to retrieve the object you want to update. It's an absolute rule in LINQ to SQL that you must retrieve an object to delete it, and it's just about as iron-clad that you should retrieve an object to update it as well. There are workarounds, but they are ugly and generally have lots more ways to get you in trouble. So just go get the record again and be done with it.
Once you have the re-fetched object, then update it with the content of your existing object that has the changes. Then do a SubmitChanges() on the new data context. That's it! LINQ to SQL will generate a fairly heavy-handed version of optimistic concurrency by comparing every value in the record to the original (in the re-fetched) record. If any value changed while you had the data, LINQ to SQL will throw a concurrency exception. (So you don't need to go altering all your tables for versioning or timestamps.)
If you have any questions about the generated update statements, you'll have to break out SQL Profiler and watch the updates go to the database. Which is actually a good idea, until you get confidence in the generated SQL.
One last note on transactions - the data context will generate a transaction for each SubmitChanges() call, if there is no ambient transaction. If you have several items to update and want to run them as one transaction, make sure you use the same data context for all of them, and wait to call SubmitChanges() until you've updated all the object contents.
If that approach to transactions isn't feasible, then look up the TransactionScope object. It will be your friend.
I think 2 is not the best option. It's sounding like you're going to create a single DataContext and keep it alive for the entire lifetime of your program which is a bad idea. DataContexts are lightweight objects meant to be spun up when you need them. Trying to keep the references around is also probably going to tightly couple areas of your program you'd rather keep separate.
Running a hundred ALTER TABLE statements one time, regenerating the context and keeping the architecture simple and decoupled is the elegant answer...
find the data context that originally created the object and use that to submit changes
Where did your datacontext go? Why is it so hard to find? You're only using one at any given time right?
So what the heck am I supposed to attach, then, if not an existing entity? If I wanted a new record, I would do an InsertOnSubmit()! So how are you supposed to use Attach()?
You're supposed to attach an instance that represents an existing record... but was not loaded by another datacontext - can't have two contexts tracking record state on the same instance. If you produce a new instance (ie. clone) you'll be good to go.
You might want to check out this article and its concurrency patterns for update and delete section.
The "An entity can only be attached as modified without original state if it declares a version member" error when attaching an entitity that has a timestamp member will (should) only occur if the entity has not travelled 'over the wire' (read: been serialized and deserialized again). If you're testing with a local test app that is not using WCF or something else that will result in the entities being serialized and deserialized then they will still keep references to the original datacontext through entitysets/entityrefs (associations/nav. properties).
If this is the case, you can work around it by serializing and deserializing it locally before calling the datacontext's .Attach method. E.g.:
internal static T CloneEntity<T>(T originalEntity)
{
Type entityType = typeof(T);
DataContractSerializer ser =
new DataContractSerializer(entityType);
using (MemoryStream ms = new MemoryStream())
{
ser.WriteObject(ms, originalEntity);
ms.Position = 0;
return (T)ser.ReadObject(ms);
}
}
Alternatively you can detach it by setting all entitysets/entityrefs to null, but that is more error prone so although a bit more expensive I just use the DataContractSerializer method above whenever I want to simulate n-tier behavior locally...
(related thread: http://social.msdn.microsoft.com/Forums/en-US/linqtosql/thread/eeeee9ae-fafb-4627-aa2e-e30570f637ba )
You can reattach to a new DataContext. The only thing that prevents you from doing so under normal circumstances is the property changed event registrations that occur within the EntitySet<T> and EntityRef<T> classes. To allow the entity to be transferred between contexts, you first have to detach the entity from the DataContext, by removing these event registrations, and then later on reattach to the new context by using the DataContext.Attach() method.
Here's a good example.
When you retrieve the data in the first place, turn off object tracking on the context that does the retrieval. This will prevent the object state from being tracked on the original context. Then, when it's time to save the values, attach to the new context, refresh to set the original values on the object from the database, and then submit changes. The following worked for me when I tested it.
MyClass obj = null;
using (DataContext context = new DataContext())
{
context.ObjectTrackingEnabled = false;
obj = (from p in context.MyClasses
where p.ID == someId
select p).FirstOrDefault();
}
obj.Name += "test";
using (DataContext context2 = new ())
{
context2.MyClasses.Attach(obj);
context2.Refresh(System.Data.Linq.RefreshMode.KeepCurrentValues, obj);
context2.SubmitChanges();
}
I'm new to the Entities Framework, and am just starting to play around with it in my free time. One of the major questions I have is regarding how to handle ObjectContexts.
Which is generally preferred/recommended of these:
This
public class DataAccess{
MyDbContext m_Context;
public DataAccess(){
m_Context = new MyDbContext();
}
public IEnumerable<SomeItem> GetSomeItems(){
return m_Context.SomeItems;
}
public void DeleteSomeItem(SomeItem item){
m_Context.DeleteObject(item);
m_Context.SaveChanges();
}
}
Or this?
public class DataAccess{
public DataAccess(){ }
public IEnumerable<SomeItem> GetSomeItems(){
MyDbContext context = new DbContext();
return context.SomeItems;
}
public void DeleteSomeItem(SomeItem item){
MyDbContext context = new DbContext();
context.DeleteObject(item);
context.SaveChanges();
}
}
The ObjectContext is meant to be the "Unit of Work".
Essentially what this means is that for each "Operation" (eg: each web-page request) there should be a new ObjectContext instance. Within that operation, the same ObjectContext should be re-used.
This makes sense when you think about it, as transactions and change submission are all tied to the ObjectContext instance.
If you're not writing a web-app, and are instead writing a WPF or windows forms application, it gets a bit more complex, as you don't have the tight "request" scope that a web-page-load gives you, but you get the idea.
PS: In either of your examples, the lifetime of the ObjectContext will either be global, or transient. In both situations, it should NOT live inside the DataAccess class - it should be passed in as a dependency
If you keep the same context for a long-running process running lots queries against it, linq-to-sql (I didn't test against linq to entities, but I guess that's the same problem) gets VERY slow (1 query a second after some 1000 simple queries). Renewing the context on a regular basis fixes this issue, and doesn't cost so much.
What happens is that the context keeps track of every query you do on it, so if it's not reset in a way, it gets really fat... Other issue is then the memory it takes.
So it mainly depends on the way your application is working, and if you new up a DataAccess instance regularly or if you keep it the same all along.
Hope this helps.
Stéphane
Just a quick note - the two code pieces are roughly the same in their underlying problem. This is something that I have been looking at, because you dont want to keep opening and closing the context (see second example) at the same time you are not sure if you can trust Microsoft to properly dispose of the context for you.
One of the things I did was create a common base class that lazy loads the Context in and implement the base class destruct-er to dispose of things. This works well for something like the MVC framework, but unfortunately leads to the problem of having to pass the context around to the various layers so the business objects can share the call.
In the end I went with something using Ninject to inject this dependency into each layer and had it track usage
While I'm not in favour of always creating, what must be, complicated objects each time I need them - I too have found that the DataContexts in Linq to Sql and the ObjectContexts in EF are best created when required.
Both of these perform a lot of static initialisation based on the model that you run them against, which is cached for subsequent calls, so you'll find that the initial startup for a context will be longer than all subsequent instantiations.
The biggest hurdle you face with this is the fact that once you obtain an entity from the context, you can't simply pass it back into another to perform update operations, or add related entities back in. In EF you can reattach an entity back to a new context. In L2S this process is nigh-on impossible.