I'm using Entity Framework 4.0. I do a lot of read operations on the database (data analysis). No data will be saved. Currently, despite the Lazy Loading, number of I / O operations to the database server slows down considerably the application. I decided most of the small tables loaded into memory (.ToList()) and then generate the calculation. Is there a way to automatically read the context of the data in the table are only the first references to it and have not been updated by the life context?. The idea is that with further references to this table was not queried database, only the memory of the application.
Now, I use this code:
public class cDBReader
{
private List<RISK_T_MEMBERS> fMembers;
public List<RISK_T_MEMBERS> Members
{
get
{
if (fMembers == null)
using (RiskEntities context = new RiskEntities(TConfiguration.connectionString))
{
context.RISK_T_MEMBERS.MergeOption = System.Data.Objects.MergeOption.NoTracking;
fMembers = context.RISK_T_MEMBERS.ToList();
}
return fMembers;
}
set { fMembers = value; }
}
}
Could you please add some more information? What is wrong with the current implementation?
It seems like what you need is some kind of caching. Here is an example implementation on caching.
An even simple ( less overkill in my opinion ) way to implement caching would be this.
Last but not least is any implementation of yours ( like the one you already made ). Maybe the singleton pattern could help you here? That is a singleton object that saves all those data you need globally and implements gets to your data by providing the context currently active, and in case the data is null, the application uses the active context to get them. Dropping and creating the context could also speed up the application after a while ( and that's the reason why I think you should provide the active context to the getData functions ).
You could try Eager Loading instead of Lazy Loading first to get full control on what EF loads and just manually preload what you need once. Loaded data will stay alive within one context lifetime bounds, if you will need more and are sure that database is read only, you could just store somewhere and then attach preloaded data when new context is created
Related
I've got an Application which consists of 2 parts at the moment
A Viewer that receives data from a database using EF
A Service that manipulates data from the database at runtime.
The logic behind the scenes includes some projects such as repositories - data access is realized with a unit of work. The Viewer itself is a WPF-Form with an underlying ViewModel.
The ViewModel contains an ObservableCollection which is the datasource of my Viewer.
Now the question is - How am I able to retrieve the database-data every few minutes? I'm aware of the following two problems:
It's not the latest data my Repository is "loading" - does EF "smart" stuff and retrieves data from the local cache? If so, how can I force EF to load the data from the database?
Re-Setting the whole ObservableCollection or adding / removing entities from another thread / backgroundworker (with invokation) is not possible. How am I supposed to solve this?
I will add some of my code if needed but at the moment I don't think that this would help at all.
Edit:
public IEnumerable<Request> GetAllUnResolvedRequests() {
return AccessContext.Requests.Where(o => !o.IsResolved);
}
This piece of code won't get the latest data - I edit some rows manually (set IsResolved to true) but this method retrieves it nevertheless.
Edit2:
Edit3:
var requests = AccessContext.Requests.Where(o => o.Date >= fromDate && o.Date <= toDate).ToList();
foreach (var request in requests) {
AccessContext.Entry(request).Reload();
}
return requests;
Final Question:
The code above "solves" the problem - but in my opinion it's not clean. Is there another way?
When you access an entity on a database, the entity is cached (and tracked to track changes that your application does until you specify AsNoTracking).
This has some issues (for example, performance issues because the cache increases or you see an old version of entities that is your case).
For this reasons, when using EF you should work with Unit of work pattern (i.e. you should create a new context for every unit of work).
You can have a look to this Microsoft article to understand how implement Unit of work pattern.
http://www.asp.net/mvc/overview/older-versions/getting-started-with-ef-5-using-mvc-4/implementing-the-repository-and-unit-of-work-patterns-in-an-asp-net-mvc-application
In your case using Reload is not a good choice because the application is not scalable. For every reload you are doing a query to database. If you just need to return desired entities the best way is to create a new context.
public IEnumerable<Request> GetAllUnResolvedRequests()
{
return GetNewContext().Requests.Where(o => !o.IsResolved).ToList();
}
Here is what you can do.
You can define the Task (which keeps running on ThreadPool) that periodically checks the Database (consider that periodically making EF to reload data has its own cost).
And You can define SQL Dependency on your query so that when there is a change in data, you can notify the main thread for the same.
We have an ASP.NET project with Entity Framework and SQL Azure.
A big part of our data only needs to be updated a few times a day, other data is very volatile.
The data that barely changes we cache in memory at startup, detach from the context and than use it mainly for reading, drastically lowering the amount of database requests we have to do.
The volatile data is requested everytime by a DbContext per Http request.
When we do an update to the cached data, we send a message to all instances to catch a fresh version of all the data from the SQL server.
So far, so good.
Until we introduced a bug that linked one of these 'cached' objects to the 'volatile' data, and did a SaveChanges.
Well, that was quite a mess.
The whole data tree was added again and again by every update, corrupting the whole database with a whole lot of duplicated data.
As a complete hack I added a completely arbitrary column with a UniqueConstraint and some gibberish data on one of the root tables; hopefully failing the SaveChanges() next time we introduce such a bug because it will violate the Unique Constraint.
But it is of course hacky, and I'm still pretty scared ;P
Are there any better ways to prevent whole tree's of cached objects ending up in the database?
More information
Project is ASP.NET MVC
I cache this data, because it is mainly read only, and this saves a tons of extra database calls per http request
This is in a high traffic website, with a lot of personal customized views. Having the POCO data in memory works really good for what I want. Except the problem I mentioned.
It is a bit more complicated, but a simplified version is that I cache the objects by a singleton: so i.e:
EntityCache.Instance.LolCats = new DbContext().LolCats.AsNoTracking().ToList();
This cache I dependency-inject into my controllers.
You can solve it like this:
1) Create an interface like this:
public interface IIsReadOnly
{
bool IsReadOnly { get; set; }
}
2) Implement this interface in all of the entities that can be cached. When you read and cache them, set the IsReadOnly property to true. This flag will be used when SaveChanges is invoked. Remember to decorate this property with the [NotMapped] attribute, or use any other mean to make EF ignore it.
public class ACacheableEntitySample
: IIsReadOnly
{
[NotMapped]
public bool IsReadOnly { get; set; }
// define the "regular" entity properties
}
NOTE: you can include the property directly in the class definition (if using Code First), or use partial classes (for Db First, Model First, or Code First).
NOTE: alternatively you can make EF ignore the IsReadOnly property using the Fluent API, or even better a custom convention (EF 6+)
3) Override your inherited DbContext.SaveChanges method. In the overridden method, review all the entries with pending changes, and if they are read only, change there state to Unchanged:
if (entry is IIsReadOnly) // if it's a cacheable entity
{
if (entry.IsReadOnly) // and it was marked as readonly when caching
{
// change the entry state to unchanged here, so that it's not updated
}
}
NOTE: This is sample code to explain what you need to do. In your final implementation you can do it with a simple LINQ sentence that get all the IIsReadOnly entities, which have the IsReadOnly set to true, and set their state to Unchanged.
You can use the IIsReadOnly entites in another DbContext and manipulate them in the usual way. For example if you get one of these entites, update it, and call SaveChanges, the changes will be saved because IsReadOnly will have the default false value. But you'll easily avoid saving changes of cached data accidentally, simply by setting the IsReadOnly property to true when caching.
Original answer deleted because it was a waste of time.
Your post and proceeding comments are a perfect example of the XY Problem.
You say:
I really need a solution for the problem, not for the architecture
What if the architecture is the problem?
The problem you presented
A caching solution you implemented that violates at least a half dozen best practices has (surprise!) blown up in your face. You've managed to stop it from blowing again up via a spectacular (not in a good way) hack but you want to know how to do it in a way that won't require such a spectacular hack.
The problem you had
You needed to cache some data because it was getting too expensive to hit the database for every request.
The answers that were offered
Use foreign keys instead of navigation properties
This is a perfectly valid answer and, surprise, a best practice. Navigation properties can change any time you regenerate the code in your Entity Data Model and are often ambiguous. With a bit of effort you could have used this and never had to worry about EF's handling of object relationships again.
Cache models instead of Entity objects
Another valid answer, and one that requires the least amount of actual work. MVC applications usually require some redundancy between viewmodels and entity objects and if you ever write a proper multi-tier application you'll practically drown in redundant objects. And nobody will accidentally add these objects to a DbContext ever again - because they can't.
Criticism
You have offered up very little useful information. From what I can tell your approach from the get-go was wrong.
Firstly, dumping whole tables into memory at App_Start is at best a temporary solution. If the table was too big to hit on every request, it's too big to hit on App_Start. What happens if something important breaks while people are using your application and you need to deploy a bug fix ASAP? What happens when your tables get really big and you start getting timeouts from EF while trying to dump them into memory? What happens if 95% of your users only really ever need 10% of that big table you've dumped into memory? Is the memory on your web/cache server going to be enough to accommodate the increasing size of your tables? For how long?
Secondly, no Entity object should remain anywhere after its originating DbContext is disposed. Entity objects behave in a convenient way while their DbContext is in scope and become troublesome POCOs when it's out of scope. I say troublesome because the 'magic' DbContext does with change tracking tends to fool people unfamiliar with the inner workings of EF into thinking that an Entity object is directly connected to a table row in the database. The problem you had illustrates this point perfectly.
Thirdly, it looks like you need to delete and re-dump a whole table to memory, even if you only update a single column in a single row. That's immensely wasteful to both the memory and CPU on your web server, and to your Azure SQL instance(s). What happens when a small bit of data comes in wrong and needs to be updated in a hurry? What if one of your nightly update jobs fails but you need fresh data in the morning?
You may not worry about any of this stuff now but your solution blowing up in your face should have at the very least raised some red flags. I've had to deal with as lot of caching in projects I've worked on in the past few years and everything I say here comes from experience.
Proposed solution - On-demand caching
If you've put a little effort into organizing your code, all of your CRUD operations on the database should be in specialized helper classes which I call repositories. Your controller calls its specialized repository (StuffController - StuffRepository), receives a model and binds that model to a view, kinda like this:
public class StuffController : Controller
{
private MyDbContext _db;
private StuffRepository _repo;
public StuffController()
{
_db = new MyDbContext();
_repo = new StuffRepository(_db);
}
// ...
public ActionResult Details(int id)
{
var model = _repo.ReadDetails(id);
// ...
return View(model);
}
protected override void Dispose(bool disposing)
{
_db.Dispose();
base.Dispose(disposing);
}
}
What on-demand caching would do is wrap that call to the repository in such a way that if the result of that method was already in the cache and it was not stale, it would return it from the cache. Otherwise it would hit the database.
Here's a simplified (and probably nonfunctional) example of a CacheWrapper class so you can understand what it does, using HttpRuntime.Cache:
public static class CacheWrapper
{
private static List<string> _keys = new List<string>();
public static List<string> Keys
{
get { lock(_keys) { return _keys.ToList(); } }
}
public static T Fetch<T>(string key, Func<T> dlgt, bool refresh = false) where T : class
{
var result = HttpRuntime.Cache.Get(key) as T;
if(result != null && !refresh) return result;
lock(HttpRuntime.Cache)
{
lock(_keys)
{
_keys.Add(key);
}
result = dlgt();
HttpRuntime.Cache.Add(key, result, /* some other params */);
}
return result;
}
}
And the new way to call things from the controller:
public ActionResult Details(int id)
{
var model = CacheWrapper.Fetch("StuffDetails_" + id, () => _repo.ReadDetails(id));
// ...
return View(model);
}
A slightly more complex version of this is in production on a public web application as we speak and working quite well.
What is the best way to refresh data in Entity Framework 5? I've got an WPF application showing statistics from a database where data is changing all the time. Every 10 seconds the application is updating the result but the default behaviour for EF seems to be to cache the previous results. I would thus like a way to invalidate the previous results so a new set of data can be loaded.
The context of interest is defined in the following way:
public partial class MyEntities: DbContext
{
...
public DbSet<Stat> Stats { get; set; }
...
}
After some reading I was able to find a few approaches, but I have no idea of how efficient these ways are and if they come with downsides.
Create a new instance of the entities object
using (var db = new MyEntities())
{
var stats = from s in db.Stats ...
}
This works but feels inefficient because there are many other places where data is retrieved, and I don't want to reopen a new connection every time I need some data. Wouldn't it be more efficient to keep the connection open and do it another way?
Call refresh on the ObjectContext
var stats = from s in db.Stats ...
ObjectContext.Refresh(RefreshMode.StoreWins, stats );
This also assumes I'm extracting ObjectContext from the dbContext in this way:
private MyEntities db = null;
private ObjectContext ObjectContext
{
get
{
return ((IObjectContextAdapter)db).ObjectContext;
}
}
This is the solution I'm using as it is now. It seems simple. But I read somewhere that ObjectContext nowadays isn't directly accessible in DbContext because the EF team doesn't think that anyone would need it, and that you can do all things you need directly in DbContext. This makes me think that maybe this is not the best way to do it, or?
I know there is a reload method of dbContext.Entry but since I'm not reloading a single entity but rather retrieve a list of entities, I don't really know if this way will work. If I get 5 stat objects in the first query, save them in a list and do a reload on them when it's time to update, I might miss out others that have been added to the list on the database. Or have I completely misunderstood the reload method? Can I do a reload on a DbSetspecified in MyEntities?
There are a number of questions above but what I mainly want to know is what is the best practice in EF5 for asking the same query to the database over and over again? It might very well be something that I haven't discovered yet...
Actually, and even if it seems counter intuitive, the first option is the correct one, see this
DbContext are design to have short lifespans, hence their instantiation cost is quite low compared to the cost of reloading everything, it's mostly due to things like caching, and their data loading designs in general.
That's also why EF works so "naturally" well with ASP .NET MVC, since the DbContext is instantiated at each request.
That doesn't mean you have to create DbContext all over the place of course, in your context, using a DbContext per update operation (the one happening every 10secs) seems good enough, if during that operation you would need to delete a particular row, for example, you would pass the DbContext around, not create a new one.
I am trying to write some business layer logic for an asp.net app to insert or update an object. I am getting an object out of the business layer and then passing it back in to be saved to the db. The data context is contained in the business layer which is what I think is causing the exception.
The exceptions is "An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported."
I'm sure I am missing some small setting but I'm just not sure what.
this is the code that is doing the insertion and updating....
public static void Save(Order order)
{
using (TicketInformationDataContext db = new TicketInformationDataContext())
{
if (order.OrderID <= 0)
db.Orders.InsertOnSubmit(order);
else
{
db.ObjectTrackingEnabled = true;
ITable table = db.GetTable(typeof(Order));
table.Attach(order, true);
db.Orders.Attach(order, true);
}
db.SubmitChanges();
}
}
I believe your problem is that you are creating separate contexts for loading the Order and later for saving it.
You should design your repositories to work against a context you can pass during construction (or preferably inject through IoC).
That way, both operations would work against the same context.
Remember that entities are bound to their context, and attempting to mix them will cause all sorts of problems, especially with lazy initialization or associated entities.
Please see the accepted response for this similar question, which discusses repository design and the unit of work design pattern.
I have a Linq object, and I want to make changes to it and save it, like so:
public void DoSomething(MyClass obj) {
obj.MyProperty = "Changed!";
MyDataContext dc = new MyDataContext();
dc.GetTable<MyClass>().Attach(dc, true); // throws exception
dc.SubmitChanges();
}
The exception is:
System.InvalidOperationException: An entity can only be attached as modified without original state if it declares a version member or does not have an update check policy.
It looks like I have a few choices:
put a version member on every one of my Linq classes & tables (100+) that I need to use in this way.
find the data context that originally created the object and use that to submit changes.
implement OnLoaded in every class and save a copy of this object that I can pass to Attach() as the baseline object.
To hell with concurrency checking; load the DB version just before attaching and use that as the baseline object (NOT!!!)
Option (2) seems the most elegant method, particularly if I can find a way of storing a reference to the data context when the object is created. But - how?
Any other ideas?
EDIT
I tried to follow Jason Punyon's advice and create a concurrency field on on table as a test case. I set all the right properties (Time Stamp = true etc.) on the field in the dbml file, and I now have a concurrency field... and a different error:
System.NotSupportedException: An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported.
So what the heck am I supposed to attach, then, if not an existing entity? If I wanted a new record, I would do an InsertOnSubmit()! So how are you supposed to use Attach()?
Edit - FULL DISCLOSURE
OK, I can see it's time for full disclosure of why all the standard patterns aren't working for me.
I have been trying to be clever and make my interfaces much cleaner by hiding the DataContext from the "consumer" developers. This I have done by creating a base class
public class LinqedTable<T> where T : LinqedTable<T> {
...
}
... and every single one of my tables has the "other half" of its generated version declared like so:
public partial class MyClass : LinqedTable<MyClass> {
}
Now LinqedTable has a bunch of utility methods, most particularly things like:
public static T Get(long ID) {
// code to load the record with the given ID
// so you can write things like:
// MyClass obj = MyClass.Get(myID);
// instead of:
// MyClass obj = myDataContext.GetTable<MyClass>().Where(o => o.ID == myID).SingleOrDefault();
}
public static Table<T> GetTable() {
// so you can write queries like:
// var q = MyClass.GetTable();
// instead of:
// var q = myDataContext.GetTable<MyClass>();
}
Of course, as you can imagine, this means that LinqedTable must somehow be able to have access to a DataContext. Up until recently I was achieving this by caching the DataContext in a static context. Yes, "up until recently", because that "recently" is when I discovered that you're not really supposed to hang on to a DataContext for longer than a unit of work, otherwise all sorts of gremlins start coming out of the woodwork. Lesson learned.
So now I know that I can't hang on to that data context for too long... which is why I started experimenting with creating a DataContext on demand, cached only on the current LinqedTable instance. This then led to the problem where the newly created DataContext wants nothing to do with my object, because it "knows" that it's being unfaithful to the DataContext that created it.
Is there any way of pushing the DataContext info onto the LinqedTable at the time of creation or loading?
This really is a poser. I definitely do not want to compromise on all these convenience functions I've put into the LinqedTable base class, and I need to be able to let go of the DataContext when necessary and hang on to it while it's still needed.
Any other ideas?
Updating with LINQ to SQL is, um, interesting.
If the data context is gone (which in most situations, it should be), then you will need to get a new data context, and run a query to retrieve the object you want to update. It's an absolute rule in LINQ to SQL that you must retrieve an object to delete it, and it's just about as iron-clad that you should retrieve an object to update it as well. There are workarounds, but they are ugly and generally have lots more ways to get you in trouble. So just go get the record again and be done with it.
Once you have the re-fetched object, then update it with the content of your existing object that has the changes. Then do a SubmitChanges() on the new data context. That's it! LINQ to SQL will generate a fairly heavy-handed version of optimistic concurrency by comparing every value in the record to the original (in the re-fetched) record. If any value changed while you had the data, LINQ to SQL will throw a concurrency exception. (So you don't need to go altering all your tables for versioning or timestamps.)
If you have any questions about the generated update statements, you'll have to break out SQL Profiler and watch the updates go to the database. Which is actually a good idea, until you get confidence in the generated SQL.
One last note on transactions - the data context will generate a transaction for each SubmitChanges() call, if there is no ambient transaction. If you have several items to update and want to run them as one transaction, make sure you use the same data context for all of them, and wait to call SubmitChanges() until you've updated all the object contents.
If that approach to transactions isn't feasible, then look up the TransactionScope object. It will be your friend.
I think 2 is not the best option. It's sounding like you're going to create a single DataContext and keep it alive for the entire lifetime of your program which is a bad idea. DataContexts are lightweight objects meant to be spun up when you need them. Trying to keep the references around is also probably going to tightly couple areas of your program you'd rather keep separate.
Running a hundred ALTER TABLE statements one time, regenerating the context and keeping the architecture simple and decoupled is the elegant answer...
find the data context that originally created the object and use that to submit changes
Where did your datacontext go? Why is it so hard to find? You're only using one at any given time right?
So what the heck am I supposed to attach, then, if not an existing entity? If I wanted a new record, I would do an InsertOnSubmit()! So how are you supposed to use Attach()?
You're supposed to attach an instance that represents an existing record... but was not loaded by another datacontext - can't have two contexts tracking record state on the same instance. If you produce a new instance (ie. clone) you'll be good to go.
You might want to check out this article and its concurrency patterns for update and delete section.
The "An entity can only be attached as modified without original state if it declares a version member" error when attaching an entitity that has a timestamp member will (should) only occur if the entity has not travelled 'over the wire' (read: been serialized and deserialized again). If you're testing with a local test app that is not using WCF or something else that will result in the entities being serialized and deserialized then they will still keep references to the original datacontext through entitysets/entityrefs (associations/nav. properties).
If this is the case, you can work around it by serializing and deserializing it locally before calling the datacontext's .Attach method. E.g.:
internal static T CloneEntity<T>(T originalEntity)
{
Type entityType = typeof(T);
DataContractSerializer ser =
new DataContractSerializer(entityType);
using (MemoryStream ms = new MemoryStream())
{
ser.WriteObject(ms, originalEntity);
ms.Position = 0;
return (T)ser.ReadObject(ms);
}
}
Alternatively you can detach it by setting all entitysets/entityrefs to null, but that is more error prone so although a bit more expensive I just use the DataContractSerializer method above whenever I want to simulate n-tier behavior locally...
(related thread: http://social.msdn.microsoft.com/Forums/en-US/linqtosql/thread/eeeee9ae-fafb-4627-aa2e-e30570f637ba )
You can reattach to a new DataContext. The only thing that prevents you from doing so under normal circumstances is the property changed event registrations that occur within the EntitySet<T> and EntityRef<T> classes. To allow the entity to be transferred between contexts, you first have to detach the entity from the DataContext, by removing these event registrations, and then later on reattach to the new context by using the DataContext.Attach() method.
Here's a good example.
When you retrieve the data in the first place, turn off object tracking on the context that does the retrieval. This will prevent the object state from being tracked on the original context. Then, when it's time to save the values, attach to the new context, refresh to set the original values on the object from the database, and then submit changes. The following worked for me when I tested it.
MyClass obj = null;
using (DataContext context = new DataContext())
{
context.ObjectTrackingEnabled = false;
obj = (from p in context.MyClasses
where p.ID == someId
select p).FirstOrDefault();
}
obj.Name += "test";
using (DataContext context2 = new ())
{
context2.MyClasses.Attach(obj);
context2.Refresh(System.Data.Linq.RefreshMode.KeepCurrentValues, obj);
context2.SubmitChanges();
}