In my ASP.NET MVC application I need to implement persistence of data. I've choose Entity Framework for its ability to create classes, database tables and queries from entity model so that I don't have to write SQL table creation or Linq to SQL queries by hand. So simplicity is my goal.
My approach was to create model and than a custom HttpModule that gets called at the and of each request and that just called SaveChanges() on the context. That made my life very hard - entity framework kept throwing very strange exception. Sometimes it worked - no exception but sometimes it did not. First I was trying to fix the problems one by one but when I got another one I realized that my general approach is probably wrong.
So that is the general practice to implement for implementing persistence in ASP.NET MVC application ? Do I just call saveChanges after each change ? Isn't that little inefficient ? And I don't know how to do that with Services patter anyway (services work with entities so I'd have to pass context instance to them so that they could save changes if they make some).
Some links to study materials or tutorials are also appreciated.
Note: this question asks for programing practice. I ask those who will consider it vague to bear in mind that it is still solving my very particular problem and right technique will save me a lot of technical problems before voting to close.
You just need to make sure SaveChanges gets called before your request finishes. At the bottom of a controller action is an ideal place. My controller actions typically look like this:
public ActionResult SomeAction(...)
{
_repository.DoSomething();
...
_repository.DoSomethingElse();
...
_repository.SaveChanges();
return View(...);
}
This has the added benefit that if an exception gets thrown, then SaveChanges will not get called. And you can either handle the exception in the action or in Controller.OnException.
It's going to be no more or less efficient than calling a stored procedure that many number of times (with respect to number of connections that need to be made).
Nominally, you would make all your changes to the object set, then SaveChanges to commit all those changes.
So instead of doing this:
mySet.Objects.Add(someObject);
mySet.SaveChanges();
mySet.OtherObjects.Add(someOtherObject);
mySet.SaveChanges();
You just need to do:
mySet.Objects.Add(someObject);
mySet.OtherObjects.Add(someOtherObject);
mySet.SaveChanges();
// Commits Both Changes
Usually your data access is wrapped by an object implementing the repsitory pattern. You then invoke a Save() method on the repository.
Something like
var customer = customerRepository.Get(id);
customer.FirstName = firstName;
customer.LastName = lastName;
customerRepository.SaveChanges();
The repository can then be wrapped by a service layer to provide view model objects or DTO's
Isn't that little inefficient ?
Don't prematurely optimise. When you have a performance issue, analyse the performance, identify a cause and then optimise. Repeat.
Update
A repository wraps data access, usually a single entity. A service layer wraps business logic and can access multiply entities through multiple repositories. It usually deals with 'slim' models or DTO's.
An example could be something like getting a list of invoices for a customer
public Customer GetCustomerWithInvoices(int id) {
var customer = customerRepository.Get(id);
var invoiceList = invoiceRepository.GetAllInvoicesFor(id);
return new Customer {
Customer = customer,
Invoices = invoiceList
};
}
Related
Please see the code below:
public Enquiry GetByID(Guid personID)
{
using (IUnitOfWork<ISession> unitOfWork = UnitOfWorkFactory.Create())
{
IRepository repository = RepostioryFactory.Create(unitOfWork);
var Person = repository.GetById(personID);
return Person;
}
}
It is contained in an application service layer. Person is passed back to the Controller and loaded into a view. The view then errors because it cannot load Person.Collection (a list).
I believe this is because the collection is loaded using lazy loading and the unit of work and NHibernate session is closed once the view is reached. Must I use eager loading in this situation or am I misunderstanding something?
IMHO lazy loading is evil!
The idea behind a repository is to return an aggregate. That aggregate should contain all the relevant data that constitute the aggregate. It is never loaded in bits. An aggregate should, therefore, always be eagerly fetched.
If you remove UoW/ORM from the equation lazy loading just isn't an option.
You should try not to query your domain. If you have a situation where a single aggregate contains all the data you need and that data has been exposed then that would be OK.
However, I would recommend you use a read model. A simple query layer. Give that a try and you may just be surprised :)
I realized the session is ending before the method finishes (its wrapped in a using block), which is before your view code runs. So yes, you do need to eager load the items in the collection property in your Enquiry type you get back from the NHibernate session.
A better way is to setup the unit of work pattern so it wraps around the entire request in the pipeline. For example, if you have a Global.asax file, it has two methods called Application_BeginRequest and Application_EndRequest.
The Application_BeginRequest method would create a new NHibernate session which can be retrieved by your controllers.
The Application_EndRequest method would simply flush your session, saving any data changes to the underlying database.
I refer you to the following StackOverflow question for incorporating NHibernate sessions with the Global.asax component: NHibernate Session in global.asax Application_BeginRequest
Introducing a View Model layer instead of passing the raw entity over to the Controller will solve your problem since mapping to the Person View Model (inside the using clause) will access Person.Collection and trigger the loading.
Alternatively, you could have a whole Read side that doesn't go through the domain, as #EbenRoux suggests.
Must I use eager loading in this situation or am I misunderstanding something?
Well, you want to, don't you? You are in a use case where you know that you want Person.Collection to be available, so why wouldn't you load it right away.
The trick is not to use the same repository implementation that you use when you want the collection to be loaded lazily (or not at all).
Udi Dahan wrote about this a number of times
http://udidahan.com/2007/03/06/better-domain-driven-design-implementation/
http://udidahan.com/2007/04/23/fetching-strategy-design/
Greg Young would caution you that the use of a fetching strategy is an implementation detail, and not part of the contract
http://codebetter.com/gregyoung/2009/01/16/ddd-the-generic-repository/
We have an ASP.NET project with Entity Framework and SQL Azure.
A big part of our data only needs to be updated a few times a day, other data is very volatile.
The data that barely changes we cache in memory at startup, detach from the context and than use it mainly for reading, drastically lowering the amount of database requests we have to do.
The volatile data is requested everytime by a DbContext per Http request.
When we do an update to the cached data, we send a message to all instances to catch a fresh version of all the data from the SQL server.
So far, so good.
Until we introduced a bug that linked one of these 'cached' objects to the 'volatile' data, and did a SaveChanges.
Well, that was quite a mess.
The whole data tree was added again and again by every update, corrupting the whole database with a whole lot of duplicated data.
As a complete hack I added a completely arbitrary column with a UniqueConstraint and some gibberish data on one of the root tables; hopefully failing the SaveChanges() next time we introduce such a bug because it will violate the Unique Constraint.
But it is of course hacky, and I'm still pretty scared ;P
Are there any better ways to prevent whole tree's of cached objects ending up in the database?
More information
Project is ASP.NET MVC
I cache this data, because it is mainly read only, and this saves a tons of extra database calls per http request
This is in a high traffic website, with a lot of personal customized views. Having the POCO data in memory works really good for what I want. Except the problem I mentioned.
It is a bit more complicated, but a simplified version is that I cache the objects by a singleton: so i.e:
EntityCache.Instance.LolCats = new DbContext().LolCats.AsNoTracking().ToList();
This cache I dependency-inject into my controllers.
You can solve it like this:
1) Create an interface like this:
public interface IIsReadOnly
{
bool IsReadOnly { get; set; }
}
2) Implement this interface in all of the entities that can be cached. When you read and cache them, set the IsReadOnly property to true. This flag will be used when SaveChanges is invoked. Remember to decorate this property with the [NotMapped] attribute, or use any other mean to make EF ignore it.
public class ACacheableEntitySample
: IIsReadOnly
{
[NotMapped]
public bool IsReadOnly { get; set; }
// define the "regular" entity properties
}
NOTE: you can include the property directly in the class definition (if using Code First), or use partial classes (for Db First, Model First, or Code First).
NOTE: alternatively you can make EF ignore the IsReadOnly property using the Fluent API, or even better a custom convention (EF 6+)
3) Override your inherited DbContext.SaveChanges method. In the overridden method, review all the entries with pending changes, and if they are read only, change there state to Unchanged:
if (entry is IIsReadOnly) // if it's a cacheable entity
{
if (entry.IsReadOnly) // and it was marked as readonly when caching
{
// change the entry state to unchanged here, so that it's not updated
}
}
NOTE: This is sample code to explain what you need to do. In your final implementation you can do it with a simple LINQ sentence that get all the IIsReadOnly entities, which have the IsReadOnly set to true, and set their state to Unchanged.
You can use the IIsReadOnly entites in another DbContext and manipulate them in the usual way. For example if you get one of these entites, update it, and call SaveChanges, the changes will be saved because IsReadOnly will have the default false value. But you'll easily avoid saving changes of cached data accidentally, simply by setting the IsReadOnly property to true when caching.
Original answer deleted because it was a waste of time.
Your post and proceeding comments are a perfect example of the XY Problem.
You say:
I really need a solution for the problem, not for the architecture
What if the architecture is the problem?
The problem you presented
A caching solution you implemented that violates at least a half dozen best practices has (surprise!) blown up in your face. You've managed to stop it from blowing again up via a spectacular (not in a good way) hack but you want to know how to do it in a way that won't require such a spectacular hack.
The problem you had
You needed to cache some data because it was getting too expensive to hit the database for every request.
The answers that were offered
Use foreign keys instead of navigation properties
This is a perfectly valid answer and, surprise, a best practice. Navigation properties can change any time you regenerate the code in your Entity Data Model and are often ambiguous. With a bit of effort you could have used this and never had to worry about EF's handling of object relationships again.
Cache models instead of Entity objects
Another valid answer, and one that requires the least amount of actual work. MVC applications usually require some redundancy between viewmodels and entity objects and if you ever write a proper multi-tier application you'll practically drown in redundant objects. And nobody will accidentally add these objects to a DbContext ever again - because they can't.
Criticism
You have offered up very little useful information. From what I can tell your approach from the get-go was wrong.
Firstly, dumping whole tables into memory at App_Start is at best a temporary solution. If the table was too big to hit on every request, it's too big to hit on App_Start. What happens if something important breaks while people are using your application and you need to deploy a bug fix ASAP? What happens when your tables get really big and you start getting timeouts from EF while trying to dump them into memory? What happens if 95% of your users only really ever need 10% of that big table you've dumped into memory? Is the memory on your web/cache server going to be enough to accommodate the increasing size of your tables? For how long?
Secondly, no Entity object should remain anywhere after its originating DbContext is disposed. Entity objects behave in a convenient way while their DbContext is in scope and become troublesome POCOs when it's out of scope. I say troublesome because the 'magic' DbContext does with change tracking tends to fool people unfamiliar with the inner workings of EF into thinking that an Entity object is directly connected to a table row in the database. The problem you had illustrates this point perfectly.
Thirdly, it looks like you need to delete and re-dump a whole table to memory, even if you only update a single column in a single row. That's immensely wasteful to both the memory and CPU on your web server, and to your Azure SQL instance(s). What happens when a small bit of data comes in wrong and needs to be updated in a hurry? What if one of your nightly update jobs fails but you need fresh data in the morning?
You may not worry about any of this stuff now but your solution blowing up in your face should have at the very least raised some red flags. I've had to deal with as lot of caching in projects I've worked on in the past few years and everything I say here comes from experience.
Proposed solution - On-demand caching
If you've put a little effort into organizing your code, all of your CRUD operations on the database should be in specialized helper classes which I call repositories. Your controller calls its specialized repository (StuffController - StuffRepository), receives a model and binds that model to a view, kinda like this:
public class StuffController : Controller
{
private MyDbContext _db;
private StuffRepository _repo;
public StuffController()
{
_db = new MyDbContext();
_repo = new StuffRepository(_db);
}
// ...
public ActionResult Details(int id)
{
var model = _repo.ReadDetails(id);
// ...
return View(model);
}
protected override void Dispose(bool disposing)
{
_db.Dispose();
base.Dispose(disposing);
}
}
What on-demand caching would do is wrap that call to the repository in such a way that if the result of that method was already in the cache and it was not stale, it would return it from the cache. Otherwise it would hit the database.
Here's a simplified (and probably nonfunctional) example of a CacheWrapper class so you can understand what it does, using HttpRuntime.Cache:
public static class CacheWrapper
{
private static List<string> _keys = new List<string>();
public static List<string> Keys
{
get { lock(_keys) { return _keys.ToList(); } }
}
public static T Fetch<T>(string key, Func<T> dlgt, bool refresh = false) where T : class
{
var result = HttpRuntime.Cache.Get(key) as T;
if(result != null && !refresh) return result;
lock(HttpRuntime.Cache)
{
lock(_keys)
{
_keys.Add(key);
}
result = dlgt();
HttpRuntime.Cache.Add(key, result, /* some other params */);
}
return result;
}
}
And the new way to call things from the controller:
public ActionResult Details(int id)
{
var model = CacheWrapper.Fetch("StuffDetails_" + id, () => _repo.ReadDetails(id));
// ...
return View(model);
}
A slightly more complex version of this is in production on a public web application as we speak and working quite well.
I am learning DDD development for few days, and i start to like it.
I (think i) understand the principle of DDD, where your main focus is on business objects, where you have aggregates, aggregates roots, repositories just for aggregates roots and so on.
I am trying to create a simple project where i combine DDD development with Code First approach.
My questions are: (I am using asp.net MVC)
DDD Business Objects will be different than Code First objects?
Even if they will probably be the same, for example i can have a Product business object which has all the rules and methods, and i can have a Product code first (POCO) object which will just contain the properties i need to save in database.
If answer to question 1 is "true", then how do i notify the Product POCO object that a property from business object Product has been changed and i have to update it? I am using an "AutoMapper" or something like this?
If the answer is "no", i am completely lost.
Can you show me the most simple (CRUD) example of how can i put those two together?
Thank you
Update I no longer advocate for the use of "domain objects" and instead advocate a use of a messaging-based domain model. See here for an example.
The answer to #1 is it depends. In any enterprise application, you're going to find 2 major categories of stuff in the domain:
Straight CRUD
There's no need for a domain object here because the next state of the object doesn't depend on the previous state of the object. It's all data and no behavior. In this case, it's ok to use the same class (i.e. an EF POCO) everywhere: editing, persisting, displaying.
An example of this is saving a billing address on an order:
public class BillingAddress {
public Guid OrderId;
public string StreetLine1;
// etc.
}
On the other hand, we have...
State Machines
You need to have separate objects for domain behavior and state persistence (and a repository to do the work). The public interface on the domain object should almost always be all void methods and no public getters. An example of this would be order status:
public class Order { // this is the domain object
private Guid _id;
private Status _status;
// note the behavior here - we throw an exception if it's not a valid state transition
public void Cancel() {
if (_status == Status.Shipped)
throw new InvalidOperationException("Can't cancel order after shipping.")
_status = Status.Cancelled;
}
// etc...
}
public class Data.Order { // this is the persistence (EF) class
public Guid Id;
public Status Status;
}
public interface IOrderRepository {
// The implementation of this will:
// 1. Load the EF class if it exists or new it up with the ID if it doesn't
// 2. Map the domain class to the EF class
// 3. Save the EF class to the DbContext.
void Save(Order order);
}
The answer to #2 is that the DbContext will automatically track changes to EF classes.
The answer is No. One of the best things about EF code-first is that it fits nicely with DDD since you have to create your business objects by hand so do use your EF models to be equivalent to DDD entities and value objects. No need to add an extra layer of complexity, I don't think DDD recommends that anywhere.
You could even have your entities to implement an IEntity and you value objects to implement IValue, additionally follow the rest of DDD patterns namely Repositories to do the actual communication to the database. More of these ideas you can find this very good sample application in .NET, even though it doesn't use EF code first, it's still very valuable: http://code.google.com/p/ndddsample/
Recently I've done similar project. I was following this tutorial: link
And I've done it this way: I've created Blank solution, added projects: Domain, Service and WebUI.
Simply said in domain I've put model (for example classes for EF code first, methods etc.)
Service was used for domain to communicate with world (WebUI, MobileUI, other sites etc.) using asp.net webapi
WebUi was actually MVC application (but model was in domain so it was mostly VC)
Hope I've helped
The Pluralsight course: Entity Framework in the Enterprise goes into this exact scenario of Domain Driven Design incorporated with EF Code First.
For number 1, I believe you can do it either way. It's just a matter of style.
For number 2, the instructor in the video goes through a couple ways to account for this. One way is to have a "State" property on every class that is set on the client-side when modifying a value. The DbContext then knows what changes to persist.
Late question on this topic.
Reading Josh Kodroff's answer confirms my thoughts about the implementation of a Repository to, for instance, Entity Framework DAL.
You map the domain object to an EF persistance object and let EF handle it when saving.
When retrieving, you let EF fetch from database and map it to your domain object(aggregate root) and adds it to your collection.
Is this the correct strategy for repository implementation?
Scenario:
Retrieve some entities
Update some properties on those entities
You perform some sort of business logic which dictates that you should no longer have those properties updated; instead you should insert some new entities documenting the results of your business logic.
Insert said new entities
SaveChanges();
Obviously in the above example calling SaveChanges() will not only insert the new entities, but update the properties on the original entities. Before I have managed to rearrange my code in a way where changes to the context (and its entities) would only be made when I knew for sure that I would want all my changes saved, however that’s not always possible. So the question is what is the best way to handle this scenario? I don’t work with the context directly, rather through repositories, if that matters. Is there a simple way to revert the entities to their original values? What is the best practice in this sort of scenario?
Update
Although I disagree with Ladislav that the business logic should be rearranged in such way that the validation always come before any modification to the entities, I agree that the solution should really be persisting wanted changes on a different context. The reason I disagree is because my business transaction is fairly long, and validation or error checking that might happen at the end of the transaction are not always obvious upfront. Imagine a Christmas tree you're decorating with lights from the top down, you've already modified the tree by the time you're working on the lower branches. What happens if one of the lights breaks? You want to roll back all of your changes, but you want to create some ERROR entities. As Ladislav suggested the most straight forward way would be to save the ERROR entities on a different context, allowing the original one (with the modified metaphorical tree) to expire without SaveChanges being ever called.
Now, in my situation I utilize Ninject for dependance injection, injecting one EF context into all of my repositories that are in the scope of the top level service. What this means is that my business layer classes don't really have control of creating new EF contexts. Not only do they not have access to the EF context (remember they work through repositories), but the injection has already occurred higher in the object hierarchy. The only solution I found is to create another class that will utilize Ninject to create a new UOW within it.
//business logic executing against repositories with already injected and shared (unit of work) context
Tree = treeRepository.Get();
Lights = lightsRepsitory.Get();
//update the tree as you're decorating it with lights
if(errors.Count == 0)
{
//no errors, calling SaveChanges() on any one repository will commit the entire UOW as they all share the same injected EF context
repository1.SaveChanges();
}
else
{
//oops one of the lights broke, we need to insert some Error entities
//however if we just add id to the errorRepository and call SaveChanges() the modifications that happened
//to the tree will also be committed.
TreeDecoratorErroHandler.Handle(errors);
}
internal class TreeDecoratorErroHandler
{
//declare repositories
//constructor that takes repository instances
public static void Handle(IList<Error> errors)
{
//create a new Ninject kernel
using(Ninject... = new Ninject...)
{
//this will create an instance that will get injected with repositories sharing a new EF instance
//completely separate from the one outside of this class
TreeDecoratorErroHandler errorHandler = ninjectKernel.Get<TreeDecoratorErroHandler>();
//this will insert the errors and call SaveChanges(), the only changes in this new context are the errors
errorHandler.InsertErrors(errors);
}
}
//other methods
}
You should definitely use a new context for this. Context is unit of work and once your business logic says: "Hey I don't want to update this entity" then the entity is not part of unit of work. You can either detach the entity or create new context.
There is possibility to use Refresh method but that method is supposed to be used in scenarios where you have to deal with optimistic concurrency. Because of that this method refreshes only scalar and complex properties and foreign keys if part of the entity - if you made changes to navigation properties these can be still present after you refresh the entity.
Take a look at ObjectContext.Refresh with RefreshMode.StoreWins I think that will do what you want. Starting a new context would achieve the same thing I guess, but not be as neat.
Just need to clarify this one, If I have the below interface
public interface IRepository<T>
{
T Add(T entity);
}
when implementing it, does checking for duplication if entity is already existing before persist it is still a job of the Repository, or it should handle some where else?
Yes - I recommend doing these checks in the repository.
Long answer: The term "repository" is a bit vague, but it is used more and more as the name of the persistence abstraction layer. The name is nice, but does not say too much: If you take Asp.Net MVC as an example, the sample apps, like Neirds dinner and alike, or codeplex projects encapsulate data access by the repository class. If such layer is implemented with a relational database for instancce, the primary keys of the tables will not allow duplicate entries, which means that in this case the repository implementation will throw an exception if 2 entries with the same key are inserted. So in other words, a RDBMS-implementation of a repository will quite always due this check, you wont be able to avoid it. So to make the behavior of repostories out there in the world most similar and to avoid surprises, lets all of them do this check.
It is a second question whether you should maintain in the business logic already that your Add() method is not alled with an entry that already exists. Sometimes it makes good sense to resolve this only at a single point, the database for instance, due to concurrency issues or savings of roundtrips. On the other hand it is for instance nice to tell the user as soon as possible that a username is already taken. So this depends.
have a nice day
If the entity already exists, you can either throw an exception, or update the existing entity's fields.
If you choose the latter, the method should probably be called something like AddOrUpdate()
Linq to SQL example
If I am retrieving a single record, I will use
public Entity GetEntity(int entityID)
{
return dataContext.Entities.SingleOrDefault(e => e.EntityID = entityID);
}
...And in the calling method, I will check to see if what is returned is null before attempting to use the returned entity.
If I am updating a record, I will retrieve the entity as shown, edit the entity, and then call an UpdateEntity(entityID) repository method to update the fields in the database.
If I am adding a record, it's even easier. Since this is a database, and my tables always contain an Identity field of type int (an auto-assignable number, essentially), adding a record is the simplest operation of all (it's always a new record):
Public void InsertEntity(Entity entity)
{
dataContext.Entities.InsertOnSubmit(entity);
dataContext.SubmitChanges();
}
Business rules (email addresses are unique, for example) can be handled in the repository, or in a separate business layer. If you are looking for the most "correct" way, I think most people would agree that business rules belong in their own Business Logic Layer.
Essentially the decision on where to handle that case depends on your exact requirements.
If you have business rules that define clear cut actions for when this happens, eg if a duplicate exists the new item should be renamed, then it can be built into the repository class.
On the other hand, if more complex rules are in place whereby, for example, more information is required to change the item before adding, then it should be handled further up the food chain.
The concept of a repository states that it exists to perform the persistence activities.
So if you can do it all within the repository, that's fine. If you find you start to reference outside the repository, or your repository has dependencies, eg calling another repository, or a service, or a manager (or whatever processor nomenclature you prefer), then it's a good sign to take it back a step.