Not sure if there's a "offical" name, but by DataContext I mean an object which transparently maintains objects' state, providing change tracking, statement-of-work functionality, concurrency and probably many other useful features. (In Entity Framework it's ObjectContext, in NHibernate - ISession).
Eventually I've come to an idea that something like that should be implemented in my application (it uses mongodb as back-end, and mongodb's partial updates are fine when we're able to track a certain property change).
So actually, I've got several questions on this subject
Could anyone formulate requirements to DataContext? - what's your understanding of it's tasks and responsibilities? (The most relevant I've managed to find is Esposito's book, but unfortunately that's at about msdn samples level).
What would you suggest for changes tracking implementation? (In simplest way it's possible to track changes "manually" in entities, but requires coding and mixes dal with business logic, so I mostly interested in "automatic" way, keeping entities more poco).
Is there way to utilize some existing solution? (I hoped nhibernate infrastructure would allow plugging-in custom module to work with mongo behind the scene, but not sure if it allows working with non-sql dbs at all).
The DataContext (ObjectContext or DbContext in EF) is nothing else than an implementation of the Unit of Work (UoW)/Repository pattern.
I suggest you Fowler's book on Patterns of Enterprise Application Architecture in which he outlines the implementation of several persistency patterns. That might be a help in implementing your own solution.
A DataContext basically needs to fullfil the job of a UoW. It needs to handle the reading and management of objects that are involved in a given lifecycle (i.e. HTTP request), s.t. there are no two objects in memory that represent the same record on the DB. Moreover it needs to provide some change tracking for performing partial updates to the DB (as you already mentioned).
To what regards change tracking, I fully agree that polluting properties with change events etc is bad. One of the recent templates introduced in EF4.1 uses Proxies to handle that and to give the possibility to have plain POCOs.
Answer to quetion 2: To make POCO classes work, you will need to generate code at run-time, possibly using System.Reflection.
If you analyse EntityFramework, you will see that it requires virtual properties to do change-tracking... that is because it needs to create a generated class at run-time, that overrides each property, and then adds code to tell the DataContext when someone changes that property.
EntityFramework also generates code to initialize collections, so that when someone try to do operations like Add and Remove, the collection object itself knows what to do.
Related
Are there any frameworks out there for MongoDB in C# that can automatically map Document Relations? I'm now talking about a model or "schema" that is purely defined by documents themselves and not by objects in .Net or any other external schema for that matter.
Think dynamic objects / bsondocuments that can automatically lazy-load relations between other documents.
I have several ideas how to solve this myself however if there already exist any frameworks or perhaps BsonDocument extensions (how I intended to solve this myself) this would lessen the need to add complexity to the project I'm working at itself.
The question is largely off-topic ('are there frameworks'), but I'd like to challenge the idea in itself:
this would lessen the need to add complexity to the project I'm working at itself.
I think it would merely hide complexity by moving it to a part of the code that knows nothing about your functional or non-functional requirements. Combined with a database that has no constraints except unique that doesn't sound like a good idea.
I'd recommend to stay away from lazy loading as an almost general rule, because it makes it impossible to tell whether
an operation is super costly (database call) or a mere memory lookup
the properties' state will be fetched on access, or is cached, thus hiding the key aspect of serialization from the user.
In other words: I'd stay away from the idea, or use something like EF with whatever database for it. If you don't care about your serialization, use a well-tested commonplace solution.
So at my job I was pointed to http://www.codeproject.com/Articles/990492/RESTful-Day-sharp-Enterprise-Level-Application#_Toc418969121 and was told to learn these patterns and implement them in my solution.
What confused me was that these things were before entity framework 6 and from what I understood, Unity of Work is used to optimize database performance by grouping queries together. Since EF6 has already these optimizations, should I still implement these layers? I get that the layerness helps with modularization and switching of data source. Does that mean that EF6 is too complex to implement with these patterns and should ADO.Net be used directly or something like that?
EDIT: I've noticed that this added layer allows usage of mock data sources. Not sure how useful this is because of the need to add another layer of apstraction
"Unit of Work is used to optimize database performance by grouping queries together." - This is not correct. Unit of Work is there to collect related operations together into a single transaction which is then committed or rolled back as a whole. It tracks changes made to objects so that required database operations can be deduced automatically and performed on your behalf.
When you work with Entity Framework, you use it to create DbContext from model. That class is both the Repository and Unit of Work, so you don't have to do anything special. Things only become more complicated than that when your project becomes so large that DbContext becomes more of a burden.
Repository is used to abstract your application from datasource, but since EntityFramework implements this pattern by itself and gives you a possibility to change data source seamlessly, there is no neccesity to add one more layer of abstraction.
You will just limit EF possibilities, while creating something like GenericRepository<T>. And nevertheless you won't be able to replace EF by another library with no changes to your code, even if you implement such a layer. (Some queries written for EF will fail for NHibernate, for example).
Just don't use DbContext everywhere in your application (inside UI code at least), use it by your data access layer (services with business dependent methods or something in that way).
Even for scenarios, where some cloud data storage is used (which EF won't be able to handle seamlessly), there is no neccessity for that layer, it's better to introduce separate classes and use them explicitly, because you cannot fit db and cloud interaction into one abstraction, it will start leaking at some point.
Entity Framework is a UnitOfWork/Repository pattern itself. If you need to abstract yourself from EF, then you could implement a layer on top of EF with your own UoW/Rep pattern.
This is good if you want to replace EF at some point in your proyect.
The cons? I think that building a UoW on top of EF gives you a redundant architecture and you will end up writing more code for something that maybe will never change.
In my current proyect, the main structure is very common, with a Data layer (with a sublayer for the Entities) for EF, Logic layer (where I put all the Business Logic) and the View layer (It can be web, or whatever). With that structure I directly invoke EF in the Logic layer.
Iam doing on some business staff for renting offices etc. One req from customer was that the product should be able to track changes in properties for specific entities (f.e. BusinessName property in Tenant class). I come up with a solution that there will be an decorator classes for every property. There will be a date when the "change was noticed" so based on that date i would like to wrap up original Tenant:ITenant class with f.e BusinessNameDecorator:TenantDecorator:ITenant class.
Raw solution is something like this
The problem for that is that the code starts to smell and plumbing get to business classes. Is there any proved and verified solution or pattern for tracking changes and persisting them in business entities? I dont want to reinvent the wheel but unfortunately i did not found anything.
Thank you for your help
P.S.: Sorry for my bad english :)
You may try setting up a mechanism like EF's POCO support. When you use POCOs with Entity Framework, EF still manages to track which properties have changed on your entity objects.
Basically EF uses Reflection.Emit to create at runtime classes that derive from your POCOs and add the change tracking behavior (provided that the tracked properties of your POCOs are declared as virtual).
It can be really complicated to achieve this, but if you have to apply the mechanism to a large number of classes it's probably well worth it.
I seem to be missing something and extensive use of google didn't help to improve my understanding...
Here is my problem:
I like to create my domain model in a persistence ignorant manner, for example:
I don't want to add virtual if I don't need it otherwise.
I don't like to add a default constructor, because I like my objects to always be fully constructed. Furthermore, the need for a default constructor is problematic in the context of dependency injection.
I don't want to use overly complicated mappings, because my domain model uses interfaces or other constructs not readily supported by the ORM.
One solution to this would be to have separate domain objects and data entities. Retrieval of the constructed domain objects could easily be solved using the repository pattern and building the domain object from the data entity returned by the ORM. Using AutoMapper, this would be trivial and not too much code overhead.
But I have one big problem with this approach: It seems that I can't really support lazy loading without writing code for it myself. Additionally, I would have quite a lot of classes for the same "thing", especially in the extended context of WCF and UI:
Data entity (mapped to the ORM)
Domain model
WCF DTO
View model
So, my question is: What am I missing? How is this problem generally solved?
UPDATE:
The answers so far suggest what I already feared: It looks like I have two options:
Make compromises on the domain model to match the prerequisites of the ORM and thus have a domain model the ORM leaks into
Create a lot of additional code
UPDATE:
In addition to the accepted answer, please see my answer for concrete information on how I solved those problems for me.
I would question that matching the prereqs of an ORM is necessarily "making compromises". However, some of these are fair points from the standpoint of a highly SOLID, loosely-coupled architecture.
An ORM framework exists for one sole reason; to take a domain model implemented by you, and persist it into a similar DB structure, without you having to implement a large number of bug-prone, near-impossible-to-unit-test SQL strings or stored procedures. They also easily implement concepts like lazy-loading; hydrating an object at the last minute before that object is needed, instead of building a large object graph yourself.
If you want stored procs, or have them and need to use them (whether you want to or not), most ORMs are not the right tool for the job. If you have a very complex domain structure such that the ORM cannot map the relationship between a field and its data source, I would seriously question why you are using that domain and that data source. And if you want 100% POCO objects, with no knowledge of the persistence mechanism behind, then you will likely end up doing an end run around most of the power of an ORM, because if the domain doesn't have virtual members or child collections that can be replaced with proxies, then you are forced to eager-load the entire object graph (which may well be impossible if you have a massive interlinked object graph).
While ORMs do require some knowledge in the domain of the persistence mechanism in terms of domain design, an ORM still results in much more SOLID designs, IMO. Without an ORM, these are your options:
Roll your own Repository that contains a method to produce and persist every type of "top-level" object in your domain (a "God Object" anti-pattern)
Create DAOs that each work on a different object type. These types require you to hard-code the get and set between ADO DataReaders and your objects; in the average case a mapping greatly simplifies the process. The DAOs also have to know about each other; to persist an Invoice you need the DAO for the Invoice, which needs a DAO for the InvoiceLine, Customer and GeneralLedger objects as well. And, there must be a common, abstracted transaction control mechanism built into all of this.
Set up an ActiveRecord pattern where objects persist themselves (and put even more knowledge about the persistence mechanism into your domain)
Overall, the second option is the most SOLID, but more often than not it turns into a beast-and-two-thirds to maintain, especially when dealing with a domain containing backreferences and circular references. For instance, for fast retrieval and/or traversal, an InvoiceLineDetail record (perhaps containing shipping notes or tax information) might refer directly to the Invoice as well as the InvoiceLine to which it belongs. That creates a 3-node circular reference that requires either an O(n^2) algorithm to detect that the object has been handled already, or hard-coded logic concerning a "cascade" behavior for the backreference. I've had to implement "graph walkers" before; trust me, you DO NOT WANT to do this if there is ANY other way of doing the job.
So, in conclusion, my opinion is that ORMs are the least of all evils given a sufficiently complex domain. They encapsulate much of what is not SOLID about persistence mechanisms, and reduce knowledge of the domain about its persistence to very high-level implementation details that break down to simple rules ("all domain objects must have all their public members marked virtual").
In short - it is not solved
(here goes additional useless characters to post my awesome answer)
All good points.
I don't have an answer (but the comment got too long when I decided to add something about stored procs) except to say my philosophy seems to be identical to yours and I code or code generate.
Things like partial classes make this a lot easier than it used to be in the early .NET days. But ORMs (as a distinct "thing" as opposed to something that just gets done in getting to and from the database) still require a LOT of compromises and they are, frankly, too leaky of an abstraction for me. And I'm not big on having a lot of dupe classes because my designs tend to have a very long life and change a lot over the years (decades, even).
As far as the database side, stored procs are a necessity in my view. I know that ORMs support them, but the tendency is not to do so by most ORM users and that is a huge negative for me - because they talk about a best practice and then they couple to a table-based design even if it is created from a code-first model. Seems to me they should look at an object datastore if they don't want to use a relational database in a way which utilizes its strengths. I believe in Code AND Database first - i.e. model the database and the object model simultaneously back and forth and then work inwards from both ends. I'm going to lay it out right here:
If you let your developers code ORM against your tables, your app is going to have problems being able to live for years. Tables need to change. More and more people are going to want to knock up against those entities, and now they all are using an ORM generated from tables. And you are going to want to refactor your tables over time. In addition, only stored procedures are going to give you any kind of usable role-based manageability without dealing with every tabl on a per-column GRANT basis - which is super-painful. If you program well in OO, you have to understand the benefits of controlled coupling. That's all stored procedures are - USE THEM so your database has a well-defined interface. Or don't use a relational database if you just want a "dumb" datastore.
Have you looked at the Entity Framework 4.1 Code First? IIRC, the domain objects are pure POCOs.
this what we did on our latest project, and it worked out pretty well
use EF 4.1 with virtual keywords for our business objects and have our own custom implementation of T4 template. Wrapping the ObjectContext behind an interface for repository style dataaccess.
using automapper to convert between Bo To DTO
using autoMapper to convert between ViewModel and DTO.
you would think that viewmodel and Dto and Business objects are same thing, and they might look same, but they have a very clear seperation in terms of concerns.
View Models are more about UI screen, DTO is more about the task you are accomplishing, and Business objects primarily concerned about the domain
There are some comprimises along the way, but if you want EF, then the benfits outweigh things that you give up
Over a year later, I have solved these problems for me now.
Using NHibernate, I am able to map fairly complex Domain Models to reasonable database designs that wouldn't make a DBA cringe.
Sometimes it is needed to create a new implementation of the IUserType interface so that NHibernate can correctly persist a custom type. Thanks to NHibernates extensible nature, that is no big deal.
I found no way to avoid adding virtual to my properties without loosing lazy loading. I still don't particularly like it, especially because of all the warnings from Code Analysis about virtual properties without derived classes overriding them, but out of pragmatism, I can now live with it.
For the default constructor I also found a solution I can live with. I add the constructors I need as public constructors and I add an obsolete protected constructor for NHibernate to use:
[Obsolete("This constructor exists because of NHibernate. Do not use.")]
protected DataExportForeignKey()
{
}
What class in my project should responsible for keeping track of which Aggregate Roots have already been created so as not to create two instances for the same Entity. Should my repository keep a list of all aggregates it has created? If yes, how do i do that? Do I use a singleton repository (doesn't sound right to me)? Would another options be to encapsulate this "caching" somewhere else is some other class? What would this class look like and in what pattern would it fit?
I'm not using an O/R mapper, so if there is a technology out there that handles it, i'd need to know how it does it (as in what pattern it uses) to be able to use it
Thanks!
I believe you are thinking about the Identity Map pattern, as described by Martin Fowler.
In his full description of the pattern (in his book), Fowler discusses implementation concerns for read/write entities (those that participate in transactions) and read-only (reference data, which ideally should be read only once and subsequently cached in memory).
I suggest obtaining his excellent book, but the excerpt describing this pattern is readable on Google Books (look for "fowler identity map").
Basically, an identity map is an object that stores, for example in a hashtable, the entity objects loaded from the database. The map itself is stored in the context of the current session (request), preferably in a Unit of Work (for read/write entities). For read-only entities, the map need not be tied to the session and may be stored in the context of the process (global state).
I consider caching to be something that happens at the Service level, rather than in a repository. Repositories should be "dumb" and just do basic CRUD operations. Services can be smart enough to work with caching as necessary (which is probably more of a business rule than a low-level data access rule).
To put it simply, I don't let any code use Repositories directly - only Services can do that. Then everything else consumes the relevant services as interfaces. That gives you a nice wrapper for putting in biz logic, caching, etc.
I would say, if this "caching" managed anywhere other than the Repository then you are letting concerns leak out.
The repository is your collection of items. No code consuming the repository should have to decide whether to retrieve the object from the repository or from somewhere else.
Singleton sounds like the wrong lifetime; it likely should be per-request. This is easy to manage if you are using an IoC/DI container.
It seems like the fact that you even have consider multiple calls for the same aggregate evidences a architecture/design problem. I would be interested to hear an example of what those first and second calls might be, and why they require the same exact instance of your AR.