Why EF Entities are making a dependency and the need of POCO

Why EF Entities are making a dependency and the need of POCO - c#

I'm pretty new to IoC, Dependency Injection and Unit Testing. I'm starting a new pet project and I'm trying to do it right.
I was planning to use the Repository pattern to mediate with the data. The objects that I was going to return from the repositories were going to be objects collected from a Linq to entities data context (EF4).
I'm reading in "Dependency Injection" from Mark Seeman that doing it, makes an important dependency and will definitely complicate the testing (that's what he's using POCO objects in a Library Project).
I'm not understanding why. Although the objects are created by a linq to entities context, I can create them simply calling the constructor as they were normal objects. So I assume that is possible to create fake repositories that deviler these objects to the caller.
I'm also concerned about the automatic generation of POCO classes, which is not very easy.
Can somebody bring some light? Are POCO objects trully necessary for a decoupled and testable project?
**EDIT: Thanks to Yuck I understand that it's better to avoid autogeneration with templates, which brings me to a design question. If I come from a big legacy database wich his tables are assuming a variety of responsabilities (doesn't fit well with the concept of a class with a single responsability), what's the best way to deal with that?
Delete the database is not an option ;-)

No, they're not necessary it just makes things easier, cleaner.
The POCO library won't have any knowledge that it's being used by Entity Framework. This allows it to be used in other ways - in place of a view model, for instance. It also allows you to use the same project on both sides of a WCF service which eliminates the need to create data transfer objects (DTO).
Just two examples from personal experience but there are surely more. In general the less a particular object or piece of code knows about who is using it or how it's being used will make it more adaptable and generic for other situations.
You also mention automatic generation of POCO classes. I don't recommend doing this. Were you planning to generate the class definitions from your database structure?

I was planning to use the Repository pattern to mediate with the data.
The objects that I was going to return from the repositories were
going to be objects collected from a Linq to entities data context
(EF4).
The default classes (not the POCOs) EF generates contain proxies for lazy loading and are tied at the hip to Entity Framework. That means any other class that wants to use those classes will have to reference the required EF assemblies.
This is the dependency Mark Seeman is talking about. Since you are now dependent on these non-abstract types, which in turn are dependent on EF, you cannot simply change the implementation of your repository to something different (i.e. just using your own persistence store) without addressing this change in the class that depend on these types.
If you are truly only interested in the public properties of the EF generated types then you can have the partial classes generated by EF implement a base interface. Put all the properties you need in that base interface and pass the dependency in as the base interface - now you only depend on the base interface and not EF anymore.

Related

Mapping between database entities and domain objects or DTO

I have a complex mapping logic between my database objects and domain objects. Hence, I am not using any third-party library like AutoMapper. I am wondering what would be a better code design:
Option #1. Having a mapper abstracted as IServiceNameMapper and inject as a dependency into IServiceName
Option #2. Write extension methods to database objects and domain objects to convert between them.
I am leaning towards option #1 as I get a nice separation of concern, wherein the mapping logic is abstracted away from the core service logic and injected as a dependency into service. This also gives me ability to extend or change the mapping logic in the future without having to change the service implementation.
Any thoughts?

From purely architectural point of view option 1 is better. However sometimes this can be overdesign. For example I have no interest in separating my DTOs from the Entities so what I do is just accept the entity in the constructor of the DTO or create ToEntity() method on the DTO and keep the mapping in the DTO. Theoretically it is bad because the DTOs cannot be used in another project without pulling in the entities but in practice I have estimated that the risk that I will need that is pretty low since Silverlight is now dead :) Unless you are into extreme unit testing I think the extension methods are fine. Basically the only problem that can occur is that your unit tests will test both the method and the mapping logic and you won't be able to write a test that only tests the method. Big deal!

Shall, I implement Repository, Unit of Work for EF6?

Since MSDN says about DbContext:
A DbContext instance represents a combination of the Unit Of Work and
Repository patterns such that it can be used to query from a database
and group together changes that will then be written back to the store
as a unit. DbContext is conceptually similar to ObjectContext.
Is it not redundant to implement these two (Unit of Work & Repository) when using EF5+?
Can somebody shed more light on this subject?
I intend to build an MVC based application using SQL server and after reading a lot about data access techniques with unit testability, I am kind of lost with the above info!

That depends on the complexity of your project and its requirements. For instance, these two questions might help your decision making:
Will you use any other data sources besides EF that must work along it?
How likely it is that you swap EF for a different ORM or data source in the future?
If you can't foresee changes or you don't need to work with more than just EF then it's probably not worth the trouble.

I would create a Generic repository so you can mock it in your tests more easily than mocking Entity Framework's context directly. But, yes, EF 5+ does implement these patterns as MSDN states.
It's a layer of abstraction. The repository pattern is a collection of objects and a thing to get a collection of objects. Entity Framework knows HOW to get that collection of objects, the repository does not know HOW.
Entity Framework has a lot features that you potentially loose by wrapping it in a repository or a thinner service. If you practise TDD coding against your own classes is often more comfortable than mocking third-party code.
Ayende has a blog post about this.

Rich domain model with ORM

I seem to be missing something and extensive use of google didn't help to improve my understanding...
Here is my problem:
I like to create my domain model in a persistence ignorant manner, for example:
I don't want to add virtual if I don't need it otherwise.
I don't like to add a default constructor, because I like my objects to always be fully constructed. Furthermore, the need for a default constructor is problematic in the context of dependency injection.
I don't want to use overly complicated mappings, because my domain model uses interfaces or other constructs not readily supported by the ORM.
One solution to this would be to have separate domain objects and data entities. Retrieval of the constructed domain objects could easily be solved using the repository pattern and building the domain object from the data entity returned by the ORM. Using AutoMapper, this would be trivial and not too much code overhead.
But I have one big problem with this approach: It seems that I can't really support lazy loading without writing code for it myself. Additionally, I would have quite a lot of classes for the same "thing", especially in the extended context of WCF and UI:
Data entity (mapped to the ORM)
Domain model
WCF DTO
View model
So, my question is: What am I missing? How is this problem generally solved?
UPDATE:
The answers so far suggest what I already feared: It looks like I have two options:
Make compromises on the domain model to match the prerequisites of the ORM and thus have a domain model the ORM leaks into
Create a lot of additional code
UPDATE:
In addition to the accepted answer, please see my answer for concrete information on how I solved those problems for me.

I would question that matching the prereqs of an ORM is necessarily "making compromises". However, some of these are fair points from the standpoint of a highly SOLID, loosely-coupled architecture.
An ORM framework exists for one sole reason; to take a domain model implemented by you, and persist it into a similar DB structure, without you having to implement a large number of bug-prone, near-impossible-to-unit-test SQL strings or stored procedures. They also easily implement concepts like lazy-loading; hydrating an object at the last minute before that object is needed, instead of building a large object graph yourself.
If you want stored procs, or have them and need to use them (whether you want to or not), most ORMs are not the right tool for the job. If you have a very complex domain structure such that the ORM cannot map the relationship between a field and its data source, I would seriously question why you are using that domain and that data source. And if you want 100% POCO objects, with no knowledge of the persistence mechanism behind, then you will likely end up doing an end run around most of the power of an ORM, because if the domain doesn't have virtual members or child collections that can be replaced with proxies, then you are forced to eager-load the entire object graph (which may well be impossible if you have a massive interlinked object graph).
While ORMs do require some knowledge in the domain of the persistence mechanism in terms of domain design, an ORM still results in much more SOLID designs, IMO. Without an ORM, these are your options:
Roll your own Repository that contains a method to produce and persist every type of "top-level" object in your domain (a "God Object" anti-pattern)
Create DAOs that each work on a different object type. These types require you to hard-code the get and set between ADO DataReaders and your objects; in the average case a mapping greatly simplifies the process. The DAOs also have to know about each other; to persist an Invoice you need the DAO for the Invoice, which needs a DAO for the InvoiceLine, Customer and GeneralLedger objects as well. And, there must be a common, abstracted transaction control mechanism built into all of this.
Set up an ActiveRecord pattern where objects persist themselves (and put even more knowledge about the persistence mechanism into your domain)
Overall, the second option is the most SOLID, but more often than not it turns into a beast-and-two-thirds to maintain, especially when dealing with a domain containing backreferences and circular references. For instance, for fast retrieval and/or traversal, an InvoiceLineDetail record (perhaps containing shipping notes or tax information) might refer directly to the Invoice as well as the InvoiceLine to which it belongs. That creates a 3-node circular reference that requires either an O(n^2) algorithm to detect that the object has been handled already, or hard-coded logic concerning a "cascade" behavior for the backreference. I've had to implement "graph walkers" before; trust me, you DO NOT WANT to do this if there is ANY other way of doing the job.
So, in conclusion, my opinion is that ORMs are the least of all evils given a sufficiently complex domain. They encapsulate much of what is not SOLID about persistence mechanisms, and reduce knowledge of the domain about its persistence to very high-level implementation details that break down to simple rules ("all domain objects must have all their public members marked virtual").

In short - it is not solved
(here goes additional useless characters to post my awesome answer)

All good points.
I don't have an answer (but the comment got too long when I decided to add something about stored procs) except to say my philosophy seems to be identical to yours and I code or code generate.
Things like partial classes make this a lot easier than it used to be in the early .NET days. But ORMs (as a distinct "thing" as opposed to something that just gets done in getting to and from the database) still require a LOT of compromises and they are, frankly, too leaky of an abstraction for me. And I'm not big on having a lot of dupe classes because my designs tend to have a very long life and change a lot over the years (decades, even).
As far as the database side, stored procs are a necessity in my view. I know that ORMs support them, but the tendency is not to do so by most ORM users and that is a huge negative for me - because they talk about a best practice and then they couple to a table-based design even if it is created from a code-first model. Seems to me they should look at an object datastore if they don't want to use a relational database in a way which utilizes its strengths. I believe in Code AND Database first - i.e. model the database and the object model simultaneously back and forth and then work inwards from both ends. I'm going to lay it out right here:
If you let your developers code ORM against your tables, your app is going to have problems being able to live for years. Tables need to change. More and more people are going to want to knock up against those entities, and now they all are using an ORM generated from tables. And you are going to want to refactor your tables over time. In addition, only stored procedures are going to give you any kind of usable role-based manageability without dealing with every tabl on a per-column GRANT basis - which is super-painful. If you program well in OO, you have to understand the benefits of controlled coupling. That's all stored procedures are - USE THEM so your database has a well-defined interface. Or don't use a relational database if you just want a "dumb" datastore.

Have you looked at the Entity Framework 4.1 Code First? IIRC, the domain objects are pure POCOs.

this what we did on our latest project, and it worked out pretty well
use EF 4.1 with virtual keywords for our business objects and have our own custom implementation of T4 template. Wrapping the ObjectContext behind an interface for repository style dataaccess.
using automapper to convert between Bo To DTO
using autoMapper to convert between ViewModel and DTO.
you would think that viewmodel and Dto and Business objects are same thing, and they might look same, but they have a very clear seperation in terms of concerns.
View Models are more about UI screen, DTO is more about the task you are accomplishing, and Business objects primarily concerned about the domain
There are some comprimises along the way, but if you want EF, then the benfits outweigh things that you give up

Over a year later, I have solved these problems for me now.
Using NHibernate, I am able to map fairly complex Domain Models to reasonable database designs that wouldn't make a DBA cringe.
Sometimes it is needed to create a new implementation of the IUserType interface so that NHibernate can correctly persist a custom type. Thanks to NHibernates extensible nature, that is no big deal.
I found no way to avoid adding virtual to my properties without loosing lazy loading. I still don't particularly like it, especially because of all the warnings from Code Analysis about virtual properties without derived classes overriding them, but out of pragmatism, I can now live with it.
For the default constructor I also found a solution I can live with. I add the constructors I need as public constructors and I add an obsolete protected constructor for NHibernate to use:
[Obsolete("This constructor exists because of NHibernate. Do not use.")]
protected DataExportForeignKey()
{
}

Entity Framework POCO Object Creation

we are currently using EF 4.1 and are thinking about a new alternative for our object model which is terrible. We do have POCOs in a sort of BL-Layer and above a GUI-Model with Objects wrapping the POCOs to offer BindingLists to the UI instead of the BL's IEnumerables.
We thought about deriving the UI-Model from the POCOs but I have no Idea how this would work with the EF instantiating the objects as it shouldn't know anything about the UI objects. Is there some way to move the instantiating process to factories or does anyone have an idea how to promote the object afterwards from base to derived type (which isn't really a good idea at all, is it?)
Any help, suggestions or comments would be highly appreciated.
Best regards
Gope

No it is not possible to move object instantiation to factories. Because of that deriving custom classes from your entities will not work because EF will not create those instances for you and you will also not be able to persist those instances because EF will not know how to map them.

The introduction of POCO support in EF4 was to allow the business objects to exist without extending objects in the EF (specifically EntityObject). I've worked on a number of systems now where the domain model is shared across all layers in the system and a repository patterns is used to handle object persistence. A number of teams use the concept of data transfer objects (DTO) to provide the UI with a lightweight object that the UI can work with (which also provides a certain degree of abstraction to the BLL from the UI).
Microsoft Spain recently published a document and sample application that talks about how to structure enterprise applications (http://microsoftnlayerapp.codeplex.com/) which might give you some ideas about how to structure things better.

How to avoid declaring database fields twice, once in database, once in a repository/model?

I recently began reading Pro ASP.NET MVC Framework.
The author talks about creating repositories, and using interfaces to set up quick automated tests, which sounds awesome.
But it carries the problem of having to declare yourself all the fields for each table in the database twice: once in the actual database, and once in the C# code, instead of auto-generating the C# data access classes with an ORM.
I do understand that this is a great practice, and enables TDD which also looks awesome. But my question is:
Isn't there any workaround having to declare fields twice: both in the database and the C# code? Can't I use something that auto-generates the C# code but still allows me to do TDD without having to manually create all the business logic in C# and creating a repository (and a fake one too) for each table?

I understand what you mean: most of the POCO classes that you're declaring to have retrieved by repositories look a whole lot like the classes that get auto-generated by your ORM framework. It is tempting, therefore, to reuse those data-access classes as if they were business objects.
But in my experience, it is rare for the data I need in a piece of business logic to be exactly like the data-access classes. Usually I either need some specific subset of data from a data object, or some combination of data produced by joining a few data objects together. If I'm willing to spend another two minutes actually constructing the POCO that I have in mind, and creating an interface to represent the repository methods I plan to use, I find that the code ends up being much easier to refactor when I need to change business logic.

If you use Entity Framework 4, you can generate POCO object automatically from the database. ( Link)
Then you can implement a generic IRepository and its generic SqlRepository, this will allow you to have a repository for all your objects. This is explained here : http://msdn.microsoft.com/en-us/ff714955.aspx
This is a clean way to achieve what you want: you only declare your object once in your database, generate them automatically, and can easily access them with your repository (in addition you can do IoC and unit test :) )
I recommend you to read the second edition of this book which is pure gold and updated with the new features introduced in MVC 2
http://www.amazon.com/ASP-NET-Framework-Second-Experts-Voice/dp/1430228865/ref=sr_1_1?s=books&ie=UTF8&qid=1289851862&sr=1-1
And you should also read about the new features introduced in MVC3 which is now in RC (there is a new view engine really useful) http://weblogs.asp.net/scottgu/archive/2010/11/09/announcing-the-asp-net-mvc-3-release-candidate.aspx

You are not declaring the business logic twice. It's just that this business logic is abstracted behind an interface and in the implementation of this interface you can do whatever you want : hit the database, read from the filesystem, aggregate information from web addresses, ... This interface allows weaker coupling between the controller and the implementation of the repository and among other simplifies TDD. Think of it as a contract between the controller and the business.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.