Repository; Mapping between POCOs / Linq-to-Sql entity classes

Repository; Mapping between POCOs / Linq-to-Sql entity classes - c#

I'm making my first database program, with Sql Express. Currently I'm using Linq-to-Sql for data access, and my repository classes return "entity" type objects. Meaning; I extend the dbml entity classes to use as my business object classes. Now I want to make this more separated; and have POCO bussiness objects.
This is where I wonder about what different solutions may exist. It looks to me like I need to manually map property-by-property, each entity class into domain class, in the repositories. I have so far about 20 tables with total few hundred columns. Now.. I just want to verify if this is a common/typical approach that you still use? And if there are alternatives without introducing excessive complexity, what would that be?

Before creating your mappings manually, have a look at AutoMapper
AutoMapper is an object-object mapper.
Object-object mapping works by
transforming an input object of one
type into an output object of a
different type. What makes AutoMapper
interesting is that it provides some
interesting conventions to take the
dirty work out of figuring out how to
map type A to type B. As long as type
B follows AutoMapper's established
convention, almost zero configuration
is needed to map two types.

AutoMapper is a good tool to use to perform class-to-class conversions. However, I'm thinking of a DAL that combines Linq2Sql and AutoMapper, and I'm thinking why not just go with Fluent NHibernate? It's very easy to set up, works on just about any database including SqlExpress, and there is a Linq provider that integrates pretty seamlessly. All of this is free open-source code, and very commonly-used so there's ample documentation and support.
If you want to stay with Linq2Sql but have a more full-featured domain model, you could consider deriving your domain model from the DTOs. That would allow you to have the business logic in the domain, with the properties passed up to the DTO. However, understand that the Linq2SQL objects will not be able to be directly cast to domain objects; you'll need a constructor in the domain that takes a DTO and copies the info into the domain (requiring at least a one-way mapping of DTO to domain). However, the domain can be treated like a DTO (because a class is always its parent) so the reverse conversion isn't necessary; just hand the domain class to the repository where it would expect the DTO.

Related

DDD principles and repositories with Dapper

I'm setting up a solution in which I apply onion architecture and DDD patterns.
One of the DDD principles encourages domain entities to have only private setters and a private default constructor, to make sure you cannot create domain entities in an invalid state.
Repositories contain the data operations on domain entities, which are mapped from/to the database. I have been trying the following two approaches:
Domain entities in a purist way: no default constructor, no public setters; validation is done in the constructor(s); which makes sure you cannot create a domain entity in an invalid state. The side effect is that it's harder to dematerialize them in the repositories in read operations; as you need reflection in order to be able to create instances and map properties; and the use of dynamics in the Dapper requests which need to be mapped to the actual domain entities. If I would map this directly to the domain entities without the use of dynamics, Dapper throws an exception that there is no public constructor.
Domain entities in a non-purist way: you allow a default constructor, and all setters are public; so you can create entities that are not valid at a given point in time. In that case you need to call the Validate() method manually, to make sure they are valid before you continue. This makes dematerilizing in the repositories much easier, as you don't need reflection nor dynamics to map the database to the model.
Both methods work, however, with option 2 the repositories become a lot simpler because they contain a lot less custom mapping code, and without reflection obviously will be more performant as well. Of course, DDD is not applied in a purist way.
Before a decide what I will use in my project, a question: are there any other micro-ORM frameworks out there that can handle private constructors and setters, so that mapping the database to those kind of 'pure' domain entities is supported without additional custom mapping logic? (No EF nor NHibernate, I want something lightweight).
Or other technial solutions to keep the 'pure' model entity approach in combination with easy repository mapping?
EDIT: the solution I implemented was the following.
First, constructors and setters in the domain entities are all 'internal', which means they cannot be set by the consumers of the domain model. However, I use 'InternalsVisibleTo' to allow the data access layer to access them directly, so this means that dematerializing from the database is very easy with Dapper (no need for intermediate models). From the application layer, I can only use domain methods to change the domain entity, not the properties directly.
Second, to construct new domein entities from my application layer, I added fluent builders to help building domain entities, so I can now construct them like:
User user = new UserBuilder()
.WithSubjectId("045454857451245")
.WithDisplayName("Bobby Vinton")
.WithLinkedAccount("Facebook", la => la.WithProviderSubjectId("1548787788877").WithEmailAddress("bobby1#gmail.com"))
.WithLinkedAccount("Microsoft", la => la.WithProviderSubjectId("54546545646").WithEmailAddress("bobby2#gmail.com"))
When the builder 'builds' the entity, validation is done as well, so you can never create an entity in an invalid state.

One of the DDD principles encourages domain entities to have only private setters and a private default constructor, to make sure you cannot create domain entities in an invalid state.
That's not quite right. Yes, rich domain models don't usually expose setters, but that is because they don't need setters. You tell the model what to do at a higher level of abstraction, and allow it to determine how its own data structures should be modified.
Similarly, there are often cases where it makes sense to expose the default constructor: if you think of an aggregate as a finite state machine, then the default constructor is a way to initialize the aggregate in its "start" state.
So normally you reconstitute an aggregate in one of two ways: either you initialize it in its default state, and then send it a bunch of messages, or you use the Factory pattern, as described in the blue book.
this means an extra mapping in between, which makes code more complex
Perhaps, but it also ensures that your domain code is less dependent on ORM magic. In particular, it means that your domain logic can be operating on a different data structure than what is used to make persistence "easy".
But it isn't free -- you have to describe in code how to get values out of the aggregate root and back into the database (or into the ORM entity, acting as a proxy for the database).

The key is that you don't use Dapper to work with your domain entities, but instead you use it inside your repository layer with POCO entities. Your repository methods will return Domain entities by converting the POCO entities (that Dapper uses for data access) to Domain entities.

How do I return a domain object from the repository when there is a separate data model?

Update
My research is telling me that I should be using a Data Mapper: https://martinfowler.com/eaaCatalog/dataMapper.html. Are Data Mappers injected into repositories like this: http://www.rantdriven.com/post/2009/09/01/Using-the-Repository-Pattern-with-the-Command-and-DataMapper-Patterns.aspx and this: Repository and Data Mapper pattern or are they used as an alternative to a repository? All the examples I find seem to use Data Mappers to map DataReader objects to lists of domain objects. I am wanting to map Persistent objects to Domain objects.
Original Question
I am trying to build a Domain Model that is completely isolated from the Data Model after reading articles like this:
http://blog.sapiensworks.com/post/2012/04/07/Just-Stop-It!-The-Domain-Model-Is-Not-The-Persistence-Model.aspx. There are only six of us working on this system at the moment, however this could increase in future to 9+. However, I am beginning to think that this is not the right approach. I read a lot of questions on here, which appear to tell me to map the ORM directly to the Domain Model.
I recently asked this question: https://softwareengineering.stackexchange.com/questions/365643/should-the-data-model-be-identical-to-the-domain-model-for-mapping-purposes. One of the answerers says: "I believe the mapping in between both should be within a (persistence-oriented) repository". How do you do this mapping? I don't believe I should be using AutoMapper in the repository because of the reasons stated here: Repository pattern and mapping between domain models and Entity Framework and here: http://enterprisecraftsmanship.com/2016/02/08/specification-pattern-c-implementation/ i.e. I cannot simply do this in the repository:
public Customer getId()
{
CustomerData customerData = customerRepository.getById(id);
return Mapper.Map<CustomerDomain>(customerData);
}
I cannot do this because the invariants of the domain object will not be considered. How can I return a domain object from the repository. Do I inject a factory into the repository, which will take parameters from the data model and return a domain model? Is this even the right approach or is there another pattern for mapping data objects to domain objects?

Quick review of repositories, as defined by Evans
REPOSITORIES (provide) the means of finding and retrieving persistent objects while encapsulating the immense infrastructure involved.
(REPOSITORIES) provide the illusion of an in memory collection...
So, the API interface should normally be expressed in a domain specific vocabulary; neither the application layer nor the domain model should need to know any specifics beyond what is expressed in the API.
Commonly, the immense infrastructure means a domain agnostic persistence appliance; we don't persist domain objects, we persist bytes. With some appliances, we work very closely with the byte representation -- think streaming data to and from files. In other cases, the appliance provides a abstraction of those bytes - an RDBMS gives us an API that understands rows in tables, and encapsulates the details of how the bytes are arranged.
What this means is that somewhere we need a transformation from a domain agnostic representation to a domain specific representation, and vice versa.
Normally, these take the form of functions.
toDomainModel: JsonDocument -> DomainModel
toJson: DomainModel -> JsonDocument
These functions are normally defined by, and invoked by, the repository implementation -- they are part of the immense infrastructure that Evans describes.
I cannot do this because the invariants of the domain object will not be considered.
There are a couple possibilities here.
One, of course, would be to get a smarter mapper.
A second possibility is to model the unvalidated representation of the model as a distinct thing from a validated representation.
Example: consider a model for Money, that requires an Amount and a CurrencyCode; and the semantics of your model requires that Amount be a positive number and CurrencyCode be an entry in a fixed collection of tokens. There's nothing wrong with having an UnvalidatedMoney type, that has no semantic restrictions, and a function that converts UnvalidatedMoney to Money (enforcing the invariant).
This is analogous to what Scott Wlaschin describes for modeling a verified email address.
Note that this isn't necessarily an undue burden; if you have an invariant on some domain concept like money, you probably already have validation to do when passing new input from the world into your model. So the work is mostly about making that validation element re-usable.
A third possibility is to look into the Builder pattern. Persistent objects are, from one point of view, messages that a model wrote in the past so that it could read it in the future. So it is often the case that looking at common messaging patterns is useful.
In this approach, we would load an instance of the builder (created for us by a factory, implemented by the domain model), which has an API that allows the repository to pass the domain agnostic data to the domain model. Again, the message builder, being provided by the domain model, knows the domain invariant and can validate it.
All the validation for the domain object is done in the constructor. Should I just inject a factory into the repository?
That should be fine. You'll want to keep the coupling as small a you can manage (think interface), and you'll want to think about the fact that the factory is now part of the public API for the model (think about how to change the factory so that it remains backwards compatible).

ORM and Data Mapper
Most .NET ORM frameworks use a data mapper internally, or can be considered a kind of data mapper themselves. Some ORMs in other tech stacks may use the alternative approach to mapping: Active Record. The difference between the two is that in Active Record, the business class knows how to persist itself while a Data Mapper makes it agnostic from any persistence mechanism.
This is a different distinction than the one between having a Data Model + a Domain Model and mapping straight to the Domain Model. Regardless if you map directly to the Domain or if you have an intermediate Data Model, you will always have a Data Mapper. Both approaches imply it. The Data Mapper can be an ORM tool or custom code.
It's not clear from your question if you intend to use both an ORM and an additional Data Mapper on top, but I wouldn't recommend trying that or calling your Data Model <=> Domain Model mapping logic a "Data Mapper" in the PoEAA sense of the term.
Object hydration and invariants
Oftentimes, domain object hydration by a Data Mapper goes through another path than normal use-case driven code in order to bypass all validation that would otherwise happen. This can be done via parameterless constructors with restricted scope (protected, internal with internalsVisibleTo) that the ORM or custom code will have access to. Some newer reflection-based frameworks can also access private fields.
With a Data Model in addition to your Domain Model, this is all simpler since you don't need to restrict access to a Data Model object the way you would a Domain entity. Data Models don't have invariants, you can safely leave their parameterless constructor and properties publicly accessible and settable for rehydration. The only thing you need to take care of is have some FromData() method or constructor on your domain entity that takes a data model as an input and instantiates the entity. Again, the technique is made safe thanks to accessibility level restriction - the hydration code can typically be in an assembly granted internalsVisibleTo by the domain.

Should entity objects be exposed by the repository?

I have an repository which implements interface IRepository. The repository performs queries on the Entity Framework (on behalf of) the application and directly returns the entity object produced.
The whole point in implementing IRepository is so that it can be switched out for different repositories in the future. However returning the exact entity objects as returned by the Entity Framework will break this. Is this acceptable?
Therefore should the repository be converting all Entity Framework objects into business objects prior to exposing them to the application? Should such objects implement an interface or have a common base type?

The repository interface should deal only with business/domain entities, that is the repository sends and receives only objects known by the app, objects that aren't related to the underlying peristence access implementation.
EF or Nhibernate entities are modelling the persistence data NOT the domain ones. So IRepository should not return an object which is an implementation detail of the ORM, but an object that can be used directly by the app (either a domain entity or a simplified view model, depending on the operation).
In the repository implementation, you deal with ORM entities which will be mapped to the corresponding app entities (usually with a mapper such as AutoMapper). Long story short, when designing IRepository forget all about its implementation. That's why is better to design the interface before deciding if/what ORM will be used.
Basically, the repository is the gateway between the app domain context and the persitence context and the app SHOULD NOT be coupled to the implementation details of the repository.

You should look at using one of the POCO templates for generating your entities. That way your entities have no special dependencies on Entity Framework, and can be passed freely between layers. It saves a whole lot of effort compared to maintaining a completely separate domain model and mapping between the two (unless your domain model would be significantly different from your entity model, in which case it would make more sense).

If you are using POCO entities, you can assume that any provider will do a similar job. Also, remember that you are returning entities, which have their properties mapped to database. So you can assume that unless entities have different property names for each provider(I can not find a logical explanation of having different names), you can return them from repository to business directly.

Why EF Entities are making a dependency and the need of POCO

I'm pretty new to IoC, Dependency Injection and Unit Testing. I'm starting a new pet project and I'm trying to do it right.
I was planning to use the Repository pattern to mediate with the data. The objects that I was going to return from the repositories were going to be objects collected from a Linq to entities data context (EF4).
I'm reading in "Dependency Injection" from Mark Seeman that doing it, makes an important dependency and will definitely complicate the testing (that's what he's using POCO objects in a Library Project).
I'm not understanding why. Although the objects are created by a linq to entities context, I can create them simply calling the constructor as they were normal objects. So I assume that is possible to create fake repositories that deviler these objects to the caller.
I'm also concerned about the automatic generation of POCO classes, which is not very easy.
Can somebody bring some light? Are POCO objects trully necessary for a decoupled and testable project?
**EDIT: Thanks to Yuck I understand that it's better to avoid autogeneration with templates, which brings me to a design question. If I come from a big legacy database wich his tables are assuming a variety of responsabilities (doesn't fit well with the concept of a class with a single responsability), what's the best way to deal with that?
Delete the database is not an option ;-)

No, they're not necessary it just makes things easier, cleaner.
The POCO library won't have any knowledge that it's being used by Entity Framework. This allows it to be used in other ways - in place of a view model, for instance. It also allows you to use the same project on both sides of a WCF service which eliminates the need to create data transfer objects (DTO).
Just two examples from personal experience but there are surely more. In general the less a particular object or piece of code knows about who is using it or how it's being used will make it more adaptable and generic for other situations.
You also mention automatic generation of POCO classes. I don't recommend doing this. Were you planning to generate the class definitions from your database structure?

I was planning to use the Repository pattern to mediate with the data.
The objects that I was going to return from the repositories were
going to be objects collected from a Linq to entities data context
(EF4).
The default classes (not the POCOs) EF generates contain proxies for lazy loading and are tied at the hip to Entity Framework. That means any other class that wants to use those classes will have to reference the required EF assemblies.
This is the dependency Mark Seeman is talking about. Since you are now dependent on these non-abstract types, which in turn are dependent on EF, you cannot simply change the implementation of your repository to something different (i.e. just using your own persistence store) without addressing this change in the class that depend on these types.
If you are truly only interested in the public properties of the EF generated types then you can have the partial classes generated by EF implement a base interface. Put all the properties you need in that base interface and pass the dependency in as the base interface - now you only depend on the base interface and not EF anymore.

Rich domain model with ORM

I seem to be missing something and extensive use of google didn't help to improve my understanding...
Here is my problem:
I like to create my domain model in a persistence ignorant manner, for example:
I don't want to add virtual if I don't need it otherwise.
I don't like to add a default constructor, because I like my objects to always be fully constructed. Furthermore, the need for a default constructor is problematic in the context of dependency injection.
I don't want to use overly complicated mappings, because my domain model uses interfaces or other constructs not readily supported by the ORM.
One solution to this would be to have separate domain objects and data entities. Retrieval of the constructed domain objects could easily be solved using the repository pattern and building the domain object from the data entity returned by the ORM. Using AutoMapper, this would be trivial and not too much code overhead.
But I have one big problem with this approach: It seems that I can't really support lazy loading without writing code for it myself. Additionally, I would have quite a lot of classes for the same "thing", especially in the extended context of WCF and UI:
Data entity (mapped to the ORM)
Domain model
WCF DTO
View model
So, my question is: What am I missing? How is this problem generally solved?
UPDATE:
The answers so far suggest what I already feared: It looks like I have two options:
Make compromises on the domain model to match the prerequisites of the ORM and thus have a domain model the ORM leaks into
Create a lot of additional code
UPDATE:
In addition to the accepted answer, please see my answer for concrete information on how I solved those problems for me.

I would question that matching the prereqs of an ORM is necessarily "making compromises". However, some of these are fair points from the standpoint of a highly SOLID, loosely-coupled architecture.
An ORM framework exists for one sole reason; to take a domain model implemented by you, and persist it into a similar DB structure, without you having to implement a large number of bug-prone, near-impossible-to-unit-test SQL strings or stored procedures. They also easily implement concepts like lazy-loading; hydrating an object at the last minute before that object is needed, instead of building a large object graph yourself.
If you want stored procs, or have them and need to use them (whether you want to or not), most ORMs are not the right tool for the job. If you have a very complex domain structure such that the ORM cannot map the relationship between a field and its data source, I would seriously question why you are using that domain and that data source. And if you want 100% POCO objects, with no knowledge of the persistence mechanism behind, then you will likely end up doing an end run around most of the power of an ORM, because if the domain doesn't have virtual members or child collections that can be replaced with proxies, then you are forced to eager-load the entire object graph (which may well be impossible if you have a massive interlinked object graph).
While ORMs do require some knowledge in the domain of the persistence mechanism in terms of domain design, an ORM still results in much more SOLID designs, IMO. Without an ORM, these are your options:
Roll your own Repository that contains a method to produce and persist every type of "top-level" object in your domain (a "God Object" anti-pattern)
Create DAOs that each work on a different object type. These types require you to hard-code the get and set between ADO DataReaders and your objects; in the average case a mapping greatly simplifies the process. The DAOs also have to know about each other; to persist an Invoice you need the DAO for the Invoice, which needs a DAO for the InvoiceLine, Customer and GeneralLedger objects as well. And, there must be a common, abstracted transaction control mechanism built into all of this.
Set up an ActiveRecord pattern where objects persist themselves (and put even more knowledge about the persistence mechanism into your domain)
Overall, the second option is the most SOLID, but more often than not it turns into a beast-and-two-thirds to maintain, especially when dealing with a domain containing backreferences and circular references. For instance, for fast retrieval and/or traversal, an InvoiceLineDetail record (perhaps containing shipping notes or tax information) might refer directly to the Invoice as well as the InvoiceLine to which it belongs. That creates a 3-node circular reference that requires either an O(n^2) algorithm to detect that the object has been handled already, or hard-coded logic concerning a "cascade" behavior for the backreference. I've had to implement "graph walkers" before; trust me, you DO NOT WANT to do this if there is ANY other way of doing the job.
So, in conclusion, my opinion is that ORMs are the least of all evils given a sufficiently complex domain. They encapsulate much of what is not SOLID about persistence mechanisms, and reduce knowledge of the domain about its persistence to very high-level implementation details that break down to simple rules ("all domain objects must have all their public members marked virtual").

In short - it is not solved
(here goes additional useless characters to post my awesome answer)

All good points.
I don't have an answer (but the comment got too long when I decided to add something about stored procs) except to say my philosophy seems to be identical to yours and I code or code generate.
Things like partial classes make this a lot easier than it used to be in the early .NET days. But ORMs (as a distinct "thing" as opposed to something that just gets done in getting to and from the database) still require a LOT of compromises and they are, frankly, too leaky of an abstraction for me. And I'm not big on having a lot of dupe classes because my designs tend to have a very long life and change a lot over the years (decades, even).
As far as the database side, stored procs are a necessity in my view. I know that ORMs support them, but the tendency is not to do so by most ORM users and that is a huge negative for me - because they talk about a best practice and then they couple to a table-based design even if it is created from a code-first model. Seems to me they should look at an object datastore if they don't want to use a relational database in a way which utilizes its strengths. I believe in Code AND Database first - i.e. model the database and the object model simultaneously back and forth and then work inwards from both ends. I'm going to lay it out right here:
If you let your developers code ORM against your tables, your app is going to have problems being able to live for years. Tables need to change. More and more people are going to want to knock up against those entities, and now they all are using an ORM generated from tables. And you are going to want to refactor your tables over time. In addition, only stored procedures are going to give you any kind of usable role-based manageability without dealing with every tabl on a per-column GRANT basis - which is super-painful. If you program well in OO, you have to understand the benefits of controlled coupling. That's all stored procedures are - USE THEM so your database has a well-defined interface. Or don't use a relational database if you just want a "dumb" datastore.

Have you looked at the Entity Framework 4.1 Code First? IIRC, the domain objects are pure POCOs.

this what we did on our latest project, and it worked out pretty well
use EF 4.1 with virtual keywords for our business objects and have our own custom implementation of T4 template. Wrapping the ObjectContext behind an interface for repository style dataaccess.
using automapper to convert between Bo To DTO
using autoMapper to convert between ViewModel and DTO.
you would think that viewmodel and Dto and Business objects are same thing, and they might look same, but they have a very clear seperation in terms of concerns.
View Models are more about UI screen, DTO is more about the task you are accomplishing, and Business objects primarily concerned about the domain
There are some comprimises along the way, but if you want EF, then the benfits outweigh things that you give up

Over a year later, I have solved these problems for me now.
Using NHibernate, I am able to map fairly complex Domain Models to reasonable database designs that wouldn't make a DBA cringe.
Sometimes it is needed to create a new implementation of the IUserType interface so that NHibernate can correctly persist a custom type. Thanks to NHibernates extensible nature, that is no big deal.
I found no way to avoid adding virtual to my properties without loosing lazy loading. I still don't particularly like it, especially because of all the warnings from Code Analysis about virtual properties without derived classes overriding them, but out of pragmatism, I can now live with it.
For the default constructor I also found a solution I can live with. I add the constructors I need as public constructors and I add an obsolete protected constructor for NHibernate to use:
[Obsolete("This constructor exists because of NHibernate. Do not use.")]
protected DataExportForeignKey()
{
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.