When to separate certain entities into different repositories?

When to separate certain entities into different repositories? - c#

I generally try and keep all related entities in the same repository. The following are entities that have a relationship between the two (marked with indentation):
User
UserPreference
So they make sense to go into a user repository. However users are often linked to many different entities, what would you do in the following example?
User
UserPrefence
Order
Order
Product
Order has a relationship with both product and user but you wouldn't put functionality for all 4 entities in the same repository. What do you do when you are dealing with the user entities and gathering order information? You may need extra information about the product and often ORMs will offer the ability of lazy loading. However if your product entity is in a separate repository to the user entity then surely this would cause a conflict between repositories?

In the Eric Evan's Domain Driven Design ( http://domaindrivendesign.org/index.htm ) sense of things you should first think about what about your Aggregates. You then build you repositories around those.
There are many techniques for handling Aggregates that relate to each other. The one that I use most often is to only allow Aggregates to relate to each other through a read only interface. One of the key thoughts behind Aggregates is that you can't change state of underlying objects without going through the root. So if Product and User are root Aggregates in your model than I can't update a Product if I got to it by going through User->Order->Product. I have to get the Product from the Product repository to edit it. (From a UI point of view you can make it look like you go User->Order->Product, but when you hit the Product edit screen you grab the entity from the Product Repository).
When you are looking at a Product (in code) by going from User->Order->Product you should be looking at a Product interface that does not have any way to change the underlying state of the Product (only gets no sets etc.)
Organize your Aggregates and therefor Repositories by how you use them. I can see User and Prodcut being their own Aggregates and having their own Repositories. I'm not sure from your description if Order should belong to User or also be stand alone.
Either way use a readonly interface when Aggregates relate. When you have to cross over from one Aggregate to the other go fetch it from its own Repository.
If your Repositories are caching then when you load an Order (through a User) only load the Product Id's from the database. Then load the details from the Product Repository using the Product Id. You can optimize a bit by loading any other invariants on the Product as you load the Order.

By repository you mean class?
Depending on the use of the objects (repositories) you could make a view that combines the data on the database and create a class (repository) with your ORM to represent that view. This design would work when you want to display lighter weight objects with only a couple columns from each of the tables.

If SQL Server is your database, and by repository you mean a database, then I would just stick the information in whatever database makes sense and have a view in dependent databases that selects out of the other database via three-dot notation.

I'm still confused by what you mean by "repository." I would make all of the things you talked about separate classes (and therefore separate files) and they'd all reside in the same project.

Related

Exposing entity CRUD for DI without breaking DDD Aggregate principles

I have a repository for an aggregate: Order and OrderRepository. Order has Products, Customer, etc. I'm using a micro ORM called dapper and .net core.
Here is my issue, when I need to save, I don't see how I can't break some DDD principle. I would like to have repositories for child entities of the aggregate i.e. ProductRepository, CustomerRepository and when I save the order it uses those repositories to save child entities but I understand that you can only have one repo per aggregate. I decided to just make a class called ProductPersistor, that would be internal to my infrastructure class library and called by the OrderRepository however then I can't use DI as it's configured .NET Core's DI framework in a different project. Furthermore it's still accessible by the classes in that class library. I can add all the insert update of all child entities into OrderRepository but that would be a gross SRP issue and still can't be injected using a DI container.
With regards to queries, the same issue stands although with dapper I can write a massive SQL JOIN and split it into different entities although that's not very efficient or flexible though.
I feel like I'm missing something. Can someone help?
Edit: As the comments below pointed out, Product and Customer can be their own Aggregate root. So let's replace those with Order and OrderLineItem.

Product and Customer seem like aggregates on their own, so they will need to have their own repositories ProductRepository and CustomerRepository. They should not be child entities of Order.
Your Order aggregate would be linked to Product and Customer, and the best way to do it between aggregates is to link on their unique identifiers.
A Repository is not equal or same as the underlying table - at least that is what DDD recommends. The Repository pattern sits between the aggregate data structure and the table/document structure, and represents the domain side of things. It's methods usually represent valid domain concepts: GetCompletedOrders(), GetTotalTaxAmount(), and so on.
An Application Service is supposed to handle the task of loading/persisting aggregates with the help of repositories, and that's the place where you would handle multiple repositories required for a process. This is where you can query other aggregates and get their identifiers, if necessary.
But I wouldn't recommend using multiple repositories as part of a single transaction in an Application Service. It violates the guideline that each business transaction should touch at most one aggregate. All other aggregates should be updated via Domain Events - they should become eventually consistent. Good for scaling and performance, this way. In your case, you may update Order aggregate, and bubble domain events to update Product and Customer, if necessary.
If you have complex queries and expect to run into performance problems (which I think you will as your system scales between Order, Product, and Customer), I would suggest you explore CQRS as an option. You don't have to implement it fully, but have readily available "Read" models in the background, with data already constructed in the format you want to consume. Background workers populate these "Read" models on a near real-time basis.
Please ask if this explanation does not cover all your questions or doubts.

Using a POCO class as a domain class, but with custom link getters and setters?

I recently learned that it's a good idea (when working with Entity Framework) to design it code-first, with POCO classes that are also mixed with original domain class logic.
So I decided to use this new idea. Before (as an example) I had a POCO class called DatabasePerson and a domain class called Person. I am now trying to merge these into one, so that I can let Entity Framework manage my repository better, and manage changes to the domain layer easier.
Now, in my DatabasePerson POCO class, I have a link to a DatabaseAccount POCO class. Similarly, my Person domain class has a link to a Account domain class.
In Entity Framework, to allow for lazy loading of these types of links, I declare the link properties virtual (as in the DatabasePerson class), like so:
public virtual DatabaseAccount Account { get; set; }
However, what if I want to change how the account is set or get, and how exceptions are handled when attempting to set it to null? How can I make sure that this won't conflict with anything that Entity Framework adds to the table?
Here's my domain class' link:
public Account Account {
get {
//maybe do some other stuff here.
return account;
}
set {
if (value == null) throw new ArgumentNullException("value");
account = value;
//maybe do some other stuff here.
}
}
I want to somehow keep this form of customizability, but also have lazy loading. Is that possible?

Lazy loading conflicts with the concepts of Domain Driven Design, especially that of Aggregates. Retrieving and updating an aggregate from a repository should be a single operation. The aggregate needs to be complete as per the specification in the ubiquitous language. Introducing lazy loading breaks this rule as the aggregate isn't complete (or perhaps this is an indication that you haven't defined your aggregates properly).
In addition to the above, you are also violating one of the core principles in DDD; you are designing your domain with a strong influence of technical concerns (Entity Framework, databases, lazy loading, etc). Introducing these infrastructural 'leaks' will constrain the way in which you make design decisions. Entities and Value Objects form the absolute core of your domain. They are real-world objects that interact with each other. Persistence ignorance is key to designing a good domain model.
I will give you a short example of an aggregate, but you need to do more reading if you wish to better grasp this concept.
Let's say that you have decided that the Order entity is the root of an aggregate which includes Order and OrderLine (an Order can have 1 or many OrderLine entities). This decision can be based off many reasons, some being:
An OrderLine has no need to be retrieved or referenced independently from its Order
An Order should be responsible for changes to its OrderLine 'collection'
An Order and its OrderLines should be transactionally consistent
When fetching Order from a repository, this aggregate will be formed in a single unit of work. All OrderLines will be fetching with their Order and returned. When saving or updating, the aggregate is also persisted in a single unit of work. This ensures that all of the entities (and their relationships) remain consistent and that no 'business rules' are violated.
In your case, a Person and their Accounts should most likely not belong in a single aggregate. I assume that you would need to access a person's accounts without needing to retrieve the person itself (perhaps using identity). I assume that you will want to reference to a specific account from outside of the aggregate (only aggregate roots can be referenced from outside of an aggregate). I also assume that an Account can change independently of Person. Perhaps another reason why you wouldn't want it to belong in the Person aggregate is due to performance reasons (yes, sometimes we have to be pragmatic and not purist!). All of the above entirely depends on your requirements.
Personally I believe in separating your data entities (generally a direct mapping from your database using Entity Framework or some other persistence tool) and your domain entities/value objects. This lets you design your domain in complete isolation of database-related structures, frameworks and constraints.

EntityFramework: To Slice or Not To Slice?

So I'm just getting started with Entity Framework. I'm working with a very large, existing database. I find myself wanting to use EF to create models that are "slices" of the whole database. These slices corresponde to 1 aspect of the application. Is that the right way to look at it, or should I try to model the whole database in 1 EDMX?
Let me give you a fictional example:
Suppose that 1 of the many things that this database contains is customer billing information. I feel like I want to create an EF model that just focuses on the tables that the Customer Billing module needs to interact with. (so then that model would NOT be used for other modules in the app, rather, those same tables might appear in other small EF models). This would allow me to leverage EF's conceptual model features (inheritance, etc) to build a view that is correct for Customer Billing, without worrying about that model's effects, on say Customer Support (even though the 2 modules share some tables)
Does that sound right?

It sounds right to me. The point of an Entity Model, after all, is to provide a set of persistence-capable business objects at a level of abstraction that's appropriate to the required business logic.
You should absolutely create entity models that support modules of the application, not models that copy the underlying database schema. As the link above describes, separating logic from persistence is one of the primary purposes of EF.

I would prefer to use a slice approach, based of following reasons:
If you have a massive database with loads of tables, then it would be difficult to manage massive Entity Model.
It is easier to maintain application / domain specific entities, as entity framework is not a table to entity mapping, you can create custom entities and also combine and split tables across entities.

Application Design - Database Tables and Interfaces

I have a database with tables for each entity in the system. e.g. PersonTable has columns PersonId, Name, HomeStateId. There is also a table for 'reference data' (i.e. states, countries, all currencies, etc.) data that will be used to fill drop down list boxes. This reference table will also be used so that PersonTable's HomeStateId will be a foreign key to the reference table.
In the C# application we have interfaces and classes defined for the entity.
e.g. PersonImplementationClass : IPersonInterface. The reason for having the interfaces for each entity is because the actual entity class will store data differently depending on a 3rd party product that may change.
The question is, should the interface have properties for Name, HomeStateId, and HomeStateName (which will be retrieved from the reference table). OR should the interface not expose the structure of the database, i.e. NOT have HomeStateId, and just have Name, HomeStateName?

I'd say you're on the right track when thinking about property names!
Model your classes as you would in the real world.
Forget the database patterns and naming conventions of StateID and foreign keys in general. A person has a city, not a cityID.
It'll be up to your data layer to map and populate the properties of those objects at run time. You should have the freedom to express your intent and the representation of 'real world' objects in your code, and not be stuck to your DB implementation.

Either way is acceptable, but they both have their pros and cons.
The first way (entities have IDs) is analagous to the ActiveRecord pattern, where your entities are thin wrappers over the database structure. This is often a flexible and fast way of structuring your data layer, because your entities have freedom to work directly with the database to accomplish domain operations. The drawback is that when the data model changes, your app is likely to need maintenance.
The second way (entities reflect more of a real-world structure) is more analagous to a heavier ORM like Entity Framework or Hibernate. In this type of data access layer, your entity management framework would take care of automatically mapping the entities back and forth into the database. This more cleanly separates the application from the data, but can be a lot more plumbing to deal with.
This is a big choice, and shouldn't be taken lightly. It really depends on your project requirements and size, who will be consuming it.

It may help to separate the design a little bit.
For each entity, use two classes:
One that deals with database operations on the entity (where you would put IDs)
One that is a simple data object (where you would have standard fields that actually mean something)
As #womp mentioned, if your entity persistence is only going to be to databases, strongly consider the use of an ORM so you don't end up rolling your own.

Caching calculated values (Sums/Totals) in the Database

Consider the Following object model (->> indicates collection):
Customer->Orders
Orders->>OrderLineItems->Product{Price}
The app is focused on processing orders, so most of the time tables showing all the orders that match certain criteria are used in the UI. 99% of the time i am only interested in displaying the Sum of LineTotals, not the individual LineTotals.
Thinking about it further, there also might be multiple payments (wire transfers,cheque, credit card etc.) associated with each order, again, im only interested in the sum of money that i received.
When querying the database for an order, I dont want to select all orders and then, for each order, its payments and LineItems.
My idea was to store the associate each order with a "status" object, caching all the sums and status of an order, improving query performance by orders of magnitude and also supporting query scenarios for unpaid orders, paid orders, orders due etc.
This prevents domain logic (e.g. when an order is considered to be paid) from leaking into database queries. However, it puts the responsibility for keeping the sums up to date. The system usually has well defined points where that needs to happen, e.g. entering or integrating payments, creating/modifying an order.
So far i have used Observable Collections, that trigger recalculations of Status when items are added or removed, or certain properties on the items are updated. I ask myself where the logic for all that should be put from a ddd perspective. It seems strange to me to force all the event wiring and calculation logic in the aggregate root.

You need to express the intent of a request in an intention-revealing interface, so that your repositories can understand what exactly you want to do and react accordingly. In this case the interface reveals intent, not to other developers, but to other code. So if you want a status or total, create an interface that reveals this intent and request an object of that type from your repository. The repository can then create and return a domain object which encapsulates doing exactly the work required to calculate the total and no more than that.
In addition, your DAL can intelligently choose which fetching strategy to apply from the interface you request, i.e. lazy loading for situations where you don't need to access child objects and eager loading where you do.
Udi Dahan has some great blog posts about this. He has written and talked on applying intention-revealing interfaces to this problem, which he calls making roles explicit.

I highly recommend looking into OR (object relational) mappers that support LINQ. To name the two primary ones, LINQ to SQL and Entity Framework, both from Microsoft. I believe LLBLGen also supports LINQ now, and nHibernate has a few half-baked LINQ solutions you could try. My prime recommendation is Entity Framework v4.0, which is available through .NET 4.0 betas or the Visual Studio 2010 Beta.
With a LINQ enabled OR mapper, you can easily query for the aggregate information you need dynamically, real-time, using only your domain model. There is no need for business logic to leak into your data layer, because you generally will not use stored procedures. OR mappers generate parameterized SQL for you on the fly. LINQ combined with OR mappers is an extremely powerful tool that allows you to not only query for and retrieve entities and entity graphs, but also query for data projections on your domain model...allowing the retrieval of custom data sets, aggregations, etc. via a single conceptual model.

"It seems strange to me to force all the event wiring and calculation logic in the aggregate root."
That is usually a call for a «Service».

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.