Application Design - Database Tables and Interfaces

Application Design - Database Tables and Interfaces - c#

I have a database with tables for each entity in the system. e.g. PersonTable has columns PersonId, Name, HomeStateId. There is also a table for 'reference data' (i.e. states, countries, all currencies, etc.) data that will be used to fill drop down list boxes. This reference table will also be used so that PersonTable's HomeStateId will be a foreign key to the reference table.
In the C# application we have interfaces and classes defined for the entity.
e.g. PersonImplementationClass : IPersonInterface. The reason for having the interfaces for each entity is because the actual entity class will store data differently depending on a 3rd party product that may change.
The question is, should the interface have properties for Name, HomeStateId, and HomeStateName (which will be retrieved from the reference table). OR should the interface not expose the structure of the database, i.e. NOT have HomeStateId, and just have Name, HomeStateName?

I'd say you're on the right track when thinking about property names!
Model your classes as you would in the real world.
Forget the database patterns and naming conventions of StateID and foreign keys in general. A person has a city, not a cityID.
It'll be up to your data layer to map and populate the properties of those objects at run time. You should have the freedom to express your intent and the representation of 'real world' objects in your code, and not be stuck to your DB implementation.

Either way is acceptable, but they both have their pros and cons.
The first way (entities have IDs) is analagous to the ActiveRecord pattern, where your entities are thin wrappers over the database structure. This is often a flexible and fast way of structuring your data layer, because your entities have freedom to work directly with the database to accomplish domain operations. The drawback is that when the data model changes, your app is likely to need maintenance.
The second way (entities reflect more of a real-world structure) is more analagous to a heavier ORM like Entity Framework or Hibernate. In this type of data access layer, your entity management framework would take care of automatically mapping the entities back and forth into the database. This more cleanly separates the application from the data, but can be a lot more plumbing to deal with.
This is a big choice, and shouldn't be taken lightly. It really depends on your project requirements and size, who will be consuming it.

It may help to separate the design a little bit.
For each entity, use two classes:
One that deals with database operations on the entity (where you would put IDs)
One that is a simple data object (where you would have standard fields that actually mean something)
As #womp mentioned, if your entity persistence is only going to be to databases, strongly consider the use of an ORM so you don't end up rolling your own.

Related

DDD and references between aggregates in EFCore and C#

I have an issue that I am not sure how to solve when DDD is assumed and using C#/EF Core.
Simplified situation: We have 2 aggregates - Item and Warehouse. Each of them has its identity by ExternalId(Guid) to identify it outside (FE etc) which is also treated as its domain identity. It also has database Id that identifies it inside databases model - Entity model and Db model is the same class as EF Core allows to use private fields - only the ExternalId, and required fields are exposed. Entity (both in DDD and EF Core sense) contain quite a lot of business logic and methods strictly coupled to the object. In general I follow the pattern from eShop/eShopOnContainers example.
Item is assigned to Warehouse and when creating an item we need to pass Warehouse to its constructor.
Is it proper to pass full Warehouse object to Item's constructor (but also to other methods that Item defines):
public Item(Warehouse warehouse,..)
or should I relay on database Id only :
public Item(long warehouseId,..)
I have an issue about this, because from one one side I read that aggregates should not reference other aggregates, but on the other hand using Datbase DB leaks the implementation detail (object persitsance in relational DB) to domain model which should not happen in my opinion.
Using ExternalId:
public Item(Guid warehouseId,..)
does not solve the problem as actual relations in db do not base on it.
What is your opinion ? I am a bit puzzled.

Usually you would create a Value Object for the Id of the Aggregate Root. It is one possibility to rely on a Id generated by the database. If you decide to let the Db generate the Id, then you will need to work with that.
But why would you need to pass the Warehouse reference or Id anyways? It looks like Item is an Entity and Warehouse is the Aggregate Root that should contain that Entity. In general you should not create an Entity outside of the Aggregate Root.
Edit: There are several identity creation strategies as Vaughn Vernon describes in the red book. One of them is let the persistance mechanism such as a SQL Db generate the unique identifier of an entity or aggregate.

Your domain model created during analysis is often different from the one created during design. Semantically they are both the same, you are passing references, but the design model recognises that you have to persist the data so you might not want to pre-load all referenced objects for performance reasons, whether that is simply loading it from disk within the same domain, or from a remote service in another domain.

Should I use inheritance in Entity Framework or is there a better approach?

I have various objects that I would like to track in an application. The objects are computers, cameras, switches, routers etc. I want the various objects to inherit from an object called Device since they will all have some properties in common (i.e. IP Address, MAC Address, etc.) I like to create the objects using the designer (Model First) but I do not like the difficulty in updating the database from the model. Basically, I do not like to have to drop the database and recreate it, especially once I start populating the database. The other approach that I experimented with was creating the database using SSMS in SQL Server, but then when I create the POCOs from the database the entities do not inherit from each other. What is a good approach for my situation ?

I want the various objects to inherit from an object called Device since they will all have some properties in common (i.e. IP Address, MAC Address, etc.)
You are essentially talking about which inheritance pattern you are going to use in EF; or how the model maps to your database tables. There are 3 main types of inheritance patterns in EF (see Inheritance Mapping: A Walkthrough Guide for Beginners):
Table-per-Hierarchy
Table-per-Type
Table-per-Concrete Type
Each has pros and cons (such as performance). But, you should also consider that this model is a model that relates to the database, and in larger projects you might then create a second layer to work with for business logic. DDD talks about persistence models and domain models. Again, your choices here are weighing up initial speed of development and scalability and performance later on.
I like to create the objects using the designer (Model First) but I do not like the difficulty in updating the database from the model.
There are 4, and only 4 development strategies for EF (see Entity Framework Development Workflows):
Model First
Code First (new database)
Database First
Code-first (existing database)
I do not like to have to drop the database and recreate it, especially once I start populating the database
Code First is really very, very good at this:
Seeding in Code First allows you to populate databases with test or live data depending on where you are deploying to.
Migrations allow you to do non-destructive updates to the database, and migrate data in a fully testable, utterly reliable fashion for live deployment.
Doing this with Model First is, unfortunately, just harder. The only real solution I know is to generate a new database, and use a SQL compare (with data compare) tool to generate the data in the new database.
Choosing a pattern and a strategy
Each strategy has pros and cons, and each inheritance pattern is better used with particular development strategies. The trade offs are really your own to judge, for example you might have to use database-first if you have en existing database you inherited, or you may be happier using the EF designer so would use model-first.
Model First (by that I mean using the EF designer to define your model) uses the TPT strategy by default. See EF Designer TPT Inheritance. If you want TPH, then you can use Model-first (see EF Designer TPH Inheritance), but you have extra work to do; Code First is a better fit for TPH. TPC is even harder using Model First, and Code First is really the best (only viable) option for that in EF 5.
when I create the POCOs from the database the entities do not inherit from each other
It is good to remember that the model deals with classes; the database deals with storage in tables. When generating a model from your database it is hard for EF to work out what the TPH or TPC inheritance should be. All it can do is create a "best guess" at your model based on the table assocations. You have to help it out after the model is generated by renaming properties, changing associations or applying inheritance. There really is no other way to do this. Updates to the database may also therefore require more work on the model.
Your best approach
That is, unfortunately, down to opinion. However if your primary requirements are:
You want TPH or TPC (or mixed strategies)
You don't want to drop your database when you issue updates to the model
then the best match for these technical requirements is Code First development, with migrations and seeding.
The downside of Code First is having to write your own POCOs, and learning the data annotation attributes. However, keep in mind:
writing POCOs is not so different from writing a database table (and once you are used to it is is just as quick)
Code First is a lot more usable with automated testing (e.g. with DI and/or IoC to test without the database involved), so can have benefits later on
If you are going to do a lot of EDMX manipulation with database first, or a lot of work whenever you drop and update your database using model first, then you are just putting in time and effort in other places instead of in writing POCOs

Domain model entities vs data entities, one or both in software architecture

**Update 2**
I have a project with a typical 3 layer structure(UI/Domain/Data layer). What is the pros and cons to have both domain model entities in domain layer and data entities layer.
The possibility of changing to different database is slim. What is the pros and cons of having only data entities in data layer as domain model entities? What is the difference if ORM is used (Is it good practice to have both entities when ORM (NHibernate) is used)?
Please fire your ideas, or links, articles, books.
Update 3
In what circumstances we should use both domain entity and data entity?

Assuming your question is about DDD. In a typical DDD scenario, Domain entities are 'hydrated' by Data layer (which is often thin because it uses ORM). In order to hydrate Domain entities, Data layer has to have intimate knowledge of the domain. If you use ORM than you most likely not need separate 'data entities', ORM knows how to reconstitute your Domain objects. Hope this helps.

Using data entities with domain entities is a tricky thing to do and adds another not necessary layer of abstraction without adding any value.
You should either use full-featured domain model mapped via ORM or 'anaemic' data model (also mapped via ORM). Which one depends on your background, requirements and personal preferences.
In case of data model, you probably map directly tables to entities (one-to-one) without any complex stuff like inheritance hierarchy mapping. That's fine. The tricky thing is mapping 1:n relationships. With data model they tend to work well if you don't represent the 'many' side in object model. Why? Because both relation ends will easily be out of sync if you don't add custom code to handle these cases.
In case of domain model, you probably use repositories to fetch your aggregate roots.
There is exception to what I've written. It is legitimate to use both data entities and domain entities in CQRS architecture.

You use data entities if your data schema does not map exactly onto your domain entities. For example, consider a telephone number. In your domain entity, it may be one single property whereas in the database it may consist of an area code field and a telephone number field.
Contrary to what some answers suggest, the data access layer DOES NOT hydrate your domain entities and DOES NOT have intimate knowledge of them. Instead, your domain layer asks your data access layer for data needed to reconstruct instances.

The domain model should exist in as many places as possible to maximize code reuse and understanding. The exception to this would when using the domain model is prohibitively expensive, in either time, memory or transportation costs.
Let's say the you have have a supplier for parts. This supplier supplies thousands of parts to you, so the one-to-many in this case might be huge, especially considering the web of classes each part might bring along. However, you need a list of parts from a specific supplier for a specific widget. In this case, you create a value object with the just the data you need.
This value object can be a copy of the supplier object, with only the parts you need, or it can be a completely new class representing only the data you need.
The typical use case for this might be displaying the data on a web page, and your transferring the data via json.

Rich domain model with ORM

I seem to be missing something and extensive use of google didn't help to improve my understanding...
Here is my problem:
I like to create my domain model in a persistence ignorant manner, for example:
I don't want to add virtual if I don't need it otherwise.
I don't like to add a default constructor, because I like my objects to always be fully constructed. Furthermore, the need for a default constructor is problematic in the context of dependency injection.
I don't want to use overly complicated mappings, because my domain model uses interfaces or other constructs not readily supported by the ORM.
One solution to this would be to have separate domain objects and data entities. Retrieval of the constructed domain objects could easily be solved using the repository pattern and building the domain object from the data entity returned by the ORM. Using AutoMapper, this would be trivial and not too much code overhead.
But I have one big problem with this approach: It seems that I can't really support lazy loading without writing code for it myself. Additionally, I would have quite a lot of classes for the same "thing", especially in the extended context of WCF and UI:
Data entity (mapped to the ORM)
Domain model
WCF DTO
View model
So, my question is: What am I missing? How is this problem generally solved?
UPDATE:
The answers so far suggest what I already feared: It looks like I have two options:
Make compromises on the domain model to match the prerequisites of the ORM and thus have a domain model the ORM leaks into
Create a lot of additional code
UPDATE:
In addition to the accepted answer, please see my answer for concrete information on how I solved those problems for me.

I would question that matching the prereqs of an ORM is necessarily "making compromises". However, some of these are fair points from the standpoint of a highly SOLID, loosely-coupled architecture.
An ORM framework exists for one sole reason; to take a domain model implemented by you, and persist it into a similar DB structure, without you having to implement a large number of bug-prone, near-impossible-to-unit-test SQL strings or stored procedures. They also easily implement concepts like lazy-loading; hydrating an object at the last minute before that object is needed, instead of building a large object graph yourself.
If you want stored procs, or have them and need to use them (whether you want to or not), most ORMs are not the right tool for the job. If you have a very complex domain structure such that the ORM cannot map the relationship between a field and its data source, I would seriously question why you are using that domain and that data source. And if you want 100% POCO objects, with no knowledge of the persistence mechanism behind, then you will likely end up doing an end run around most of the power of an ORM, because if the domain doesn't have virtual members or child collections that can be replaced with proxies, then you are forced to eager-load the entire object graph (which may well be impossible if you have a massive interlinked object graph).
While ORMs do require some knowledge in the domain of the persistence mechanism in terms of domain design, an ORM still results in much more SOLID designs, IMO. Without an ORM, these are your options:
Roll your own Repository that contains a method to produce and persist every type of "top-level" object in your domain (a "God Object" anti-pattern)
Create DAOs that each work on a different object type. These types require you to hard-code the get and set between ADO DataReaders and your objects; in the average case a mapping greatly simplifies the process. The DAOs also have to know about each other; to persist an Invoice you need the DAO for the Invoice, which needs a DAO for the InvoiceLine, Customer and GeneralLedger objects as well. And, there must be a common, abstracted transaction control mechanism built into all of this.
Set up an ActiveRecord pattern where objects persist themselves (and put even more knowledge about the persistence mechanism into your domain)
Overall, the second option is the most SOLID, but more often than not it turns into a beast-and-two-thirds to maintain, especially when dealing with a domain containing backreferences and circular references. For instance, for fast retrieval and/or traversal, an InvoiceLineDetail record (perhaps containing shipping notes or tax information) might refer directly to the Invoice as well as the InvoiceLine to which it belongs. That creates a 3-node circular reference that requires either an O(n^2) algorithm to detect that the object has been handled already, or hard-coded logic concerning a "cascade" behavior for the backreference. I've had to implement "graph walkers" before; trust me, you DO NOT WANT to do this if there is ANY other way of doing the job.
So, in conclusion, my opinion is that ORMs are the least of all evils given a sufficiently complex domain. They encapsulate much of what is not SOLID about persistence mechanisms, and reduce knowledge of the domain about its persistence to very high-level implementation details that break down to simple rules ("all domain objects must have all their public members marked virtual").

In short - it is not solved
(here goes additional useless characters to post my awesome answer)

All good points.
I don't have an answer (but the comment got too long when I decided to add something about stored procs) except to say my philosophy seems to be identical to yours and I code or code generate.
Things like partial classes make this a lot easier than it used to be in the early .NET days. But ORMs (as a distinct "thing" as opposed to something that just gets done in getting to and from the database) still require a LOT of compromises and they are, frankly, too leaky of an abstraction for me. And I'm not big on having a lot of dupe classes because my designs tend to have a very long life and change a lot over the years (decades, even).
As far as the database side, stored procs are a necessity in my view. I know that ORMs support them, but the tendency is not to do so by most ORM users and that is a huge negative for me - because they talk about a best practice and then they couple to a table-based design even if it is created from a code-first model. Seems to me they should look at an object datastore if they don't want to use a relational database in a way which utilizes its strengths. I believe in Code AND Database first - i.e. model the database and the object model simultaneously back and forth and then work inwards from both ends. I'm going to lay it out right here:
If you let your developers code ORM against your tables, your app is going to have problems being able to live for years. Tables need to change. More and more people are going to want to knock up against those entities, and now they all are using an ORM generated from tables. And you are going to want to refactor your tables over time. In addition, only stored procedures are going to give you any kind of usable role-based manageability without dealing with every tabl on a per-column GRANT basis - which is super-painful. If you program well in OO, you have to understand the benefits of controlled coupling. That's all stored procedures are - USE THEM so your database has a well-defined interface. Or don't use a relational database if you just want a "dumb" datastore.

Have you looked at the Entity Framework 4.1 Code First? IIRC, the domain objects are pure POCOs.

this what we did on our latest project, and it worked out pretty well
use EF 4.1 with virtual keywords for our business objects and have our own custom implementation of T4 template. Wrapping the ObjectContext behind an interface for repository style dataaccess.
using automapper to convert between Bo To DTO
using autoMapper to convert between ViewModel and DTO.
you would think that viewmodel and Dto and Business objects are same thing, and they might look same, but they have a very clear seperation in terms of concerns.
View Models are more about UI screen, DTO is more about the task you are accomplishing, and Business objects primarily concerned about the domain
There are some comprimises along the way, but if you want EF, then the benfits outweigh things that you give up

Over a year later, I have solved these problems for me now.
Using NHibernate, I am able to map fairly complex Domain Models to reasonable database designs that wouldn't make a DBA cringe.
Sometimes it is needed to create a new implementation of the IUserType interface so that NHibernate can correctly persist a custom type. Thanks to NHibernates extensible nature, that is no big deal.
I found no way to avoid adding virtual to my properties without loosing lazy loading. I still don't particularly like it, especially because of all the warnings from Code Analysis about virtual properties without derived classes overriding them, but out of pragmatism, I can now live with it.
For the default constructor I also found a solution I can live with. I add the constructors I need as public constructors and I add an obsolete protected constructor for NHibernate to use:
[Obsolete("This constructor exists because of NHibernate. Do not use.")]
protected DataExportForeignKey()
{
}

EntityFramework: To Slice or Not To Slice?

So I'm just getting started with Entity Framework. I'm working with a very large, existing database. I find myself wanting to use EF to create models that are "slices" of the whole database. These slices corresponde to 1 aspect of the application. Is that the right way to look at it, or should I try to model the whole database in 1 EDMX?
Let me give you a fictional example:
Suppose that 1 of the many things that this database contains is customer billing information. I feel like I want to create an EF model that just focuses on the tables that the Customer Billing module needs to interact with. (so then that model would NOT be used for other modules in the app, rather, those same tables might appear in other small EF models). This would allow me to leverage EF's conceptual model features (inheritance, etc) to build a view that is correct for Customer Billing, without worrying about that model's effects, on say Customer Support (even though the 2 modules share some tables)
Does that sound right?

It sounds right to me. The point of an Entity Model, after all, is to provide a set of persistence-capable business objects at a level of abstraction that's appropriate to the required business logic.
You should absolutely create entity models that support modules of the application, not models that copy the underlying database schema. As the link above describes, separating logic from persistence is one of the primary purposes of EF.

I would prefer to use a slice approach, based of following reasons:
If you have a massive database with loads of tables, then it would be difficult to manage massive Entity Model.
It is easier to maintain application / domain specific entities, as entity framework is not a table to entity mapping, you can create custom entities and also combine and split tables across entities.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.