I'm wrestling with a design and trying to figure out the best way of approaching it.
We have many tables, and in a current LinqToSql implementation, our DBML is many megs in size, very unwieldy. I want to avoid recreating this situation if I can. We decide our connection string on a per user basis, so it got very difficult to make separate dbmls for different groups of tables.
I'm set on using Entity Framework, and although we don't need the Code First elements, I'm liking the lightweight code without all the generation and we don't need the visual mapping so I was thinking of generating the code files for all the tables and then adding them into a DataContext as DbSets.
This got me thinking about best practice here, and I wanted to ask the question;
Is it wise to create a DataContext for every group of tables you want to use. I.e. I'm going to have a module, it will be responsible for gathering data from 5 tables, it doesn't need every single table in the database, just 5. Do I create a DbContext that includes these 5 tables. If I need more in the future I can add them in, but it's lightweight.
While you may have a separate context for each grouping of tables, if your model is that large, or your domains that disparate, you may want to look into adding a layer of abstraction. By this, I mean having a single context that encompasses your whole model, then adding something along the lines of the repository pattern. This is a decent write-up on accomplishing this with EF.
By doing this, would you be essentially accomplishing two goals: abstracting out your data tier, thus freeing up implementation concerns; and, allowing your developers to work with just the entities they need, possibly grouped by aggregate root.
One thing I would like to make clear though. I am not necessarily suggesting that you go with a specific end-to-end architecture (i.e. DDD). What I am trying to do here is suggest a few patterns that will give you the flexibility to allow you to make mistakes (fail gracefully) while still making progress with your project.
You can certainly do this. You just add tables to the edmx model just as in Linq2SQL so by just adding the 5 tables you need you'll save on having any overhead for entity tracking for the other untracked tables. Entity Framework nicely adds 2-way Navigation Properties which Linq2SQL doesn't have too. I'd recommend using EF instead of Linq2SQL.
There is nothing inherently bad about a large DBML model, the performance impact should be negligible in EF.
On the other hand in my opinion reducing complexity also applies to Entity Framework - if your code only needs 5 tables from the database by all means create a separate context that only has the entities for those 5 tables. By factoring out completely independent tables into separate contexts you are expressing this separation in clear way - there are no dependencies from these tables to other tables in your database, and no dependencies from the code to unrelated entities - if that is the case I think (and there might be other opinions) this is the way to go.
However keep in mind that if you need some of those tables in another context you would have to put the corresponding entities into that context as well - it can get hard to understand that the same tables are present in multiple context or even have cross-dependencies between contexts. That should be avoided since it adds complexity.
Related
I have a Grid control (WinForms) that is used for viewing/updating data but executing SaveChangesAsync() is very slow (3 seconds) for updating one object. I'm finding it difficult to solve this performance issue while also sticking with DDD.
One might suggest using a CRUD approach instead of DDD, but this is only a part of the application. Would I want to use CRUD for this view, if I use DDD in other parts of the app? Seems like it would break encapsulation and confuse other developers on the project. If I did this, would I put them in separate bounded contexts?
Another option is to enable AsNoTracking*(). The problem is that my aggregate is Product and the entity is ProductRecipe (the parts needed to make the product). I only need to track ProductRecipe entities, but since ProductRecipes are accessed via the aggregate Product, I need to track them the same.
My final option is to just make ProductRecipe an aggregate which would allow me to query it separately from Product.
What's the best way to fix this performance issue?
Details:
EF Core 5
WinForms
I have various objects that I would like to track in an application. The objects are computers, cameras, switches, routers etc. I want the various objects to inherit from an object called Device since they will all have some properties in common (i.e. IP Address, MAC Address, etc.) I like to create the objects using the designer (Model First) but I do not like the difficulty in updating the database from the model. Basically, I do not like to have to drop the database and recreate it, especially once I start populating the database. The other approach that I experimented with was creating the database using SSMS in SQL Server, but then when I create the POCOs from the database the entities do not inherit from each other. What is a good approach for my situation ?
I want the various objects to inherit from an object called Device since they will all have some properties in common (i.e. IP Address, MAC Address, etc.)
You are essentially talking about which inheritance pattern you are going to use in EF; or how the model maps to your database tables. There are 3 main types of inheritance patterns in EF (see Inheritance Mapping: A Walkthrough Guide for Beginners):
Table-per-Hierarchy
Table-per-Type
Table-per-Concrete Type
Each has pros and cons (such as performance). But, you should also consider that this model is a model that relates to the database, and in larger projects you might then create a second layer to work with for business logic. DDD talks about persistence models and domain models. Again, your choices here are weighing up initial speed of development and scalability and performance later on.
I like to create the objects using the designer (Model First) but I do not like the difficulty in updating the database from the model.
There are 4, and only 4 development strategies for EF (see Entity Framework Development Workflows):
Model First
Code First (new database)
Database First
Code-first (existing database)
I do not like to have to drop the database and recreate it, especially once I start populating the database
Code First is really very, very good at this:
Seeding in Code First allows you to populate databases with test or live data depending on where you are deploying to.
Migrations allow you to do non-destructive updates to the database, and migrate data in a fully testable, utterly reliable fashion for live deployment.
Doing this with Model First is, unfortunately, just harder. The only real solution I know is to generate a new database, and use a SQL compare (with data compare) tool to generate the data in the new database.
Choosing a pattern and a strategy
Each strategy has pros and cons, and each inheritance pattern is better used with particular development strategies. The trade offs are really your own to judge, for example you might have to use database-first if you have en existing database you inherited, or you may be happier using the EF designer so would use model-first.
Model First (by that I mean using the EF designer to define your model) uses the TPT strategy by default. See EF Designer TPT Inheritance. If you want TPH, then you can use Model-first (see EF Designer TPH Inheritance), but you have extra work to do; Code First is a better fit for TPH. TPC is even harder using Model First, and Code First is really the best (only viable) option for that in EF 5.
when I create the POCOs from the database the entities do not inherit from each other
It is good to remember that the model deals with classes; the database deals with storage in tables. When generating a model from your database it is hard for EF to work out what the TPH or TPC inheritance should be. All it can do is create a "best guess" at your model based on the table assocations. You have to help it out after the model is generated by renaming properties, changing associations or applying inheritance. There really is no other way to do this. Updates to the database may also therefore require more work on the model.
Your best approach
That is, unfortunately, down to opinion. However if your primary requirements are:
You want TPH or TPC (or mixed strategies)
You don't want to drop your database when you issue updates to the model
then the best match for these technical requirements is Code First development, with migrations and seeding.
The downside of Code First is having to write your own POCOs, and learning the data annotation attributes. However, keep in mind:
writing POCOs is not so different from writing a database table (and once you are used to it is is just as quick)
Code First is a lot more usable with automated testing (e.g. with DI and/or IoC to test without the database involved), so can have benefits later on
If you are going to do a lot of EDMX manipulation with database first, or a lot of work whenever you drop and update your database using model first, then you are just putting in time and effort in other places instead of in writing POCOs
I seem to be missing something and extensive use of google didn't help to improve my understanding...
Here is my problem:
I like to create my domain model in a persistence ignorant manner, for example:
I don't want to add virtual if I don't need it otherwise.
I don't like to add a default constructor, because I like my objects to always be fully constructed. Furthermore, the need for a default constructor is problematic in the context of dependency injection.
I don't want to use overly complicated mappings, because my domain model uses interfaces or other constructs not readily supported by the ORM.
One solution to this would be to have separate domain objects and data entities. Retrieval of the constructed domain objects could easily be solved using the repository pattern and building the domain object from the data entity returned by the ORM. Using AutoMapper, this would be trivial and not too much code overhead.
But I have one big problem with this approach: It seems that I can't really support lazy loading without writing code for it myself. Additionally, I would have quite a lot of classes for the same "thing", especially in the extended context of WCF and UI:
Data entity (mapped to the ORM)
Domain model
WCF DTO
View model
So, my question is: What am I missing? How is this problem generally solved?
UPDATE:
The answers so far suggest what I already feared: It looks like I have two options:
Make compromises on the domain model to match the prerequisites of the ORM and thus have a domain model the ORM leaks into
Create a lot of additional code
UPDATE:
In addition to the accepted answer, please see my answer for concrete information on how I solved those problems for me.
I would question that matching the prereqs of an ORM is necessarily "making compromises". However, some of these are fair points from the standpoint of a highly SOLID, loosely-coupled architecture.
An ORM framework exists for one sole reason; to take a domain model implemented by you, and persist it into a similar DB structure, without you having to implement a large number of bug-prone, near-impossible-to-unit-test SQL strings or stored procedures. They also easily implement concepts like lazy-loading; hydrating an object at the last minute before that object is needed, instead of building a large object graph yourself.
If you want stored procs, or have them and need to use them (whether you want to or not), most ORMs are not the right tool for the job. If you have a very complex domain structure such that the ORM cannot map the relationship between a field and its data source, I would seriously question why you are using that domain and that data source. And if you want 100% POCO objects, with no knowledge of the persistence mechanism behind, then you will likely end up doing an end run around most of the power of an ORM, because if the domain doesn't have virtual members or child collections that can be replaced with proxies, then you are forced to eager-load the entire object graph (which may well be impossible if you have a massive interlinked object graph).
While ORMs do require some knowledge in the domain of the persistence mechanism in terms of domain design, an ORM still results in much more SOLID designs, IMO. Without an ORM, these are your options:
Roll your own Repository that contains a method to produce and persist every type of "top-level" object in your domain (a "God Object" anti-pattern)
Create DAOs that each work on a different object type. These types require you to hard-code the get and set between ADO DataReaders and your objects; in the average case a mapping greatly simplifies the process. The DAOs also have to know about each other; to persist an Invoice you need the DAO for the Invoice, which needs a DAO for the InvoiceLine, Customer and GeneralLedger objects as well. And, there must be a common, abstracted transaction control mechanism built into all of this.
Set up an ActiveRecord pattern where objects persist themselves (and put even more knowledge about the persistence mechanism into your domain)
Overall, the second option is the most SOLID, but more often than not it turns into a beast-and-two-thirds to maintain, especially when dealing with a domain containing backreferences and circular references. For instance, for fast retrieval and/or traversal, an InvoiceLineDetail record (perhaps containing shipping notes or tax information) might refer directly to the Invoice as well as the InvoiceLine to which it belongs. That creates a 3-node circular reference that requires either an O(n^2) algorithm to detect that the object has been handled already, or hard-coded logic concerning a "cascade" behavior for the backreference. I've had to implement "graph walkers" before; trust me, you DO NOT WANT to do this if there is ANY other way of doing the job.
So, in conclusion, my opinion is that ORMs are the least of all evils given a sufficiently complex domain. They encapsulate much of what is not SOLID about persistence mechanisms, and reduce knowledge of the domain about its persistence to very high-level implementation details that break down to simple rules ("all domain objects must have all their public members marked virtual").
In short - it is not solved
(here goes additional useless characters to post my awesome answer)
All good points.
I don't have an answer (but the comment got too long when I decided to add something about stored procs) except to say my philosophy seems to be identical to yours and I code or code generate.
Things like partial classes make this a lot easier than it used to be in the early .NET days. But ORMs (as a distinct "thing" as opposed to something that just gets done in getting to and from the database) still require a LOT of compromises and they are, frankly, too leaky of an abstraction for me. And I'm not big on having a lot of dupe classes because my designs tend to have a very long life and change a lot over the years (decades, even).
As far as the database side, stored procs are a necessity in my view. I know that ORMs support them, but the tendency is not to do so by most ORM users and that is a huge negative for me - because they talk about a best practice and then they couple to a table-based design even if it is created from a code-first model. Seems to me they should look at an object datastore if they don't want to use a relational database in a way which utilizes its strengths. I believe in Code AND Database first - i.e. model the database and the object model simultaneously back and forth and then work inwards from both ends. I'm going to lay it out right here:
If you let your developers code ORM against your tables, your app is going to have problems being able to live for years. Tables need to change. More and more people are going to want to knock up against those entities, and now they all are using an ORM generated from tables. And you are going to want to refactor your tables over time. In addition, only stored procedures are going to give you any kind of usable role-based manageability without dealing with every tabl on a per-column GRANT basis - which is super-painful. If you program well in OO, you have to understand the benefits of controlled coupling. That's all stored procedures are - USE THEM so your database has a well-defined interface. Or don't use a relational database if you just want a "dumb" datastore.
Have you looked at the Entity Framework 4.1 Code First? IIRC, the domain objects are pure POCOs.
this what we did on our latest project, and it worked out pretty well
use EF 4.1 with virtual keywords for our business objects and have our own custom implementation of T4 template. Wrapping the ObjectContext behind an interface for repository style dataaccess.
using automapper to convert between Bo To DTO
using autoMapper to convert between ViewModel and DTO.
you would think that viewmodel and Dto and Business objects are same thing, and they might look same, but they have a very clear seperation in terms of concerns.
View Models are more about UI screen, DTO is more about the task you are accomplishing, and Business objects primarily concerned about the domain
There are some comprimises along the way, but if you want EF, then the benfits outweigh things that you give up
Over a year later, I have solved these problems for me now.
Using NHibernate, I am able to map fairly complex Domain Models to reasonable database designs that wouldn't make a DBA cringe.
Sometimes it is needed to create a new implementation of the IUserType interface so that NHibernate can correctly persist a custom type. Thanks to NHibernates extensible nature, that is no big deal.
I found no way to avoid adding virtual to my properties without loosing lazy loading. I still don't particularly like it, especially because of all the warnings from Code Analysis about virtual properties without derived classes overriding them, but out of pragmatism, I can now live with it.
For the default constructor I also found a solution I can live with. I add the constructors I need as public constructors and I add an obsolete protected constructor for NHibernate to use:
[Obsolete("This constructor exists because of NHibernate. Do not use.")]
protected DataExportForeignKey()
{
}
We've had quite a bit of discussion among our development group concerning whether the composition of entities should drive the database design, or should the database design drive the composition of the entities.
For those who have dealt with this, what has been your philosophy? Of course, not every entity maps 1:1 to a database table. But, for those that do, how have you handled this? IOW, which comes first, the database table and then a corresponding entity or an entity and then a database table to persist it?
Thanks.
"entity and then a database table to persist it"
The Entity is what your program manipulates. That's the essence of what's being processed.
The database representation of that entity (like flat-file representations or GUI representations) are just handy representations of the entity.
You may have to think a bit about DB representation when it comes to certain things that relational databases are particularly bad at. Many-to-many relationships, for example, require introducing an extra table because the database has limitations that your object model doesn't have. You may have some entity design considerations to cope with this, but those a few and well-understood.
The database is less important.
The Entity definitions are central and essential.
Your database will likely outlive whatever application you build today. All the performance and the scalability are going to be driven by your database schema. A sound database model is the foundation on which any application is built, and I'd say is where you should invest most effort in design and testing, for it will give the biggest benefits.
That being said, of course your application will prefer to manipulate domain entities, and manipulating unnatural entities driven by relational theory as opposed to business entities will just complicate things. My view is that is the role of ORM to match the two, as best as possible. But whenever inevitable conflicts appear, the right of way should be given by the driving factor of your performance and scalability: the database schema.
I would say that you build your logical data model, and build the database and objects corresponding to that.
In fact, I would question the assumption that the database table and corresponding entities can't corresponding. I've rarely if ever seen a case where they really couldn't (if you are building an application from the ground up). Also, I would say that every time the object model and database schema diverged, it introduced a lot of problems.
I've come back around to the idea of that everything is simplier if you make them always match, however heretical that may be.
HI,
Currently i have one big datacontex with 35 tables (i dragged all my DB tables to the designer). I must admit it is very comfortable cause i have ORM to my full DB and query with linq is easy and simple.
My questions are:
1. Would you consider it bad design to have one datacontext with 35 tables or should i split it to logic units?
2. Is there any performance penalties for using such a big datacontext?
Thanks, Pini.
Split to logic units
There ain't real performance penalties tho it will be kinda hard to have overview.
There is a performance penalty, since its rebuilding the meta model for all the mappings every time you create a new context, but if you are monitoring this and its not causing you problems then I wouldn't worry about it.
I tend to only use DataContexts for very small projects that only need a handful of tables modeled, anything more than that and its easier in the long-run going with a more traditional and mature ORM like nHibernate in my opinion.
I can understand your pain. The LINQ to SQL designer isn't great when it comes to big models. However 35 tables is not really big.
When you can split the tables in two or more groups, where each group is completely independent of the other (no relations), in that case splitting is justified IMO, especially when the groups are logically separated parts. In that case you can give each context a proper name.
However, when you have relationships between the groups, it is often an indication that they are part of one domain. When splitting such a domain, it means you will have to duplicate tables, which can be annoying and unpractical, but when one model / context only reads that table, it could be okay.
Also be aware that splitting de model could have some annoying side effects in your architecture. Of course it depends on your architecture. An application I’ve worked on used ‘service commands’ that executed business logic on behalf of the presentation layer. An automatic construct supplied the commands with an DataContext instance, and having multiple DataContexts made that design quite frustrating.