Continuous delivery and database schema changes with entity framework - c#

We want to progress towards being able to do continuous delivery of of our application into production. We currently deploy to azure and use table/blob storage and have a azure sql database, which we access with the entity.
As the database schema changes we want to be able to automatically apply the schema changes to the production database, but as this will happen whilst the application is live and the code changes are being deployed to many nodes at the same time we are not sure what the correct approach is.
After some reading it seems (and this makes sense) that the application needs to be tolerant of the 2 different database schema versions, so that it doesn't matter if its an old version of the code or a new version of the code which sees the database, however I'm not sure what the best way to approach handling this in the application is, using the entity framework.
Should we have versioned instances of the EF generated classes in the code which know how to access a specific version of the schema? What happens when the schema is updated and an old version of the code is running against the database?
Our entity framework classes are mapped to views in specific schemas in the db and nothing is mapped to the underlying tables, so potentially this could allow us to create v1 views which the old code uses and v2 views which the new code uses, but maintaining this feels like it would be a bit of a nightmare (its already enough of a pain simply maintaining the EF mappings to views rather than tables)
So what are best practices in this area? What do others do to solve this problem?

Whether you use EF or not, maintaining the code's ability to work with 2 consecutive versions of the database is a good (and perhaps the only viable) approach here.
Here are some ways we handle specific types of migrations:
When adding a column, we can typically just add the column (with a default constraint if non-nullable) and not worry about the code. EF will never issue a "SELECT *", so it will be able to continue to function properly while ignoring the new column. Similarly, adding a table is easy.
When removing a column or table, simply keep that column around 1 version longer than you would have otherwise.
For more complex migrations (e. g. completely changing the structure for a table or segment of the data model), deploy the new model alongside backwards-compatibility views (or tables with triggers to keep them in-sync), which will live as long as does the code that references them. As you say, this can a lot of work depending on the complexity of the migration, but it sounds like you are already well-positioned to do this because your EF entities point to views anyway. On the other hand, the benefit of this work is that you have more time to do the code migration. If you have a large codebase, this could be really beneficial in allowing you to migrate the data model to fit the needs of new features while still supporting old features without major code changes.
As a side-note, the difficulty of data migration often makes us push developing a finalized data model as far back as possible in the development schedule. With EF, you can write and test a lot of code before the data model is finalized (we use code-first to generate a sample SQLExpress database in a unit tests, even though our production database is not maintained by code-first). That way, we make fewer incremental changes to the production data model once a new feature is released.

Related

DDD: Is it ok to generate/update my entities classes from changes in database schema?

Some time ago, at work, we had to change our main system to be "cross-rdbms". I'm not sure if this is the correct term, but basically the system worked only with MSSQLServer but in order to acomodate a new client we had to make it possible for the system to work with both MSSQLServer and Oracle.
We don't use a ORM because of reasons. Instead, we use a custom ADO-based data access layer.
Before of this change, we rellied heavily on stored procedures, database functions, triggers, etc. A substantial amount of business logic was located on the database itself.
We decided to get rid of all stored procedures, triggers and stuff, and basically reduce to database to a mere storage layer.
To handle migrations, we created a .json file which contains a representation of our database schema: tables, columns, indexes, constraints, etc. A simple application was created to edit this file. By using it, we're able to edit existent tables and columns and add new ones.
This json file is versioned in our repository. When the application is executed, a routine reads the file, constructing a representation of the database in memory. It then reads the metadata from the actual database, compare it to the in-memory representation and generates scripts based on the differences found.
Finally, the scripts are executed, updating the database schema.
So, now comes my real problem. When a new column is added, the developer needs to:
- add a new property to the POCO class that represents a row in that table;
- edit the method which maps the table columns to the class properties, adding the new column/property mapping;
- edit the class which handles database commands, creating a new parameter referent to the new column.
When this approach was initially implemented, I thought about auto-generating and updating the POCO classes based on changes in the json file. These would keep the classes in sync with the database schema, and we wouldn't have to worry about developers forgetting to update the classes after creating new columns.
This feature wasn't implemented tough, and now I'm having serious doubts about it, mostly because I've been studying Clean Architecture/Onion Architecture and Domain Driven Design.
From a DDD perspective, everything should be about the Domain, which in turn should be tottally ignorant about its persistence.
So, my question is basically: how can I maintain my domain model and my database schema in sync, without violating DRY and without using a "database-centric" approach?
DDD puts the focus on the domain language and its representation in domain classes. DB issues are not the primary concern of DDD.
Therefore, generating domain classes from the database schema is the wrong direction if the intention is to apply DDD.
This question is more about finding a decent way to manage DB upgrades, which has little to do with DDD. Unit/integration tests for basic read/write DB operations may go a long way in assisting developers to remember editing the required files when DB columns are altered.

Multiple databases with slightly changing models. How do I allow `EF` to work with different database structures at run-time?

I am working with EF6, MSSQL, Oracle, .NET4.5 on a system that is used globally across company (many departments) to query different databases that belong to our department, that have mostly same EF model, some databases are Oracle and some are Microsoft SQL, some are development or uat, some are logs.
I am using different EF models for Oracle and for MSSQL databases.
One requirement is to switch between databases at run time, and this is easy,
public AggregatorEntities(string connectionString)
: base(connectionString)
{
}
however it does have side effects - many databases (dev, uat, dr, logs,...) are out of sync from what Live is (model is generated from Live), which results in errors when querying those databases.
Management knows about situation and they are ok for devs that work on some specific database to do changes to global querying system that would allow testers and uat to query the data. However they want changes they have to do to take minimum time to do this - as it is additional cost to each project that involves database changes. I will basically need to build a 'can handle all' resilient system, that when one changes database in EF will do something to accommodate to specific database.
There are different failure scenarios:
1. Name of column on table is the same but Type is different in entity
2. No column on table but there is one on entity in EF
3. Additional columns on table that are not on EF
4. Additional tables in database that are not in EF model
5. No table in database but there is entity in EF model.
I have done some thinking and this question is broad and might get closed for same reason. However I am not sure if it is worth splitting the question into each scenario, as it depends on the answer. The way I understand if single answer can answer all points then no need to split, however if each situation has different 'cure' then question should be split for that part only, but without answer no way to know.... (catch 22).
Only option I see ATM is to generate it's own model for each mirroring database, but then I end up with 50+ models.
How do I allow EF to work with different database structures at run-time?
This now officially cannot be done in a proper manner.
However end result of being able to switch between different databases with similar structures still can be achieved (for those without morals). Part with removing columns can used.
Solution is to have all inclusive EF model that is generated from database that has all the tables and all the columns (that are in any database think like logical OR of everything). Then model with all entities that have all properties from all db environments can be removed specific to environment that is queried at runtime in mechanism described here. This does not cover cases where type of column changes.
Hope this saves you some time as it took 2 weeks from mine...

Can I use an EF Database three ways?

I have a MVC web application that I've done code first to build my database.
I also need to have a console app manage data based on timeframes so, it will also need to access this database which I understand I can use as a Database First model.
However, I also need to build another website as a management dashboard, which I understand will also work as Database First.
Can I do this without having EF in one of the two circumstances nuke the database if I need to make a change to the model?
The short answer: no. You cannot implement both code-first and data-first EF on the same dataset without encountering a bona fide logistical nightmare.
Converting from one to the other is not quite as difficult as you might think, however, if your application is not overly complex. Based on the tables you've already created, data-first EF should produce objects that are reasonably compatible with your existing code.
Your next steps should look like this:
Pick one approach for EF
If necessary, convert existing projects to that paradigm
Move EF code into a shared class library (as suggested by snow)
Implement new projects using that class library to ensure consistency and reduce redundancy

Entity Framework and ADO.NET with Unit of Work pattern

We have a system built using Entity Framework 5 for creating, editing and deleting data but the problem we have is that sometimes EF is too slow or it simply isn't possible to use entity framework (Views which build data for tables based on users participating in certain groups in database, etc) and we are having to use a stored procedure to update the data.
However we have gotten ourselves into a problem where we are having to save the changes to EF in order to have the data in the database and then call the stored procedures, we can't use ITransactionScope as it always elevates to a distributed transaction and/or locks the table(s) for selects during the transaction.
We are also trying to introduce a DomainEvents pattern which will queue events and raise them after the save changes so we have the data we need in the DB but then we may end up with the first part succeeding and the second part failing.
Are there any good ways to handle this or do we need to move away from EF entirely for this scenario?
I had similar scenario . Later I break the process into small ones and use EF only, and make each small process short. Even overall time is longer, but system is easier to maintain and scale. Also I minimized joins, only update entity itself, disable EF'S AutoDetectChangesEnabled and ValidateOnSaveEnabled.
Sometimes if you look your problem in different ways, you may have better solution.
Good luck!

Updating database for desktop application (patching)

I wonder what you are using for updating a client database when your program is patched?
Let's take a look at this scenario:
You have a desktop application (.net, entity framework) which is using sql server compact database.
You release a new version of your application which is using extended database.
The user downloads a patch with modified files
How do you update the database?
I wonder how you are doing this process. I have some conception but I think more experienced people can give me better and tried solutions or advice.
You need a migration framework.
There are existing OSS libraries like FluentMigrator
project page
wiki
long "Getting started" blogpost
Entity Framework Code First will also get its own migration framework, but it's still in beta:
Code First Migrations: Beta 1 Released
Code First Migrations: Beta 1 ‘No-Magic’ Walkthrough
Code First Migrations: Beta 1 ‘With-Magic’ Walkthrough (Automatic Migrations)
You need to provide explicitly or hidden in your code DB upgrade mechanism, and - thus implement something like DB versioning chain
There are a couple of aspects to it.
First is versioning. You need some way of tying teh version of teeh db to the version of the program, could be something as simple as table with a version number in it. You need to check it on executing the application as well.
One fun scenario is you 'update' application and db successfully, and then for some operational reason the customer restores a previous version of the db, or if you are on a frequent patch cycle, do you have to do each patch in order or can thay catch up. Do you want to deal with application only or database only upgrades differently?
There's no one right way for this, you have to look at what sort of changes you make, and what level of complexity you are prepared to maintain in order to cope with everything that could go wrong.
A couple a of things worth looking at.
Two databases, one for static 'read-only' data, and one for more dynamic stuff. Upgrading the static data, can then simply be a restore from a resource within the upgrade package.
The other is how much can you do with meta-data, stored in db tables. For instance a version based xsd to describe your objects instead of a concrete class. That's goes in your read only db, now you've updated code and application with a restore and possibly some transforms.
Lots of ways to go, just remember
'users' will always find some way of making you look like an eejit, by doing something you never thought they would.
The more complex you make the system, the more chance of the above.
And last but not least, don't take short cuts on data version conversions, if you lose data integrity, everything else you do will be wasted.

Categories

Resources