How do you use unit/integration testing to ensure that a code-first migration works correctly on a populated previous-version database, including any additional code to map data from one column or table to another?
I found some previous answers that used classes in the System.Data.Entity namespace, but it appears these have become obsolete with Entity Framework Core and manual control of migrations is not possible?
I found a solution for myself which I will post, but I welcome other, better solutions.
The database is defined by a Context class, and ContextModelSnapshot class, a series of migration files, and the C# objects that the data tables store.
If you haven't created the migration yet, copy all of these files to your test project and rename the classes all with a suffix, e.g. "MyDataEntity" => "MyDataEntityVersion1". Edit the bundled .Design.cs files accordingly. Afterwards, create the new migration on the original.
If you have created the new migration but can't step back, you can manually edit the ContextModelSnapshot file to revert the changes.
The key to this working is that both point to the same database file. One expects the original state, and the other expects the upgraded state.
In your test case, you can then do:
[TestInitialize]
public void TestInit()
{
using (var db = new MyDataContext())
db.Database.EnsureDeleted(); // reset database before each test
}
[TestMethod]
public void Migrate_Version1_To_Version2_On_Populated_Database()
{
using (var db = new MyDataContextVersion1())
db.Database.Migrate(); // create database and apply migrations up through Version 1
// populate the Version 1 database
App.InitializeDatabase(); // whatever method you would normally call to read/update the database
// assert statements to test that the Version 2 database looks like you expect.
}
where InitializeDatabase() looks something like:
public void InitializeDatabase()
{
using (var db = new MyDataContext())
{
db.Database.Migrate();
// detect if upgrade needed and set new columns
db.SaveChanges();
}
}
Note that this solution is in part motivated by using SQLite, which does not support dropping columns in a migration. It discouraged me from trying to do anything more fancy inside the migrations.
Related
I need to map to a view when using EF6 with migrations.
The view pivots 2 other tables to enable a simple summary view of the underlying data, the idea being it allows us to use this in a summary index view.
The issue I have is I am unable create a migration that either deploys the view (ideal goal) or deploys the DB without the view for later manual deployment.
In most attempts, following other SO questions, I end up either deadlocking the Add-Migration and Update-Database commands or generally causing an error that breaks one or the other.
What is the current best way to use EF6 to access views, even if I lose the ability to automatically deploy them with the migrations, and not cause errors with migrations.
Further detail
The Db contains 2 tables Reports and ReportAnswers. The view ReportView combines these two and pivots ReportAnswers to allow some of the rows to become columns in this summary view.
Reports and ReportAnswers were depolied via EF Migrations. The view is currently a script that needs be added to the deployment somehow.
Reports, ReportAnswers & ReportView are accessible from the db Context
public virtual DbSet<ReportAnswer> ReportAnswers { get; set; }
public virtual DbSet<Report> Reports { get; set; }
public virtual DbSet<ReportView> ReportView { get; set; }
I have tried using Add-Migration Name -IgnoreChanges to create a blank migration and then manually adding the view to the Up() and Down() methods but this just deadlocks the migration and update commands, each wanting the other to run first.
I have also tried using modelBuilder.Ignore<ReportView>(); to ignore the type when running the migrations but this proved incredibly error prone, even though it did seem to work at least once.
I just walked around interesting article about using views with EF Core few days ago, but I found also the very same using EF 6.
You may want to use Seed method instead of migration Up and Down methods.
protected override void Seed({DbContextType} context)
{
string codeBase = Assembly.GetExecutingAssembly().CodeBase;
UriBuilder uri = new UriBuilder(codeBase);
string path = Uri.UnescapeDataString(uri.Path);
var baseDir = Path.GetDirectoryName(path) + "\\Migrations\\{CreateViewSQLScriptFilename}.sql";
context.Database.ExecuteSqlCommand(File.ReadAllText(baseDir));
}
Your SQL command should look like sample below.
IF NOT EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[{ViewName}]'))
EXEC dbo.sp_executesql #statement = N'CREATE VIEW [dbo].[{ViewName}]
AS
SELECT {SelectCommand}
It is not perfect, but I hope at least helpful.
I found another blog post about this topic and the writer says to use Sql(#"CREATE VIEW dbo.{ViewName} AS...") in Up method and Sql(#"DROP VIEW dbo.{ViewName};") in Down method. I added it as you didn't supplied the code from Up and Down migration methods. Maybe good idea will be to add SqlFile instead of Sql method.
There is also option to create customized code or sql generator and plug it in to migrations, but I guess it is not the things you are looking for.
Let me know in comment in case you need additional help.
Related links:
Using Views with Entity Framework Code First
EF CODE FIRST - VIEWS AND STORED PROCEDURES
Leveraging Views in Entity Framework
DbMigration.Sql Method (String, Boolean, Object)
DbMigration.SqlFile Method (String, Boolean, Object)
Just a bit of an outline of what i am trying to accomplish.
We keep a local copy of a remote database (3rd party) within our application. To download the information we use an api.
We currently download the information on a schedule which then either inserts new records into the local database or updates the existing records.
here is how it currently works
public void ProcessApiData(List<Account> apiData)
{
// get the existing accounts from the local database
List<Account> existingAccounts = _accountRepository.GetAllList();
foreach(account in apiData)
{
// check if it already exists in the local database
var existingAccount = existingAccounts.SingleOrDefault(a => a.AccountId == account.AccountId);
// if its null then its a new record
if(existingAccount == null)
{
_accountRepository.Insert(account);
continue;
}
// else its a new record so it needs updating
existingAccount.AccountName = account.AccountName;
// ... continue updating the rest of the properties
}
CurrentUnitOfWork.SaveChanges();
}
This works fine, however it just feels like this could be improved.
There is one of these methods per Entity, and they all do the same thing (just updating different properties) or inserting a different Entity. Would there be anyway to make this more generic?
It just seems like a lot of database calls, would there be anyway to "Bulk" do this. I've had a look at this package which i have seen mentioned on a few other posts https://github.com/loresoft/EntityFramework.Extended
But it seems to focus on bulk updating a single property with the same value, or so i can tell.
Any suggestions on how i can improve this would be brilliant. I'm still fairly new to c# so i'm still searching for the best way to do things.
I'm using .net 4.5.2 and Entity Framework 6.1.3 with MSSQL 2014 as the backend database
For EFCore you can use this library:
https://github.com/borisdj/EFCore.BulkExtensions
Note: I'm the author of this one.
And for EF 6 this one:
https://github.com/TomaszMierzejowski/EntityFramework.BulkExtensions
Both are extending DbContext with Bulk operations and have the same syntax call:
context.BulkInsert(entitiesList);
context.BulkUpdate(entitiesList);
context.BulkDelete(entitiesList);
EFCore version have additionally BulkInsertOrUpdate method.
Assuming that the classes in apiData are the same as your entities, you should be able to use Attach(newAccount, originalAccount) to update an existing entity.
For bulk inserts I use AddRange(listOfNewEntitities). If you have a lot of entities to insert it is advisable to batch them. Also you may want to dispose and recreate the DbContext on each batch so that it's not using too much memory.
var accounts = new List<Account>();
var context = new YourDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
foreach (var account in apiData)
{
accounts.Add(account);
if (accounts.Count % 1000 == 0)
// Play with this number to see what works best
{
context.Set<Account>().AddRange(accounts);
accounts = new List<Account>();
context.ChangeTracker.DetectChanges();
context.SaveChanges();
context?.Dispose();
context = new YourDbContext();
}
}
context.Set<Account>().AddRange(accounts);
context.ChangeTracker.DetectChanges();
context.SaveChanges();
context?.Dispose();
For bulk updates, there's not anything built in in LINQ to SQL. There are however libraries and solutions to address this. See e.g. Here for a solution using expression trees.
List vs. Dictionary
You check in a list every time if the entity exists which is bad. You should create a dictionary instead to improve performance.
var existingAccounts = _accountRepository.GetAllList().ToDictionary(x => x.AccountID);
Account existingAccount;
if(existingAccounts.TryGetValue(account.AccountId, out existingAccount))
{
// ...code....
}
Add vs. AddRange
You should be aware of Add vs. AddRange performance when you add multiple records.
Add: Call DetectChanges after every record is added
AddRange: Call DetectChanges after all records is added
So at 10,000 entities, Add method have taken 875x more time to add entities in the context simply.
To fix it:
CREATE a list
ADD entity to the list
USE AddRange with the list
SaveChanges
Done!
In your case, you will need to create an InsertRange method to your repository.
EF Extended
You are right. This library updates all data with the same value. That is not what you are looking for.
Disclaimer: I'm the owner of the project Entity Framework Extensions
This library may perfectly fit for your enterprise if you want to improve your performance dramatically.
You can easily perform:
BulkSaveChanges
BulkInsert
BulkUpdate
BulkDelete
BulkMerge
Example:
public void ProcessApiData(List<Account> apiData)
{
// Insert or Update using the primary key (AccountID)
CurrentUnitOfWork.BulkMerge(apiData);
}
I came across this related question: How can I generate DDL scripts from Entity Framework 4.3 Code-First Model?
But this doesn't appear to answer the question of when a Code First application actually checks the existence/correctness of the DB and modifies it if necessary. Is it at run-time or build time? Assuming it's at run-time is it at start-up or when you create the DbContext or at the last possible moment e.g. when you try to write/read the DB table(s) it checks they exist on a case-by-case basis?
It is ceated at rutime the first time you access an entity, ie,
using (var db = new MyDBContext())
{
var items = db.MyObj.Count() // <- Here it is created!
}
There are some flavors on how, like if you set the creating strategy to CreateDatabaseIfNotExists, DropCreateDatabaseAlways, Etc. Please give this a look:
http://www.entityframeworktutorial.net/code-first/database-initialization-strategy-in-code-first.aspx
The column Model in the table __MigrationHistory is serialized and gzipped(base64) version of your EDMX. In code first the column Model is generated by Add-Migration and stored in the second part of the migration partial class and in the database when the database is created as binary stream varbinary(max).
When the database initializer (Database.SetInitializer) is called, then EF generate from the classes on the fly(Runtime) the current Entity Data Model(EDMX). The generated model will be serialized, zipped(base64) and finally compare it with the stored Model of the migration history table.
The comparison happens before the DbContext is created and, if the two Models(binary streams) are not identical then you will get a compatibility exception.
I've always been a database oriented programmer, so up to this day, I've always used a database-driven approach to programming and I feel pretty confident in T-SQL and SQL Server.
I'm trying to wrap my head around the Entity Framework 6 code-first approach - and frankly - I'm struggling.
I have an existing database - so I did a Add New Item > ADO.NET Entity Data Model > Code-First from Database and I get a bunch of C# classes representing my existing database. So far so good.
What I'm trying to do now is explore how to handle ongoing database upgrades - both in schema as well as "static" (pre-populated) lookup data. My first gripe is that the entities that were reverse-engineered from the database are being configured with the Fluent API, while it seems more natural to me to create the new tables I want to have created as a C# class with data annotations. Is there any problems / issues with "mixing" those two approaches? Or could I tell the reverse-engineering step to just use data annotation attributes instead of the Fluent API altogether?
My second and even bigger gripe: I'm trying to create nice and small migrations - one each for each set of features I'm trying to add (e.g. a new table, a new index, a few new columns etc.) - but it seems I cannot have more than a single "pending" migration...... when I have one, and I modify my model classes further, and I try to get a second migration using add-migration (name of migration), I'm greeted with:
Unable to generate an explicit migration because the following explicit migrations are pending: [201510061539107_CreateTableMdsForecast]. Apply the pending explicit migrations before attempting to generate a new explicit migration.
Seriously ?!?!? I cannot have more than one, single pending migration?? I need to run update-database after every single tiny migration I'm adding?
Seems like a rather BIG drawback! I'd much rather create my 10, 20 small, compact, easy-to-understand migrations, and then apply them all in one swoop - no way to do this!?!? This is really hard to believe..... any way around this??
It is true that you can only have one pending migration open at a time during development. To understand why, you have to understand how the migrations are generated. The generator works by comparing the current state of your database (the schema) with the current state of your model code. It then effectively creates a "script" (a C# class) which changes the schema of the database to match the model. You would not want to have more than one of these pending at the same time or else the scripts would conflict with each other. Let's take an simple example:
Let's say I have a class Widget:
class Widget
{
public int Id { get; set; }
public string Name { get; set; }
}
and a matching table Widgets in the database:
Widgets
-------
Id (int, PK, not null)
Name (nvarchar(100), not null)
Now I decide to add a new property Size to my class.
class Widget
{
public int Id { get; set; }
public string Name { get; set; }
public int Size { get; set; } // added
}
When I create my migration, the generator looks at my model, compares it with the database and sees that my Widget model now has a Size property while the corresponding table does not have a Size column. So the resulting migration ends up looking like this:
public partial class AddSizeToWidget : DbMigration
{
public override void Up()
{
AddColumn("dbo.Widgets", "Size", c => c.Int());
}
public override void Down()
{
DropColumn("dbo.Widgets", "Size");
}
}
Now, imagine that it is allowed to create a second migration while the first is still pending. I haven't yet run the Update-Database command, so my baseline database schema is still the same. Now I decide to add another property Color to Widget.
When I create a migration for this change, the generator compares my model to the current state of the database and sees that I have added two columns. So it creates the corresponding script:
public partial class AddColorToWidget : DbMigration
{
public override void Up()
{
AddColumn("dbo.Widgets", "Size", c => c.Int());
AddColumn("dbo.Widgets", "Color", c => c.Int());
}
...
}
So now I have two pending migrations, and both of them are going to try to add a Size column to the database when they are ultimately run. Clearly, that is not going to work. So that is why there is only one pending migration allowed to be open at a time.
So, the general workflow during development is:
Change your model
Generate a migration
Update the database to establish a new baseline
Repeat
If you make a mistake, you can roll back the database to a previous migration using the –TargetMigration parameter of the Update-Database command, then delete the errant migration(s) from your project and generate a new one. (You can use this as a way to combine several small migrations into a larger chunk if you really want to, although I find in practice it is not worth the effort).
Update-Database –TargetMigration PreviousMigrationName
Now, when it comes time to update a production database, you do not have to manually apply each migration one at a time. That is the beauty of migrations -- they are applied automatically whenever you run your updated code against the database. During initialization, EF looks at the target database and checks the migration level (this is stored in the special __MigrationHistory table which was created when you enabled migrations on the database). For any migration in your code which has not yet been applied, it runs them all in order for you, to bring the database up to date.
Hope this helps clear things up.
Is there any problems / issues with "mixing" those two approaches?
No, there is no problem to mix them.
You can do more with fluent config than with data annotations.
Fluent config overrides data annotation when constructing the migration script.
You can use data annotations to generate DTOs and front-end/UI constraints dynamically - saves a lot of code.
Fluent API has class EntityTypeConfiguration which allows you to make domains (in DDD sense) of objects dynamically and store them - speeds up work with DbContext a lot.
I cannot have more than a single "pending" migration
Not 100% true. ( Maybe 50% but this is not a showstopper )
Yes, the DbMigrator compares your model "hash" to the database model "hash" when it generates the Db - so it blocks you before you make your new small migration. But this is not a reason to think you can not make small migration steps. I do only small migration steps all the time.
When you develop an app and you use your local db you apply the small migrations one by one as you develop functionality - gradually. At the end you deploy to staging/production all your small migrations in one dll with all the new functionality - and they are applied one by one.
This may be a bit silly, but all the applications I've built have always utilized the EF Code-First approach to generate the database. When using this method, I've always accessed the database through the Context:
public class RandomController : Controller
{
public CombosContext db = new CombosContext();
//
// GET: /Home/
public ActionResult Index()
{
var rows = db.Combos.OrderBy(a => a.Id).ToList();
However, what if the database is already created for me, OR I create one by adding entities to the schema/design surface and then generate the database from that. How would I access the db without the
public CombosContext db = new ComboxContext();
If the DB is already created, you can use the Database First approuch: http://blogs.msdn.com/b/adonet/archive/2011/09/28/ef-4-2-model-amp-database-first-walkthrough.aspx
A basic setup would by to rightclick the project in the solution explorer and click Add > new item. On the dialog, select Data on the left pane and ADO.net Entity Data Model and follow the wizard to create your model based on the database. This way, you will have a context object exactly the way you have with code first (with some minor changes, but works almost the same).
You can still do this with Code first and is the better approach IMHO. Use the Entity Framework Power Tools to reverse engineer your existing database into a code-first model.
http://visualstudiogallery.msdn.microsoft.com/72a60b14-1581-4b9b-89f2-846072eff19d/
See my demo on using it at:
http://channel9.msdn.com/Events/TechEd/NorthAmerica/2012/DEV215