Entity Framework Data Migration

Entity Framework Data Migration - c#

We´ve changed some logic in our application and now we add some fields within a migration but the existing entries should get some values also. Therefor we need to load the entry, calculate some values and need to add this value to the new field. So how to access the data which is requested with the Sql-Method? The defaultValue-Flag of the AddColumn-Method will not work because every entry has a value calculated different.
Here somekind of workflow:
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.AddColumn<uint>(
name: "Field",
table: "Table",
type: "INTEGER",
nullable: false,
defaultValue: 0u);
var builder = migrationBuilder.Sql("SELECT * FROM Table");
var data = builder.AccessData; // here I´m unsure how to access the data
data.Field = Calculator(data);
migrationBuilder.UpdateData(data); // this one is also not cleare
}

A MigrationBuilder is a way to express in a fluent, C#-friendly fashion the SQL code that will be executed during the migration itself. Even if the code you write and the "intermediate files" generated by the migration tool look as C#, at the end the migration becomes pure SQL that neither knows nor can communicate with your application.
Therefore, if you need to process existing data, all your logic must happen DB-side and must be expressed as operations on the MigrationBuilder. Of course in the Up and Down methods of the migration you can leverage all the power of C# to build the raw SQL statements to be provided to the migrationBuilder.Sql(...). Also, consider that the whole migration is executed in a transaction and that a "volatile" stored procedure can turn out to be useful to extract a complex migration logic.

Related

Entity Framework: Add column requires altering of Seed data?

first off, I'm pretty certain that this must have been asked before but have been unable to find an exact answer via googling, so please bear with me.
I have inherited a code first entity framework project which uses migrations. I've added a (non-nullable) column to a table and I need to insert values into this column for all existing entries - which are not the default value:
public override void Up()
{
AddColumn("dbo.QuestionType", "Duplicated", c => c.Boolean(nullable: false, defaultValue: false));
Sql("UPDATE dbo.QuestionType SET Duplicated = 1");
}
However there was originally some seed data added to this table:
context.QuestionTypes.AddOrUpdate(
e => e.Name,
new QuestionType() { Name = QuestionTypeNames.INTERVIEWER});
which means that the update statement is immediately overwritten by the data in the call to the Seed method (called after each migration).
My questions are:
Is it 'safe' to simply add the extra value into the Seed data (or will this cause everything to break for earlier migrations where the new column doesn't exist on the database).
Alternatively is there any way to prevent the Seed method from running after this migration (and all subsequent migrations).
Thanks

Entity Framework is great, but to be able to use it effectively, I'm afraid we really have no choice but to be diligent in making sure that the migrations and the seed method are in sync and work well with each other.
Yes, if you put the new value in the seed (ie. new QuestionType() { Name = QuestionTypeNames.INTERVIEWER, Duplicated = true}), this will cause systems which do not have the latest migration applied to break. This is because the seed method will be looking for the Duplicated column but will not find it.
No, I don't think there is a way to do this, at least not automatically. One solution is this: since you have access to the context object in the seed method, you can just query the database to check if the particular migration already exists. You can then wrap the specific seeder for QuestionTypes in an if statement, utilizing the result of the earlier query.

Trying out EF code-first and migrations - stumbling blocks

I've always been a database oriented programmer, so up to this day, I've always used a database-driven approach to programming and I feel pretty confident in T-SQL and SQL Server.
I'm trying to wrap my head around the Entity Framework 6 code-first approach - and frankly - I'm struggling.
I have an existing database - so I did a Add New Item > ADO.NET Entity Data Model > Code-First from Database and I get a bunch of C# classes representing my existing database. So far so good.
What I'm trying to do now is explore how to handle ongoing database upgrades - both in schema as well as "static" (pre-populated) lookup data. My first gripe is that the entities that were reverse-engineered from the database are being configured with the Fluent API, while it seems more natural to me to create the new tables I want to have created as a C# class with data annotations. Is there any problems / issues with "mixing" those two approaches? Or could I tell the reverse-engineering step to just use data annotation attributes instead of the Fluent API altogether?
My second and even bigger gripe: I'm trying to create nice and small migrations - one each for each set of features I'm trying to add (e.g. a new table, a new index, a few new columns etc.) - but it seems I cannot have more than a single "pending" migration...... when I have one, and I modify my model classes further, and I try to get a second migration using add-migration (name of migration), I'm greeted with:
Unable to generate an explicit migration because the following explicit migrations are pending: [201510061539107_CreateTableMdsForecast]. Apply the pending explicit migrations before attempting to generate a new explicit migration.
Seriously ?!?!? I cannot have more than one, single pending migration?? I need to run update-database after every single tiny migration I'm adding?
Seems like a rather BIG drawback! I'd much rather create my 10, 20 small, compact, easy-to-understand migrations, and then apply them all in one swoop - no way to do this!?!? This is really hard to believe..... any way around this??

It is true that you can only have one pending migration open at a time during development. To understand why, you have to understand how the migrations are generated. The generator works by comparing the current state of your database (the schema) with the current state of your model code. It then effectively creates a "script" (a C# class) which changes the schema of the database to match the model. You would not want to have more than one of these pending at the same time or else the scripts would conflict with each other. Let's take an simple example:
Let's say I have a class Widget:
class Widget
{
public int Id { get; set; }
public string Name { get; set; }
}
and a matching table Widgets in the database:
Widgets
-------
Id (int, PK, not null)
Name (nvarchar(100), not null)
Now I decide to add a new property Size to my class.
class Widget
{
public int Id { get; set; }
public string Name { get; set; }
public int Size { get; set; } // added
}
When I create my migration, the generator looks at my model, compares it with the database and sees that my Widget model now has a Size property while the corresponding table does not have a Size column. So the resulting migration ends up looking like this:
public partial class AddSizeToWidget : DbMigration
{
public override void Up()
{
AddColumn("dbo.Widgets", "Size", c => c.Int());
}
public override void Down()
{
DropColumn("dbo.Widgets", "Size");
}
}
Now, imagine that it is allowed to create a second migration while the first is still pending. I haven't yet run the Update-Database command, so my baseline database schema is still the same. Now I decide to add another property Color to Widget.
When I create a migration for this change, the generator compares my model to the current state of the database and sees that I have added two columns. So it creates the corresponding script:
public partial class AddColorToWidget : DbMigration
{
public override void Up()
{
AddColumn("dbo.Widgets", "Size", c => c.Int());
AddColumn("dbo.Widgets", "Color", c => c.Int());
}
...
}
So now I have two pending migrations, and both of them are going to try to add a Size column to the database when they are ultimately run. Clearly, that is not going to work. So that is why there is only one pending migration allowed to be open at a time.
So, the general workflow during development is:
Change your model
Generate a migration
Update the database to establish a new baseline
Repeat
If you make a mistake, you can roll back the database to a previous migration using the –TargetMigration parameter of the Update-Database command, then delete the errant migration(s) from your project and generate a new one. (You can use this as a way to combine several small migrations into a larger chunk if you really want to, although I find in practice it is not worth the effort).
Update-Database –TargetMigration PreviousMigrationName
Now, when it comes time to update a production database, you do not have to manually apply each migration one at a time. That is the beauty of migrations -- they are applied automatically whenever you run your updated code against the database. During initialization, EF looks at the target database and checks the migration level (this is stored in the special __MigrationHistory table which was created when you enabled migrations on the database). For any migration in your code which has not yet been applied, it runs them all in order for you, to bring the database up to date.
Hope this helps clear things up.

Is there any problems / issues with "mixing" those two approaches?
No, there is no problem to mix them.
You can do more with fluent config than with data annotations.
Fluent config overrides data annotation when constructing the migration script.
You can use data annotations to generate DTOs and front-end/UI constraints dynamically - saves a lot of code.
Fluent API has class EntityTypeConfiguration which allows you to make domains (in DDD sense) of objects dynamically and store them - speeds up work with DbContext a lot.
I cannot have more than a single "pending" migration
Not 100% true. ( Maybe 50% but this is not a showstopper )
Yes, the DbMigrator compares your model "hash" to the database model "hash" when it generates the Db - so it blocks you before you make your new small migration. But this is not a reason to think you can not make small migration steps. I do only small migration steps all the time.
When you develop an app and you use your local db you apply the small migrations one by one as you develop functionality - gradually. At the end you deploy to staging/production all your small migrations in one dll with all the new functionality - and they are applied one by one.

C# - Entity Framework - Large seed data code-first

I'm going to create an initial table in my app to store all the cities/states of my country.
This is a relative large data set: 5k+ registries.
Reading this post enlightened to me a good way to do this, altough I think that leaving a sql file, that will be imported by the EF, in the open is a security flaw.
The file format is irrelevant: I can make it a XLS or a TXT if I want; instead of executing it as a SQL command as show in the post I can simply read it as a stream and generate the objects as shown in the tutorial of the next hiperlink.
Reading this tutorial about data seed in code-first, I saw that the seed method will be executed in the database initialization process and the seed objects are generated in the seed method.
My questions:
About the seed methods, what is the best approach, the SQL-file approach or the object approach?
I personally think that the object approach is more secure, but can be slower, possibility that generates my second question:
The seed method that is executed in the database initialization process, is executed ONLY when the DB is created? This is a little unclear to me.
Thanks.

Reference System.Data in your database project and add the NuGet package EntityFramework.BulkInsert.
Insert your data in the seed method if you detect that it's not there yet:
protected override void Seed(BulkEntities context)
{
if (!context.BulkItems.Any())
{
var items = Enumerable.Range(0, 100000)
.Select(s => new BulkItem
{
Name = s.ToString(),
Status = "asdf"
});
context.BulkInsert(items, 1000);
}
}
Inserting 100,000 items takes about 3 seconds over here.

EF Code Migration - Migrating string to decimal field (v. 4.3.1.0)

I'm supporting an application in which we need to migrate the code first generated database. The change is to modify three properties (all on the same table/entity) from a string to a nullable decimal.
Part of the requirement of doing this is that I need to output the changes to a SQL file, since we a deploying the patch to our client, who is also hosting the product in production.
I was told that this is possible but I am unsure how to do it.
Question: How can I, using EF code first, migrate the database table to have nullable decimals instead of strings and have the changes outputted to a SQL file. I am making the assumption that all the values currently in the column are convertible to decimals, but if not how would that change the complexity?

If I'm correct in the requirements, then you'll need to create a migration, then run
Update-Database -Script
This will create a sql script that will be run against the DB to update the structure.
Also Update-Database -Verbose will update the DB structure, and output the SQL run.
If you need to preserve the data, you can also run SQL in your migration script directly:
public partial class RenameColumn : DbMigration
{
public override void Up()
{
Sql("update blah...");
}
public override void Down()
{
Sql("drop table bobby");
}
And there you can do whatever you need to do to preserve the data: add temp colum/table, copy original data there, change the column type, convert the data to a required format and copy it to the new column, clean-up.
If you have Sql command in the migration, when you run Update-Database -Script, it will give you combined script - scaffolded migration script and your manually written script.

Auto generate class properties by binding to database lookup table

I'm not sure if this is feasible or not but is there a way in C# that will allow me to generate static members or enumrator of all the lookup values in a database table?
For example, if I have a table for countries with 2 columns: code, countryname. I want a way to convert all the rows in this table into a class with properity for each row so I can do the following:
string countryCode = Country.Egypt.Code
Where Egypt is a generated property from the database table.

When you say "to convert all the rows", do you actually mean "to convert all the columns"?
I so, and if your ADO.NET provider supports it, you can use LINQ to SQL to auto-generate a class that has properties that match the columns in your table. You can follow this procedure:
Right-click on your project and Add / New Item / LINQ to SQL Classes. By default, this will generate a DataClasses1.dbml file with DataClasses1DataContext class.
Expand the database connection of interest in the Server Explorer, under Data Connections (you may need to add it there first through right-click on Data Connections).
Pick the table of interest and drag'n'drop it onto the surface of DataClasses1.dbml.
Assuming your table name was COUNRTY with fields NAME and CODE, you can then use it from your code like this:
using (var db = new DataClasses1DataContext()) {
COUNRTY egypt = db.COUNRTies.Where(row => row.NAME == "Egypt").SingleOrDefault();
if (egypt == null) {
// "Egypt" is not in the database.
}
else {
var egypt_code = egypt.CODE;
// Use egypt_code...
}
}
If you actually meant "rows", I'm not aware of an automated way to do that (which doesn't mean it doesn't exist!). Writing a small program that goes through all rows, extracts the actual values and generates some C# text should be a fairly simple exercise though.
But even if you do that, how would you handle database changes? Say, a value is deleted from the database yet it still exists in your program because it existed at the time of compilation? Or is added to the database but is missing from your program?

This cannot be done because Country.Egypt has to be available at compile time. I think your options are:
Generate code for Country class from database. Of course, the question then is how will clients use it?
Keep the Properties statically declared and read their Code from database during application start-up
Keep the properties as well as code statically declared and check them against database during application start-up.
Further to #1 above, if the client code does not depend on individual property names then these are not types but data and you could just as well use a Country.AllCountries property for that is initialized at start-up.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.