I am looking for a way to do daily deployments and keep the database scripts in line with releases.
Currently, we have a fairly decent way of deploying our source, we have unit code coverage, continuous integration and rollback procedures.
The problem is keeping the database scripts in line with a release. Everyone seems to try the script out on the test database then run them on live, when the ORM mappings are updated (that is, the changes goes live) then it picks up the new column.
The first problem is that none of the scripts HAVE to be written anywhere, generally everyone "attempts" to put them into a Subversion folder but some of the lazier people just run the script on live and most of the time no one knows who has done what to the database.
The second issue is that we have 4 test databases and they are ALWAYS out of line and the only way to truly line them back up is to do a restore from the live database.
I am a big believer that a process like this needs to be simple, straightforward and easy to use in order to help a developer, not hinder them.
What I am looking for are techniques/ideas that make it EASY for the developer to want to record their database scripts so they can be ran as part of a release procedure. A process that the developer would want to follow.
Any stories, use cases or even a link would helpful.
For this very problem I chose to use a migration tool: Migratordotnet.
With migrations (in any tool) you have a simple class used to perform your changes and undo them. Here's an example:
[Migration(62)]
public class _62_add_date_created_column : Migration
{
public void Up()
{
//add it nullable
Database.AddColumn("Customers", new Column("DateCreated", DateTime) );
//seed it with data
Database.Execute("update Customers set DateCreated = getdate()");
//add not-null constraint
Database.AddNotNullConstraint("Customers", "DateCreated");
}
public void Down()
{
Database.RemoveColumn("Customers", "DateCreated");
}
}
This example shows how you can handle volatile updates, like adding a new not-null column to a table that has existing data. This can be automated easily, and you can easily go up and down between versions.
This has been a really valuable addition to our build, and has streamlined the process immensely.
I posted a comparison of the various migration frameworks in .NET here: http://benscheirman.com/2008/06/net-database-migration-tool-roundup
Read K.Scott Allen's series of posts on database versioning.
We built a tool for applying database scripts in a controlled manner based on the techniques he describes and it works well.
This could then be used as part of the continuous integration process with each test database having changes deployed to it when a commit is made to the URL you keep the database upgrade scripts in. I'd suggest having a baseline script and upgrade scripts so that you can always run a sequence of scripts to get a database from it's current version to the new state that is needed.
This does still require some process and discipline from the developers though (all changes need to be rolled into a new version of the base install script and a patch script).
We've been using SQL Compare from RedGate for a few years now:
http://www.red-gate.com/products/index.htm
The pro version has a command line interface that you could probably use to setup your deployment procedures.
We use a modified version of the database versioning described by K. Scott Allen. We use the Database Publishing Wizard to create the original baseline script. Then a custom C# tool based on SQL SMO to dump the stored procedures, views and user functions. Change scripts which contain schema and data changes are generated by Red Gate tools. So we end up with a structure like
Database\
ObjectScripts\ - contains stored procs, views and user funcs 1-per file
\baseline.sql - database snapshot which includes tables and data
\sc.01.00.0001.sql - incremental change scripts
\sc.01.00.0002.sql
\sc.01.00.0003.sql
The custom tool creates the database if necessary, applies the baseline.sql if necessary, adds a SchemaChanges table if necessary and applies the change scripts as necessary based on what's in the SchemaChanges table. That process occurs as part of a nant build script each time we do a deployment build via cc.net.
If anyone wants the source code to the schemachanger app I can throw it up on codeplex/google or wherever.
If you are talking about trying to keep database schemas in sync, try using Red Gate SQL Comparison SDK. Build a temp database based on a create script (newDb) - this is what you want your database to look like. Compare newDb against your old database (oldDb). Get a change set from that comparison and apply it using Red Gate. You could build this upgrade process into you tests, and you can try and get all the devs to agree that there is one place where the create script for the database is kept. This same practice works well for upgrading your database across several versions and running data migration scripts and processes between each step (using an XML doc to map the create and data migration scripts)
Edit: With Red Gate technique, you only are concerned with create scripts, not upgrade scripts since Red Gate comes up with the upgrade script. It will also let you drop and create indexes, stored procedures, functions, etc.
Go here:
https://blog.codinghorror.com/get-your-database-under-version-control/
Scroll down a bit to the list of 5 links to the odetocode.com website. Fantastic five-part series. I would use that as a starting point to get ideas and figure out a process that will work for your team.
You should consider using a build tool like MSBuild or NAnt. We use a combination of CruiseControl.NET, NAnt, and SourceGear Fortress to handle our deployments, including SQL objects. The NAnt db build task calls sqlcmd.exe to update scripts in our dev and staging environments after they're checked into Fortress.
We use Visual Studio for Database Professionals and TFS to version and manage our database deployments. This allows us to treat our databases just like code (check out, check in, lock, view version history, branch, build, deploy, test, etc.) and even include them in the same solution files if we wish.
Our developers can work on local databases to avoid stepping on each other's changes in a shared environment. When they check database changes into TFS, we have continuous integration to build, test and deploy to our integrated dev environment. We have separate builds on release branches to create differential deployment scripts for each subsequent environment.
Later, if a bug is discovered in a release, we can go to a release branch and hotfix the code and database at the same time.
This is a great product, but its adoption was hindered early on due to a Microsoft marketing blunder. It was originally a separate product under Team System. This meant in order to use features of the developer edition and database edition at the same time, you were required to step up to the much more expensive Team Suite edition. We (and many other customers) gave Microsoft grief about this, and we were very happy they announced this year that DB Pro has been folded into the developer edition, and that immediately anyone licensed with developer edition can install the database edition.
Gus off-handedly mentioned DB Ghost (above) – I second it as a potential solution.
A brief overview of how my company is using DB Ghost:
After the schema for a new DB has been reasonably settled during initial development, we use the DB Ghost 'Data and Schema Scripter' to create script (.sql) files for all the DB objects (and any static data) and we check-in these script files into source control (the tool separates the objects into folders such as 'Stored Procedures', 'Tables', etc.). At this point, we can use either of the DB GHost 'Packager' or 'Packager Plus' tools to create a stand-alone executable to create a new DB from these scripts.
All changes to the DB schema are checked-in to source by check-ins to the specific script files.
At anytime we can use the packager to create an executable to either (a) create a new DB or (b) update an existing DB. Some customization is required for certain path-dependent changes (e.g. changes that require data to be updated), but we have pre-update and post-update scripts that are run.
The 'update' process involves the creation of a clean 'source' DB and then (after pre-update custom scripts), a comparison between the schemas of the source DB and the target DB. DB Ghost updates the target DB to match
We routinely make changes to production DBs (we have 14 customers in 7 different production environments) but inevitably deploy a large-enough set of changes with a DB Ghost update executable (created during our build process). Any production changes that were not checked-in to source (or that were not checked-in to the appropriate branch being released) are LOST. This has forced everyone to check-in changes consistently.
To summarize:
If you enforce a policy that all DB updates be deployed using a DB Ghost update executable, you can 'force' developers to consistently check-in their changes, regardless of whether they are deployed manually in the interim.
Adding a step (or steps) to your build process to create a DB Ghost update executable will in-effect perform a test to verify that a DB can be created from scripts (i.e. because DB Ghost creates a 'source' DB, even when creating the update executable package) and if you add a step (or steps) to execute the update package [on any of the four test DBs you mentioned], you can keep your test DBs in line with source.
There are some caveats and some limitations in what changes are 'easily' deployed with this tool (really a suite of related tools), but they are all fairly minor (at least for my company):
Renaming objects must be done in one of the custom scripts
The entire DB is always updated (e.g. objects in a single schema can't be updated alone) making it difficult to support customer-specific objects in the main application DB
The book Refactoring Databases addresses many of these issues at a conceptual level.
As far as tools go, I know that DB Ghost works well for SQL Server. I have heard that the Data Dude edition of Visual Studio has really been imporved upon in the latest release but I don't have any experience with it.
As far as really pulling off continuous integration style database development, it gets really resource instensive really fast because of the number of database copies you need. It is very doable when the database can fit on a developer workstation but impractical when the database is so large that it needs to be deployed across a grid. To do it you bacically need 1 copy of the database per developer [developers who make DDL changes, not just changes to procs] + 6 common copies. The common copies are as follows:
INT DEV --> Developers "check in" their refactoring to INT DEV for integration testing. When integration testing passes, this database is copied over to DEV.
DEV --> This is the "official" development copy of the database. INT DEV is refreshed regularly with a copy of DEV. Developers working on new refactorings get a fresh copy of the database from DEV.
INT QA --> Same idea as INT DEV except for the QA team. When integration tests pass here, this database is copied over to QA and to DEV*.
QA
INT PROD --> Same idea as INT QA except for production. When integration tests pass here, this database is copied over to PROD, QA*, and DEV*
PROD
*When copying databases across DEV/QA/PROD lines, you will also need to run scripts to update test data relevant to the particular environment (e.g. setting up users in QA that the QA team uses to test but that don't exist in production).
One possible solution is to look into implementing DML auditing on your test databases, then just rolling those audit logs into a script for final testing and live deployment. SQL Server 2008 significantly improves on DML auditing, but even SQL Server 2005 supports it via triggers.
There are a bunch of links in these posts that I'll want to follow up on (I "rolled my own" system years ago, have to see if there are similarities). One thing you will need, and that I hope is mentioned in these links, is discipline. I don't quite see how any automated system can work if anyone can change anything at any time. (Your question implies that this can happen on your production systems, but obviously that can't be true.)
Having one person (the fabled "database administrator") dedicated to the task of managing changes to databases, particularly production databases, is a very common solution. As for maintaining consistency across X development and testing databases: if it/they are used by many users, once again you are best served by having an individual act as a "clearing house" for changes; if everyone has their own database instance, then they're responsible for keeping it in order, and having a central consistent database "source" will be critical when they need a refreshed baseline database.
Here's a recent Stack Overflow post that may be of interest: how-to-refresh-a-test-instance-of-sql-server-with-production-data-without-using
Red Gate has a paper describing how to achieve build automation: http://downloads.red-gate.com/HelpPDF/ContinuousIntegrationForDatabasesUsingRedGateSQLTools.pdf
This is built around SQL Source Control, which integrates with SSMS and your existing source control system.
I've written a .NET based tool to handle database versioning in an automated fashion. We have been using this tool in production to handle rolling out database updates (including patches) to multiple environments, keep a log in each database of which scripts have been run, and do it all in an automated fashion. It has a command-line console so you can create batch scripts which use this tool. Check it out: https://github.com/bmontgomery/DatabaseVersioning
For what it's worth, this is a real example of a simple, low cost approach used by my former employer (and which I am trying to impress on my current employer as a basic first step).
Add a table called 'DB_VERSION' or similar. In EVERY upgrade script, add a row to that table which can include as little or as many columns as you see fit to describe the upgrade but at a minimum I would suggest { VERSION, EXECUTION_DATE, DESCRIPTION, EXECUTION_USER }. Now you have a concrete record of what has been going on. If someone runs their own unauthorised script you'd still need to follow the advice of the answers above, but this is just a simple way of dramatically improving on your existing versioning control (i.e. none).
Now let's you have an upgrade script from v2.1 to v2.2 of the database and you want to verify the lone maverick guy has actually run it on his database, you can just search for rows where VERSION = 'v2.2' and if you get a result, don't run this upgrade script. Can be built into a console utility app if necessary.
Related
I have to design a system with an sql database whose work is to get data from different databases which may be in another database such as mysql or oracle etc. Then the system will map the attributes of that database with my database schema and store them..
Example reference link: https://msdn.microsoft.com/en-us/library/aa728893(v=vs.71).aspx
Since I am new, I can't attach images which is why I am providing links.
All my searches end up with getting the mapping tools but what actually I want is how to create that tool myself.
I am not a professional but a little push will be enough for me and highly appreciated. Thanks in advance.
This, as I said in the comment, sounds to me a job for an Integration Services package.
If you would like to use Microsoft SQL Server Integration Services you should first have Microsoft SQL Server Data Tools installed on your development machine.
Afterwards you start by creating a new Integration Service project inside Visual Studio. Then you can add an ODBC Connection Manager to manage your different databases input of data. After that you can add a different transformation container objects in your package to transform the data as you need it. At the end you need to specify the output of all those elements into your database where you want to store the information that you collect from other sources.
You can also create a different package for every source database you have so that the tasks can be separated. Unfortunately a complete tutorial is very long for me to post here but you can check out the tutorial on Microsoft web site. An other example here.
As as warning you should be really careful with data types because if you don't match/convert them correctly the package will fail with not so obvious errors.
If you choose the .tt(T4 Template) solution in which you create the application then you should start by connecting to the sources database and loop through tables definitions to get the columns and then store them as a xml file. The matching you will have to do it inside the text template file so that you have the matching already done when the table is read from the data source.
Here is an example that should get you started. Note that in the example the output file will be a .cs file not .xml but you can configure that very easily with this T4 directive <## output extension=".xml" #>.
Please, ignore for this task, any form of programming, Do not do it. Especially ignore Entity framework.
I would recommend you to take one of the many ETL tools out there, and design the work with it. I can recommend you Talend . It is very easy to learn and in time, even a developer starts to work with it.
The best part about it that it can connect to webservices, enterprise solutions and probably any imaginable Database out there.
What you do is to design Jobs that run on its own in parallel, and then you export the jobs as standalone JARs. A chroneJob, or a scheduled service completes the process to have it executed periodically
I'm currently working on a database that comes with a legacy project which uses EntityFramework (updates code based on existing database using Data Model Designer)
Currently I work on the master copy and our developers work locally using SQL Server merge-replications on their local PC.
Issue here is that we recently started doing some change work that modifies the database schema, so when we use schema comparison (visual studio SQL compare feature), there are huge number of replication sp & schema changes that basically if I do update it will corrupt the live database. So my current solution is remove the dev server replication (so that the schema goes back to what it should look like without replication changes), then do the schema compare & update, and then create a new merge replication again so our developers can continue working on the dev db.
I thought it was just one-off db schema change, but just realized it will be continuous changes at least for the next 3-6 months, so that basically make each release a big headache (if it can be called as a 'release' prep...)
My SQL & EntityFramework knowledge is limited, can anyone shed some light on this for me please?
Thanks in advance!
Whats the observed need behind merge replication in the dev environment? I understand the need for devs to have a local copy they can mess with, run tests against etc, but I'm lost on why a full Publisher-Subscriber model is needed to synchronize DB state in a dev/test environment, and it seems to be causing you more problems than it may solve given the schema is going to be malleable for a few months.
If merge replication is not a hard requirement for the dev environment, I would suggest you replace it with an alternate method of distributing changes to the local copies. If the devs are working with a full copy of the DB anyway, I see no reason not to write a script that backs up the master copy on the dev server, then pulls that file down and restores it locally. Then, changes to that schema would be accomplished with change scripts, which can be run and tested locally before being applied to the master DB, then distributed on-demand with another run of the backup/restore script.
It's a slightly more manual process and an older way to work with DBs, but it seems far more palatable to me than breaking and re-establishing replication regularly. It'll require some collaboration to make sure devs aren't trying to make a backup at the same time or making conflicting changes to local copies that will blow up on the master copy; your devs ideally should be talking to each other anyway about this kind of thing, and you might make the script smart enough to look for a recent backup before generating another.
One more thought, don't know how feasible it is given your progress to date; it's not impossible to switch from DB-First to Code-First. The conversion is basically a hybrid process of Database First and Code First; the DB is reverse-engineered as a one-time operation to generate a model similar to DB First, but instead of EDMX files, the model is written out to source code files, and changes to those model files or to mapping conventions on the context can then be aggregated and applied to the schema as migrations in typical Code First style. Assuming you prepare the live DB for migrations as well (and have the live DB in the same state as the master Dev DB prior to the model generation), this even removes the requirement of a SQL compare and update; you just apply the migrations to the live DB, same as you would to any Dev instance. The only potential gotcha is that some migrations can be written destructively, so you have to make sure what you're about to apply isn't going to clear out all the fields in a renamed column.
In my Application I have used Entity Framework Database First approach.
Currently my application is in Dev Environment, now it need to be moved into Test Environment and later into Production Environment.
So is there anyway that I can use .net feature or Entity framework feature to migrate/create database in Test environment. Other than using SQL feature of restoring the database.
Also note that if any enhancement comes then Database structure can change, table schema can change.
So can you suggest me the best way to easily migrate database schema in different environment without losing existing Data.
If you want to take advantage of EF-Migrations feature, you must convert your application to Code First with Existing Database http://msdn.microsoft.com/en-us/data/jj200620.aspx
If you are unable to convert to code first then you must create the update script by hand.
Use a schema compare tool, compare the development and production server.
For each difference found, create an update query.
Once the entire script is finished, test it on the staging server.
Automating the migration is very risky, it depends on the type and size changes you made to the schema. You can't trust any single feature or tool specially if the changes requires data motion (moving data around).
The following links might help you:
How to do Migrations in DB first approach
EF Migrations for Database-first approach?
With Database First, the easiest way to copy a schema is to extract a data tier application in management studio, create an empty database on the target, register it as a data tier application with the same name, and upgrade the empty database using the upgraded file. You can repeat this step to manage schema changes.
Having said that, going forward you're really better off switching your Database First to Code First as it will make change management across your deployments much easier.
Migrations are best way to deal with it
Preferred way to update production db is to first generate sql file and then run the sql file in production environment.
MS had a very good artical on this
http://msdn.microsoft.com/en-in/data/jj591621.aspx#script
I wonder what you are using for updating a client database when your program is patched?
Let's take a look at this scenario:
You have a desktop application (.net, entity framework) which is using sql server compact database.
You release a new version of your application which is using extended database.
The user downloads a patch with modified files
How do you update the database?
I wonder how you are doing this process. I have some conception but I think more experienced people can give me better and tried solutions or advice.
You need a migration framework.
There are existing OSS libraries like FluentMigrator
project page
wiki
long "Getting started" blogpost
Entity Framework Code First will also get its own migration framework, but it's still in beta:
Code First Migrations: Beta 1 Released
Code First Migrations: Beta 1 ‘No-Magic’ Walkthrough
Code First Migrations: Beta 1 ‘With-Magic’ Walkthrough (Automatic Migrations)
You need to provide explicitly or hidden in your code DB upgrade mechanism, and - thus implement something like DB versioning chain
There are a couple of aspects to it.
First is versioning. You need some way of tying teh version of teeh db to the version of the program, could be something as simple as table with a version number in it. You need to check it on executing the application as well.
One fun scenario is you 'update' application and db successfully, and then for some operational reason the customer restores a previous version of the db, or if you are on a frequent patch cycle, do you have to do each patch in order or can thay catch up. Do you want to deal with application only or database only upgrades differently?
There's no one right way for this, you have to look at what sort of changes you make, and what level of complexity you are prepared to maintain in order to cope with everything that could go wrong.
A couple a of things worth looking at.
Two databases, one for static 'read-only' data, and one for more dynamic stuff. Upgrading the static data, can then simply be a restore from a resource within the upgrade package.
The other is how much can you do with meta-data, stored in db tables. For instance a version based xsd to describe your objects instead of a concrete class. That's goes in your read only db, now you've updated code and application with a restore and possibly some transforms.
Lots of ways to go, just remember
'users' will always find some way of making you look like an eejit, by doing something you never thought they would.
The more complex you make the system, the more chance of the above.
And last but not least, don't take short cuts on data version conversions, if you lose data integrity, everything else you do will be wasted.
My company provides a large .NET service-oriented solution. The services layer interact with a T-SQL back-end consisting of hundreds of tables and stored procedures. Our C# code is in version-control (SVN) but our stored procedures and schema are not.
After much lobbying of expedient upper-management, I was allowed to review our (non-existent) build/deployment process to accomplish the following goals:
Place schema and stored procedures under source-control.
Automate the build/deployment process.
I would like to proceed per the accepted answer's strategy in this post but have additional questions:
I would like to use Hudson as my build server. Is this a reasonable choice for a C#/SQL solution? What better alternatives should I explore?
Assuming I have all triggers, stored-procedures, schema, etc... under source control, and that they are scripted to individual files, how do I generate a build script which will take into account dependencies/references between these items? (SQL Server does this automatically, but it generates one giant script)
What does the workflow of performing an update at the client look like? i.e. I have to keep existing table data. How do I roll-back schema changes?
I am the only programmer. Several other pseudo-technical staff like to make changes directly inside SQL Management Studio. Is it realistic to expect others to adhere to this solution -- how can I enforce this?
Thank you in advance for your help.
Edit:
Unfortunately we will not be able to use TFS. We do have Visual Studio 2008/2010 with the Database Project components available, though, so it looks like I'll have to hack together a script-based solution. Any suggestions/updates are appreciated..
The canonical example on the Microsoft stack for T-SQL deployment is the Visual Studio Database Project deployment process. In this process, your database schema, procedures, right assignment and pretty much all else are stored as pieces of a VSDB project, which means that they are stored as SQL definition files, and checked under source control (SVN is fine). The 'build' process delivers a .dbschema file, which is a file that contains a synthesis of the entire VSDB project (is a glorified XML file). The .dbschema file is then shipped into the deployment server (development server, QA validation server, even produciton server) and 'deployed'. Deployment is done via the vsdbcmd tool which will run a sophisticated diff between the deployment server and the .dbschema file and 'align' the server to the content of the .dbschema file, using CREATE/ALTER/DROP statements as appropriate, based on what exists in the target database/server.
A contiguously integrated process would start a nightly build, drop the .dbschema along with other deliverable on the test SQL server, deploy the .dbschema, then run build validation tests, and if all good in the end will drop a fully build and QA validated deliverable, the daily 'drop'. Fully integration all the way to deployment into production is possible, but usually avoided due to risk of unexpected downtime on the central, production, server. However, fully integration and deployment into production is usually the norm for multi-server environments, where 'production' means hundreds/thousands of deployed servers.
Now you say that you want to deploy using Hudson, which is all good, except that you have to recreate everything I describe in the steps above as Ants build steps and you'll spend the next 10 years reinventing the VS DB project concepts, like a .dbschema file and a tools like vsdbcmd. I'm not the one that can make the call to invest into buying a VSDB and TFS based build server license, but I'm saying that I'm not aware of an end-to-end solution available in OSS. With VS 2010, the Database Projects are in Standard Edition, I believe. With VS 2008 you'd need the high end license.
As of users doing changes riding shot-gun from SSMS: you can prevent them using DDL triggers, you can track them using Event Notifications, or you can fully audit them using C2 compliant audit.