Stored Procedures In Source Control - Automate Build/Deployment Process

Stored Procedures In Source Control - Automate Build/Deployment Process - c#

My company provides a large .NET service-oriented solution. The services layer interact with a T-SQL back-end consisting of hundreds of tables and stored procedures. Our C# code is in version-control (SVN) but our stored procedures and schema are not.
After much lobbying of expedient upper-management, I was allowed to review our (non-existent) build/deployment process to accomplish the following goals:
Place schema and stored procedures under source-control.
Automate the build/deployment process.
I would like to proceed per the accepted answer's strategy in this post but have additional questions:
I would like to use Hudson as my build server. Is this a reasonable choice for a C#/SQL solution? What better alternatives should I explore?
Assuming I have all triggers, stored-procedures, schema, etc... under source control, and that they are scripted to individual files, how do I generate a build script which will take into account dependencies/references between these items? (SQL Server does this automatically, but it generates one giant script)
What does the workflow of performing an update at the client look like? i.e. I have to keep existing table data. How do I roll-back schema changes?
I am the only programmer. Several other pseudo-technical staff like to make changes directly inside SQL Management Studio. Is it realistic to expect others to adhere to this solution -- how can I enforce this?
Thank you in advance for your help.
Edit:
Unfortunately we will not be able to use TFS. We do have Visual Studio 2008/2010 with the Database Project components available, though, so it looks like I'll have to hack together a script-based solution. Any suggestions/updates are appreciated..

The canonical example on the Microsoft stack for T-SQL deployment is the Visual Studio Database Project deployment process. In this process, your database schema, procedures, right assignment and pretty much all else are stored as pieces of a VSDB project, which means that they are stored as SQL definition files, and checked under source control (SVN is fine). The 'build' process delivers a .dbschema file, which is a file that contains a synthesis of the entire VSDB project (is a glorified XML file). The .dbschema file is then shipped into the deployment server (development server, QA validation server, even produciton server) and 'deployed'. Deployment is done via the vsdbcmd tool which will run a sophisticated diff between the deployment server and the .dbschema file and 'align' the server to the content of the .dbschema file, using CREATE/ALTER/DROP statements as appropriate, based on what exists in the target database/server.
A contiguously integrated process would start a nightly build, drop the .dbschema along with other deliverable on the test SQL server, deploy the .dbschema, then run build validation tests, and if all good in the end will drop a fully build and QA validated deliverable, the daily 'drop'. Fully integration all the way to deployment into production is possible, but usually avoided due to risk of unexpected downtime on the central, production, server. However, fully integration and deployment into production is usually the norm for multi-server environments, where 'production' means hundreds/thousands of deployed servers.
Now you say that you want to deploy using Hudson, which is all good, except that you have to recreate everything I describe in the steps above as Ants build steps and you'll spend the next 10 years reinventing the VS DB project concepts, like a .dbschema file and a tools like vsdbcmd. I'm not the one that can make the call to invest into buying a VSDB and TFS based build server license, but I'm saying that I'm not aware of an end-to-end solution available in OSS. With VS 2010, the Database Projects are in Standard Edition, I believe. With VS 2008 you'd need the high end license.
As of users doing changes riding shot-gun from SSMS: you can prevent them using DDL triggers, you can track them using Event Notifications, or you can fully audit them using C2 compliant audit.

Related

Data Mapping between two different Database Schemas

I have to design a system with an sql database whose work is to get data from different databases which may be in another database such as mysql or oracle etc. Then the system will map the attributes of that database with my database schema and store them..
Example reference link: https://msdn.microsoft.com/en-us/library/aa728893(v=vs.71).aspx
Since I am new, I can't attach images which is why I am providing links.
All my searches end up with getting the mapping tools but what actually I want is how to create that tool myself.
I am not a professional but a little push will be enough for me and highly appreciated. Thanks in advance.

This, as I said in the comment, sounds to me a job for an Integration Services package.
If you would like to use Microsoft SQL Server Integration Services you should first have Microsoft SQL Server Data Tools installed on your development machine.
Afterwards you start by creating a new Integration Service project inside Visual Studio. Then you can add an ODBC Connection Manager to manage your different databases input of data. After that you can add a different transformation container objects in your package to transform the data as you need it. At the end you need to specify the output of all those elements into your database where you want to store the information that you collect from other sources.
You can also create a different package for every source database you have so that the tasks can be separated. Unfortunately a complete tutorial is very long for me to post here but you can check out the tutorial on Microsoft web site. An other example here.
As as warning you should be really careful with data types because if you don't match/convert them correctly the package will fail with not so obvious errors.
If you choose the .tt(T4 Template) solution in which you create the application then you should start by connecting to the sources database and loop through tables definitions to get the columns and then store them as a xml file. The matching you will have to do it inside the text template file so that you have the matching already done when the table is read from the data source.
Here is an example that should get you started. Note that in the example the output file will be a .cs file not .xml but you can configure that very easily with this T4 directive <## output extension=".xml" #>.

Please, ignore for this task, any form of programming, Do not do it. Especially ignore Entity framework.
I would recommend you to take one of the many ETL tools out there, and design the work with it. I can recommend you Talend . It is very easy to learn and in time, even a developer starts to work with it.
The best part about it that it can connect to webservices, enterprise solutions and probably any imaginable Database out there.
What you do is to design Jobs that run on its own in parallel, and then you export the jobs as standalone JARs. A chroneJob, or a scheduled service completes the process to have it executed periodically

MS-SQL - Is it possible to manage the code of sql agent jobs using the Visual Studio SSDT plugin?

I have jobs in my DB held at server_agent/jobs.
Is there a way to manage them using ssdt plugin in Visual Studio 2012?
(Being able to Compare/Update)
Thanks.

At the moment, there's no good way to handle this. Jobs are made up of data stored in the tables in msdb. The best way to handle it would be either to script out each job into its own file, then call those from one larger file or to make a single large file of all scripts for the jobs.
I'd probably look into scripting out each job into its own file and calling them from a larger script. It would be more manageable in the long term and you could easily comment out a single job by commenting it out of the main script.
There's no way to compare/update the jobs directly, though you could possibly use a Data Compare tool to check the various job tables in msdb against some master copy of the data. SQL Data Compare from Red-Gate is probably your best option for that as SSDT does not include data compare functionality at this time.
That being said, there are rumors that MS is working on some data-specific components for SSDT, but nothing official has been said about this capability.

While there is no built-in support for managing SQL Agent Jobs directly in SSDT you can get deployment functionality using the Post-Deployment scripts
Granted this is not your question (sounds like you want to import, as well as compare, change)
However, you can put the "sql" of the job under source control, and deploy it (upsert) using either publish or indirectly with a built DACPAC
The gist is:
create a sql script which calls the msdb sp_xxx_job Stored Procedures
sp_add_job
sp_add_jobstep
sp_add_jobschedule
when using a script marked as a PostDeployment script invoke the job generation script
:r .\Jobs\MyPHATjob.sql

Database Deployment Practices

I have deployed plenty of software to my clients. Mostly are Window Forms applications.
Here is my current practice.
Manually install SQLExpress and SQL Management Studio to each client PC.
Then use ClickOne to install the code from the server.
When there is a changes in code, I will use ClickOne to deploy -(NO PROBLEM with this step)
But when there is a change in a database column, what do I do?
I have even tried writing a database update script. Each time the program starts, it will read through the .sql update file and run them if the database exists. This solves the problem of updating the database columns, but it does not help in my DEBUGGING work when my customer complain there is a wrong data. At that point, I have to personally go to their site to check it out.
I find it difficult to have the database installed on the client PC as it make my debugging work very very difficult. I am thinking about moving my client database to a host on an Online server. But that then comes with these constraints:
What if the internet is down?
What if my customer has no internet?
Could you help to advise me? Is this a common problem faced by developer? What is the common practice out there? Does Window Azure or SQL CE help?

Depending on the data I would recommend using SQL CE.
If the data isn't too much, speed is not the primary goal (CE is slower than Express) and you don't need DB-Features not supported by CE (e.g. stored procedures) it is the better choice IMHO, because:
The client does not need to install a full SQL server (easier installation/deployment)
You do not have problems with multiple SQLExpress instances
Your SW doesn't need to worry if there even is a SQL instance
Less resources used on the client side
Additionally the clients could send you their SQL CE DB-File for inspection and you do not need to go to their site.
It is also relativly easy to implement an off site sync with SQL CE and MS Sync FW.

Installing one database per client PC can be tricky. I think you have a decent handle on how to deal with the issue currently. It seems like the real issue you are currently facing is debugging. To deal with this, there are a couple ways you could go:
Have the customer upload their copy of the database back to you. This would provide you with the data they have and you could use it with a debug copy of your code to identify the issues. The downside is that if the database is large it might be an issue transferring it.
Remote onto the customer's machine. Observe the system remotely using something like CoPilot. That way you could see what is happening in its natural environment.
There are probably other ways, but these are a couple of good ones. As for using an online database, this is an option but it brings its own set of issues with it. You mentioned a couple. As for Azure, that is cloud-based (online) so the same issues will apply. SQL CE won't help you any more than your current installation does.
Bottom line is that I would recommend you look into the ways to fix your one issue (as listed above) instead of creating a whole new set of issues by moving to an Internet-based solution. I would only recommend moving to the Internet if it was addressing a larger business need (for example, mobility). Doing the same thing you have been doing only online will probably just make life harder.
To recap the comments below since they are so pertinent to the issue, if you are choosing between file-based databases that don't need to be physically installed on the machine, your best choices are probably between SQLite and SQL CE. Microsoft supports SQL CE better but it is a larger package and has less features than the trim SQLite. Here is a good discussion on the differences:
https://stackoverflow.com/questions/2278104/sql-ce-sqlite-what-are-the-differences-between-them
However, the issue gets more complicated when you start looking at linq2sql since that is designed for SQL server. Microsoft does not support SQL CE with linq2sql out of the box, although there is a work-around that will get it to work:
http://pietschsoft.com/post/2009/01/Using-LINQ-to-SQL-with-SQL-Server-Compact-Edition.aspx
SQLite is not supported at all with linq2sql but there is a way to use linq to talk with SQLite:
LINQ with SQLite (linqtosql)
This library also supports other common databases including MySQL and Firebird.

You could use the SQLCMD utility to execute the change script, as mentioned in this related question

Best means to store data locally when offline

I am in the midst of writing a small program (more to experiment with vs 2010 than anything else)
Despite being an experiment it has some practical use for our local athletics club.
My thought was to access the DB (currently online) to download the current members and store locally on a laptop (this is a MS sql table, used to power the club's website).
Take the laptop to the event (yes there ARE places that don't have internet coverage), add members to that days race (also a row from a sql table (though no changes would be made to this), record results (new records in 3rd table)
Once home, showered and within internet access again, upload/edit the tables as per the race results/member changes etc.
So I was thinking I'd do something like write xml files locally with the data, including a field to indicate changes etc?
If anyone can point me in a direction I would appreciate it...hell if anyone could tell me if this has a name, I'd appreciate it.

Essentially what you need is, in addition to your remote data store, a local data store on your desktop. You could then write your code by hand to sync the data stores when you go offline / online, or you could use the Microsoft Sync framework to handle it for you.
I've personally used the Sync framework on a number of projects and once you get used to the conventions, it's pretty easy to use.

If a local storage format is what your after. SQLite is one option. You can copy your tables from the server to your local SQLite db.
You could also save your data to files, but XML is a horrible format for doing this. You'll probably want to use YAML or JSON instead.

You may want to take a look at SQL Server Compact -- it provides some decent capabilities with synchronizing back with the mothership SQL server.

If you're using MS SQL Server for production, and you only need to work offline on your personal computer, you could install MS SQL Server Express locally. The advantage here over using a different local datastore is that you can reuse your schema, stored procedures, etc. essentially only needing to change the connection string to your application (which you could run locally too through Visual Studio). You would have to write code to manually sync your online and offline db instances, but since it's a small application, it may be reasonable to just copy the entire database from production to local and then from local to production when you get home (assuming you're the only one updating the db, and wouldn't be potentially wiping out any new records entered in production while you were at the event).

Google Gears http://gears.google.com/ is intended if your app is a web app (which I didn't quite get what it is from your description)

Database Deployment Strategies (SQL Server)

I am looking for a way to do daily deployments and keep the database scripts in line with releases.
Currently, we have a fairly decent way of deploying our source, we have unit code coverage, continuous integration and rollback procedures.
The problem is keeping the database scripts in line with a release. Everyone seems to try the script out on the test database then run them on live, when the ORM mappings are updated (that is, the changes goes live) then it picks up the new column.
The first problem is that none of the scripts HAVE to be written anywhere, generally everyone "attempts" to put them into a Subversion folder but some of the lazier people just run the script on live and most of the time no one knows who has done what to the database.
The second issue is that we have 4 test databases and they are ALWAYS out of line and the only way to truly line them back up is to do a restore from the live database.
I am a big believer that a process like this needs to be simple, straightforward and easy to use in order to help a developer, not hinder them.
What I am looking for are techniques/ideas that make it EASY for the developer to want to record their database scripts so they can be ran as part of a release procedure. A process that the developer would want to follow.
Any stories, use cases or even a link would helpful.

For this very problem I chose to use a migration tool: Migratordotnet.
With migrations (in any tool) you have a simple class used to perform your changes and undo them. Here's an example:
[Migration(62)]
public class _62_add_date_created_column : Migration
{
public void Up()
{
//add it nullable
Database.AddColumn("Customers", new Column("DateCreated", DateTime) );
//seed it with data
Database.Execute("update Customers set DateCreated = getdate()");
//add not-null constraint
Database.AddNotNullConstraint("Customers", "DateCreated");
}
public void Down()
{
Database.RemoveColumn("Customers", "DateCreated");
}
}
This example shows how you can handle volatile updates, like adding a new not-null column to a table that has existing data. This can be automated easily, and you can easily go up and down between versions.
This has been a really valuable addition to our build, and has streamlined the process immensely.
I posted a comparison of the various migration frameworks in .NET here: http://benscheirman.com/2008/06/net-database-migration-tool-roundup

Read K.Scott Allen's series of posts on database versioning.
We built a tool for applying database scripts in a controlled manner based on the techniques he describes and it works well.
This could then be used as part of the continuous integration process with each test database having changes deployed to it when a commit is made to the URL you keep the database upgrade scripts in. I'd suggest having a baseline script and upgrade scripts so that you can always run a sequence of scripts to get a database from it's current version to the new state that is needed.
This does still require some process and discipline from the developers though (all changes need to be rolled into a new version of the base install script and a patch script).

We've been using SQL Compare from RedGate for a few years now:
http://www.red-gate.com/products/index.htm
The pro version has a command line interface that you could probably use to setup your deployment procedures.

We use a modified version of the database versioning described by K. Scott Allen. We use the Database Publishing Wizard to create the original baseline script. Then a custom C# tool based on SQL SMO to dump the stored procedures, views and user functions. Change scripts which contain schema and data changes are generated by Red Gate tools. So we end up with a structure like
Database\
ObjectScripts\ - contains stored procs, views and user funcs 1-per file
\baseline.sql - database snapshot which includes tables and data
\sc.01.00.0001.sql - incremental change scripts
\sc.01.00.0002.sql
\sc.01.00.0003.sql
The custom tool creates the database if necessary, applies the baseline.sql if necessary, adds a SchemaChanges table if necessary and applies the change scripts as necessary based on what's in the SchemaChanges table. That process occurs as part of a nant build script each time we do a deployment build via cc.net.
If anyone wants the source code to the schemachanger app I can throw it up on codeplex/google or wherever.

If you are talking about trying to keep database schemas in sync, try using Red Gate SQL Comparison SDK. Build a temp database based on a create script (newDb) - this is what you want your database to look like. Compare newDb against your old database (oldDb). Get a change set from that comparison and apply it using Red Gate. You could build this upgrade process into you tests, and you can try and get all the devs to agree that there is one place where the create script for the database is kept. This same practice works well for upgrading your database across several versions and running data migration scripts and processes between each step (using an XML doc to map the create and data migration scripts)
Edit: With Red Gate technique, you only are concerned with create scripts, not upgrade scripts since Red Gate comes up with the upgrade script. It will also let you drop and create indexes, stored procedures, functions, etc.

Go here:
https://blog.codinghorror.com/get-your-database-under-version-control/
Scroll down a bit to the list of 5 links to the odetocode.com website. Fantastic five-part series. I would use that as a starting point to get ideas and figure out a process that will work for your team.

You should consider using a build tool like MSBuild or NAnt. We use a combination of CruiseControl.NET, NAnt, and SourceGear Fortress to handle our deployments, including SQL objects. The NAnt db build task calls sqlcmd.exe to update scripts in our dev and staging environments after they're checked into Fortress.

We use Visual Studio for Database Professionals and TFS to version and manage our database deployments. This allows us to treat our databases just like code (check out, check in, lock, view version history, branch, build, deploy, test, etc.) and even include them in the same solution files if we wish.
Our developers can work on local databases to avoid stepping on each other's changes in a shared environment. When they check database changes into TFS, we have continuous integration to build, test and deploy to our integrated dev environment. We have separate builds on release branches to create differential deployment scripts for each subsequent environment.
Later, if a bug is discovered in a release, we can go to a release branch and hotfix the code and database at the same time.
This is a great product, but its adoption was hindered early on due to a Microsoft marketing blunder. It was originally a separate product under Team System. This meant in order to use features of the developer edition and database edition at the same time, you were required to step up to the much more expensive Team Suite edition. We (and many other customers) gave Microsoft grief about this, and we were very happy they announced this year that DB Pro has been folded into the developer edition, and that immediately anyone licensed with developer edition can install the database edition.

Gus off-handedly mentioned DB Ghost (above) – I second it as a potential solution.
A brief overview of how my company is using DB Ghost:
After the schema for a new DB has been reasonably settled during initial development, we use the DB Ghost 'Data and Schema Scripter' to create script (.sql) files for all the DB objects (and any static data) and we check-in these script files into source control (the tool separates the objects into folders such as 'Stored Procedures', 'Tables', etc.). At this point, we can use either of the DB GHost 'Packager' or 'Packager Plus' tools to create a stand-alone executable to create a new DB from these scripts.
All changes to the DB schema are checked-in to source by check-ins to the specific script files.
At anytime we can use the packager to create an executable to either (a) create a new DB or (b) update an existing DB. Some customization is required for certain path-dependent changes (e.g. changes that require data to be updated), but we have pre-update and post-update scripts that are run.
The 'update' process involves the creation of a clean 'source' DB and then (after pre-update custom scripts), a comparison between the schemas of the source DB and the target DB. DB Ghost updates the target DB to match
We routinely make changes to production DBs (we have 14 customers in 7 different production environments) but inevitably deploy a large-enough set of changes with a DB Ghost update executable (created during our build process). Any production changes that were not checked-in to source (or that were not checked-in to the appropriate branch being released) are LOST. This has forced everyone to check-in changes consistently.
To summarize:
If you enforce a policy that all DB updates be deployed using a DB Ghost update executable, you can 'force' developers to consistently check-in their changes, regardless of whether they are deployed manually in the interim.
Adding a step (or steps) to your build process to create a DB Ghost update executable will in-effect perform a test to verify that a DB can be created from scripts (i.e. because DB Ghost creates a 'source' DB, even when creating the update executable package) and if you add a step (or steps) to execute the update package [on any of the four test DBs you mentioned], you can keep your test DBs in line with source.
There are some caveats and some limitations in what changes are 'easily' deployed with this tool (really a suite of related tools), but they are all fairly minor (at least for my company):
Renaming objects must be done in one of the custom scripts
The entire DB is always updated (e.g. objects in a single schema can't be updated alone) making it difficult to support customer-specific objects in the main application DB

The book Refactoring Databases addresses many of these issues at a conceptual level.
As far as tools go, I know that DB Ghost works well for SQL Server. I have heard that the Data Dude edition of Visual Studio has really been imporved upon in the latest release but I don't have any experience with it.
As far as really pulling off continuous integration style database development, it gets really resource instensive really fast because of the number of database copies you need. It is very doable when the database can fit on a developer workstation but impractical when the database is so large that it needs to be deployed across a grid. To do it you bacically need 1 copy of the database per developer [developers who make DDL changes, not just changes to procs] + 6 common copies. The common copies are as follows:
INT DEV --> Developers "check in" their refactoring to INT DEV for integration testing. When integration testing passes, this database is copied over to DEV.
DEV --> This is the "official" development copy of the database. INT DEV is refreshed regularly with a copy of DEV. Developers working on new refactorings get a fresh copy of the database from DEV.
INT QA --> Same idea as INT DEV except for the QA team. When integration tests pass here, this database is copied over to QA and to DEV*.
QA
INT PROD --> Same idea as INT QA except for production. When integration tests pass here, this database is copied over to PROD, QA*, and DEV*
PROD
*When copying databases across DEV/QA/PROD lines, you will also need to run scripts to update test data relevant to the particular environment (e.g. setting up users in QA that the QA team uses to test but that don't exist in production).

One possible solution is to look into implementing DML auditing on your test databases, then just rolling those audit logs into a script for final testing and live deployment. SQL Server 2008 significantly improves on DML auditing, but even SQL Server 2005 supports it via triggers.

There are a bunch of links in these posts that I'll want to follow up on (I "rolled my own" system years ago, have to see if there are similarities). One thing you will need, and that I hope is mentioned in these links, is discipline. I don't quite see how any automated system can work if anyone can change anything at any time. (Your question implies that this can happen on your production systems, but obviously that can't be true.)
Having one person (the fabled "database administrator") dedicated to the task of managing changes to databases, particularly production databases, is a very common solution. As for maintaining consistency across X development and testing databases: if it/they are used by many users, once again you are best served by having an individual act as a "clearing house" for changes; if everyone has their own database instance, then they're responsible for keeping it in order, and having a central consistent database "source" will be critical when they need a refreshed baseline database.
Here's a recent Stack Overflow post that may be of interest: how-to-refresh-a-test-instance-of-sql-server-with-production-data-without-using

Red Gate has a paper describing how to achieve build automation: http://downloads.red-gate.com/HelpPDF/ContinuousIntegrationForDatabasesUsingRedGateSQLTools.pdf
This is built around SQL Source Control, which integrates with SSMS and your existing source control system.

I've written a .NET based tool to handle database versioning in an automated fashion. We have been using this tool in production to handle rolling out database updates (including patches) to multiple environments, keep a log in each database of which scripts have been run, and do it all in an automated fashion. It has a command-line console so you can create batch scripts which use this tool. Check it out: https://github.com/bmontgomery/DatabaseVersioning

For what it's worth, this is a real example of a simple, low cost approach used by my former employer (and which I am trying to impress on my current employer as a basic first step).
Add a table called 'DB_VERSION' or similar. In EVERY upgrade script, add a row to that table which can include as little or as many columns as you see fit to describe the upgrade but at a minimum I would suggest { VERSION, EXECUTION_DATE, DESCRIPTION, EXECUTION_USER }. Now you have a concrete record of what has been going on. If someone runs their own unauthorised script you'd still need to follow the advice of the answers above, but this is just a simple way of dramatically improving on your existing versioning control (i.e. none).
Now let's you have an upgrade script from v2.1 to v2.2 of the database and you want to verify the lone maverick guy has actually run it on his database, you can just search for rows where VERSION = 'v2.2' and if you get a result, don't run this upgrade script. Can be built into a console utility app if necessary.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.