Best way to transfer data between SQL Servers

Best way to transfer data between SQL Servers - c#

I have a general in-office database and a bunch of laptop-installed SQL Server Express databases of the same structure. In these databases I log information, which is partitioned by the id of a specific event. The laptops are taken by employees mostly to service one event (2-3 at the most sometimes), and obviously I have an eventId column in all tables except the dictionaries in the databases (though it rarely goes above 3 on the laptop databases).
When the employees come back to the office, they need to backup the last event data to the office SQL Server, where eventId's can be a hundred or a thousand, etc., this is the history of all past events.
Another reason for my question is I would like to be able to make an eventId-based selective copy of an existing database to a different server for development purposes (from local to SQL Azure actually).
What is the efficient way of moving such data between servers? I have approximately 50-100 tables with keys and references.
What we do at the moment - selective copying the data to a new clean database on source server (only whatever is required for copying); backing up this new DB; restoring it on a target server; data copy between databases using dynamic T-SQL with a dbname as a parameter. What I don't like is dynamic T-Sql, no syntax check at all until runtime.
Loading data somewhat in RAM with help of developing a C# program (but it may be too resource consuming, as long as data may be large) and transferring it to a new server with a C# program as a proxy and RAM as intermediate storage. Challenging and in certain situations impossible (too much data).
I started looking into SSIS, but it seems that the general advice for use of SSIS is when copying is done at fixed environment ("in-house"). We supply software not to a vast amount of customers, but there surely are more than one, and we would need to pass connection information dynamically from somewhere, as long as we can't visit all our customers to prepare a set-up for them on location.
Any other suggestions?
Any help appreciated!

Here is what I would recommend.
Create an SSIS package. Install it on the Laptops SQL Server Express.
You can use a configuration file that will allow you to select which Event Ids you want to Export.
You can also keep all of your connection information in the configuration file. For those non in-house customers you can remove the configuration file between exports.
That would be my recommendation.

Related

Is Access limited to a number of user that connect and run query on Database?

Is Access limited to a number of user that connect and run query on Database ?
i have shared Access file that 50 users connect and run query on hem
(select.....update....insert....delete)
my program that connect to this access in in C# (WinForm)
thanks

The Access database has a limit of 256 connections, but there is a limit of 64 connections per process in the database drivers.
Hope this below screen shot will help you. You can make the necessary modifications according to your needs.
You should prefer other Database like SQL Server for good performance.
FYI - Choosing edited record lets users use the database together but not work on the same record e.g a student at same time...
Need a persistent connection to the back-end from each of the front-end workstations. This can be done using a bound form which is always open or by keeping a recordset open at all times..

Technically, the limit is higher, but as a practical matter, your limit is one.
You may have used more in the past, but if so, you were lucky. Access is an in-process database engine. That means it works best when the database is loaded in with the process that is accessing it, and that in turn means that if you have more than one query writing into the database at a time, especially into the same table, you are leaving yourself open for corruption. Maybe not today, maybe not tomorrow, but soon, and when you least expect it.
If you're sharing a database among that many users, it's time to look for an host-process database engine... a server. Examples include Sql Server (Express Edition is free, even for commercial use), PostGreSQL, Oracle, and others.

SaaS application needs to export/backup data to individual customer sites

We have a cloud based SaaS application and many of our customers (school systems) require that a backup of their data be stored on-site for them.
All of our application data is stored in a single MS SQL database. At the very top of the "hierarchy" we have an "Organization". This organization represents a single customer in our system. Each organization has many child tables/objects/data. Each having FK relationships that ultimately end at "Organization".
We need a way to extract a SINGLE customer's data from the database and bundle it in some way so that it can be downloaded to the customers site. Preferably in a SQL Express, SQLite or an access database.
For example: Organization -> Skill Area -> Program -> Target -> Target Data are all tables in the system. Each one linking back to the parent by a FK. I need to get all the target data, targets, programs and skill areas per organization and export that data.
Does anyone have any suggestions about how to do this within SQL Server, a C# service, or a 3-rd party tool?
I need this solution to be easy to replicate for each customer who wants this feature "turned on"
Ideas?

I'm a big fan of using messaging to propagate data at the moment, so here's a message based solution that will allow external customers to keep a local, in sync copy of the data which you provide on the web.
The basic architecture would be an online, password secured and user specific list of changes which have occurred in the system.
At the server side this list would be appended to any time there was a change to an entity which is relevant to the specific customer.
At the client would run an application which checks the list of changes for any it hasn't yet received and then applies them to its local database (in the order they occurred).
There a a bunch of different ways of doing the list based component of the system but my gut feeling is that you would be best to use something like RSS to do this.
Below is a practical scenario of how this could work:
A new skill area is created for organisation "my org"
The skill is added to the central database and associated with the "my org" reccord
A SkillAreaExists event is also added at the same time to the "my org" RSS with JSON or XML data specifying the properties of the new skill area
A new program is added to the skill area that was just created
The program is added to the central database and associated with the skill area
A ProgramExists event is also added at the same time to the "my org" RSS with JSON or XML data specifying the properties of the new program
A SkillAreaHasProgram event is also added at the same time to the "my org" RSS with JSON or XML data specifying an identifier for the skill area and program
The client agent checks the RSS feed and sees the new messages and processes them in order
When the SkillAreaExists event is processed a new Skill area is added to the local DB
When the ProgramExists event is processed a new Program is added to the local DB
When the SkillAreaHasProgram event is processed the program is linked to the skill area
This approach has a whole bunch of benefits over traditional point in time replication.
Its online, a consumer of this can get realtime updates if required
Consistancy is maintained by order, at any point in time in the event stream if you stop receiving events you have a local DB which accuratly reflects the central DB as at some point in time.
Its diff based, you only need to recieve changes
Its auditable, you can see whats actually happened not just the current state.
Its easily recoverable, if there's a data consistency issue you can revert the entire DB by replaying the event stream.
It allows for multiple consumers, lots of individual copies of the clients info can exist and function autonomously.
We have had a great deal of success with these techniques for replicating data between sites especially when they are only sometimes online.

While there are some very interesting enterprise solutions that have been suggested, I think my approach would be to develop a plane old scheduled backup solution that simply exports the data for each organisation with a stored procedure or just a number of select statements.
Admittedly you'll have to keep this up to date as your database schema changes but if this is a production application I cant imagine that happens very drastically.
There are any number of technologies available to do this, be it SSIS, a custom windows service, or even something as rudimentary as a scheduled task that kicks off a stored procedure from the command line.
The format you choose to export to is entirely up to you and should probably be driven by how the backup is intended to be used. I might consider writing data to a number of CSV files and zipping the result such that it could be imported into other platforms should the need arise.
Other options might be to copy data across to a scratch database and then simply create a SQL backup of that database.
However you choose to go about it, I would encourage you to ensure that the process is well documented and has as much automated installation and setup as possible. Systems with loosely coupled dependencies such as common file locations or scheduled tasks are prone to getting tweaked and changed over time. Without those tweaks and changes being recorded you can create a system that works but can't be replicated. Soon no one wants to touch it and no one remembers exactly how it works. When it eventual needs changing, or worse it breaks, you have to start reverse engineering before you can fix it.
In a cloud based environment this is especially important because you want to be able to deploy as quickly as possible. If there is a lot of configuration that needs to be done you're likely to make mistakes or just be inconsistent. By creating a nuke-and-repave deployment you have a single point that you can change installation and configuration, safe in the knowledge that the change will be consistent across any deployment.

From what i understand, you have one large database for all the clients, you use relations which lead to the table organization to know which data for which client, and you want to backup the data based on client => organization.
To backup the data you can use one of the following methods:
As the comments from #Phil, and #Kris you can use SSIS for automated backup, check this link for structure backup, and check this link for how to Export a Query Result to a File using SSIS and instead of file do it to access or SQL Server database.
Build an application\service using C# to select the data and export it manually, need time but customization has no limits.

Have you looked at StreamInsight?
http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/complex-event-processing.aspx

When I've had to deal with backups of relational data in the past (in MySQL which isn't super different in terms of capability from MSSQL that you're running) is to create a backup "package" file which is essentially a zip file with a different file extension so that windows won't let users open it.
If you really want to get fancy, encrypt the file after zipping it and change the extension. I presume you're using ASP for your SaaS and since I'm a PHP-geek, I can't help too much with the code side of things, but the way I've handled this before was for a script that would package an entire Joomla site and Database for migration to a new server.
//open the MySQL connection
$dbc = mysql_connect($cfg->host,$cfg->user,$cfg->password);
//select the database
mysql_select_db($cfg->db,$dbc);
output( 'Getting database tables
');
//get all the tables in the database
$tables = array();
$result = mysql_query('SHOW TABLES',$dbc);
while($row = mysql_fetch_row($result)) {
$tables[] = $row[0];
}
output( 'Found '.count($tables).' tables to be migrated.
Exporting tables:
');
$return = "";
//cycle through the tables and get their create statements and data
foreach($tables as $table) {
$result = mysql_query('SELECT * FROM '.$table);
$num_fields = mysql_num_fields($result);
$return.= 'DROP TABLE IF EXISTS '.$table.";\n";
$row2 = mysql_fetch_row(mysql_query('SHOW CREATE TABLE '.$table));
$return.= $row2[1].";\n";
while($row = mysql_fetch_row($result)) {
$return.= 'INSERT INTO '.$table.' VALUES(';
for($j=0; $j<$num_fields; $j++) {
$row[$j] = mysql_escape_string($row[$j]);
$row[$j] = ereg_replace("\n","\\n",$row[$j]);
if (!empty($row[$j])) {
$return.= "'".$row[$j]."'" ;
} else {
$return.= "NULL";
}
if ($j<($num_fields-1)) {
$return.= ',';
}
}
$return.= ");\n";
}
}
That's the relevant portion of the code in PHP that loops the database structure and stores the recreation script in $result which can then be output to a file.
In your case, you don't want to recreate the databases, but rather the data itself. You've compounded the issue slightly since you have a SaaS that is prone to possible data structure changes which you'll need to be able to account for. My suggestion would be this then:
Use a similar system to the above to dump the relevant data from the individual tables. I'm simply pulling all the data, but you could pull only the parts that pertain to the individual user by using JOIN statements and whatnot. Dump the contents of each table's insert/replace statements into a file named after the table. Create a file called manifest.xml or something of that sort and populate it with the current version of your SaaS application, name/information, unique ID, etc of the client exporting the data.
Package all those files into a ZIP file, change the extension to whatever you want, encrypt it if you desire, etc. Let them download that backup file and you're set.
In your import script, you will need to read the version number of the exported data and compare it to some algorithm that can handle remapping the data based on revisions you make later on. This way if you need to re-import one of their backups later, you can correctly handle transitioning the data from when they pulled the backup to the current structure of the data in that table now.
Hopefully that helps ;)

Because you keep all the data in just one database, it will always be difficult to export/backup data on customer basis.
Even if you implement such scenario now, you will end up with two different places you need to maintain/change/test every time you change the database schema (fixing bugs, adding new features, optimization, etc).
I would recommend you to partition the data, say, by using a database per organization. Then you change your application just once (mainly around building a connection string for the specified organization), and then you can safely export/backup each database separately in a way you want it.
It also gives you a lot of extra benefits "for free" such as scalability and the ability to dedicate resources on per-organization base (whether it is needed in the future).
Say, you have a set of small and low priority (from a business point of view) organizations, and a big and high priority one. So you will be able to keep a set of small low priority databases on one server, but dedicate another one for that specific important big one.
Or if your current DB server is overloaded (perhaps you have A LOT of data and A LOT of requests to the database), you can simply get another cheap server and move half of the load without any changes in your system...
You still need to write something in order to split the existing big database into several small ones, but you do it just once, and after it is done this "migration tool" can be thrown away so you don't need to support it anymore.

Have you tried SyncFramework?
Have a look at this article!
It explains how to sync filtered data between databases using Sync Framework.
You can sync to the customer's database or sync to your own empty db and then export it as a file.

Did you thought about using an ORM? (Object Relational Mapper)
I know, and use, LLBLGen Pro (so I can talk only about the feature of this specific ORM)
Anyway, with LLBLGen you can reverse-engineer the DB and create a hierarchy of class that map the tables and relations of your DB.
Now If all the data of a customer is reachable via relations, I can tell to my ORM framework to load a single costumers (1 row of a specific table) and then load all the related data in the related table.
If the data is not too complex, it should be possible.
If you have hundreds of self referenced tables or strange relations, it may be undoable, it depend upon your data.
If all the data of a single customer is, say, 10'000 rows in 100 tables, it will probably work.
If all the data of is 100'000 rows in 1000 tables it "may" work if you have some times, and a lot of memory.
If all the data is 10'000'000 you probably cant load it all at once, and you'll need a more efficient way.
Anyway, if you can load all the data at once, then you'll have a nice "in memory" graph with all the data of a single customer, and then you can serialize this data, or project it on a dataset (obtaining a set of datatable/relations) and then serialize the dataset.
Using an ORM to load and export all the data of a single customer as explained, probably, is not the most efficient way of doing things, but when doable it's a simple and cheap way.
Naturally, with or without ORM, you can find hundreds of different way to export this data :-)

For you design, you should have sharded your database for customers.
However, as you have already developed the database design, I suggest you to create a temp database and create the new tables in this temp database using the FK relation.
For this, you need to sort the tables based on the FK relationship and create them in the temp database.
Then, select the table data from the source database and insert them in the temp database.
You can also use this technique to shard your database and revamp your database design.
Aravind

How to transfer data (Entities) from one database to another

I have a system (using Entity Framework) that is deployed in various production systems and also on a quality control system. My problem is that data entry is often done only on one of those occurrences of my system (different databases).
I want to find the best way to transfer my data from one database to another database. Ids can change, as long as the relations between my objects are maintained. 98% of my data in in DB, some of it is external files, I can manage those separately, manually.
Currently we use a xml structure as a transition file. The file is then imported in the destination system, and code manually imports the entities and re-creates the data.
I'm looking for a more generic way to do this, with less code. Since all my data in stored in Entities couldn't I simply create a big List and throw all my objects in there, then serialize that in some matter into an external file and finally generically import all the entities in there in my destination system? (I'll probably have to be careful in maintaining relation ids, but should be ok...)
Anyways I'm wondering if anyone would have smart approaches, I'm pretty sure I,m not the first with a similar problem.
Thanks!

You need to get some process around this. If all environments contain the same data (unlikely) you can replicate. It is the most automatic. But a QA environ should not update production, so you have to really think this through.
If semi-automated is okay, there are tools out there you can use from a variety of vendors. I use Red Gate tools, personally, but others are also fine.
Can you set up a more automated push with EF? Sure, but the amount of time you spend is really not worth it.

In my opinion you can check some of the following approaches:
1) Use Sql Compare or Sql Data Compare. Those tools are from Red Gate and can be found here
2) Regular backups and restores of the databases. You could, if it is an option regularly backup your most up-to-date database and restore it on the destination systems. I have no experience in automatizing this but here is a link to do that through .net.
3) You could always give it a go creating a version control system of your own. I would picture one such system selecting all records from a certain table (or all of them), deleting all records in the target database and inserting them. This seems pretty complex though, as you have to worry about relationships, data dependencies, etc.
Hope this helps in some way.
Regards

If you for some reason will not be satisfied with existing tools may be you'll want take a look at the Sync Framework and implement this functionality yourself for your very particular data bases.

Given what you described, pushing data from One SQL Server to another for demo purposes, you should consider SQL Server Integration Services.
If you're got a simple scenario where you just move the data and objects from DB to the next you can use their built-in Wizards. If you need to do custom stuff you can build complex workflows using C# and SQL (tools you already know). Note: most of what you're going to want comes with the standard edition so if you're using express this is less interesting.
The story for Red Gate products is more compelling when you don't have SQL Server (So you have to go out and buy something) and if you are interested in finding out what the changes are between DB's (like viewing code changes in a .cs file in a source control product)

.NET Data Storage - Database vs single file

I have a C# application that allows one user to enter information about customers and job sites. The information is very basic.
Customer: Name, number, address, email, associated job site.
Job Site: Name, location.
Here are my specs I need for this program.
No limit on amount of data entered.
Single user per application. No concurrent activity or multiple users.
Allow user entries/data to be exported to an external file that can be easily shared between applications/users.
Allows for user queries to display customers based on different combinations of customer information/job site information.
The data will never be viewed or manipulated outside of the application.
The program will be running almost always, minimized to the task bar.
Startup time is not very important, however I would like the queries to be considerably fast.
This all seems to point me towards a database, but a very lightweight one. However I also need it to have no limitations as far as data storage. If you agree I should use a database, please let me know what would be best suited for my needs. If you don't think I should use a database, please make some other suggestions on what you think would be best.

My suggestion would be to use SQLite. You can find it here: http://sqlite.org/. And you can find the C# wrapper version here: http://sqlite.phxsoftware.com/
SQLite is very lightweight and has some pretty powerful stuff for such a lightweight engine. Another option you can look into is Microsoft Access.

You're asking the wrong question again :)
The better question is "how do I build an application that lets me change the data storage implementation?"
If you apply the repository pattern and properly interface it you can build interchangable persistence layers. So you could start with one implementation and change it as-needed wihtout needing to re-engineer the business or application layers.
Once you have a repository interface you could try implementations in a lot of differnt approaches:
Flat File - You could persist the data as XML, and provided that it's not a lot of data you could store the full contents in-memory (just read the file at startup, write the file at shutdown). With in-memory XML you can get very high throughput without concern for database indexes, etc.
Distributable DB - SQLite or SQL Compact work great; they offer many DB benefits, and require no installation
Local DB - SQL Express is a good middle-ground between a lightweight and full-featured DB. Access, when used carefully, can suffice. The main benefit is that it's included with MS Office (although not installed by default), and some IT groups are more comfortable having Access installed on machines than SQL Express.
Full DB - MySql, SQL Server, PostGreSQL, et al.
Given your specific requirements I would advise you towards an XML-based flat file--with the only condition being that you are OK with the memory-usage of the application directly correlating to the size of the file (since your data is text, even with the weight of XML, this would take a lot of entries to become very large).
Here's the pros/cons--listed by your requirements:
Cons
No limit on amount of data entered.
using in-memory XML would mean your application would not scale. It could easily handle a 10MB data-file, 100MB shouldn't be an issue (unless your system is low on RAM), above that you have to seriously question "can I afford this much memory?".
Pros
Single user per application. No concurrent activity or multiple users.
XML can be read into memory and held by the process (AppDomain, really). It's perfectly suited for single-user scenarios where concurrency is a very narrow concern.
Allow user entries/data to be exported to an external file that can be easily shared between applications/users.
XML is perfect for exporting, and also easy to import to Excel, databases, etc...
Allows for user queries to display customers based on different combinations of customer information/job site information.
Linq-to-XML is your friend :D
The data will never be viewed or manipulated outside of the application.
....then holding it entirely in-memory doesn't cause any issues
The program will be running almost always, minimized to the task bar.
so loading the XML at startup, and writing at shutdown will be acceptible (if the file is very large it could take a while)
Startup time is not very important, however I would like the queries to be considerably fast
Reading the XML would be relatively slow at startup; but when it's loaded in-memory it will be hard to beat. Any given DB will require that the DB engine be started, that interop/cross-process/cross-network calls be made, that the results be loaded from disk (if not cached by the engine), etc...

It sounds to me like a database is 100% what you need. It offers both the data storage, data retrieval (including queries) and the ability to export data to a standard format (either direct from the database, or through your application.)
For a light database, I suggest SQLite (pronounced 'SQL Lite' ;) ). You can google for tutorials on how to set it up, and then how to interface with it via your C# code. I also found a reference to this C# wrapper for SQLite, which may be able to do much of the work for you!

How about SQLite? It sounds like it is a good fit for your application.
You can use System.Data.SQLite as the .NET wrapper.

You can get SQL Server Express for free. I would say the question is not so much why should you use a database, more why shouldn't you? This type of problem is exactly what databases are for, and SQL Server is a very powerful and widely used database, so if you are going to go for some other solution you need to provide a good reason why you wouldn't go with a database.

A database would be a good fit. SQLite is good as others have mentioned.
You could also use a local instance of SQL Server Express to take advantage of improved integration with other pieces of the Microsoft development stack (since you mention C#).
A third option is a document database like Raven which may fit from the sounds of your data.
edit
A fourth option would be to try Lightswitch when the beta comes out in a few days. (8-23-2010)
/edit
There is always going to be a limitation on data storage (the empty space of the hard disk). According to wikipedia, SQL Express is limited to 10 GB for SQL Server Express 2008 R2

Best means to store data locally when offline

I am in the midst of writing a small program (more to experiment with vs 2010 than anything else)
Despite being an experiment it has some practical use for our local athletics club.
My thought was to access the DB (currently online) to download the current members and store locally on a laptop (this is a MS sql table, used to power the club's website).
Take the laptop to the event (yes there ARE places that don't have internet coverage), add members to that days race (also a row from a sql table (though no changes would be made to this), record results (new records in 3rd table)
Once home, showered and within internet access again, upload/edit the tables as per the race results/member changes etc.
So I was thinking I'd do something like write xml files locally with the data, including a field to indicate changes etc?
If anyone can point me in a direction I would appreciate it...hell if anyone could tell me if this has a name, I'd appreciate it.

Essentially what you need is, in addition to your remote data store, a local data store on your desktop. You could then write your code by hand to sync the data stores when you go offline / online, or you could use the Microsoft Sync framework to handle it for you.
I've personally used the Sync framework on a number of projects and once you get used to the conventions, it's pretty easy to use.

If a local storage format is what your after. SQLite is one option. You can copy your tables from the server to your local SQLite db.
You could also save your data to files, but XML is a horrible format for doing this. You'll probably want to use YAML or JSON instead.

You may want to take a look at SQL Server Compact -- it provides some decent capabilities with synchronizing back with the mothership SQL server.

If you're using MS SQL Server for production, and you only need to work offline on your personal computer, you could install MS SQL Server Express locally. The advantage here over using a different local datastore is that you can reuse your schema, stored procedures, etc. essentially only needing to change the connection string to your application (which you could run locally too through Visual Studio). You would have to write code to manually sync your online and offline db instances, but since it's a small application, it may be reasonable to just copy the entire database from production to local and then from local to production when you get home (assuming you're the only one updating the db, and wouldn't be potentially wiping out any new records entered in production while you were at the event).

Google Gears http://gears.google.com/ is intended if your app is a web app (which I didn't quite get what it is from your description)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.