Import table data into database that expoted before - c#

I am working on a feather that export some tables(~50) to a disk file and import the file back to database. Export is quite easy, serialize dataset to a file stream. But when importing: table structure need to be determined dynamically.What I am doing now :
foreach table in dataset
(compare table schemas that in db and imported dataset)
define a batch command
foreach row in table
contruct a single insert sqlcommand,add it to batch command
execute batch insert command
this is very inefficient and I also I meet some problem to convert datatype in dataset datatable to database datatable. So I want to know is there some good method to do so?
Edit:
In fact, import and export is 2 functions(button) in program, On UI, there is a grid that list lots of tables, what I need to implement is to export selected tables's data to a disk file and import data back to database later

Why not use SQL Server's native Backup and Restore functionality? You can do incremental Restores on the data, and it's by far the fastest way to export and then import data again.
There are a lot of very advanced options to take into account some fringe cases, but at it's heart, it's two commands: Backup Database and Restore Database.
backup database mydb to disk = 'c:\my\path\to\backup.bak'
restore database mydb from disk = 'c:\my\path\to\backup.bak'
When doing this against TB-sized databases, it takes about 45 minutes to an hour in my experience. Much faster than trying to go through every row!

I'm guessing you are using SQL server? if so I would
a) make sure the table names are showing up in the export
b) look into the BulkCopy command. that will allow you to push an entire table in. so you can loop through the datatables and bulk copy each one in.
using (SqlBulkCopy copy = new SqlBulkCopy(MySQLExpConn))
{
copy.ColumnMappings.Add(0, 0);
copy.ColumnMappings.Add(1, 1);
copy.ColumnMappings.Add(2, 2);
copy.ColumnMappings.Add(3, 3);
copy.ColumnMappings.Add(4, 4);
copy.ColumnMappings.Add(5, 5);
copy.ColumnMappings.Add(6, 6);
copy.DestinationTableName = ds.Tables[i].TableName;
copy.WriteToServer(ds.Tables[i]);
}

You can use XML serializatin but you will need good ORML tool like NHibernation etc to help you with it. XML Serialization will maintain its data type and will work flowlessly.
You can read entire table and serialize all values into xml file, and you can read entire xml file back into list of objects and you can store them into database. Using good ORML tool you will not need to write any SQL. And I think they can work on different database servers as well.

I finally choose SqlCommandBuilder to build insert command automatically
See
SqlCommandBuilder Class

Related

Load data from flat file to Sql Server table and also export to excel using SSIS

Problem Statement: The requirement is straight-forward, which is we have a flat file(csv basically) which we need to load into one of the tables in Sql Server database. The problem arises when we have to derive a new column(not present in flat file) and populate this too alongwith rest of the columns from the file.
The derivation logic of the new columns is - find the max date of "TransactionDate".
The entire exercise is to be performed in SSIS and we were hoping to get it done by using DataFlowTask but stuck on how to derive the new column and then add it to the destination flow.
Ideas:
Use DataFlowTask to read the file and then store it in recordset so that in ControlFlow we would use ScriptTask to read it as DataTable and use LINQ sort-of to determine the max column and push it to another DataFlow to be consumed by Sql table (but this I guess would require creating of tabletype in database which I would avoid)
Perform the entire operation in DataFlowTask itself and we would be needing Asynchronous transformation (to get all the data and find out the max value)
We are kind of out-of-ideas here and any lead would be much appreciated and do let us know if any further information would be required on this regard.
Run a dataflow task to insert the data to your destination table. Follow that up with an Execute SQL task that calculates the MAX(TransactionDate) based on the values in the table with a NULL (or other new record indicator) MaxTransactionDate.

Fast and simple way to import csv to SQL Server

We are importing a csv file with CSVReader then using SqlBulkCopy to insert that data into SQL Server. This code works for us and is very simple, but wondering if there is a faster method (some of our files have 100000 rows) that would also not get too complex?
SqlConnection conn = new SqlConnection(connectionString);
conn.Open();
SqlTransaction transaction = conn.BeginTransaction();
try
{
using (TextReader reader = File.OpenText(sourceFileLocation))
{
CsvReader csv = new CsvReader(reader, true);
SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
copy.DestinationTableName = reportType.ToString();
copy.WriteToServer(csv);
transaction.Commit();
}
}
catch (Exception ex)
{
transaction.Rollback();
success = false;
SendFileImportErrorEmail(Path.GetFileName(sourceFileLocation), ex.Message);
}
finally
{
conn.Close();
}
Instead of building your own tool to do this, have a look at SQL Server Import and Export / SSIS. You can target flat files and SQL Server databases directly. The output dtsx package can also be run from the command line or as a job through the SQL Server Agent.
The reason I am suggesting it is because the wizard is optimized for parallelism and works really well on large flat files.
You should consider using a Table-Valued Parameter (TVP), which is based on a User-Defined Table Type (UDTT). This ability was introduced in SQL Server 2008 and allows you to define a strongly-typed structure that can be used to stream data into SQL Server (if done properly). An advantage of this approach over using SqlBulkCopy is that you can do more than a simple INSERT into a table; you can do any logic that you want (validate / upsert / etc) since the data arrives in the form of a Table Variable. You can deal with all of the import logic in a single Stored Procedure that can easily use local temporary tables if any of the data needs to be staged first. This makes it rather easy to isolate the process such that you can run multiple instances at the same time as long as you have a way to logically separate the rows being imported.
I posted a detailed answer on this topic here on S.O. a while ago, including example code and links to other info:
How can I insert 10 million records in the shortest time possible?
There is even a link to a related answer of mine that shows another variation on that theme. I have a third answer somewhere that shows a batched approach if you have millions of rows, which you don't, but as soon as I find that I will add the link here.

Fastest way Store data in database using c#

I have a csv file with 3 million lines and want to stored it in a database using c#. The csv file looks like "device;date;value".
Shall I write it into an array or directly into a System.Data.DataTable? And what is the fastest way to store this DataTable into a Database (SQL-Server for example).
I tried to store the lines using 3 million insert into statements but it was too slow :)
thanks
You can load the data in a DataTable and then use SqlBulkCopy for copying the date to the table in sql server
The SqlBulkCopy class can be used to write data only to SQL Server
tables. However, the data source is not limited to SQL Server; any
data source can be used, as long as the data can be loaded to a
DataTable instance or read with a IDataReader instance.
.
I'd guess BCP would be pretty fast. Once you have the data in a DataTable you can try
using (SqlBulkCopy bcp= new SqlBulkCopy(yourConnectionString))
{
BulkCopy.DestinationTableName = "TargetTable";
BulkCopy.WriteToServer(dataTable);
}
I think the best way is to open stream reader and create row line by line. Use ReadLine in a while loop and split to find differents parts.
Sending 3 millions insert statements is bordering on crazy slow!
Buffer it by using transactions and reading in for example 200-1000 lines at a time (the smaller your data, the more you can read in at a time) then, after reading in these lines, commit your inserts to the database directly.

Out of memory exception when pulling huge data from DB

We are pulling a huge data from sql server DB. It has around 25000 rows with 2500 columns. The requirement is to read the data and export it to spread sheet, so pagination is not a choice. When the records are less it is able to pull the data but when it grows to the size i mentioned above it is throwing exception.
public DataSet Exportexcel(string Username)
{
Database db = DatabaseFactory.CreateDatabase(Config);
DbCommand dbCommand =
db.GetStoredProcCommand("Sp_ExportADExcel");
db.AddInParameter(dbCommand, "#Username", DbType.String,
Username);
return db.ExecuteDataSet(dbCommand);
}
Please help me in resolving this issue.
The requirement is to read the data and export it to spread sheet, so
pagination is not a choice.
Why not read data in steps. Instead of getting all records at once why not get limited number of records every time and write them to excel. Continue until you have processed all the records
Your problem is purely down to the fact that you are trying to extract so much data in one go.
You may get around the problem by installing more memory in the machine doing the query, but this is just a bodge.
Your best to retrieve such amounts of data in steps.
You could quite easily read the data back row by row and export/append that in CSV format to a file and this could all be done in a stored procedure.
You don't say what database you are using, but handling such large amounts of data is what database engines are designed to cope with.
Other than that when handling large quantities of data objects in C# code its best to look into using generics as this doesn't impose object instantiation in the same way that classes do and so reduces the memory footprint.
You can use batch processing logic to fetch records in batches say 5000 records per execution and store the result in a temp dataset and once all processing is done. Dump the data from temp dataset to excel.
You can use C# BulkCopy class for this purpose.
If it is enough to have the data available in Excel as csv you can use bulk copy
bcp "select col1, col2, col3 from database.schema.SomeTable" queryout "c:\MyData.txt" -c -t"," -r"\n" -S ServerName -T
This is mangnitudes faster and has little footprint.

Bulk Insert into access database from c#?

How can I do this. I have about 10000 records in an an Excel file and I want to insert all records as fast as possible into an access database?
Any suggestions?
What you can do is something like this:
Dim AccessConn As New System.Data.OleDb.OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source=C:\Test Files\db1 XP.mdb")
AccessConn.Open()
Dim AccessCommand As New System.Data.OleDb.OleDbCommand("SELECT * INTO [ReportFile] FROM [Text;DATABASE=C:\Documents and Settings\...\My Documents\My Database\Text].[ReportFile.txt]", AccessConn)
AccessCommand.ExecuteNonQuery()
AccessConn.Close()
Switch off the indexing on the affected tables before starting the load and then rebuilding the indexes from scratch after the bulk load has finished. Rebuilding the indexes from scratch is faster than trying to keep them up to date while loading large amount of data into a table.
If you choose to insert row by row, then maybe want to you consider using transactions. Like, open transaction, insert 1000 records, commit transaction. This should work fine.
Use the default data import features in Access. If that does not suit your needs and you want to use C#, use standard ADO.NET and simply write record-for-record. 10K records should not take too long.

Categories

Resources