How to copy a MySql database schema using C#? - c#

I'm trying to use C# & MySql in order to copy an empty table (recreate the Schema really). The structure looks like this:
> TableTemplate (schema)
+ Tables
> FirstTable (table)
> second table (table)
> ...
> SomeOtherTable
+ Tables
> ...
What I would like is to copy the TableTemplate into a new Schema with the user name.
The first obvious path to oblivion was trying CREATE TABLE #UserName LIKE TableTemplate, swiftly learning that sql parameters are supposed to be used for values and not table names (all hail jon skeet, again: How to pass a table as parameter to MySqlCommand?).
So that leaves us with manual validation of the user names in order to build the table names (robert's a prime example).
Next, it seems that even CREATE TABLE UserID LIKE TableTemplate; won't work (even from MySQL Workbench), since TableTemplate isn't a table.
So It's down to writing a loop that will create a table LIKE each table in TableTemplate, after creating a UserID Schema (after manual validation of that string), or trying other options like dumping the database and creating a new one, as seen in these questions:
C# and mysqldump
Slow performance using mysqldump from C#
But I would prefer avoid running a process, dumping the database, and creating it from there every time I add a user.
Any suggestions would be highly appreciated.

I think mysqldump would be better. but if you want to do in one process. try this.
SELECT
CONCAT ("CREATE TABLE SomeOtherTable.",
TABLE_NAME ," AS SELECT * FROM TableTemplate.", TABLE_NAME
) as creation_sql
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'TableTemplate';
the output will be like
CREATE TABLE SomeOtherTable.tbl_name AS SELECT * FROM TableTemplate.tbl_name;
then iterate result and execute CREATE TABLE ....

Ended up using something like this, in a method where aName is passed for the table name:
using (MySqlCommand cmd = new MySqlCommand(string.Format("CREATE DATABASE {0} ;", aName), connection))
{
cmd.ExecuteNonQuery(); // Create the database with the given user name
// Building the sql query that will return a "create table" per table in some_db template DB.
cmd.CommandText = (string.Format("SELECT CONCAT (\"CREATE TABLE {0}.\", TABLE_NAME ,\" "
+ "LIKE some_other_db.\", TABLE_NAME ) as creation_sql "
+ "FROM information_schema.TABLES WHERE TABLE_SCHEMA = 'some_db';"
, aName));
try // Building the inner tables "create table" sql strings
{
using (MySqlDataReader reader = cmd.ExecuteReader())
{
while (reader.Read())
createInnerTablesList.Add(reader.GetString(0));
}
}
catch (MySqlException mysql_ex) { ... } // handle errors
foreach (var sql_insert_query in createInnerTablesList)
{
try // Insert the tables into the user database
{
cmd.CommandText = sql_insert_query;
cmd.ExecuteNonQuery();
}
catch (Exception e) { ... } // handle errors
}
}
The reasons for using LIKE vs AS like Jungsu suggested is that even though the AS will create the tables, it will not keep any of the constraints and keys defined (primary key, etc).
Using the LIKE will replicate them with the constraints.
I'm still not too happy about this, since I feel I'm missing something though ...

Related

How can I parameterize an SQL table without vulnerability to SQL injection

I'm writing a C# class library in which one of the features is the ability to create an empty data table that matches the schema of any existing table.
For example, this:
private DataTable RetrieveEmptyDataTable(string tableName)
{
var table = new DataTable() { TableName = tableName };
using var command = new SqlCommand($"SELECT TOP 0 * FROM {tableName}", _connection);
using SqlDataAdapter dataAdapter = new SqlDataAdapter(command);
dataAdapter.Fill(table);
return table;
}
The above code works, but it has a glaring security vulnerability: SQL injection.
My first instinct is to parameterize the query like so:
using var command = new SqlCommand("SELECT TOP 0 * FROM #tableName", _connection);
command.Parameters.AddWithValue("#tableName", tableName);
But this leads to the following exception:
Must declare the table variable "#tableName"
After a quick search on Stack Overflow I found this question, which recommends using my first approach (the one with sqli vulnerability). That doesn't help at all, so I kept searching and found this question, which says that the only secure solution would be to hard-code the possible tables. Again, this doesn't work for my class library which needs to work for arbitrary table names.
My question is this: how can I parameterize the table name without vulnerability to SQL injection?
An arbitrary table name still has to exist, so you can check first that it does:
IF EXISTS (SELECT 1 FROM sys.objects WHERE name = #TableName)
BEGIN
... do your thing ...
END
And further, if the list of tables you want to allow the user to select from is known and finite, or matches a specific naming convention (like dbo.Sales%), or belongs to a specific schema (like Reporting), you could add additional predicates to check for those.
This requires you to pass the table name in as a proper parameter, not concatenate or token-replace. (And please don't use AddWithValue() for anything, ever.)
Once your check that the object is real and valid has passed, then you will still have to build your SQL query dynamically, because you still won't be able to parameterize the table name. You still should apply QUOTENAME(), though, as I explain in these posts:
Protecting Yourself from SQL Injection in SQL Server - Part 1
Protecting Yourself from SQL Injection in SQL Server - Part 2
So the final code would be something like:
CREATE PROCEDURE dbo.SelectFromAnywhere
#TableName sysname
AS
BEGIN
IF EXISTS (SELECT 1 FROM sys.objects
WHERE name = #TableName)
BEGIN
DECLARE #sql nvarchar(max) = N'SELECT *
FROM ' + QUOTENAME(#TableName) + N';';
EXEC sys.sp_executesql #sql;
END
ELSE
BEGIN
PRINT 'Nice try, robot.';
END
END
GO
If you also want it to be in some defined list you can add
AND #TableName IN (N't1', N't2', …)
Or LIKE <some pattern> or join to sys.schemas or what have you.
Provided nobody has the rights to then modify the procedure to change the checks, there is no value you can pass to #TableName that will allow you to do anything malicious, other than maybe selecting from another table you didn’t expect because someone with too much access was able to create before calling the code. Replacing characters like -- or ; does not make this any safer.
You could pass the table name to the SQL Server to apply quotename() on it to properly quote it and subsequently only use the quoted name.
Something along the lines of:
...
string quotedTableName = null;
using (SqlCommand command = new SqlCommand("SELECT quotename(#tablename);", connection))
{
SqlParameter parameter = command.Parameters.Add("#tablename", System.Data.SqlDbType.NVarChar, 128 /* nvarchar(128) is (currently) equivalent to sysname which doesn't seem to exist in SqlDbType */);
parameter.Value = tableName;
object buff = command.ExecuteScalar();
if (buff != DBNull.Value
&& buff != null /* theoretically not possible since a FROM-less SELECT always returns a row */)
{
quotedTableName = buff.ToString();
}
}
if (quotedTableName != null)
{
using (SqlCommand command = new SqlCommand($"SELECT TOP 0 FROM { quotedTableName };", connection))
{
...
}
}
...
(Or do the dynamic part on SQL Server directly, also using quotename(). But that seems overly and unnecessary tedious, especially if you will do more than one operation on the table in different places.)
Aaron Bertrand's answer solved the problem, but a stored procedure is not useful for a class library that might interact with any database. Here is the way to write RetrieveEmptyDataTable (the method from my question) using his
answer:
private DataTable RetrieveEmptyDataTable(string tableName)
{
const string tableNameParameter = "#TableName";
var query =
" IF EXISTS (SELECT 1 FROM sys.objects\n" +
$" WHERE name = {tableNameParameter})\n" +
" BEGIN\n" +
" DECLARE #sql nvarchar(max) = N'SELECT TOP 0 * \n" +
$" FROM ' + QUOTENAME({tableNameParameter}) + N';';\n" +
" EXEC sys.sp_executesql #sql;\n" +
"END";
using var command = new SqlCommand(query, _connection);
command.Parameters.Add(tableNameParameter, SqlDbType.NVarChar).Value = tableName;
using SqlDataAdapter dataAdapter = new SqlDataAdapter(command);
var table = new DataTable() { TableName = tableName };
Connect();
dataAdapter.Fill(table);
Disconnect();
return table;
}

Delete with params in SqlCommand

I use ADO.NET to delete some data from DB like this:
using (SqlConnection conn = new SqlConnection(_connectionString))
{
try
{
conn.Open();
using (SqlCommand cmd = new SqlCommand("Delete from Table where ID in (#idList);", conn))
{
cmd.Parameters.Add("#idList", System.Data.SqlDbType.VarChar, 100);
cmd.Parameters["#idList"].Value = stratIds;
cmd.CommandTimeout = 0;
cmd.ExecuteNonQuery();
}
}
catch (Exception e)
{
//_logger.LogMessage(eLogLevel.ERROR, DateTime.Now, e.ToString());
}
finally
{
conn.Close();
}
}
That code executes without Exception but data wasn't deleted from DB.
When I use the same algorithm to insert or update DB everything is OK.
Does anybody know what is the problem?
You can't do that in regular TSQL, as the server treats #idList as a single value that happens to contain commas. However, if you use a List<int>, you can use dapper-dot-net, with
connection.Execute("delete from Table where ID in #ids", new { ids=listOfIds });
dapper figures out what you mean, and generates an appropriate parameterisation.
Another option is to send in a string and write a UDF to perform a "split" operation, then use that UDF in your query:
delete from Table where ID in (select Item from dbo.Split(#ids))
According to Marc's Split-UDF, this is one working implementation:
CREATE FUNCTION [dbo].[Split]
(
#ItemList NVARCHAR(MAX),
#delimiter CHAR(1)
)
RETURNS #IDTable TABLE (Item VARCHAR(50))
AS
BEGIN
DECLARE #tempItemList NVARCHAR(MAX)
SET #tempItemList = #ItemList
DECLARE #i INT
DECLARE #Item NVARCHAR(4000)
SET #tempItemList = REPLACE (#tempItemList, ' ', '')
SET #i = CHARINDEX(#delimiter, #tempItemList)
WHILE (LEN(#tempItemList) > 0)
BEGIN
IF #i = 0
SET #Item = #tempItemList
ELSE
SET #Item = LEFT(#tempItemList, #i - 1)
INSERT INTO #IDTable(Item) VALUES(#Item)
IF #i = 0
SET #tempItemList = ''
ELSE
SET #tempItemList = RIGHT(#tempItemList, LEN(#tempItemList) - #i)
SET #i = CHARINDEX(#delimiter, #tempItemList)
END
RETURN
END
And this is how you could call it:
DELETE FROM Table WHERE (ID IN (SELECT Item FROM dbo.Split(#idList, ',')));
I want to give this discussion a little more context. This seems to fall under the topic of "how do I get multiple rows of data to sql". In #Kate's case she is trying to DELETE-WHERE-IN, but useful strategies for this user case are very similar to strategies for UPDATE-FROM-WHERE-IN or INSERT INTO-SELECT FROM. The way I see it there are a few basic strategies.
String Concatenation
This is the oldest and most basic way. You do a simple "SELECT * FROM MyTable WHERE ID IN (" + someCSVString + ");"
Super simple
Easiest way to open yourself to a SQL Injection attack.
Effort you put into cleansing the string would be better spent on one of the other solutions
Object Mapper
As #MarcGravell suggested you can use something like dapper-dot-net, just as Linq-to-sql or Entity Framework would work. Dapper lets you do connection.Execute("delete from MyTable where ID in #ids", new { ids=listOfIds }); Similarly Linq would let you do something like from t in MyTable where myIntArray.Contains( t.ID )
Object mappers are great.
However, if your project is straight ADO this is a pretty serious change to accomplish a simple task.
CSV Split
In this strategy you pass a CSV string to SQL, whether ad-hoc or as a stored procedure parameter. The string is processed by a table valued UDF that returns the values as a single column table.
This has been a winning strategy since SQL-2000
#TimSchmelter gave a great example of a csv split function.
If you google this there are hundreds of articles examining every aspect from the basics to performance analysis across various string lengths.
Table Valued Parameters
In SQL 2008 custom "table types" can be defined. Once the table type is defined it can be constructed in ADO and passed down as a parameter.
The benefit here is it works for more scenarios than just an integer list -- it can support multiple columns
strongly typed
pull string processing back up to a layer/language that is quite good at it.
This is a fairly large topic, but Table-Valued Parameters in SQL Server 2008 (ADO.NET) is a good starting point.

Getting the Last Insert ID with SQLite.NET in C#

I have a simple problem with a not so simple solution... I am currently inserting some data into a database like this:
kompenzacijeDataSet.KompenzacijeRow kompenzacija = kompenzacijeDataSet.Kompenzacije.NewKompenzacijeRow();
kompenzacija.Datum = DateTime.Now;
kompenzacija.PodjetjeID = stranka.id;
kompenzacija.Znesek = Decimal.Parse(tbZnesek.Text);
kompenzacijeDataSet.Kompenzacije.Rows.Add(kompenzacija);
kompenzacijeDataSetTableAdapters.KompenzacijeTableAdapter kompTA = new kompenzacijeDataSetTableAdapters.KompenzacijeTableAdapter();
kompTA.Update(this.kompenzacijeDataSet.Kompenzacije);
this.currentKompenzacijaID = LastInsertID(kompTA.Connection);
The last line is important. Why do I supply a connection? Well there is a SQLite function called last_insert_rowid() that you can call and get the last insert ID. Problem is it is bound to a connection and .NET seems to be reopening and closing connections for every dataset operation. I thought getting the connection from a table adapter would change things. But it doesn't.
Would anyone know how to solve this? Maybe where to get a constant connection from? Or maybe something more elegant?
Thank you.
EDIT:
This is also a problem with transactions, I would need the same connection if I would want to use transactions, so that is also a problem...
Using C# (.net 4.0) with SQLite, the SQLiteConnection class has a property LastInsertRowId that equals the Primary Integer Key of the most recently inserted (or updated) element.
The rowID is returned if the table doesn't have a primary integer key (in this case the rowID is column is automatically created).
See https://www.sqlite.org/c3ref/last_insert_rowid.html for more.
As for wrapping multiple commands in a single transaction, any commands entered after the transaction begins and before it is committed are part of one transaction.
long rowID;
using (SQLiteConnection con = new SQLiteConnection([datasource])
{
SQLiteTransaction transaction = null;
transaction = con.BeginTransaction();
... [execute insert statement]
rowID = con.LastInsertRowId;
transaction.Commit()
}
select last_insert_rowid();
And you will need to execute it as a scalar query.
string sql = #"select last_insert_rowid()";
long lastId = (long)command.ExecuteScalar(sql); // Need to type-cast since `ExecuteScalar` returns an object.
last_insert_rowid() is part of the solution. It returns a row number, not the actual ID.
cmd = CNN.CreateCommand();
cmd.CommandText = "SELECT last_insert_rowid()";
object i = cmd.ExecuteScalar();
cmd.CommandText = "SELECT " + ID_Name + " FROM " + TableName + " WHERE rowid=" + i.ToString();
i = cmd.ExecuteScalar();
I'm using Microsoft.Data.Sqlite package and I do not see a LastInsertRowId property. But you don't have to create a second trip to database to get the last id. Instead, combine both sql statements into a single string.
string sql = #"
insert into MyTable values (null, #name);
select last_insert_rowid();";
using (var cmd = conn.CreateCommand()) {
cmd.CommandText = sql;
cmd.Parameters.Add("#name", SqliteType.Text).Value = "John";
int lastId = Convert.ToInt32(cmd.ExecuteScalar());
}
There seems to be answers to both Microsoft's reference and SQLite's reference and that is the reason some people are getting LastInsertRowId property to work and others aren't.
Personally I don't use an PK as it's just an alias for the rowid column. Using the rowid is around twice as fast as one that you create. If I have a TEXT column for a PK I still use rowid and just make the text column unique. (for SQLite 3 only. You need your own for v1 & v2 as vacuum will alter rowid numbers)
That said, the way to get the information from a record in the last insert is the code below. Since the function does a left join to itself I LIMIT it to 1 just for speed, even if you don't there will only be 1 record from the main SELECT statement.
SELECT my_primary_key_column FROM my_table
WHERE rowid in (SELECT last_insert_rowid() LIMIT 1);
The SQLiteConnection object has a property for that, so there is not need for additional query.
After INSERT you just my use LastInsertRowId property of your SQLiteConnection object that was used for INSERT command.
Type of LastInsertRowId property is Int64.
Off course, as you already now, for auto increment to work the primary key on table must be set to be AUTOINCREMENT field, which is another topic.
database = new SQLiteConnection(databasePath);
public int GetLastInsertId()
{
return (int)SQLite3.LastInsertRowid(database.Handle);
}
# How about just running 2x SQL statements together using Execute Scalar?
# Person is a object that has an Id and Name property
var connString = LoadConnectionString(); // get connection string
using (var conn = new SQLiteConnection(connString)) // connect to sqlite
{
// insert new record and get Id of inserted record
var sql = #"INSERT INTO People (Name) VALUES (#Name);
SELECT Id FROM People
ORDER BY Id DESC";
var lastId = conn.ExecuteScalar(sql, person);
}
In EF Core 5 you can get ID in the object itself without using any "last inserted".
For example:
var r = new SomeData() { Name = "New Row", ...};
dbContext.Add(r);
dbContext.SaveChanges();
Console.WriteLine(r.ID);
you would get new ID without thinking of using correct connection or thread-safety etc.
If you're using the Microsoft.Data.Sqlite package, it doesn't include a LastInsertRowId property in the SqliteConnection class, but you can still call the last_insert_rowid function by using the underlying SQLitePCL library. Here's an extension method:
using Microsoft.Data.Sqlite;
using SQLitePCL;
public static long GetLastInsertRowId(this SqliteConnection connection)
{
var handle = connection.Handle ?? throw new NullReferenceException("The connection is not open.");
return raw.sqlite3_last_insert_rowid(handle);
}

Possible to get PrimaryKey IDs back after a SQL BulkCopy?

I am using C# and using SqlBulkCopy. I have a problem though. I need to do a mass insert into one table then another mass insert into another table.
These 2 have a PK/FK relationship.
Table A
Field1 -PK auto incrementing (easy to do SqlBulkCopy as straight forward)
Table B
Field1 -PK/FK - This field makes the relationship and is also the PK of this table. It is not auto incrementing and needs to have the same id as to the row in Table A.
So these tables have a one to one relationship but I am unsure how to get back all those PK Id that the mass insert made since I need them for Table B.
Edit
Could I do something like this?
SELECT *
FROM Product
WHERE NOT EXISTS (SELECT * FROM ProductReview WHERE Product.ProductId = ProductReview.ProductId AND Product.Qty = NULL AND Product.ProductName != 'Ipad')
This should find all the rows that where just inserted with the sql bulk copy. I am not sure how to take the results from this then do a mass insert with them from a SP.
The only problem I can see with this is that if a user is doing the records one at a time and a this statement runs at the same time it could try to insert a row twice into the "Product Review Table".
So say I got like one user using the manual way and another user doing the mass way at about the same time.
manual way.
1. User submits data
2. Linq to sql Product object is made and filled with the data and submited.
3. this object now contains the ProductId
4. Another linq to sql object is made for the Product review table and is inserted(Product Id from step 3 is sent along).
Mass way.
1. User grabs data from a user sharing the data.
2. All Product rows from the sharing user are grabbed.
3. SQL Bulk copy insert on Product rows happens.
4. My SP selects all rows that only exist in the Product table and meets some other conditions
5. Mass insert happens with those rows.
So what happens if step 3(manual way) is happening at the same time as step 4(mass way). I think it would try to insert the same row twice causing a primary constraint execption.
In that scenario, I would use SqlBulkCopy to insert into a staging table (i.e. one that looks like the data I want to import, but isn't part of the main transactional tables), and then at the DB to a INSERT/SELECT to move the data into the first real table.
Now I have two choices depending on the server version; I could do a second INSERT/SELECT to the second real table, or I could use the INSERT/OUTPUT clause to do the second insert , using the identity rows from the table.
For example:
-- dummy schema
CREATE TABLE TMP (data varchar(max))
CREATE TABLE [Table1] (id int not null identity(1,1), data varchar(max))
CREATE TABLE [Table2] (id int not null identity(1,1), id1 int not null, data varchar(max))
-- imagine this is the SqlBulkCopy
INSERT TMP VALUES('abc')
INSERT TMP VALUES('def')
INSERT TMP VALUES('ghi')
-- now push into the real tables
INSERT [Table1]
OUTPUT INSERTED.id, INSERTED.data INTO [Table2](id1,data)
SELECT data FROM TMP
If your app allows it, you could add another column in which you store an identifier of the bulk insert (a guid for example). You would set this id explicitly.
Then after the bulk insert, you just select the rows that have that identifier.
I had the same issue where I had to get back ids of the rows inserted with SqlBulkCopy.
My ID column was an identity column.
Solution:
I have inserted 500+ rows with bulk copy, and then selected them back with the following query:
SELECT TOP InsertedRowCount *
FROM MyTable
ORDER BY ID DESC
This query returns the rows I have just inserted with their ids. In my case I had another unique column. So I selected that column and id. Then mapped them with a IDictionary like so:
IDictionary<string, int> mymap = new Dictionary<string, int>()
mymap[Name] = ID
Hope this helps.
My approach is similar to what RiceRiceBaby described, except one important thing to add is that the call to retrieve Max(Id) needs to be a part of a transaction, along with the call to SqlBulkCopy.WriteToServer. Otherwise, someone else may insert during your transaction and this would make your Id's incorrect. Here is my code:
public static void BulkInsert<T>(List<ColumnInfo> columnInfo, List<T> data, string
destinationTableName, SqlConnection conn = null, string idColumn = "Id")
{
NLogger logger = new NLogger();
var closeConn = false;
if (conn == null)
{
closeConn = true;
conn = new SqlConnection(_connectionString);
conn.Open();
}
SqlTransaction tran =
conn.BeginTransaction(System.Data.IsolationLevel.Serializable);
try
{
var options = SqlBulkCopyOptions.KeepIdentity;
var sbc = new SqlBulkCopy(conn, options, tran);
var command = new SqlCommand(
$"SELECT Max({idColumn}) from {destinationTableName};", conn,
tran);
var id = command.ExecuteScalar();
int maxId = 0;
if (id != null && id != DBNull.Value)
{
maxId = Convert.ToInt32(id);
}
data.ForEach(d =>
{
maxId++;
d.GetType().GetProperty(idColumn).SetValue(d, maxId);
});
var dt = ConvertToDataTable(columnInfo, data);
sbc.DestinationTableName = destinationTableName;
foreach (System.Data.DataColumn dc in dt.Columns)
{
sbc.ColumnMappings.Add(dc.ColumnName, dc.ColumnName);
}
sbc.WriteToServer(dt);
tran.Commit();
if(closeConn)
{
conn.Close();
conn = null;
}
}
catch (Exception ex)
{
tran.Rollback();
logger.Write(LogLevel.Error, $#"An error occurred while performing a bulk
insert into table {destinationTableName}. The entire
transaction has been rolled back.
{ex.ToString()}");
throw ex;
}
}
Depending on your needs and how much control you have of the tables, you may want to consider using UNIQUEIDENTIFIERs (Guids) instead of your IDENTITY primary keys. This moves key management outside of the database and into your application. There are some serious tradeoffs to this approach, so it may not meet your needs. But it may be worth considering. If you know for sure that you'll be pumping a lot of data into your tables via bulk-insert, it is often really handy to have those keys managed in your object model rather than your application relying on the database to give you back the data.
You could also take a hybrid approach with staging tables as suggested before. Get the data into those tables using GUIDs for the relationships, and then via SQL statements you could get the integer foreign keys in order and pump data into your production tables.
I would:
Turn on identity insert on the table
Grab the Id of the last row of the table
Loop from (int i = Id; i < datable.rows.count+1; i++)
In the loop, assign the Id property of your datable to i+1.
Run your SQL bulk insert with your keep identity turned on.
Turn identity insert back off
I think that's the safest way to get your ids on an SQL bulk insert because it will prevent mismatched ids that could caused by the application be executed on another thread.
Disclaimer: I'm the owner of the project C# Bulk Operations
The library overcome SqlBulkCopy limitations and add flexible features like output inserted identity value.
Behind the code, it does exactly like the accepted answer but way easier to use.
var bulk = new BulkOperation(connection);
// Output Identity
bulk.ColumnMappings.Add("ProductID", ColumnMappingDirectionType.Output);
// ... Column Mappings...
bulk.BulkInsert(dt);

SQL Insert one row or multiple rows data?

I am working on a console application to insert data to a MS SQL Server 2005 database. I have a list of objects to be inserted. Here I use Employee class as example:
List<Employee> employees;
What I can do is to insert one object at time like this:
foreach (Employee item in employees)
{
string sql = #"INSERT INTO Mytable (id, name, salary)
values ('#id', '#name', '#salary')";
// replace #par with values
cmd.CommandText = sql; // cmd is IDbCommand
cmd.ExecuteNonQuery();
}
Or I can build a balk insert query like this:
string sql = #"INSERT INTO MyTable (id, name, salary) ";
int count = employees.Count;
int index = 0;
foreach (Employee item in employees)
{
sql = sql + string.format(
"SELECT {0}, '{1}', {2} ",
item.ID, item.Name, item.Salary);
if ( index != (count-1) )
sql = sql + " UNION ALL ";
index++
}
cmd.CommandType = sql;
cmd.ExecuteNonQuery();
I guess the later case is going to insert rows of data at once. However, if I have
several ks of data, is there any limit for SQL query string?
I am not sure if one insert with multiple rows is better than one insert with one row of data, in terms of performance?
Any suggestions to do it in a better way?
Actually, the way you have it written, your first option will be faster.
Your second example has a problem in it. You are doing sql = + sql + etc. This is going to cause a new string object to be created for each iteration of the loop. (Check out the StringBuilder class). Technically, you are going to be creating a new string object in the first instance too, but the difference is that it doesn't have to copy all the information from the previous string option over.
The way you have it set up, SQL Server is going to have to potentially evaluate a massive query when you finally send it which is definitely going to take some time to figure out what it is supposed to do. I should state, this is dependent on how large the number of inserts you need to do. If n is small, you are probably going to be ok, but as it grows your problem will only get worse.
Bulk inserts are faster than individual ones due to how SQL server handles batch transactions. If you are going to insert data from C# you should take the first approach and wrap say every 500 inserts into a transaction and commit it, then do the next 500 and so on. This also has the advantage that if a batch fails, you can trap those and figure out what went wrong and re-insert just those. There are other ways to do it, but that would definately be an improvement over the two examples provided.
var iCounter = 0;
foreach (Employee item in employees)
{
if (iCounter == 0)
{
cmd.BeginTransaction;
}
string sql = #"INSERT INTO Mytable (id, name, salary)
values ('#id', '#name', '#salary')";
// replace #par with values
cmd.CommandText = sql; // cmd is IDbCommand
cmd.ExecuteNonQuery();
iCounter ++;
if(iCounter >= 500)
{
cmd.CommitTransaction;
iCounter = 0;
}
}
if(iCounter > 0)
cmd.CommitTransaction;
In MS SQL Server 2008 you can create .Net table-UDT that will contain your table
CREATE TYPE MyUdt AS TABLE (Id int, Name nvarchar(50), salary int)
then, you can use this UDT in your stored procedures and your с#-code to batch-inserts.
SP:
CREATE PROCEDURE uspInsert
(#MyTvp AS MyTable READONLY)
AS
INSERT INTO [MyTable]
SELECT * FROM #MyTvp
C# (imagine that records you need to insert already contained in Table "MyTable" of DataSet ds):
using(conn)
{
SqlCommand cmd = new SqlCommand("uspInsert", conn);
cmd.CommandType = CommandType.StoredProcedure;
SqlParameter myParam = cmd.Parameters.AddWithValue
("#MyTvp", ds.Tables["MyTable"]);
myParam.SqlDbType = SqlDbType.Structured;
myParam.TypeName = "dbo.MyUdt";
// Execute the stored procedure
cmd.ExecuteNonQuery();
}
So, this is the solution.
Finally I want to prevent you from using code like yours (building the strings and then execute this string), because this way of executing may be used for SQL-Injections.
look at this thread,
I've answered there about table valued parameter.
Bulk-copy is usually faster than doing inserts on your own.
If you still want to do it in one of your suggested ways you should make it so that you can easily change the size of the queries you send to the server. That way you can optimize for speed in your production environment later on. Query times may v ary alot depending on the query size.
The batch size for a SQL Server query is listed at being 65,536 * the network packet size. The network packet size is by default 4kbs but can be changed. Check out the Maximum capacity article for SQL 2008 to get the scope. SQL 2005 also appears to have the same limit.

Categories

Resources