SQL Transaction with ADO.Net

SQL Transaction with ADO.Net - c#

I am new to Database interection with C#, I am trying to writing 10000 records in database in a loop with the help of SqlCommand and SqlConnection objects with the help of SqlTransaction and committing after 5000. It is taking 10 seconds to processed.
SqlConnection myConnection = new SqlConnection("..Connection String..");
myConnection.Open();
SqlCommand myCommand = new SqlCommand();
myCommand.CommandText = "exec StoredProcedureInsertOneRowInTable Param1, Param2........";
myCommand.Connection = myConnection;
SqlTransaction myTrans = myConnection.Begintransaction();
for(int i=0;i<10000;i++)
{
mycommand.ExecuteNonQuery();
if(i%5000==0)
{
myTrans.commit();
myTrans = myConnection.BeginTransaction();
mycommand.Transaction = myTrans;
}
}
Above code is giving me only 1000 rows write/sec in database.
But when i tried to implement same logic in SQL and execute it on Database with SqlManagement Studio the it gave me 10000 write/sec.
When I compare the behaviour of above two approch then it showes me that while executing with ADO.Net there is large number of Logical reads.
my questions are:
1. Why there is logical reads in ADO.Net execution?
2. Is tansaction have some hand shaking?
3. Why they are not available in case of management studio?
4. If I want very fast insert transactions on DB then what will be the approach? .
Updated Information about Database objects
Table: tbl_FastInsertTest
No Primary Key, Only 5 fields first three are type of int (F1,F2,F3) and last 2(F4,F5) are type varchar(30)
storedprocedure:
create proc stp_FastInsertTest
{
#nF1 int,
#nF2 int,
#nF3 int,
#sF4 varchar(30),
#sF5 varchar(30)
}
as
Begin
set NoCOUNT on
Insert into tbl_FastInsertTest
{
[F1],
[F2],
[F3],
[F4],
[F5]
}
Values
{
#nF1,
#nF2,
#nF3,
#sF4,
#sF5,
} end
--------------------------------------------------------------------------------------
SQL Block Executing on SSMS
--When I am executing following code on SSMS then it is giving me more than 10000 writes per second but when i tried to execute same STP on ADO than it gave me 1000 to 1200 writes per second
--while reading no locks
begin trans
declare #i int
set #i=0
While(1<>0)
begin
exec stp_FastInsertTest 1,2,3,'vikram','varma'
set #i=#i+1
if(#i=5000)
begin
commit trans
set #i=0
begin trans
end
end

If you are running something like:
exec StoredProcedureInsertOneRowInTable 'blah', ...
exec StoredProcedureInsertOneRowInTable 'bloop', ...
exec StoredProcedureInsertOneRowInTable 'more', ...
in SSMS, that is an entirely different scenario, where all of that is a single batch. With ADO.NET you are paying a round-trip per ExecuteNonQuery - I'm actually impressed it managed 1000/s.
Re the logical reads, that could just be looking at the query-plan cache, but without knowing more about StoredProcedureInsertOneRowInTable it is impossible to comment on whether something query-specific is afoot. But I suspect you have some different SET conditions between SSMS and ADO.NET that is forcing it to use a different plan - this is in particular a problem with things like persisted calculated indexed columns, and columns "promoted" out of a sql-xml field.
Re making it faster - in this case it sounds like a table-valued parameters is exactly the thing, but you should also review the other options here

For performant inserts take a look at SqlBulkCopy class if it works for you it should be fast.
As Sean said, using parameterized queries is always a good idea.
Using a StringBuilder class, batching thousand INSERT statements in a single query and committing the transaction is a proven way of inserting data:
var sb=new StringBuilder();
for(int i=0;i < 1000;i++)
{
sb.AppendFormat("INSERT INTO Table(col1,col2)
VALUES({0},{1});",values1[i],values2[i]);
}
sqlCommand.Text=sb.ToString();
Your code doesn't look right to me, you are not committing transactions at each batch. Your code keeps opening new transactions.
It is always a good practice to drop indexes while inserting a lot of data, and adding them later. Indexes will slow down your writes.
Sql Management Studio does not have transactions but Sql has, try this:
BEGIN TRANSACTION MyTransaction
INSERT INTO Table(Col1,Col1) VALUES(Val10,Val20);
INSERT INTO Table(Col1,Col1) VALUES(Val11,Val21);
INSERT INTO Table(Col1,Col1) VALUES(Val12,Val23);
COMMIT TRANSACTION

You need to use a parameterized query so that the execution path can get processed and cached. Since you're using string concatenation (shudder, this is bad, google sql injection) to build the query, SQL Server treats those 10,000 queries are separate, individual queries and builds an execution plan for each one.
MSDN: http://msdn.microsoft.com/en-us/library/yy6y35y8.aspx although you're going to want to simplify their code a bit and you'll have to reset the parameters on the command.
If you really, really want to get the data in the db fast, think about using bcp... but you better make sure the data is clean first (as there's no real error checking/handling on it.

Related

Improve SQLite writing speed C#

I need to dramatically improve writing speed for SQLite (or maybe suggest another solution for this outside of SQLite).
Scenario :
I have 71 Columns with 365 * 24 * 60 values each. (365 = days)
I do "insert intos" for testing the db_performance
To shorten the testing-time I did the tests for 90 days instead of 365 (so the result-timespans will be x4)
Settings :
I've tried various PRAGMAS like
synchronous off
locking_mode exclusive
cache & pagesize with different values (though I read low values may improve performance, for me higher values did a good job)
journal_mode off
changing timeout values
Approaches :
#A1 Gathering all "insert intos", ExecuteNonquery each, at the end do one giant transaction
#A2 the same like above but with ParallelForEach and ExecuteNonqueryAsync
#A3 Gathering all "insert intos" for one day and do one transaction each
Tablestructure :
#T1 One table with all the columns
#T2 One table for each column
Results :
I did runs for 90 days ( so it doesn't take too long ) and the main problem is writing speed.
I measured 5 phases, which are :
#P1 setup the tables & headers ( ~ 8-9ms)
#P2 prepare the data (for every "insert into" command do ExecuteNonquery) ( ~ 15000-18000ms ! )
#P3 do the transaction (~ 200-500 ms)
#P4 read one complete column ( ~80 - 200 ms)
#P5 delete one complete column ( ~ 1 - 9 ms)
I tried all the different methods and approaches I mentioned before, but couldn't manage to improve #P2. Any ideas how to fix that ? Or maybe any hint for a better solution as a serverless db (Realm?) ?
Here's the code for the #A1 #P2 #T2, which had the best results so far...
using (var transaction = sqLiteConnection.BeginTransaction())
{
using (var command = sqLiteConnection.CreateCommand())
{
foreach (var vcommand in values_list)
{
command.CommandText = vcommand;
command.ExecuteNonQuery();
}
}
transaction.Commit();
}
(values list is a string[] with 71*90 insert intos or in Marks version one giant command.)
Edit/Update :
I tried the approach by Mark Benningfield making one giant "insert into" for all values in one table with all columns and could improve the overall speed to ~8500ms (#P2 ~7500ms).
Final Update :
Ok I did a bunch of tests and will summarize the results :
For comparison reason all databases had the same values, a two-dimensional double array with [129600,71] values. None of them had a prepared insert-statement, so the generation time for transforming the values into the needed format is included (phase 2).
SQLite needs ~14seconds with one giant transaction (the previous ~8s were without generating the insert-into-command live). SQL_CE is atm the best for this scenario. This is mainly due to not operating with strings ("INSERT INTO"), but with DataTables and rows + bulkInsert. Realm is interesting, especially for mobile users - very intuitive. But you cannot add dynamic obejcts atm (so you need a static object). Influx is another nice database for timeseries, but it's very specific, not embedded and has IMHO a poor C# implementation (it may perform much better via console).

Have you tried writing the data to a text file and then using the import command (see Importing CSV files)? Unlike INSERT commands, these routines usually ignore triggers and work with direct table access.

Make your insert command look like this (by constructing it however you need to):
INSERT INTO table (col1, col2, col3) VALUES (val1, 'val2', val3),
(val1, 'val2', val3),
(val2, 'val2', val3),
...
(val1, 'val2', val3);
Then execute the single insert command to do a bulk update of known data.

Ok I did a bunch of tests and will summarize the results :
For comparison reason all databases had the same values, a two-dimensional double array with [129600,71] values. None of them had a prepared insert-statement, so the generation time for transforming the values into the needed format is included (phase 2).
SQLite needs ~14seconds with one giant transaction (the previous ~8s were without generating the insert-into-command live). SQL_CE is atm the best for this scenario. This is mainly due to not operating with strings ("INSERT INTO"), but with DataTables and rows + bulkInsert.
Realm is interesting, especially for mobile users - very intuitive. But you cannot add dynamic obejcts atm (so you need a static object).
Influx is another nice database for timeseries, but it's very specific, not embedded and has IMHO a poor C# implementation (it may perform much better via console).

The fastest and recommended way to do bulk inserts is to use prepared statements with parameters. That way, a statement (command) is only parsed and prepared once, instead of having to parse it again for every row. SQLite also does not have to parse the parameter values from the command text, but they are supplied and used directly. For each row, you only switch parameters.
So instead of going this way:
using (var transaction = sqLiteConnection.BeginTransaction())
{
using (var command = sqLiteConnection.CreateCommand())
{
foreach (var vcommand in values_list)
{
command.CommandText = vcommand;
command.ExecuteNonQuery();
}
}
transaction.Commit();
}
You should do it like this:
using (var transaction = sqLiteConnection.BeginTransaction())
{
using (var command = sqLiteConnection.CreateCommand())
{
// Create command and parameters
command.CommandText = "INSERT INTO MyTable VALUES (?, ?)";
var param1 = command.Parameters.Add(null, SqliteType.Integer);
var param2 = command.Parameters.Add(null, SqliteType.Text);
foreach (var item in values_list)
{
// For each row, only update parameter values
param1.Value = item.IntProperty;
param2.Value = item.TextProperty;
command.ExecuteNonQuery();
}
}
transaction.Commit();
}
This will perform much better. The statement is only parsed on first execute. All following executes will use the already prepared statement. It also safeguards you against SQL Injection attacks: Text parameters are not inserted into the actual SQL statement string, which would allow manipulation of your statement. Instead, they are passed directly as values to SQLite. So you do not only gain performance, you have also prevented one of the most common database attack scenarios.
General rule: Never put values directly into SQL statements. Always use parameters instead.
Note: There are also other ways to create and parameters. This is just one example how to do it. For example, you can also use named parameters:
// Create command and parameters
command.CommandText = "INSERT INTO MyTable VALUES (#one, #two)";
var param1 = command.Parameters.Add("#one", SqliteType.Integer);
var param2 = command.Parameters.Add("#two", SqliteType.Text);

Why insert statement is slow with SqlCommand?

I am using SqlCommand to insert multiple records to the database but it takes long time to insert 2000 records, I did the following code:
using (SqlConnection sql = new SqlConnection(connectionString))
{
using (SqlCommand cmd = new SqlCommand(query, sql))
{
sql.Open();
int ff = 0;
while (ff < 2000)
{
cmd.ExecuteNonQuery();//It takes 139 milliseconds approximately
ff++;
Console.WriteLine(ff);
}
}
}
But when I execute the following script in SSMS(Sql Server Management Studio) the 2000 records are stored in 15 seconds:
declare #i int;
set #i=1
while (#i)<=2000
begin
set #i=#i+1
INSERT INTO Fulls (ts,MOTOR,HMI,SET_WEIGHT) VALUES ('2018-07-04 02:56:57','0','0','0');
end
What's going on?
Why is It so slow in executing the sentence?
Additional:
-The database is a SQL Database hosted in Microsoft Azure.
-The loading speed of my internet is 20 Mbits.
-The above query is not the real query, the real query contains 240 columns and 240 values.
-I tried to do a transaction following this example: https://msdn.microsoft.com/en-us/library/86773566(v=vs.110).aspx
-The sql variable is of type SqlConnection.
Thanks for your help.

It sounds like there is a high latency between your SQL server and your application server. When I do this locally, the pure TSQL version runs in 363ms, and the C# version with 2000 round trips takes 1061ms (so: about 0.53ms per round-trip). Note: I took the Console.WriteLine away, because I didn't want to measure how fast Console isn't!
For 2000 inserts, this is a pretty fair comparison. If you're seeing something massively different, then I suspect:
your SQL server is horribly under-powered - it should not take 15s (from the question) to insert 2000 rows, under any circumstances (my 363ms timing is on my desktop PC, not a fast server)
as already suggested; you have high latency
Note there are also things like "DTC" which might impact the performance based on the connection string and ambient transactions (TransactionScope), but I'm assuming those aren't factors here.
If you need to improve the performance here, the first thing to do would be to find out why it is so horribly bad - i.e. the raw server performance is terrible, and the latency is huge. Neither of those is a coding question: those are infrastructure questions.
If you can't fix those, then you can code around them. Table valued parameters or bulk-insert (SqlBulkCopy) both provide ways to transfer multiple rows without having to pay a round-trip per execute. You can also use "MARS" (multiple active results sets) and pipelined inserts, but that is quite an advanced topic (and most people tend to recommend not enabling MARS).

Make sure to minimize number of indexes on your table and use SqlBulkCopy as below
DataTable sourceData=new DataTable();
using (var sqlBulkCopy = new SqlBulkCopy(_connString))
{
sqlBulkCopy .DestinationTableName = "DestinationTableName";
sqlBulkCopy .WriteToServer(sourceData);
}

c# optimum way to run multiple queries

I am using a C# application, in order to manage a mySQL database.
What I want to do is:
Read some records.
Run some functions to calculate "stuff".
Insert "stuff" to database.
In order to calculate n-th "stuff", I must have already calculated (n-1)-th "stuff".
This is what I do:
Declare:
static MySqlCommand cmd;
static MySqlDataReader dr;
My main loop is like following:
for (...)
{
dr.Close();
cmd.CommandText = "insert into....";
dr = cmd.ExecuteReader();
}
This is taking way too long. Total number of rows to be inserted is about 2.5M.
When I use mySql database in regular server, it takes about 100-150 hours. When I use a localhost database, it takes about 50h.
I think there should be a quicker way. My thoughts:
I think that now i connect to db and disconnect from db every time i loop. Is it true?
I could i create a CommandText that contains for example 100 queries (separated by semi-colon). Is this possible?
Instead of executing the queries, output them in a text file (file will be about 300MB). Then insert them into db using phpMyAdmin (Bonus question: I'm using phpMyAdmin. Is this ok? Is there a better (maybe not web) interface?)

Try using a bulk insert. I found this syntax here. And then use ExecuteNonQuery() as SLaks suggested in the comments. Those combined may speed it up a good bit.
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

It's possible you are using InnoDB as an access method. In this case you should try wrapping every hundred or so rows of INSERT operations in a transaction. When I have had to handle this kind of application, it's made a huge difference. To do this, structure your code like so:
MySqlCommand commit;
start.CommandText = "START TRANSACTION";
MySqlCommand commit;
commit.CommandText = "COMMIT";
int bunchSize = 100;
int bunch = 0;
start.ExecuteNonQuery(); /* start the first bunch transaction */
bunch = bunchsize;
for(/*whatever loop conditions you need*/) {
/* whatever you need to do */
/* your insert statement */
if (--bunchsize <= 0) {
commit.ExecuteNonQuery(); /* end one bunch transaction */
start.ExecuteNonQuery(); /* and begin the next */
bunchsize = bunch;
}
}
commit.ExecuteNonQuery(); /* end the last bunch transaction */
It is also possible that the table to which you're inserting megarows has lots of indexes. In this case you can speed things up by beginning your series of INSERTs with
SET unique_checks=0;
SET foreign_key_checks=0;
ALTER TABLE tbl_name DISABLE KEYS;
and ending it with this sequence.
ALTER TABLE tbl_name ENABLE KEYS;
SET unique_checks=1;
SET foreign_key_checks=1;
You must take great care in your software to avoid inserting rows that would be rejected as duplicates when you use this technique, because the ENABLE KEYS operation will not work in that case.
Read this for more information: http://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-bulk-data-loading.html

C#/SQL: Execute Multiple Insert/Update in ONE Transaction

I have a customer entity with 10 Properties.
7 of those properties are saved in the customer table.
3 of those properties are saved in the test table.
The 3 properties in test table are CustomerId, Label, Text.
When I query these 3 properties I get 3 dataset like this:
CustomerId | Label | Text
1005 | blubb | What a day
1006 | hello | Sun is shining
0007 | |
When I save them I have to call my stored procedure 3 times on the test table
In my SP I check wether the dataset with the specific customerId AND Label already exists
then I do an UPDATE else an INSERT.
How would you call the stored procedure 3 times with all CommandText, CommandType, ExecuteNonQuery etc stuff ?

The easiest way : use the TransactionScope class.
Simply put the call into a block like :
using(TransactionScope ts = new TransactionScope()){
using(SqlConnection conn = new SqlConnection(myconnstring)
{
conn.Open();
... do the call to sproc
ts.Complete();
conn.Close();
}
}
[Edit] I also added the SqlConnection, because I'm very fan of this pattern. The using keyword ensure the connection is closed and the transcation rollback if something wrong happened

Well, a SqlTransaction spanning three ExecuteNonQuery is the simplest, but some alternatives:
use the XML datatype to pass all three in as XML; parse the XML (SQL server has functions for this) in the sproc into 3 records
use a "table valued parameter" to pass them in a single call - note this needs additional definition at the DB to represent the structured data
if the data volume is huge (3000 rather than 3), SqlBulkCopy into a staging table, then run a sproc to move the data into the real table in one set-based operation
Finally, watch out for the "inner platform effect" - it sounds a bit like a DB inside a DB.

There are several classes that inherit from DbTransaction. The documentation for SqlTransaction has sample code.

You should encapsulate your INSERTs into a transaction. The bad way to do it is to use a TransactionScope in ADO.NET, the good way is to write a stored procedure and BEGIN and COMMIT/ROLLBACK your transaction inside you proc. You don't want to go back and forth form client to server while maintaing a transaction, because you will hurt concurreny and performance (exclusive locks are hold on the resources inserted until the transaction ends).
BEGIN TRAN
BEGIN TRY
INSERT
INSERT
COMMIT TRAN
END TRY
BEGIN CATCH
PRINT ERROR_MESSAGE() -- you can use THROW in SQL Server 2012 to retrhrow the error
ROLLBACK
END CATCH

C# - Inserting multiple rows using a stored procedure

I have a list of objects, this list contains about 4 million objects. there is a stored proc that takes objects attributes as params , make some lookups and insert them into tables.
what s the most efficient way to insert this 4 million objects to db?
How i do :
-- connect to sql - SQLConnection ...
foreach(var item in listofobjects)
{
SQLCommand sc = ...
// assign params
sc.ExecuteQuery();
}
THis has been really slow.
is there a better way to do this?
this process will be a scheduled task. i will run this ever hour, so i do expect high volume data like this.

Take a look at the SqlBulkCopy Class
based on your comment, dump the data into a staging table then do the lookup and insert into the real table set based from a proc....it will be much faster than row by row

It's never going to be ideal to insert four million records from C#, but a better way to do it is to build the command text up in code so you can do it in chunks.
This is hardly bulletproof, and it doesn't illustrate how to incorporate lookups (as you've mentioned you need), but the basic idea is:
// You'd modify this to chunk it out - only testing can tell you the right
// number - perhaps 100 at a time.
for(int i=0; i < items.length; i++) {
// e.g., 'insert dbo.Customer values(#firstName1, #lastName1)'
string newStatement = string.Format(
"insert dbo.Customer values(#firstName{0}, #lastName{0})", i);
command.CommandText += newStatement;
command.Parameters.Add("#firstName" + i, items[i].FirstName);
command.Parameters.Add("#lastName" + i, items[i].LastName);
}
// ...
command.ExecuteNonQuery();

I have had excellent results using XML to get large amounts of data into SQL Server. Like you, I initially was inserting rows one at a time which took forever due to the round trip time between the application and the server, then I switched the logic to pass in an XML string containing all the rows to insert. Time to insert went from 30 minutes to less that 5 seconds. This was for a couple of thousand rows. I have tested with XML strings up to 20 megabytes in size and there were no issues. Depending on your row size this might be an option.
The data was passed in as an XML String using the nText type.
Something like this formed the basic details of the stored procedure that did the work:
CREATE PROCEDURE XMLInsertPr( #XmlString ntext )
DECLARE #ReturnStatus int, #hdoc int
EXEC #ReturnStatus = sp_xml_preparedocument #hdoc OUTPUT, #XmlString
IF (#ReturnStatus <> 0)
BEGIN
RAISERROR ('Unable to open XML document', 16,1,50003)
RETURN #ReturnStatus
END
INSERT INTO TableName
SELECT * FROM OPENXML(#hdoc, '/XMLData/Data') WITH TableName
END

You might consider dropping any indexes you have on the table(s) you are inserting into and then recreating them after you have inserted everything. I'm not sure how the bulk copy class works but if you are updating your indexes on every insert it can slow things down quite a bit.

Like Abe metioned: drop indexes (and recreate later)
If you trust your data: generate a sql statement for each call to the stored proc, combine some, and then execute.
This saves you communication overhead.
The combined calls (to the stored proc) could be wrapped in a BEGIN TRANSACTION so you have only one commit per x inserts
If this is a onetime operation: do no optimize and run it during the night / weekend

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.