c# optimum way to run multiple queries - c#

I am using a C# application, in order to manage a mySQL database.
What I want to do is:
Read some records.
Run some functions to calculate "stuff".
Insert "stuff" to database.
In order to calculate n-th "stuff", I must have already calculated (n-1)-th "stuff".
This is what I do:
Declare:
static MySqlCommand cmd;
static MySqlDataReader dr;
My main loop is like following:
for (...)
{
dr.Close();
cmd.CommandText = "insert into....";
dr = cmd.ExecuteReader();
}
This is taking way too long. Total number of rows to be inserted is about 2.5M.
When I use mySql database in regular server, it takes about 100-150 hours. When I use a localhost database, it takes about 50h.
I think there should be a quicker way. My thoughts:
I think that now i connect to db and disconnect from db every time i loop. Is it true?
I could i create a CommandText that contains for example 100 queries (separated by semi-colon). Is this possible?
Instead of executing the queries, output them in a text file (file will be about 300MB). Then insert them into db using phpMyAdmin (Bonus question: I'm using phpMyAdmin. Is this ok? Is there a better (maybe not web) interface?)

Try using a bulk insert. I found this syntax here. And then use ExecuteNonQuery() as SLaks suggested in the comments. Those combined may speed it up a good bit.
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

It's possible you are using InnoDB as an access method. In this case you should try wrapping every hundred or so rows of INSERT operations in a transaction. When I have had to handle this kind of application, it's made a huge difference. To do this, structure your code like so:
MySqlCommand commit;
start.CommandText = "START TRANSACTION";
MySqlCommand commit;
commit.CommandText = "COMMIT";
int bunchSize = 100;
int bunch = 0;
start.ExecuteNonQuery(); /* start the first bunch transaction */
bunch = bunchsize;
for(/*whatever loop conditions you need*/) {
/* whatever you need to do */
/* your insert statement */
if (--bunchsize <= 0) {
commit.ExecuteNonQuery(); /* end one bunch transaction */
start.ExecuteNonQuery(); /* and begin the next */
bunchsize = bunch;
}
}
commit.ExecuteNonQuery(); /* end the last bunch transaction */
It is also possible that the table to which you're inserting megarows has lots of indexes. In this case you can speed things up by beginning your series of INSERTs with
SET unique_checks=0;
SET foreign_key_checks=0;
ALTER TABLE tbl_name DISABLE KEYS;
and ending it with this sequence.
ALTER TABLE tbl_name ENABLE KEYS;
SET unique_checks=1;
SET foreign_key_checks=1;
You must take great care in your software to avoid inserting rows that would be rejected as duplicates when you use this technique, because the ENABLE KEYS operation will not work in that case.
Read this for more information: http://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-bulk-data-loading.html

Related

Improve SQLite writing speed C#

I need to dramatically improve writing speed for SQLite (or maybe suggest another solution for this outside of SQLite).
Scenario :
I have 71 Columns with 365 * 24 * 60 values each. (365 = days)
I do "insert intos" for testing the db_performance
To shorten the testing-time I did the tests for 90 days instead of 365 (so the result-timespans will be x4)
Settings :
I've tried various PRAGMAS like
synchronous off
locking_mode exclusive
cache & pagesize with different values (though I read low values may improve performance, for me higher values did a good job)
journal_mode off
changing timeout values
Approaches :
#A1 Gathering all "insert intos", ExecuteNonquery each, at the end do one giant transaction
#A2 the same like above but with ParallelForEach and ExecuteNonqueryAsync
#A3 Gathering all "insert intos" for one day and do one transaction each
Tablestructure :
#T1 One table with all the columns
#T2 One table for each column
Results :
I did runs for 90 days ( so it doesn't take too long ) and the main problem is writing speed.
I measured 5 phases, which are :
#P1 setup the tables & headers ( ~ 8-9ms)
#P2 prepare the data (for every "insert into" command do ExecuteNonquery) ( ~ 15000-18000ms ! )
#P3 do the transaction (~ 200-500 ms)
#P4 read one complete column ( ~80 - 200 ms)
#P5 delete one complete column ( ~ 1 - 9 ms)
I tried all the different methods and approaches I mentioned before, but couldn't manage to improve #P2. Any ideas how to fix that ? Or maybe any hint for a better solution as a serverless db (Realm?) ?
Here's the code for the #A1 #P2 #T2, which had the best results so far...
using (var transaction = sqLiteConnection.BeginTransaction())
{
using (var command = sqLiteConnection.CreateCommand())
{
foreach (var vcommand in values_list)
{
command.CommandText = vcommand;
command.ExecuteNonQuery();
}
}
transaction.Commit();
}
(values list is a string[] with 71*90 insert intos or in Marks version one giant command.)
Edit/Update :
I tried the approach by Mark Benningfield making one giant "insert into" for all values in one table with all columns and could improve the overall speed to ~8500ms (#P2 ~7500ms).
Final Update :
Ok I did a bunch of tests and will summarize the results :
For comparison reason all databases had the same values, a two-dimensional double array with [129600,71] values. None of them had a prepared insert-statement, so the generation time for transforming the values into the needed format is included (phase 2).
SQLite needs ~14seconds with one giant transaction (the previous ~8s were without generating the insert-into-command live). SQL_CE is atm the best for this scenario. This is mainly due to not operating with strings ("INSERT INTO"), but with DataTables and rows + bulkInsert. Realm is interesting, especially for mobile users - very intuitive. But you cannot add dynamic obejcts atm (so you need a static object). Influx is another nice database for timeseries, but it's very specific, not embedded and has IMHO a poor C# implementation (it may perform much better via console).
Have you tried writing the data to a text file and then using the import command (see Importing CSV files)? Unlike INSERT commands, these routines usually ignore triggers and work with direct table access.
Make your insert command look like this (by constructing it however you need to):
INSERT INTO table (col1, col2, col3) VALUES (val1, 'val2', val3),
(val1, 'val2', val3),
(val2, 'val2', val3),
...
(val1, 'val2', val3);
Then execute the single insert command to do a bulk update of known data.
Ok I did a bunch of tests and will summarize the results :
For comparison reason all databases had the same values, a two-dimensional double array with [129600,71] values. None of them had a prepared insert-statement, so the generation time for transforming the values into the needed format is included (phase 2).
SQLite needs ~14seconds with one giant transaction (the previous ~8s were without generating the insert-into-command live). SQL_CE is atm the best for this scenario. This is mainly due to not operating with strings ("INSERT INTO"), but with DataTables and rows + bulkInsert.
Realm is interesting, especially for mobile users - very intuitive. But you cannot add dynamic obejcts atm (so you need a static object).
Influx is another nice database for timeseries, but it's very specific, not embedded and has IMHO a poor C# implementation (it may perform much better via console).
The fastest and recommended way to do bulk inserts is to use prepared statements with parameters. That way, a statement (command) is only parsed and prepared once, instead of having to parse it again for every row. SQLite also does not have to parse the parameter values from the command text, but they are supplied and used directly. For each row, you only switch parameters.
So instead of going this way:
using (var transaction = sqLiteConnection.BeginTransaction())
{
using (var command = sqLiteConnection.CreateCommand())
{
foreach (var vcommand in values_list)
{
command.CommandText = vcommand;
command.ExecuteNonQuery();
}
}
transaction.Commit();
}
You should do it like this:
using (var transaction = sqLiteConnection.BeginTransaction())
{
using (var command = sqLiteConnection.CreateCommand())
{
// Create command and parameters
command.CommandText = "INSERT INTO MyTable VALUES (?, ?)";
var param1 = command.Parameters.Add(null, SqliteType.Integer);
var param2 = command.Parameters.Add(null, SqliteType.Text);
foreach (var item in values_list)
{
// For each row, only update parameter values
param1.Value = item.IntProperty;
param2.Value = item.TextProperty;
command.ExecuteNonQuery();
}
}
transaction.Commit();
}
This will perform much better. The statement is only parsed on first execute. All following executes will use the already prepared statement. It also safeguards you against SQL Injection attacks: Text parameters are not inserted into the actual SQL statement string, which would allow manipulation of your statement. Instead, they are passed directly as values to SQLite. So you do not only gain performance, you have also prevented one of the most common database attack scenarios.
General rule: Never put values directly into SQL statements. Always use parameters instead.
Note: There are also other ways to create and parameters. This is just one example how to do it. For example, you can also use named parameters:
// Create command and parameters
command.CommandText = "INSERT INTO MyTable VALUES (#one, #two)";
var param1 = command.Parameters.Add("#one", SqliteType.Integer);
var param2 = command.Parameters.Add("#two", SqliteType.Text);

How to insert large amount of data into SQL Server 2008 with c# code?

I have developed a program which calculates and inserts around 4800 rows within a loop to SQL Server 2008. But after inserting 200+ rows it gets stuck every time and does not insert the rest of the rows.
Now I am writing a text file with the insert command inside the loop instead inserting into the DB. If I try to copy the whole 4800 insert command from the text log and paste it into the query editor of the SQL Server then it inserts all within 1 minute. I would like to get suggestion on how I may solve this issue? I would appreciate any suggestion or help.
Here is my code sample what I am trying now:
strSQL = "Insert into performance Values (#Rptdate,#CP_Name, #Shortcode, #Keyword, #MO_Count, #MO_Revenue,";
strSQL = strSQL + "#PMT_Sent_Count, #MT_Revenue, #ZMT_Sent_Count, #Infra_Revenue, #Total_MT, #UM_rev_Share, #CP_Rev_Share, ";
strSQL = strSQL + "#MCP_Rev_Share, #UM_Total_Revenue, #CP_Revenue)";
try
{
db.openconn("MOMT_Report", "Report");
cmd = new SqlCommand(strSQL, db.cn);
cmd.Parameters.AddWithValue("#Rptdate", Rptdate);
cmd.Parameters.AddWithValue("#Name", Name);
cmd.Parameters.AddWithValue("#Shortcode", Shortcode);
cmd.Parameters.AddWithValue("#Keyword", Keyword);
cmd.Parameters.AddWithValue("#MO_Count", MO_Count);
cmd.Parameters.AddWithValue("#MO_Revenue", MO_Revenue);
cmd.Parameters.AddWithValue("#PMT_Sent_Count", PMT_Sent_Count);
cmd.Parameters.AddWithValue("#MT_Revenue", MT_Revenue);
cmd.Parameters.AddWithValue("#ZMT_Sent_Count", ZMT_Sent_Count);
cmd.Parameters.AddWithValue("#Infra_Revenue", Infra_Revenue);
cmd.Parameters.AddWithValue("#Total_MT", Total_MT);
cmd.Parameters.AddWithValue("#rev_Share", rev_Share);
cmd.Parameters.AddWithValue("#Rev_Share", Rev_Share);
cmd.Parameters.AddWithValue("#MCP_Rev_Share", MCP_Rev_Share);
cmd.Parameters.AddWithValue("#Total_Revenue", Total_Revenue);
cmd.Parameters.AddWithValue("#Revenue", Revenue);
cmd.CommandTimeout = 0;
cmd.ExecuteNonQuery();
}
Is there a db.closeconn(); somewhere after the try block that was pasted into the question? If not then that is a huge issue (i.e. to keep opening connections and not closing them, and that could explain why it freezes after opening 200+ of them). If there is a close connection method being called then great, but still, opening and closing the connection per each INSERT is unnecessary, let alone horribly inefficient.
At the very least you can:
define the query string, SqlParameters, and SqlCommand once
in the loop, set the parameter values and call ExecuteNonQuery();
(it is also preferred to not use AddWithValue() anyway)
Example:
// this should be in a try block
strSQL = "INSERT...";
db.openconn("MOMT_Report", "Report");
cmd = new SqlCommand(strSQL, db.cn);
SqlParameter _Rptdate = new SqlParameter("#Rptdate", DbType.Int);
cmd.Parameters.Add(_Rptdate);
...{repeat for remaining params}...
// optional begin transaction
for / while loop
{
_Rptdate.Value = Rptdate;
// set other param values
cmd.ExecuteNonQuery();
}
// if optional transaction was started, do commit
db.closeconn(); // this should be in a finally block
However, the fastest and cleanest way to get this data inserted is to use Table-Valued Parameters (TVPs) which were introduced in SQL Server 2008. You need to create a User-Defined Table Type (one time) to define the structure, and then you can use it in either an ad hoc insert like you current have, or pass to a stored procedure. But this way you don't need to export to a file just to import. There is no need for that additional steps.
Rather than copy/paste a large code block, I have noted three links below where I have posted the code to do this. The first two links are the full code (SQL and C#) to accomplish this. Each is a slight variation on the theme (which shows the flexibility of using TVPs). The third is another variation but not the full code as it just shows the differences from one of the first two in order to fit that particular situation. But in all 3 cases, the data is streamed from the app into SQL Server. There is no creating of any additional collection or external file; you use what you currently have and only need to duplicate the values of a single row at a time to be sent over. And on the SQL Server side, it all comes through as a populated Table Variable. This is far more efficient than taking data you already have in memory, converting it to a file (takes time and disk space) or XML (takes cpu and memory) or a DataTable (for SqlBulkCopy; takes cpu and memory) or something else, only to rely on an external factor such as the filesystem (the files will need to be cleaned up, right?) or need to parse out of XML.
How can I insert 10 million records in the shortest time possible?
Pass Dictionary<string,int> to Stored Procedure T-SQL
Storing a Dictionary<int,string> or KeyValuePair in a database
Use Bulk Insert.
It is nicely described here:
http://blogs.msdn.com/b/nikhilsi/archive/2008/06/11/bulk-insert-into-sql-from-c-app.aspx
executing one sqlcommand is much better than executing 4800 sqlcommands
there is several ways,
for the Bulk Insert you need your .txt file be accessed from your database server(transfer the file to database server or it can access the file through network)
and after that use:
BULK INSERT Your_table
FROM 'full_file_name'
WITH
(
FIELDTERMINATOR =' terminator character',
ROWTERMINATOR = ':\n'
);
another way: you can build a new text in your C# code which has an insert command for each row and execute the whole text once (it's better to put it into a transaction)
I think if you need to bulk insert using XML then you can use this type of approach also..
first of all create a store procedure like this..
CREATE PROCEDURE SP_INSERT_BULK
#DataXML XML
AS
BEGIN
INSERT INTO performance
SELECT
d.value('#Rptdate','varchar') AS Rptdate
,d.value('#Name','varchar') AS Name
,d.value('#Shortcode','varchar') AS Shortcode
,d.value('#Keyword','varchar') AS Keyword
,d.value('#MO_Count','varchar') AS MO_Count
,d.value('#MO_Revenue','varchar') AS MO_Revenue
,d.value('#PMT_Sent_Count','varchar') AS PMT_Sent_Count
,d.value('#MT_Revenue','varchar') AS MT_Revenue
,d.value('#ZMT_Sent_Count','varchar') AS ZMT_Sent_Count
,d.value('#Infra_Revenue','varchar') AS Infra_Revenue
,d.value('#Total_MT','varchar') AS Total_MT
,d.value('#rev_Share','varchar') AS rev_Share
,d.value('#Rev_Share','varchar') AS Rev_Share
,d.value('#MCP_Rev_Share','varchar') AS MCP_Rev_Share
,d.value('#Total_Revenue','varchar') AS Total_Revenue
,d.value('#Revenue','varchar') AS Revenue
FROM #DataXML.nodes('Reports/Report') n(d)
END
the above store procedure is just for demonstration, you can modify it with your own logic
now next step is to create data XML to pass into your store procedure as parameter
//prepare your data xml here
//you can use any of your logic to prepare dataxml
string xmlstring = #"<?xml version='1.0' encoding='utf-8'?><Reports>";
for (int i = 0; i < recordcout; i++)
{
xmlstring += string.Format(#"<Report Rptdate='{0}'
Name='{1}'
Shortcode='{2}'
Keyword='{3}'
MO_Count='{4}'
MO_Revenue ='{5}'
PMT_Sent_Count='{6}'
MT_Revenue='{7}'
ZMT_Sent_Count ='{8}'
Infra_Revenue='{9}'
Total_MT='{10}'
rev_Share='{11}'
Rev_Share='{12}'
MCP_Rev_Share='{13}'
Total_Revenue='{14}'
Revenue='{15}' />");
}
xmlstring += "</Reports>";
Now next step is to pass this XML string to your store procedure
using (SqlConnection dbConnection = new SqlConnection("CONNECTIONSTRING"))
//Create database connection
{
// Database command with stored - procedure
using (SqlCommand dbCommand =
new SqlCommand("SP_INSERT_BULK", dbConnection))
{
// we are going to use store procedure
dbCommand.CommandType = CommandType.StoredProcedure;
// Add input parameter and set its properties.
SqlParameter parameter = new SqlParameter();
// Store procedure parameter name
parameter.ParameterName = "#DataXML";
// Parameter type as XML
parameter.DbType = DbType.Xml;
parameter.Direction = ParameterDirection.Input; // Input Parameter
parameter.Value = xmlstring; // XML string as parameter value
// Add the parameter in Parameters collection.
dbCommand.Parameters.Add(parameter);
dbConnection.Open();
int intRetValue = dbCommand.ExecuteNonQuery();
}
}
I think you need set timeout option first in link below:
http://msdn.microsoft.com/en-us/library/ms189470.aspx
then try to change max allowed packet:
http://msdn.microsoft.com/en-us/library/ms177437.aspx
hope it will work
You should do all this in one transaction.
Open DB connection.
Create command.
Begin transaction.
Start loop.
Clear parameters if added
Set parameters and execute it.
End loop.
Commit transaction.
Close DB connection.

Calling SQL select statement from C# thousands of times and is very time consuming. Is there a better way?

I get a list of ID's and amounts from a excel file (thousands of id's and corresponding amounts). I then need to check the database to see if each ID exists and if it does check to make sure the amount in the DB is greater or equal to that of the amount from the excel file.
Problem is running this select statement upwards of 6000 times and return the values I need takes a long time. Even at a 1/2 of a second a piece it will take about an hour to do all the selects. (I normally dont get more than 5 results max back)
Is there a faster way to do this?
Is it possible to somehow pass all the ID's at once and just make 1 call and get the massive collection?
I have tried using SqlDataReaders and SqlDataAdapters but they seem to be about the same (too long either way)
General idea of how this works below
for (int i = 0; i < ID.Count; i++)
{
SqlCommand cmd = new SqlCommand("select Amount, Client, Pallet from table where ID = #ID and Amount > 0;", sqlCon);
cmd.Parameters.Add("#ID", SqlDbType.VarChar).Value = ID[i];
SqlDataAdapter da = new SqlDataAdapter(cmd);
da.Fill(dataTable);
da.Dispose();
}
Instead of a long in list (difficult to parameterise and has a number of other inefficiencies regarding execution plans: compilation time, plan reuse, and the plans themselves) you can pass all the values in at once via a table valued parameter.
See arrays and lists in SQL Server for more details.
Generally I make sure to give the table type a primary key and use option (recompile) to get the most appropriate execution plans.
Combine all the IDs together into a single large IN clause, so it reads like:
select Amount, Client, Pallet from table where ID in (1,3,5,7,9,11) and Amount > 0;
"I have tried using SqlDataReaders and SqlDataAdapters"
It sounds like you might be open to other APIs. Using Linq2SQL or Linq2Entities:
var someListIds = new List<int> { 1,5,6,7 }; //imagine you load this from where ever
db.MyTable.Where( mt => someListIds.Contains(mt.ID) );
This is safe in terms of avoiding potential SQL injection vulnerabilities and will generate a "in" clause. Note however the size of the someListIds can be so large that the SQL query generated exceeds limits of query length, but the same is true of any other technique involving the IN clause. You can easily workaround that by partitioning lists into large chunks, and still be tremendously better than a query per ID.
Use Table-Valued Parameters
With them you can pass a c# datatable with your values into a stored procedure as a resultset/table which you can join to and do a simple:
SELECT *
FROM YourTable
WHERE NOT EXISTS (SELECT * FORM InputResultSet WHERE YourConditions)
Use the in operator. Your problem is very common and it has a name: N+1 performance problem
Where are you getting the IDs from? If it is from another query, then consider grouping them into one.
Rather than performing a separate query for every single ID that you have, execute one query to get the amount of every single ID that you want to check (or if you have too many IDs to put in one query, then batch them into batches of a few thousand).
Import the data directly to SQL Server. Use stored procedure to output the data you need.
If you must consume it in the app tier... use xml datatype to pass into a stored procedure.
You can import the data from the excel file into SQL server as a table (using the import data wizard). Then you can perform a single query in SQL server where you join this table to your lookup table, joining on the ID field. There's a few more steps to this process, but it's a lot neater than trying to concatenate all the IDs into a much longer query.
I'm assuming a certain amount of access privileges to the server here, but this is what I'd do given the access I normally have. I'm also assuming this is a one off task. If not, the import of the data to SQL server can be done programmatically as well
IN clause has limits, so if you go with that approach, make sure a batch size is used to process X amount of Ids at a time, otherwise you will hit another issue.
A #Robertharvey has noted, if there are not a lot of IDs and there are no transactions occurring, then just pull all the Ids at once into memory into a dictionary like object and process them there. Six thousand values is not alot and a single select could return all those back within a few seconds.
Just remember that if another process is updating the data, your local cached version may be stale.
There is another way to handle this, Making XML of IDs and pass it to procedure. Here is code for procedure.
IF OBJECT_ID('GetDataFromDatabase') IS NOT NULL
BEGIN
DROP PROCEDURE GetDataFromDatabase
END
GO
--Definition
CREATE PROCEDURE GetDataFromDatabase
#xmlData XML
AS
BEGIN
DECLARE #DocHandle INT
DECLARE #idList Table (id INT)
EXEC SP_XML_PREPAREDOCUMENT #DocHandle OUTPUT, #xmlData;
INSERT INTO #idList (id) SELECT x.id FROM OPENXML(#DocHandle, '//data', 2) WITH ([id] INT) x
EXEC SP_XML_removeDOCUMENT #DocHandle ;
--SELECT * FROM #idList
SELECT t.Amount, t.Client, t.Pallet FROM yourTable t INNER JOIN #idList x ON t.id = x.id and t.Amount > 0;
END
GO
--Uses
EXEC GetDataFromDatabase #xmlData = '<root><data><id>1</id></data><data><id>2</id></data></root>'
You can put any logic in procedure. You can pass id, amount also via XML. You can pass huge list of ids via XML.
SqlDataAdapter objects too heavy for that.
Firstly, using stored procedures, it will be faster.
Secondly, use the group operation, for this pass as a parameter to a list of identifiers on the side of the database, run a query on these parameters, and return the processed result.
It will quickly and efficiently, as all data processing logic is on the side of the database server
You can select the whole resultset (or join multiple 'limited' result sets) and save it all to DataTable Then you can do selects and updates (if needed) directly on datatable. Then plug new data back... Not super efficient memory wise, but often is very good (and only) solution when working in bulk and need it to be very fast.
So if you have thousands of records, it might take couple of minutes to populate all records into the DataTable
then you can search your table like this:
string findMatch = "id = value";
DataRow[] rowsFound = dataTable.Select(findMatch);
Then just loop foreach (DataRow dr in rowsFound)

SQL Transaction with ADO.Net

I am new to Database interection with C#, I am trying to writing 10000 records in database in a loop with the help of SqlCommand and SqlConnection objects with the help of SqlTransaction and committing after 5000. It is taking 10 seconds to processed.
SqlConnection myConnection = new SqlConnection("..Connection String..");
myConnection.Open();
SqlCommand myCommand = new SqlCommand();
myCommand.CommandText = "exec StoredProcedureInsertOneRowInTable Param1, Param2........";
myCommand.Connection = myConnection;
SqlTransaction myTrans = myConnection.Begintransaction();
for(int i=0;i<10000;i++)
{
mycommand.ExecuteNonQuery();
if(i%5000==0)
{
myTrans.commit();
myTrans = myConnection.BeginTransaction();
mycommand.Transaction = myTrans;
}
}
Above code is giving me only 1000 rows write/sec in database.
But when i tried to implement same logic in SQL and execute it on Database with SqlManagement Studio the it gave me 10000 write/sec.
When I compare the behaviour of above two approch then it showes me that while executing with ADO.Net there is large number of Logical reads.
my questions are:
1. Why there is logical reads in ADO.Net execution?
2. Is tansaction have some hand shaking?
3. Why they are not available in case of management studio?
4. If I want very fast insert transactions on DB then what will be the approach? .
Updated Information about Database objects
Table: tbl_FastInsertTest
No Primary Key, Only 5 fields first three are type of int (F1,F2,F3) and last 2(F4,F5) are type varchar(30)
storedprocedure:
create proc stp_FastInsertTest
{
#nF1 int,
#nF2 int,
#nF3 int,
#sF4 varchar(30),
#sF5 varchar(30)
}
as
Begin
set NoCOUNT on
Insert into tbl_FastInsertTest
{
[F1],
[F2],
[F3],
[F4],
[F5]
}
Values
{
#nF1,
#nF2,
#nF3,
#sF4,
#sF5,
} end
--------------------------------------------------------------------------------------
SQL Block Executing on SSMS
--When I am executing following code on SSMS then it is giving me more than 10000 writes per second but when i tried to execute same STP on ADO than it gave me 1000 to 1200 writes per second
--while reading no locks
begin trans
declare #i int
set #i=0
While(1<>0)
begin
exec stp_FastInsertTest 1,2,3,'vikram','varma'
set #i=#i+1
if(#i=5000)
begin
commit trans
set #i=0
begin trans
end
end
If you are running something like:
exec StoredProcedureInsertOneRowInTable 'blah', ...
exec StoredProcedureInsertOneRowInTable 'bloop', ...
exec StoredProcedureInsertOneRowInTable 'more', ...
in SSMS, that is an entirely different scenario, where all of that is a single batch. With ADO.NET you are paying a round-trip per ExecuteNonQuery - I'm actually impressed it managed 1000/s.
Re the logical reads, that could just be looking at the query-plan cache, but without knowing more about StoredProcedureInsertOneRowInTable it is impossible to comment on whether something query-specific is afoot. But I suspect you have some different SET conditions between SSMS and ADO.NET that is forcing it to use a different plan - this is in particular a problem with things like persisted calculated indexed columns, and columns "promoted" out of a sql-xml field.
Re making it faster - in this case it sounds like a table-valued parameters is exactly the thing, but you should also review the other options here
For performant inserts take a look at SqlBulkCopy class if it works for you it should be fast.
As Sean said, using parameterized queries is always a good idea.
Using a StringBuilder class, batching thousand INSERT statements in a single query and committing the transaction is a proven way of inserting data:
var sb=new StringBuilder();
for(int i=0;i < 1000;i++)
{
sb.AppendFormat("INSERT INTO Table(col1,col2)
VALUES({0},{1});",values1[i],values2[i]);
}
sqlCommand.Text=sb.ToString();
Your code doesn't look right to me, you are not committing transactions at each batch. Your code keeps opening new transactions.
It is always a good practice to drop indexes while inserting a lot of data, and adding them later. Indexes will slow down your writes.
Sql Management Studio does not have transactions but Sql has, try this:
BEGIN TRANSACTION MyTransaction
INSERT INTO Table(Col1,Col1) VALUES(Val10,Val20);
INSERT INTO Table(Col1,Col1) VALUES(Val11,Val21);
INSERT INTO Table(Col1,Col1) VALUES(Val12,Val23);
COMMIT TRANSACTION
You need to use a parameterized query so that the execution path can get processed and cached. Since you're using string concatenation (shudder, this is bad, google sql injection) to build the query, SQL Server treats those 10,000 queries are separate, individual queries and builds an execution plan for each one.
MSDN: http://msdn.microsoft.com/en-us/library/yy6y35y8.aspx although you're going to want to simplify their code a bit and you'll have to reset the parameters on the command.
If you really, really want to get the data in the db fast, think about using bcp... but you better make sure the data is clean first (as there's no real error checking/handling on it.

C# - Inserting multiple rows using a stored procedure

I have a list of objects, this list contains about 4 million objects. there is a stored proc that takes objects attributes as params , make some lookups and insert them into tables.
what s the most efficient way to insert this 4 million objects to db?
How i do :
-- connect to sql - SQLConnection ...
foreach(var item in listofobjects)
{
SQLCommand sc = ...
// assign params
sc.ExecuteQuery();
}
THis has been really slow.
is there a better way to do this?
this process will be a scheduled task. i will run this ever hour, so i do expect high volume data like this.
Take a look at the SqlBulkCopy Class
based on your comment, dump the data into a staging table then do the lookup and insert into the real table set based from a proc....it will be much faster than row by row
It's never going to be ideal to insert four million records from C#, but a better way to do it is to build the command text up in code so you can do it in chunks.
This is hardly bulletproof, and it doesn't illustrate how to incorporate lookups (as you've mentioned you need), but the basic idea is:
// You'd modify this to chunk it out - only testing can tell you the right
// number - perhaps 100 at a time.
for(int i=0; i < items.length; i++) {
// e.g., 'insert dbo.Customer values(#firstName1, #lastName1)'
string newStatement = string.Format(
"insert dbo.Customer values(#firstName{0}, #lastName{0})", i);
command.CommandText += newStatement;
command.Parameters.Add("#firstName" + i, items[i].FirstName);
command.Parameters.Add("#lastName" + i, items[i].LastName);
}
// ...
command.ExecuteNonQuery();
I have had excellent results using XML to get large amounts of data into SQL Server. Like you, I initially was inserting rows one at a time which took forever due to the round trip time between the application and the server, then I switched the logic to pass in an XML string containing all the rows to insert. Time to insert went from 30 minutes to less that 5 seconds. This was for a couple of thousand rows. I have tested with XML strings up to 20 megabytes in size and there were no issues. Depending on your row size this might be an option.
The data was passed in as an XML String using the nText type.
Something like this formed the basic details of the stored procedure that did the work:
CREATE PROCEDURE XMLInsertPr( #XmlString ntext )
DECLARE #ReturnStatus int, #hdoc int
EXEC #ReturnStatus = sp_xml_preparedocument #hdoc OUTPUT, #XmlString
IF (#ReturnStatus <> 0)
BEGIN
RAISERROR ('Unable to open XML document', 16,1,50003)
RETURN #ReturnStatus
END
INSERT INTO TableName
SELECT * FROM OPENXML(#hdoc, '/XMLData/Data') WITH TableName
END
You might consider dropping any indexes you have on the table(s) you are inserting into and then recreating them after you have inserted everything. I'm not sure how the bulk copy class works but if you are updating your indexes on every insert it can slow things down quite a bit.
Like Abe metioned: drop indexes (and recreate later)
If you trust your data: generate a sql statement for each call to the stored proc, combine some, and then execute.
This saves you communication overhead.
The combined calls (to the stored proc) could be wrapped in a BEGIN TRANSACTION so you have only one commit per x inserts
If this is a onetime operation: do no optimize and run it during the night / weekend

Categories

Resources