I am working on a console application to insert data to a MS SQL Server 2005 database. I have a list of objects to be inserted. Here I use Employee class as example:
List<Employee> employees;
What I can do is to insert one object at time like this:
foreach (Employee item in employees)
{
string sql = #"INSERT INTO Mytable (id, name, salary)
values ('#id', '#name', '#salary')";
// replace #par with values
cmd.CommandText = sql; // cmd is IDbCommand
cmd.ExecuteNonQuery();
}
Or I can build a balk insert query like this:
string sql = #"INSERT INTO MyTable (id, name, salary) ";
int count = employees.Count;
int index = 0;
foreach (Employee item in employees)
{
sql = sql + string.format(
"SELECT {0}, '{1}', {2} ",
item.ID, item.Name, item.Salary);
if ( index != (count-1) )
sql = sql + " UNION ALL ";
index++
}
cmd.CommandType = sql;
cmd.ExecuteNonQuery();
I guess the later case is going to insert rows of data at once. However, if I have
several ks of data, is there any limit for SQL query string?
I am not sure if one insert with multiple rows is better than one insert with one row of data, in terms of performance?
Any suggestions to do it in a better way?
Actually, the way you have it written, your first option will be faster.
Your second example has a problem in it. You are doing sql = + sql + etc. This is going to cause a new string object to be created for each iteration of the loop. (Check out the StringBuilder class). Technically, you are going to be creating a new string object in the first instance too, but the difference is that it doesn't have to copy all the information from the previous string option over.
The way you have it set up, SQL Server is going to have to potentially evaluate a massive query when you finally send it which is definitely going to take some time to figure out what it is supposed to do. I should state, this is dependent on how large the number of inserts you need to do. If n is small, you are probably going to be ok, but as it grows your problem will only get worse.
Bulk inserts are faster than individual ones due to how SQL server handles batch transactions. If you are going to insert data from C# you should take the first approach and wrap say every 500 inserts into a transaction and commit it, then do the next 500 and so on. This also has the advantage that if a batch fails, you can trap those and figure out what went wrong and re-insert just those. There are other ways to do it, but that would definately be an improvement over the two examples provided.
var iCounter = 0;
foreach (Employee item in employees)
{
if (iCounter == 0)
{
cmd.BeginTransaction;
}
string sql = #"INSERT INTO Mytable (id, name, salary)
values ('#id', '#name', '#salary')";
// replace #par with values
cmd.CommandText = sql; // cmd is IDbCommand
cmd.ExecuteNonQuery();
iCounter ++;
if(iCounter >= 500)
{
cmd.CommitTransaction;
iCounter = 0;
}
}
if(iCounter > 0)
cmd.CommitTransaction;
In MS SQL Server 2008 you can create .Net table-UDT that will contain your table
CREATE TYPE MyUdt AS TABLE (Id int, Name nvarchar(50), salary int)
then, you can use this UDT in your stored procedures and your с#-code to batch-inserts.
SP:
CREATE PROCEDURE uspInsert
(#MyTvp AS MyTable READONLY)
AS
INSERT INTO [MyTable]
SELECT * FROM #MyTvp
C# (imagine that records you need to insert already contained in Table "MyTable" of DataSet ds):
using(conn)
{
SqlCommand cmd = new SqlCommand("uspInsert", conn);
cmd.CommandType = CommandType.StoredProcedure;
SqlParameter myParam = cmd.Parameters.AddWithValue
("#MyTvp", ds.Tables["MyTable"]);
myParam.SqlDbType = SqlDbType.Structured;
myParam.TypeName = "dbo.MyUdt";
// Execute the stored procedure
cmd.ExecuteNonQuery();
}
So, this is the solution.
Finally I want to prevent you from using code like yours (building the strings and then execute this string), because this way of executing may be used for SQL-Injections.
look at this thread,
I've answered there about table valued parameter.
Bulk-copy is usually faster than doing inserts on your own.
If you still want to do it in one of your suggested ways you should make it so that you can easily change the size of the queries you send to the server. That way you can optimize for speed in your production environment later on. Query times may v ary alot depending on the query size.
The batch size for a SQL Server query is listed at being 65,536 * the network packet size. The network packet size is by default 4kbs but can be changed. Check out the Maximum capacity article for SQL 2008 to get the scope. SQL 2005 also appears to have the same limit.
Related
I want to create simple database in runtime, fill it with data from internal resource and then read each record through loop. Previously I used LiteDb for that but I couldn't squeeze time anymore so
I choosed SQLite.
I think there are few things to improve I am not aware of.
Database creation process:
First step is to create table
using var create = transaction.Connection.CreateCommand();
create.CommandText = "CREATE TABLE tableName (Id TEXT PRIMARY KEY, Value TEXT) WITHOUT ROWID";
create.ExecuteNonQuery();
Next insert command is defined
var insert = transaction.Connection.CreateCommand();
insert.CommandText = "INSERT OR IGNORE INTO tableName VALUES (#Id, #Record)";
var idParam = insert.CreateParameter();
var valueParam = insert.CreateParameter();
idParam.ParameterName = "#" + IdColumn;
valueParam.ParameterName = "#" + ValueColumn;
insert.Parameters.Add(idParam);
insert.Parameters.Add(valueParam);
Through loop each value is inserted
idParameter.Value = key;
valueParameter.Value = value.ValueAsText;
insert.Parameters["#Id"] = idParameter;
insert.Parameters["#Value"] = valueParameter;
insert.ExecuteNonQuery();
Transaction commit transaction.Commit();
Create index
using var index = transaction.Connection.CreateCommand();
index.CommandText = "CREATE UNIQUE INDEX idx_tableName ON tableName(Id);";
index.ExecuteNonQuery();
And after that i perform milion selects (to retrieve single value):
using var command = _connection.CreateCommand();
command.CommandText = "SELECT Value FROM tableName WHERE Id = #id;";
var param = command.CreateParameter();
param.ParameterName = "#id";
param.Value = id;
command.Parameters.Add(param);
return command.ExecuteReader(CommandBehavior.SingleResult).ToString();
For all select's one connection is shared and never closed. Insert is quite fast (less then minute) but select's are very troublesome here. Is there a way to improve them?
Table is quite big (around ~2 milions records) and Value contains quite heavy serialized objects.
System.Data.SQLite provider is used and connection string contains this additional options: Version=3;Journal Mode=Off;Synchronous=off;
If you go for performance, you need to consider this: each independent SELECT command is a roundtrip to the DB with some extra costs. It's similar to a N+1 select problem in case of parent-child relations.
The best thing you can do is to get a LIST of items (values):
SELECT Value FROM tableName WHERE Id IN (1, 2, 3, 4, ...);
Here's a link on how to code that: https://www.mikesdotnetting.com/article/116/parameterized-in-clauses-with-ado-net-and-linq
You could have the select command not recreated for every Id but created once and only executed for every Id. From your code it seems every select is CreateCommand/CreateParameters and so on. See this for example: https://learn.microsoft.com/en-us/dotnet/api/system.data.idbcommand.prepare?view=net-5.0 - you run .Prepare() once and then only execute (they don't need to be NonQuery)
you could then try to see if you can be faster with ExecuteScalar and not having reader created for one data result, like so: https://learn.microsoft.com/en-us/dotnet/api/system.data.idbcommand.executescalar?view=net-5.0
If scalar will not prove to be faster then you could try to use .SingleRow instead of .SingleResult in your ExecuteReader for possible performance optimisations. According to this: https://learn.microsoft.com/en-us/dotnet/api/system.data.commandbehavior?view=net-5.0 it might work. I doubt that but if first two don't help, why not try it too.
I have been asked to look at finding the most efficient way to take a DataTable input and write it to a SQL Server table using C#. The snag is that the solution must use ODBC Connections throughout, this rules out sqlBulkCopy. The solution must also work on all SQL Server versions back to SQL Server 2008 R2.
I am thinking that the best approach would be to use batch inserts of 1000 rows at a time using the following SQL syntax:
INSERT INTO dbo.Table1(Field1, Field2)
SELECT Value1, Value2
UNION
SELECT Value1, Value2
I have already written the code the check if a table corresponding to the DataTable input already exists on the SQL Server and to create one if it doesn't.
I have also written the code to create the INSERT statement itself. What I am struggling with is how to dynamically build the SELECT statements from the rows in the data table. How can I access the values in the rows to build my SELECT statement? I think I will also need to check the data type of each column in order to determine whether the values need to be enclosed in single quotes (') or not.
Here is my current code:
public bool CopyDataTable(DataTable sourceTable, OdbcConnection targetConn, string targetTable)
{
OdbcTransaction tran = null;
string[] selectStatement = new string[sourceTable.Rows.Count];
// Check if targetTable exists, create it if it doesn't
if (!TableExists(targetConn, targetTable))
{
bool created = CreateTableFromDataTable(targetConn, sourceTable);
if (!created)
return false;
}
try
{
// Prepare insert statement based on sourceTable
string insertStatement = string.Format("INSERT INTO [dbo].[{0}] (", targetTable);
foreach (DataColumn dataColumn in sourceTable.Columns)
{
insertStatement += dataColumn + ",";
}
insertStatement += insertStatement.TrimEnd(',') + ") ";
// Open connection to target db
using (targetConn)
{
if (targetConn.State != ConnectionState.Open)
targetConn.Open();
tran = targetConn.BeginTransaction();
for (int i = 0; i < sourceTable.Rows.Count; i++)
{
DataRow row = sourceTable.Rows[i];
// Need to iterate through columns in row, getting values and data types and building a SELECT statement
selectStatement[i] = "SELECT ";
}
insertStatement += string.Join(" UNION ", selectStatement);
using (OdbcCommand cmd = new OdbcCommand(insertStatement, targetConn, tran))
{
cmd.ExecuteNonQuery();
}
tran.Commit();
return true;
}
}
catch
{
tran.Rollback();
return false;
}
}
Any advice would be much appreciated. Also if there is a simpler approach than the one I am suggesting then any details of that would be great.
Ok since we cannot use stored procedures or Bulk Copy ; when I modelled the various approaches a couple of years ago, the key determinant to performance was the number of calls to the server. So batching a set of MERGE or INSERT statements into a single call separated by semi-colons was found to be the fastest method. I ended up batching my SQL statements. I think the max size of a SQL statement was 32k so I chopped up my batch into units of that size.
(Note - use StringBuilder instead of concatenating strings manually - it has a beneficial effect on performance)
Psuedo-code
string sqlStatement = "INSERT INTO Tab1 VALUES {0},{1},{2}";
StringBuilder sqlBatch = new StringBuilder();
foreach(DataRow row in myDataTable)
{
sqlBatch.AppendLine(string.Format(sqlStatement, row["Field1"], row["Field2"], row["Field3"]));
sqlBatch.Append(";");
}
myOdbcConnection.ExecuteSql(sqlBatch.ToString());
You need to deal with batch size complications, and formatting of the correct field data types in the string-replace step, but otherwise this will be the best performance.
Marked solution of PhillipH is open for several mistakes and SQL injection.
Normally you should build a DbCommand with parameters and execute this instead of executing a self build SQL statement.
The CommandText must be "INSERT INTO Tab1 VALUES ?,?,?" for ODBC and OLEDB, SqlClient needs named parameters ("#<Name>").
Parameters should be added with the dimensions of underlaying column.
I'm working on an ASP.NET project (C#) with SQL Server 2008.
When I insert a row into a table in the database, I would like to get the last inserted ID, which is the table's IDENTITY (Auto Incremented).
I do not wish to use another query, and do something like...
SELECT MAX(ID) FROM USERS;
Because - even though it's only one query - it feels lame...
When I insert something I usually use ExecuteNonQuery(), which returns the number of affected rows.
int y = Command.ExecuteNonQuery();
Isn't there a way to return the last inserted ID without using another query?
Most folks do this in the following way:
INSERT dbo.Users(Username)
VALUES('my new name');
SELECT NewID = SCOPE_IDENTITY();
(Or instead of a query, assigning that to a variable.)
So it's not really two queries against the table...
However there is also the following way:
INSERT dbo.Users(Username)
OUTPUT inserted.ID
VALUES('my new name');
You won't really be able to retrieve this with ExecuteNonQuery, though.
You can return the id as an output parameter from the stored procedure, e.g. #userId int output
Then, after the insert, SET #userId = scope_identity()
even though it's only one query - it feels lame...
It actually is also wrong as you can have multiple overlapping iserts.
That is one thing that I always fuind funny - people not reading the documentation.
SELECT SCOPE_IDENTITY()
returns the last identity value generated in a specific scope and is syntactically correct. It also is properly documented.
Isn't there a way to return the last inserted ID without using another query?
Yes. Ask for the number in the saame SQL batch.
INSERT (blablab9a); SELECT SCOPE_IDENTITY ();
as ONE string. ExecuteScalar.
You can have more than one SQL statement in one batch.
If you want to execute query from C# code & want to get last inserted id then you have to find the following code.
SqlConnection connection = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["ConnectionString"].ConnectionString);
connection.Open();
string sql = "Insert into [Order] (customer_id) values (" + Session["Customer_id"] + "); SELECT SCOPE_IDENTITY()";
SqlCommand cmd = new SqlCommand();
cmd.Connection = connection;
cmd.CommandText = sql;
cmd.CommandType = CommandType.Text;
var order_id = cmd.ExecuteScalar();
connection.Close();
Console.Write(order_id);
I have this code and it always returns -1.I have three tables (a picture is more suggestive ):
I want to see if the row is already in the ReservationDetails table, and if it's not to insert it.
try
{
SqlConnection conn = new SqlConnection...
SqlCommand slct = new SqlCommand("SELECT * FROM ReservationDetails WHERE rID=#rID AND RNumber=#RNumber", conn);
slct.Parameters.AddWithValue("#rID", (int)comboBox1.SelectedValue);
slct.Parameters.AddWithValue("#RNumber", dataGridView1.SelectedRows[0].Cells[0].Value);
int noRows;//counts if we already have the entry in the table
conn.Open();
noRows = slct.ExecuteNonQuery();
conn.Close();
MessageBox.Show("The result of select="+noRows);
if (noRows ==0) //we can insert the new row
Have you read the documentation of SqlCommand.ExecuteNonQuery?
For UPDATE, INSERT, and DELETE statements, the return value is the number of rows affected by the command. When a trigger exists on a table being inserted or updated, the return value includes the number of rows affected by both the insert or update operation and the number of rows affected by the trigger or triggers. For all other types of statements, the return value is -1. If a rollback occurs, the return value is also -1.
And your query is SELECT.
You should
1) Change your TSQL to
SELECT COUNT(*) FROM ReservationDetails WHERE ...
(better still, use IF EXISTS ...)
2) and use ExecuteScalar():
noRows = (int) slct.ExecuteScalar();
Also: you will need to use a transaction (or some other atomic technique), or else someone could insert a row in-between you testing and trying to insert it...
All that said, it would be better to create a stored procedure that given your parameters, atomically tests and inserts into the table, returning 1 if successful, or 0 if the row already existed.
It is better to do it in a single query so that you do not need to request server two times.
Create a procedure like this and call it from the code.
IF EXISTS (SELECT 1 from ReservationDetails WHERE rID=#rID AND RNumber=#RNumber)
BEGIN
insert into ReservationDetails values(#rID,#RNumber)
END
As per Microsoft:
You can use the ExecuteNonQuery to perform catalog operations (for example, querying the structure of a database or creating database objects such as tables), or to change the data in a database without using a DataSet by executing UPDATE, INSERT, or DELETE statements.
What you may need, instead of ExecuteNonQuery is ExecuteScalar and put the COUNT in your select query.
i.e.
SqlCommand slct = new SqlCommand("SELECT COUNT(*) FROM ReservationDetails WHERE rID=#rID AND RNumber=#RNumber", conn);
Also, try to make use of the using statement in C#, so you don't need to worry about closing the connection manually, even if things fail.
i.e.
using (SqlConnection conn = new SqlConnection(connString))
{
SqlCommand cmd = new SqlCommand(sql, conn);
try
{
conn.Open();
newProdID = (Int32)cmd.ExecuteScalar();
}
catch (Exception ex)
{
//Do stuff
}
}
see:
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.executescalar.aspx
#nickNatra
When ever you use
Select command
It will return you values. Which can be either used by
DataSet or SqlDataReader
But
command.ExecuteNonQuery()
is used only when you are using
Insert , Update , Delete where the Rows are getting effected in your table
Yes, If you do want to know how much records are there in your query.
You can perform
a) Modify your query "select count(*) from table"
where you will only get one value ie. Number of Rows.
b) Using this query perform command. ExecuteScalar() which will return only First row and first column which is the Row Count
Hence this satisfy's your requirement.
Cheers!!
My c# button adding some data in one table. how can i get id of this row? does anybody have idea?
private static void InsertIntoTransportation(string ZednadebNumber, DateTime DateTime, int SellerID, int BuyerID, string TransStart, string TransEnd, decimal Quantity, string LoadType, string LoadName, int DriverID, decimal Cost, decimal FuelUsed, decimal salary)
{
try
{
using (SqlCommand cmd = new SqlCommand("INSERT INTO Transportation (DriverID, ClientIDAsSeller, ClientIDAsBuyer, ZednadebNumber, LoadName, TransStart, TransEnd, LoadType, Quantity, Cost, FuelUsed, DateTime) VALUES (#DriverID, #ClientIDAsSeller, #ClientIDAsBuyer, #ZednadebNumber, #LoadName, #TransStart, #TransEnd, #LoadType, #Quantity, #Cost, #FuelUsed, #DateTime)", new SqlConnection(Program.ConnectionString)))
{
cmd.Parameters.AddWithValue("#DriverID", DriverID);
cmd.Parameters.AddWithValue("#ClientIDAsSeller", SellerID);
cmd.Parameters.AddWithValue("#ClientIDAsBuyer", BuyerID);
cmd.Parameters.AddWithValue("#ZednadebNumber", ZednadebNumber);
cmd.Parameters.AddWithValue("#LoadName", LoadName);
cmd.Parameters.AddWithValue("#TransStart", TransStart);
cmd.Parameters.AddWithValue("#TransEnd", TransEnd);
cmd.Parameters.AddWithValue("#LoadType", LoadType);
cmd.Parameters.AddWithValue("#Quantity", Quantity);
cmd.Parameters.AddWithValue("#Cost", Cost);
cmd.Parameters.AddWithValue("#FuelUsed", FuelUsed);
cmd.Parameters.AddWithValue("#DateTime", DateTime);
cmd.Connection.Open();
cmd.ExecuteNonQuery();
SqlCommand cmd1 = new SqlCommand("INSERT INTO Salary (DriverID, TransportationID, Salary) VALUES (#DID, ##IDENTITY, #Salary)", new SqlConnection(Program.ConnectionString));
cmd1.Parameters.AddWithValue("#DID", DriverID);
cmd1.Parameters.AddWithValue("#Salary", salary);
cmd1.Connection.Open();
cmd1.ExecuteNonQuery();
cmd1.Connection.Close();
cmd.Connection.Close();
}
}
catch (Exception)
{
MessageBox.Show("ვერ ვახერხებ მონაცემთა ბაზასთან კავშირს, დაწრმუნდით თქვენი კომპიუტერის ინტერნეტთან კავშირის გამართულობაში.", "შეცდომა !!!", MessageBoxButtons.OK, MessageBoxIcon.Error);
}
}
My c# button adding some data in one
table. how can i get id of this row?
does anybody have idea?
Your code example provided inserts two rows.
Either way, simply alter the statement to included "; SELECT SCOPE_IDENTITY()" at the end, and get a scalar result from the command.
Assuming your row ID is a standard integer field (rather than a bigint)
// Edit your command text to have "; SELECT SCOPE_IDENTITY()" at the end.
int? insertedId = cmd.ExecuteScalar() as int?;
if (insertedId.HasValue)
{
// success!
}
Note that if your row isn't using auto incrementing integer/big-int, then this won't work.
In addition, I would strongly suggest that you consider moving your statements to stored procedures, if this is for anything other than test/sample code.
Separating your SQL code out of your C# means that you can re-use common statements (think of them as Methods). Alternatively, consider using an ORM solution.
On an MS SQL Server, My preference is for the output clause. You wouldn't be executing a non-query, it would be a reader (even though it's technically an insert) - the resultset from your insert will then contain one row with the field(s) you choose. This is useful if you have a ROWVERSION (TIMESTAMP) column that you also need for updates that might follow (prevents you from doing another read).
Edit - Just use OUTPUT, not OUTPUT INTO (which is more useful for SQL Server side code than C#).
This works well with sprocs and triggers too, as opposed to scope_identity (which, when using instead-of triggers is in a different scope and so returns null), or ##identity and all of it's issues.
Some of my co-worker developers will perfrom a query after the insert to return the identity value. As a DBA there are other ways to get it as indicated by Alexander. While the query will produce extra IO on the network at least it should still be in cache and not cause any more Disk IO.