SqlBulkCopy succeeds but inserts no records - c#

I am trying to insert a large number of records into a variety of tables. I cannot fit all of the records into memory at once, so instead, I am using an IDataReader implementation to fit some of the data into memory and then dump it into the database with SqlBulkCopy.
The problem I am experiencing is that the second time I try to write to a table, SqlBulkCopy will succeed but fail to actually insert the records. I thought it was a transaction issue at first, but I disabled all transactions on my connection and am still seeing the same problem. I can also independently confirm the size of the tables before and after both inside the code and outside.
Here is a code snippet:
long before = GetCount(tableName);
DataServerConnection conn = GetConnection();
using (var batch = conn.HasTransaction ?
new SqlBulkCopy((SqlConnection)conn.IDbConnection, SqlBulkCopyOptions.Default, (SqlTransaction)conn.Transaction) :
new SqlBulkCopy((SqlConnection)conn.IDbConnection))
{
batch.DestinationTableName = tableName;
batch.WriteToServer(reader);
}
long after = GetCount(tableName);
if ((reader.Count + before) != after)
{
throw new Exception($"Not all records inserted: Before = {before}, After = {after}, Reader Count = {reader.Count}, Expected = {reader.Count + before}");
}
Any ideas what I am missing? GetCount(tableName) is doing a simple
SELECT COUNT(*) FROM [{tableName}]
reader is a basic IDataReader implementation that I have verified works other places on millions of records. GetConnection() is returning a wrapper for the connection which helps to prevent me from having to manage my connections constantly.

I'm not sure how your reader variable is declared so I'll tell you how I do it sometimes (and this doesn't mean it is the best way to do it).
First I declare a tableAdapter based on the dataSet I have:
DataSetTableAdapter.MyTableAdapter dataTable = new DataSetTableAdapter.MyTableAdapter();
Then, I create the connection.
SqlConnection sql = new SqlConnection(...);
And then the BulkCopy variable:
SqlBulkCopy insertData = new SqlBulkCopy(sql);
Once I have this, I start adding rows to the table Adapter like this:
dataTable.AddMyTableRow(...);
When I am finished, I do the bulk insertion:
sql.Open();
insertData.DestinationTableName = "MyTable";
insertData.WriteToServer(dataTable);
sql.Close();
Let me know if this helps you.

Related

What is the best way to get rows of data and its datatypes?

Currently, I’ve been working to translate a whole table and put it into another table with same schema.
Given:
Since the table data rows are more than a thousand rows, it is quite hard to translate all of that in one transaction
I also need to know its datatypes since not all of the columns are translatable.
Plan:
My initial plan is to get the users by batch(e.g. top 10 first) and put it into a “datatable”. Reason is because datatable has a column list which holds the columns datatype. This plan I think is JUST OK.
Drawback:
Putting it into a datatable, I know, would be slow. I wouldnt be able to hide it even if batch it. Just a little bit mitigate it.
On the otherhand, if I put the data into a list, instead of datatable, transaction would be faster. But this will result to another sqlcommand call to get the data type schema of the table.
Question:
Is there a way I could the best of both worlds? Faster and a one call, data value and datatype together. Note, In this case, aside from the row data, I just need the data type of the column.
One technique might be to use BulkCopy. Simply read the schema off the first table. Create the target table, define column mappings and do the bulk copy. I have seen this rip through hundreds of thousands of records in seconds.
string connectionString = GetConnectionString();
// Open a sourceConnection to the AdventureWorks database.
using (SqlConnection sourceConnection =
new SqlConnection(connectionString))
{
sourceConnection.Open();
// Perform initial schema read and create target table
// Get data from the source table as a SqlDataReader.
SqlCommand commandSourceData = new SqlCommand(
"SELECT ProductID, Name, " +
"ProductNumber " +
"FROM Production.Product;", sourceConnection);
SqlDataReader reader =
commandSourceData.ExecuteReader();
// Open the destination connection.
using (SqlConnection destinationConnection =
new SqlConnection(connectionString))
{
destinationConnection.Open();
// Set up the bulk copy object.
using (SqlBulkCopy bulkCopy =
new SqlBulkCopy(destinationConnection))
{
bulkCopy.DestinationTableName =
"dbo.BulkCopyDemoMatchingColumns";
bulkCopy.ColumnMappings.Add("SourceColumn1", "TargetColumn1");
bulkCopy.ColumnMappings.Add("SourceColumn2", "TargetColumn2");
try
{
// Write from the source to the destination.
bulkCopy.WriteToServer(reader);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
finally
{
// Close the SqlDataReader. The SqlBulkCopy
// object is automatically closed at the end
// of the using block.
reader.Close();
}
}
}
}
https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/multiple-bulk-copy-operations

Is there a better way to fill this autocomplete? C#

I have a textbox that autocompletes from values in a SQL Server database. I also created a stored procedure, which is very simple:
Stored procedure code
My code is this:
public AutoCompleteStringCollection AutoCompleteFlight(TextBox flight)
{
using (SqlConnection connection = new SqlConnection(ConnectionLoader.ConnectionString("Threshold")))
{
AutoCompleteStringCollection flightCollection = new AutoCompleteStringCollection();
connection.Open();
SqlCommand flights = new SqlCommand("AutoComplete_Flight", connection);
flights.CommandType = CommandType.StoredProcedure;
SqlDataReader readFlights = flights.ExecuteReader();
while (readFlights.Read())
{
flightCollection.Add(readFlights["Flight_Number"].ToString());
}
return flight.AutoCompleteCustomSource = flightCollection;
}
}
Is there a point to having this stored procedure since it's such a simple query? Or am I doing this wrong, since it still has to use the data reader and insert it into the collections.
My previous code before the stored procedure was:
using (SqlConnection connection = new SqlConnection(ConnectionLoader.ConnectionString("Threshold")))
{
AutoCompleteStringCollection flightCollection = new AutoCompleteStringCollection();
connection.Open();
SqlCommand flights = new SqlCommand("SELECT DISTINCT Flight_Number FROM Ramp_Board", connection);
SqlDataReader readFlights = flights.ExecuteReader();
while (readFlights.Read())
{
flightCollection.Add(readFlights["Flight_Number"].ToString());
}
return flight.AutoCompleteCustomSource = flightCollection;
}
Is the second piece of code better or are they both wrong, and there is a way better way of doing this?
"Better way" is a little undefined.
If you are looking for a performance answer of stored procedure or not, I'm not sure it matters all that much with that small of a data set and a simple query. Stored procedures shine when there are complex operations to perform that can limit back and forth with the server or limit the amount of data returned. In your case, the server side effort is the same either way, and the amount of data returned is also the same. #Niel points out that the procedures can be updated server side without changing your deployed code. This is another useful feature of Stored procedures that you probably will not need for this scenario though.
If you are looking for an alternate code answer then you could use a DataAdapter instead of a DataReader. There are many articles on this site that talk about the performance of the two, and most of them agree that they are more or less the same. The only exception is if you dont't plan on reading all of the rows. In your case, you are reading the whole table, so they are effectively the same.
SqlCommand sqlCmd = new SqlCommand("SELECT * FROM SomeTable", connection);
SqlDataAdapter sqlDA= new SqlDataAdapter();
sqlDA.SelectCommand = sqlCmd;
DataTable table = new DataTable();
// Fill table from SQL using the command and connection
sqlDA.Fill(table);
// Fill autoComplete from table
autoComplete.AddRange(table.AsEnumerable().Select(dr => dr["ColumnName"].ToString()).ToArray());
If you decide to use this kind of a LINQ statement, it is best to set the column to not allow nulls, or add a where that filters nulls. I'm not sure how or if AutoCompleteStringCollection handles nulls.

Insert data from one SQL Database to a second one using C#

I dont know how to do this query in c#.
There are two databases and each one has a table required for this query. I need to take the data from one database table and update the other database table with the corresponding payrollID.
I have two tables in seperate databases, Employee which is in techData database and strStaff in QLS database. In the employee table I have StaffID but need to pull the PayrollID from strStaff.
Insert payrollID into Employee where staffID from strStaff = staffID from Employee
However I need to get the staffID and PayrollID from strStaff before I can do the insert query.
This is what I have got so far but it wont work.
cn.ConnectionString = ConfigurationManager.ConnectionStrings["PayrollPlusConnectionString"].ConnectionString;
cmd.Connection = cn;
cmd.CommandText = "Select StaffId, PayrollID From [strStaff] Where (StaffID = #StaffID)";
cmd.Parameters.AddWithValue("#StaffID", staffID);
//Open the connection to the database
cn.Open();
// Execute the sql.
dr = cmd.ExecuteReader();
// Read all of the rows generated by the command (in this case only one row).
For each (dr.Read()) {
cmd.CommandText = "Insert into Employee, where StaffID = #StaffID";
}
// Close your connection to the DB.
dr.Close();
cn.Close();
Assuminig, you want to add data to existing table, you have to use UPDATE + SELECT statement (as i mentioned in a comment to the question). It might look like:
UPDATE emp SET payrollID = sta.peyrollID
FROM Emplyoee AS emp INNER JOIN strStaff AS sta ON emp.staffID = sta.staffID
I have added some clarity to your question: the essential part is that you want to create a C# procedure to accomplish your task (not using SQL Server Management Studio, SSIS, bulk insert, etc). Pertinent to this, there will be 2 different connection objects, and 2 different SQL statements to execute on those connections.
The first task would be retrieving data from the first DB (for certainty let's call it source DB/Table) using SELECT SQL statement, and storing it in some temporary data structure, either per row (as in your code), or the entire table using .NET DataTable object, which will give substantial performance boost. For this purpose, you should use the first connection object to source DB/Table (btw, you can close that connection as soon as you get the data).
The second task would be inserting the data into second DB (target DB/Table), though from your business logic it's a bit unclear how to handle possible data conflicts if records with identical ID already exist in the target DB/Table (some clarity needed). To complete this operation you should use the second connection object and second SQL query.
The sample code snippet to perform the first task, which allows retrieving entire data into .NET/C# DataTable object in a single pass is shown below:
private static DataTable SqlReadDB(string ConnString, string SQL)
{
DataTable _dt;
try
{
using (SqlConnection _connSql = new SqlConnection(ConnString))
{
using (SqlCommand _commandl = new SqlCommand(SQL, _connSql))
{
_commandSql.CommandType = CommandType.Text;
_connSql.Open();
using (SqlCeDataReader _dataReaderSql = _commandSql.ExecuteReader(CommandBehavior.CloseConnection))
{
_dt = new DataTable();
_dt.Load(_dataReaderSqlCe);
_dataReaderSql.Close();
}
}
_connSqlCe.Close();
return _dt;
}
}
catch { return null; }
}
The second part (adding data to target DB/Table) you should code based on the clarified business logic (i.e. data conflicts resolution: do you want to update existing record or skip, etc). Just iterate through the data rows in DataTable object and perform either INSERT or UPDATE SQL operations.
Hope this may help. Kind regards,

Making SqlDataAdapter/Datareader "Really Read-Only"

Updated Question: Is there a way to force dataadapter accept only commands which do not include any update/drop/create/delete/insert commands other than verifying the command.text before sending to dataadapter (otherwise throw exception). is there any such built-in functionality provided by dot net in datareader dataadapter or any other?
Note: DataReader returns results it also accepts update query and returns result. (I might be omitting some mistake but I am showing my update command just before executing reader and then show message after its success which is all going fine
Could you search the string for some keywords? Like CREATE,UPDATE, INSERT, DROP or if the query does not start with SELECT? Or is that too flimsy?
You might also want to create a login for this that the application uses that only has read capability. I don't know if the object has that property but you can make the server refuse the transaction.
All you need to do is ensure there are no INSERT, UPDATE, or DELETE statements prepared for the DataAdapter. Your code could look something like this:
var dataAdapter = new SqlDataAdapter("SELECT * FROM table", "connection string");
OR
var dataAdapter = new SqlDataAdapter("SELECT * FROM table", sqlConnectionObject);
And bam, you have a read-only data adapter.
If you just wanted a DataTable then the following method is short and reduces complexity:
public DataTable GetDataForSql(string sql, string connectionString)
{
using(SqlConnection connection = new SqlConnection(connectionString))
{
using(SqlCommand command = new SqlCommand())
{
command.CommandType = CommandType.Text;
command.Connection = connection;
command.CommandText = sql;
connection.Open();
using(SqlDataReader reader = command.ExecuteReader())
{
DataTable data = new DataTable();
data.Load(reader);
return data;
}
}
}
}
usage:
try{
DataTable results = GetDataForSql("SELECT * FROM Table;", ApplicationSettings["ConnectionString"]);
}
catch(Exception e)
{
//Logging
//Alert to user that command failed.
}
There isn't really a need to use the DataAdapter here - it's not really for what you want. Why even go to the bother of catching exceptions etc if the Update, Delete or Insert commands are used? It's not a great fit for what you want to do.
It's important to note that the SelectCommand property doesn't do anything special - when the SelectCommand is executed, it will still run whatever command is passed to it - it just expects a resultset to be returned and if no results are returned then it returns an empty dataset.
This means that (and you should do this anyway) you should explicitly grant only SELECT permissions to the tables you want people to be able to query.
EDIT
To answer your other question, SqlDataReader's are ReadOnly because they work via a Read-Only firehose style cursor. What this effectively means is:
while(reader.Read()) //Reads a row at a time moving forward through the resultset (`cursor`)
{
//Allowed
string name = reader.GetString(reader.GetOrdinal("name"));
//Not Allowed - the read only bit means you can't update the results as you move through them
reader.GetString(reader.GetOrdina("name")) = name;
}
It's read only because it doesn't allow you to update the records as you move through them. There is no reason why the sql they execute to get the resultset can't update data though.
If you have a read-only requirement, have your TextBox use a connection string that uses an account with only db_datareader permissions on the SQL database.
Otherwise, what's stopping the developer who is consuming your control from just connecting to the database and wreaking havoc using SqlCommand all on their own?

How to do a batch update?

I am wondering is there a way to do batch updating? I am using ms sql server 2005.
I saw away with the sqlDataAdaptor but it seems like you have to first the select statement with it, then fill some dataset and make changes to dataset.
Now I am using linq to sql to do the select so I want to try to keep it that way. However it is too slow to do massive updates. So is there away that I can keep my linq to sql(for the select part) but using something different to do the mass update?
Thanks
Edit
I am interested in this staging table way but I am not sure how to do it and still not clear how it will be faster since I don't understand how the update part works.
So can anyone show me how this would work and how to deal with concurrent connections?
Edit2
This was my latest attempt at trying to do a mass update using xml however it uses to much resources and my shared hosting does not allow it to go through. So I need a different way so thats why I am not looking into a staging table.
using (TestDataContext db = new TestDataContext())
{
UserTable[] testRecords = new UserTable[2];
for (int count = 0; count < 2; count++)
{
UserTable testRecord = new UserTable();
if (count == 1)
{
testRecord.CreateDate = new DateTime(2050, 5, 10);
testRecord.AnotherField = true;
}
else
{
testRecord.CreateDate = new DateTime(2015, 5, 10);
testRecord.AnotherField = false;
}
testRecords[count] = testRecord;
}
StringBuilder sBuilder = new StringBuilder();
System.IO.StringWriter sWriter = new System.IO.StringWriter(sBuilder);
XmlSerializer serializer = new XmlSerializer(typeof(UserTable[]));
serializer.Serialize(sWriter, testRecords);
using (SqlConnection con = new SqlConnection(connectionString))
{
string sprocName = "spTEST_UpdateTEST_TEST";
using (SqlCommand cmd = new SqlCommand(sprocName, con))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandType = System.Data.CommandType.StoredProcedure;
SqlParameter param1 = new SqlParameter("#UpdatedProdData", SqlDbType.VarChar, int.MaxValue);
param1.Value = sBuilder.Remove(0, 41).ToString();
cmd.Parameters.Add(param1);
con.Open();
int result = cmd.ExecuteNonQuery();
con.Close();
}
}
}
# Fredrik Johansson I am not sure what your saying will work. Like it seems to me you want me to make a update statement for each record. I can't do that since I will have need update 1 to 50,000+ records and I will not know till that point.
Edit 3
So this is my SP now. I think it should be able to do concurrent connections but I wanted to make sure.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[sp_MassUpdate]
#BatchNumber uniqueidentifier
AS
BEGIN
update Product
set ProductQty = 50
from Product prod
join StagingTbl stage on prod.ProductId = stage.ProductId
where stage.BatchNumber = #BatchNumber
DELETE FROM StagingTbl
WHERE BatchNumber = #BatchNumber
END
You can use the sqlDataAdapter to do a batch update. It dosen’t matter how you fill your dataset. L2SQL or whatever, you can use different methods to do the update. Just define the query to run using the data in your datatable.
The key here is the UpdateBatchSize. The dataadapter will send the updates in batches of whatever size you define. You need to expirement with this value to see what number works best, but typicaly numbers of 500-1000 do best. SQL can then optimize the update and execute a little faster. Note that when doing batchupdates, you cannot update the row source of the datatable.
I use this method to do updates of 10-100K and it usualy runs in under 2 minutes. It will depend on what you are updating though.
Sorry, this is in VB….
Using da As New SqlDataAdapter
da.UpdateCommand = conn.CreateCommand
da.UpdateCommand.CommandTimeout = 300
da.AcceptChangesDuringUpdate = False
da.ContinueUpdateOnError = False
da.UpdateBatchSize = 1000 ‘Expirement for best preformance
da.UpdateCommand.UpdatedRowSource = UpdateRowSource.None 'Needed if UpdateBatchSize > 1
sql = "UPDATE YourTable"
sql += " SET YourField = #YourField"
sql += " WHERE ID = #ID"
da.UpdateCommand.CommandText = sql
da.UpdateCommand.UpdatedRowSource = UpdateRowSource.None
da.UpdateCommand.Parameters.Clear()
da.UpdateCommand.Parameters.Add("#YourField", SqlDbType.SmallDateTime).SourceColumn = "YourField"
da.UpdateCommand.Parameters.Add("#ID", SqlDbType.SmallDateTime).SourceColumn = "ID"
da.Update(ds.Tables("YourTable”)
End Using
Another option is to bulkcopy to a temp table, and then run a query to update the main table from it. This may be faster.
As allonym said, Use SqlBulkCopy, which is very fast(I found speed improvements of over 200x - from 1500 secs to 6s). However you can use the DataTable and DataRows classes to provide data to SQlBulkCopy (which seems easier). Using SqlBulkCopy this way has the added advantage of bein .NET 3.0 compliant as well (Linq was added only in 3.5).
Checkout http://msdn.microsoft.com/en-us/library/ex21zs8x%28v=VS.100%29.aspx for some sample code.
Use SqlBulkCopy, which is lightning-fast. You'll need a custom IDataReader implementation which enumerates over your linq query results. Look at http://code.msdn.microsoft.com/LinqEntityDataReader for more info and some potentially suitable IDataReader code.
You have to work with the expression trees directly, but it's doable. In fact, it's already been done for you, you just have to download the source:
Batch Updates and Deletes with LINQ to SQL
The alternative is to just use stored procedures or ad-hoc SQL queries using the ExecuteMethodCall and ExecuteCommand methods of the DataContext.
You can use SqlDataAdapter to do a batch-update even if a datatable is filled manually/programmatically (from linq of any other source).
Just remember to manually set the RowState for the rows in the datatable. Use dataRow.SetModified() for this.

Categories

Resources