Well i have a file.sql that contains 20,000 of insert commands
Sample From the .sql file
INSERT INTO table VALUES
(1,-400,400,3,154850,'Text',590628,'TEXT',1610,'TEXT',79);
INSERT INTO table VALUES
(39,-362,400,3,111659,'Text',74896,'TEXT',0,'TEXT',14);
And i am using the following code to create an in memory Sqlite database and pull the values into it then calculate the time elapsed
using (var conn = new SQLiteConnection(#"Data Source=:memory:"))
{
conn.Open();
var stopwatch = new Stopwatch();
stopwatch.Start();
using (var cmd = new SQLiteCommand(conn))
{
using (var transaction = conn.BeginTransaction())
{
cmd.CommandText = File.ReadAllText(#"file.sql");
cmd.ExecuteNonQuery();
transaction.Commit();
}
}
var timeelapsed = stopwatch.Elapsed.TotalSeconds <= 60
? stopwatch.Elapsed.TotalSeconds + " seconds"
: Math.Round(stopwatch.Elapsed.TotalSeconds/60) + " minutes";
MessageBox.Show(string.Format("Time elapsed {0}", timeelapsed));
conn.Close();
}
Things i have tried
Using file database instead of memory one.
Using begin transaction and commit transaction [AS SHOWN IN MY CODE].
Using Firefox's extension named SQLite Manager to test whether the
slowing down problem is from the script; However, I was surprised
that the same 20,000 lines that i am trying to process using my code
has been pulled to the database in JUST 4ms!!!.
Using PRAGMA synchronous = OFF, as well as, PRAGMA journal_mode =
MEMORY.
Appending begin transaction; and commit transaction; to the
beginning and ending of the .sql file respectively.
As the SQLite documentations says : SQLite is capable of processing 50,000 commands per seconds. And that is real and i made sure of it using the SQLite Manager [AS DESCRIPED IN THE THIRD SOMETHING THAT I'V TRIED]; However, I am getting my 20,000 commands done in 4 minutes something that tells that there is something wrong.
QUESTION : What is the problem am i facing why is the Execution done very slowly ?!
SQLite.Net documentation recommends the following construct for transactions
using (SqliteConnection conn = new SqliteConnection(#"Data Source=:memory:"))
{
conn.Open();
using(SqliteTransaction trans = conn.BeginTransaction())
{
using (SqliteCommand cmd = new SQLiteCommand(conn))
{
cmd.CommandText = File.ReadAllText(#"file.sql");
cmd.ExecuteNonQuery();
}
trans.Commit();
}
con.Close();
}
Are you able to manipulate the text file contexts to something like:
INSERT INTO table (col01, col02, col03, col04, col05, col06, col07, col08, col09, col10, col11)
SELECT 1,-400,400,3,154850,'Text',590628,'TEXT',1610,'TEXT',79
UNION ALL
SELECT 39,-362,400,3,111659,'Text',74896,'TEXT',0,'TEXT',14
;
Maybe try "batching them" into groups of 100 as a initial test.
http://sqlite.org/lang_select.html
SqlLite seems to support the UNION ALL statement.
Related
I'm working with an hosted service in C# asp.net core, linQ and T-SQL.
I need to make an insert one by one of records in my database.
Of course this is not a fast operation, but I'm not that experienced in this field so maybe I'm doing it wrong.
This is my code in my manager:
public void StrategyMassive(string foldpathsave)
{
using (IServiceScope scope = _services.CreateScope())
{
List<string> filesreading = new List<string>();
VUContext _context = scope.ServiceProvider.GetRequiredService<VUContext>();
List<string> filesnumber = File.ReadAllLines(foldpathsave).ToList();
filesreading = filesnumber.ToList();
filesreading.RemoveRange(0, 2);
foreach (string singlefile in filesreading)
{
//INTERNAL DATA NORMALIZATION
_repository.ImportAdd(_context, newVUL, newC2, newC3, newDATE);
_repository.Save(_context);
}
}
}
And this is my repository interface:
public void ImportAdd(VUContext _context, AVuTable newVUL, ACs2Table newC2, ACs3Table newC3, ADateTable newDATe)
{
_context.AVuTable.Add(newVU);
_context.ADateTable.Add(newDATE);
if (newC2 != null)
{
_context.ACs2Table.Add(newC2);
}
if (newC3 != null)
{
_context.ACs3Table.Add(newC3);
}
public void Save(VUContext _context)
{
_context.SaveChanges();
}
}
It everything quite simple I know, so how can I speed up this insert keeping it one by one record easly?
Start NOT using the slowest way to do it.
It starts with the way you actually load the files.
It goes on by not using SqlBulkCopy - in multiple threads possibly - to write the data to the database.
What you do is the slowest possible way - because EntityFramework is NOT an ETL tool.
Btw., one transaction per item (SaveChanges) does not help either. It maeks a super slow solution really really really super slow.
I manage to laod around 64k rows per second per thread, with 4-6 threads running in parallel.
To my experience SqlBulkCopy is the fastest way to do it. filesnumber sounds to be misnomer and I suspect you are reading a list of delimited files to be loaded to SQL Server after some normalization process. Probably that would even be faster if you do your normalization on server side, after loading the data initially to a temp file. Here is a sample SqlBulkCopy from a delimited file:
void Main()
{
Stopwatch sw = new Stopwatch();
sw.Start();
string sqlConnectionString = #"server=.\SQLExpress2012;Trusted_Connection=yes;Database=SampleDb";
string path = #"d:\temp\SampleTextFiles";
string fileName = #"combDoubledX.csv";
using (OleDbConnection cn = new OleDbConnection(
"Provider=Microsoft.ACE.OLEDB.12.0;Data Source="+path+
";Extended Properties=\"text;HDR=No;FMT=Delimited\";"))
using (SqlConnection scn = new SqlConnection( sqlConnectionString ))
{
OleDbCommand cmd = new OleDbCommand("select * from "+fileName, cn);
SqlBulkCopy sbc = new SqlBulkCopy(scn, SqlBulkCopyOptions.TableLock,null);
sbc.ColumnMappings.Add(0,"[Category]");
sbc.ColumnMappings.Add(1,"[Activity]");
sbc.ColumnMappings.Add(5,"[PersonId]");
sbc.ColumnMappings.Add(6,"[FirstName]");
sbc.ColumnMappings.Add(7,"[MidName]");
sbc.ColumnMappings.Add(8,"[LastName]");
sbc.ColumnMappings.Add(12,"[Email]");
cn.Open();
scn.Open();
SqlCommand createTemp = new SqlCommand();
createTemp.CommandText = #"if exists
(SELECT * FROM tempdb.sys.objects
WHERE object_id = OBJECT_ID(N'[tempdb]..[##PersonData]','U'))
BEGIN
drop table [##PersonData];
END
create table ##PersonData
(
[Id] int identity primary key,
[Category] varchar(50),
[Activity] varchar(50) default 'NullOlmasin',
[PersonId] varchar(50),
[FirstName] varchar(50),
[MidName] varchar(50),
[LastName] varchar(50),
[Email] varchar(50)
)
";
createTemp.Connection = scn;
createTemp.ExecuteNonQuery();
OleDbDataReader rdr = cmd.ExecuteReader();
sbc.NotifyAfter = 200000;
//sbc.BatchSize = 1000;
sbc.BulkCopyTimeout = 10000;
sbc.DestinationTableName = "##PersonData";
//sbc.EnableStreaming = true;
sbc.SqlRowsCopied += (sender,e) =>
{
Console.WriteLine("-- Copied {0} rows to {1}.[{2} milliseconds]",
e.RowsCopied,
((SqlBulkCopy)sender).DestinationTableName,
sw.ElapsedMilliseconds);
};
sbc.WriteToServer(rdr);
if (!rdr.IsClosed) { rdr.Close(); }
cn.Close();
scn.Close();
}
sw.Stop();
sw.Dump();
}
And few sample lines from that file:
"Computer Labs","","LRC 302 Open Lab","","","10057380","Test","","Cetin","","5550123456","","cb#nowhere.com"
"Computer Labs","","LRC 302 Open Lab","","","123456789","John","","Doe","","5551234567","","jdoe#somewhere.com"
"Computer Labs","","LRC 302 Open Lab","","","012345678","Mary","","Doe","","5556666444","","mdoe#here.com"
You could create and run a list of Tasks<> doing SqlBulkCopy reading from a source (SqlBulkCopy supports a series of readers).
For faster operation you need to reduce the amount of database roundtrips
Using batching of statements feature in EF Core
You can see this feature is available only in EF Core, so you need to migrate to using EF Core if you are still using EF 6.
Compare EF Core & EF6
For this feature to work you need to move the Save operation outside of the loop.
Bulk insert
Bulk insert feature is designed to be the fastest way to insert large amount of database records
Bulk Copy Operations in SQL Server
To use it you need to use the SqlBulkCopy class for SQL Server and your code needs considerable rework.
I recently read about SQLite and thought I would give it a try. When I insert one record it performs okay. But when I insert one hundred it takes five seconds, and as the record count increases so does the time. What could be wrong? I am using the SQLite Wrapper (system.data.SQlite):
dbcon = new SQLiteConnection(connectionString);
dbcon.Open();
//---INSIDE LOOP
SQLiteCommand sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
dbcon.close();
Wrap BEGIN \ END statements around your bulk inserts. Sqlite is optimized for transactions.
dbcon = new SQLiteConnection(connectionString);
dbcon.Open();
SQLiteCommand sqlComm;
sqlComm = new SQLiteCommand("begin", dbcon);
sqlComm.ExecuteNonQuery();
//---INSIDE LOOP
sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
sqlComm = new SQLiteCommand("end", dbcon);
sqlComm.ExecuteNonQuery();
dbcon.close();
I read everywhere that creating transactions is the solution to slow SQLite writes, but it can be long and painful to rewrite your code and wrap all your SQLite writes in transactions.
I found a much simpler, safe and very efficient method: I enable a (disabled by default) SQLite 3.7.0 optimisation : the Write-Ahead-Log (WAL).
The documentation says it works in all unix (i.e. Linux and OSX) and Windows systems.
How ? Just run the following commands after initializing your SQLite connection:
PRAGMA journal_mode = WAL
PRAGMA synchronous = NORMAL
My code now runs ~600% faster : my test suite now runs in 38 seconds instead of 4 minutes :)
Try wrapping all of your inserts (aka, a bulk insert) into a single transaction:
string insertString = "INSERT INTO [TableName] ([ColumnName]) Values (#value)";
SQLiteCommand command = new SQLiteCommand();
command.Parameters.AddWithValue("#value", value);
command.CommandText = insertString;
command.Connection = dbConnection;
SQLiteTransaction transaction = dbConnection.BeginTransaction();
try
{
//---INSIDE LOOP
SQLiteCommand sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
transaction.Commit();
return true;
}
catch (SQLiteException ex)
{
transaction.Rollback();
}
By default, SQLite wraps every inserts in a transaction, which slows down the process:
INSERT is really slow - I can only do few dozen INSERTs per second
Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second.
Transaction speed is limited by disk drive speed because (by default) SQLite actually waits until the data really is safely stored on the disk surface before the transaction is complete. That way, if you suddenly lose power or if your OS crashes, your data is still safe. For details, read about atomic commit in SQLite..
By default, each INSERT statement is its own transaction. But if you surround multiple INSERT statements with BEGIN...COMMIT then all the inserts are grouped into a single transaction. The time needed to commit the transaction is amortized over all the enclosed insert statements and so the time per insert statement is greatly reduced.
See "Optimizing SQL Queries" in the ADO.NET help file SQLite.NET.chm. Code from that page:
using (SQLiteTransaction mytransaction = myconnection.BeginTransaction())
{
using (SQLiteCommand mycommand = new SQLiteCommand(myconnection))
{
SQLiteParameter myparam = new SQLiteParameter();
int n;
mycommand.CommandText = "INSERT INTO [MyTable] ([MyId]) VALUES(?)";
mycommand.Parameters.Add(myparam);
for (n = 0; n < 100000; n ++)
{
myparam.Value = n + 1;
mycommand.ExecuteNonQuery();
}
}
mytransaction.Commit();
}
Within C# application code, I would like to create and then interact with one or more SQLite databases.
How do I initialize a new SQLite database file and open it for reading and writing?
Following the database's creation, how do I execute a DDL statement to create a table?
The next link will bring you to a great tutorial, that helped me a lot!
How to SQLITE in C#: I nearly used everything in that article to create the SQLite database for my own C# Application.
Preconditions
Download the SQLite.dll
either by addding the SQLite DLL's manually
or by using NuGet
Add it as a reference to your project
Refer to the dll from your code using the following line on top of your class: using System.Data.SQLite;
Code sample
The code below creates a database file and inserts a record into it:
// this creates a zero-byte file
SQLiteConnection.CreateFile("MyDatabase.sqlite");
string connectionString = "Data Source=MyDatabase.sqlite;Version=3;";
SQLiteConnection m_dbConnection = new SQLiteConnection(connectionString);
m_dbConnection.Open();
// varchar will likely be handled internally as TEXT
// the (20) will be ignored
// see https://www.sqlite.org/datatype3.html#affinity_name_examples
string sql = "Create Table highscores (name varchar(20), score int)";
// you could also write sql = "CREATE TABLE IF NOT EXISTS highscores ..."
SQLiteCommand command = new SQLiteCommand(sql, m_dbConnection);
command.ExecuteNonQuery();
sql = "Insert into highscores (name, score) values ('Me', 9001)";
command = new SQLiteCommand(sql, m_dbConnection);
command.ExecuteNonQuery();
m_dbConnection.Close();
After you created a create script in C#, you might want to add rollback transactions. It will ensure that data will be committed at the end in one big piece as an atomic operation to the database and not in little pieces, where it could fail at 5th of 10th query for example.
Example on how to use transactions:
using (TransactionScope transaction = new TransactionScope())
{
//Insert create script here.
// Indicates that creating the SQLiteDatabase went succesfully,
// so the database can be committed.
transaction.Complete();
}
3rd party edit
To read records you can use ExecuteReader()
sql = "SELECT score, name, Length(name) as Name_Length
FROM highscores WHERE score > 799";
command = new SQLiteCommand(sql, m_dbConnection);
SQLiteDataReader reader = command.ExecuteReader();
while(reader.Read())
{
Console.WriteLine(reader[0].ToString() + " "
+ reader[1].ToString() + " "
+ reader[2].ToString());
}
dbConnection.Close();
See also this transactionscope example
I'm building a .NET application that talks to an Oracle 11g database. I am trying to take data from Excel files provided by a third party and upsert (UPDATE record if exists, INSERT if not), but am having some trouble with performance.
These Excel files are to replace tariff codes and descriptions, so there are a couple thousand records in each file.
| Tariff | Description |
|----------------------------------------|
| 1234567890 | 'Sample description here' |
I did some research on bulk inserting, and even wrote a function that opens a transaction in the application, executes a bunch of UPDATE or INSERT statements, then commits. Unfortunately, that takes a long time and prolongs the session between the application and the database.
public void UpsertMultipleRecords(string[] updates, string[] inserts) {
OleDbConnection conn = new OleDbConnection("connection string here");
conn.Open();
OleDbTransaction trans = conn.BeginTransaction();
try {
for (int i = 0; i < updates.Length; i++) {
OleDbCommand cmd = new OleDbCommand(updates[i], conn);
cmd.Transaction = trans;
int count = cmd.ExecuteNonQuery();
if (count < 1) {
cmd = new OleDbCommand(inserts[i], conn);
cmd.Transaction = trans;
}
}
trans.Commit();
} catch (OleDbException ex) {
trans.Rollback();
} finally {
conn.Close();
}
}
I found via Ask Tom that an efficient way of doing something like this is using an Oracle MERGE statement, implemented in 9i. From what I understand, this is only possible using two existing tables in Oracle. I've tried but don't understand temporary tables or if that's possible. If I create a new table that just holds my data when I MERGE, I still need a solid way of bulk inserting.
The way I usually upload my files to merge, is by first inserting into a load table with sql*loader and then executing a merge statement from the load table into the target table.
A temporary table will only retain it's contents for the duration of the session. I expect sql*loader to end the session upon completion, so better use a normal table that you truncate after the merge.
merge into target_table t
using load_table l on (t.key = l.key) -- brackets are mandatory
when matched then update
set t.col = l.col
, t.col2 = l.col2
, t.col3 = l.col3
when not matched then insert
(t.key, t.col, t.col2, t.col3)
values
(l.key, l.col, l.col2, l.col3)
I recently read about SQLite and thought I would give it a try. When I insert one record it performs okay. But when I insert one hundred it takes five seconds, and as the record count increases so does the time. What could be wrong? I am using the SQLite Wrapper (system.data.SQlite):
dbcon = new SQLiteConnection(connectionString);
dbcon.Open();
//---INSIDE LOOP
SQLiteCommand sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
dbcon.close();
Wrap BEGIN \ END statements around your bulk inserts. Sqlite is optimized for transactions.
dbcon = new SQLiteConnection(connectionString);
dbcon.Open();
SQLiteCommand sqlComm;
sqlComm = new SQLiteCommand("begin", dbcon);
sqlComm.ExecuteNonQuery();
//---INSIDE LOOP
sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
sqlComm = new SQLiteCommand("end", dbcon);
sqlComm.ExecuteNonQuery();
dbcon.close();
I read everywhere that creating transactions is the solution to slow SQLite writes, but it can be long and painful to rewrite your code and wrap all your SQLite writes in transactions.
I found a much simpler, safe and very efficient method: I enable a (disabled by default) SQLite 3.7.0 optimisation : the Write-Ahead-Log (WAL).
The documentation says it works in all unix (i.e. Linux and OSX) and Windows systems.
How ? Just run the following commands after initializing your SQLite connection:
PRAGMA journal_mode = WAL
PRAGMA synchronous = NORMAL
My code now runs ~600% faster : my test suite now runs in 38 seconds instead of 4 minutes :)
Try wrapping all of your inserts (aka, a bulk insert) into a single transaction:
string insertString = "INSERT INTO [TableName] ([ColumnName]) Values (#value)";
SQLiteCommand command = new SQLiteCommand();
command.Parameters.AddWithValue("#value", value);
command.CommandText = insertString;
command.Connection = dbConnection;
SQLiteTransaction transaction = dbConnection.BeginTransaction();
try
{
//---INSIDE LOOP
SQLiteCommand sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
transaction.Commit();
return true;
}
catch (SQLiteException ex)
{
transaction.Rollback();
}
By default, SQLite wraps every inserts in a transaction, which slows down the process:
INSERT is really slow - I can only do few dozen INSERTs per second
Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second.
Transaction speed is limited by disk drive speed because (by default) SQLite actually waits until the data really is safely stored on the disk surface before the transaction is complete. That way, if you suddenly lose power or if your OS crashes, your data is still safe. For details, read about atomic commit in SQLite..
By default, each INSERT statement is its own transaction. But if you surround multiple INSERT statements with BEGIN...COMMIT then all the inserts are grouped into a single transaction. The time needed to commit the transaction is amortized over all the enclosed insert statements and so the time per insert statement is greatly reduced.
See "Optimizing SQL Queries" in the ADO.NET help file SQLite.NET.chm. Code from that page:
using (SQLiteTransaction mytransaction = myconnection.BeginTransaction())
{
using (SQLiteCommand mycommand = new SQLiteCommand(myconnection))
{
SQLiteParameter myparam = new SQLiteParameter();
int n;
mycommand.CommandText = "INSERT INTO [MyTable] ([MyId]) VALUES(?)";
mycommand.Parameters.Add(myparam);
for (n = 0; n < 100000; n ++)
{
myparam.Value = n + 1;
mycommand.ExecuteNonQuery();
}
}
mytransaction.Commit();
}