SqlBulkCopy.WriteToServer hangs Thread.Abort is called but not sure why

SqlBulkCopy.WriteToServer hangs Thread.Abort is called but not sure why - c#

Given:
A BenchMark class that lets me know when something has completed.
A very large XML file (~120MB) that has been parsed into multiple Lists
Some code:
SqlConnection con = null;
SqlTransaction transaction = null;
try
{
con = getCon(); // gets a new connection object
con.Open();
transaction = con.BeginTransaction();
var bulkCopy = new SqlBulkCopy(con, SqlBulkCopyOptions.Default, transaction)
{
BatchSize = 1000,
DestinationTableName = "Table1"
};
// assume that the BenchMark class is working
b = new BenchMark("Table1");
bulkCopy.WriteToServer(_insertTable1s.AsDataReader()); // _insertTables1s is a List<Table1>
b.Complete();
LogHelper.WriteLogItem(b);
b = new BenchMark("Table2");
bulkCopy.DestinationTableName = "Table2";
bulkCopy.WriteToServer(_insertTable2s.AsDataReader()); // _insertTables2s is a List<Table2>
b.Complete();
LogHelper.WriteLogItem(b);
// etc... this code does a batch insert into about 7 tables all having about 40,000 records being inserted.
b = new BenchMark("Transaction Commit");
transaction.Commit();
b.Complete();
}
catch (Exception e)
{
transaction.Rollback();
LogHelper.WriteLogItem(
LogLevel.Critical,
LogType.DataProcessing,
e.ToString());
}
finally
{
con.Close();
}
The Problem:
On my local development environment, everything is fine. Its when I run this operation in the cloud that causes it to hang. Using the LogHelper.WriteLogItem method, I can watch the progress of this process. I observe it hang randomly on a particular table. No exception is thrown so the transaction isn't rolled back. Say it hangs on Table2 bulk insert. Using MS SQL Management Studio, I run queries on Table3, Table2 and Table1 with no issue (this means that the transaction was aborted?)
Since it hangs, I'll go rerun the process. This time it hangs sooner so I might get logs like this:
7755 Benchmark LoadXML took 00:00:04.2432816
7756 Benchmark Table1 took 00:00:06.3961230
7757 Benchmark Table2 took 00:00:05.2566890
7758 Benchmark Table3 took 00:00:08.4900921
7759 Benchmark Table4 took 00:00:02.0000123
... it hangs on Table5 (because the BenchMark never completed). I go to run it again and the rest of the log looks like:
7780 Benchmark LoadXML took 00:00:04.1203923
... and it hangs here now.
I'm using rackspace cloud hosting if that helps. I have been able to fix this in the past by deleting all the tables from my dbml file and readding them but this time its not working. I'm wondering if the amount of data being processed is causing the problem?
EDIT: The code in this example is run in an Asynchronous thread. I've found out that the Thread is Aborting for an unknown reason and I need to find out why to solve this problem.

If you have admin to your server or database, you can run
SELECT * FROM sys.dm_tran_session_transactions
to see what transactions are currently active - From Pinal
Additionally, you can run sp_lock to make sure there isn't something blocking your transaction.

Because this process is done asynchronously (i.e. a thread is kicked off to handle this) the thread has a problem which aborts it and that is why I get strange behavior where the code stalls at different places. I've solved this by completing this task synchronously (it works but its not ideal).
I guess the real issue is why my thread is aborting since I'm not aborting it in any of my code. I believe that its due to amount of data that is being processed, but I could be wrong.
Either way, I've solved my problem.

Related

why does MySQL Connector claim there is already an open DataReader when there isn't?

I'm using the .NET Connector to access a MySQL database from my C# program. All my queries are done with MySqlCommand.BeginExecuteReader, with the IAsyncResults held in a list so I can check them periodically and invoke appropriate callbacks whenever they finish, fetching the data via MySqlCommand.EndExecuteReader. I am careful never to hold one of these readers open while attempting to read results from something else.
This mostly works fine. But I find that if I start two queries at the same time, then I get the dreaded MySqlException: There is already an open DataReader associated with this Connection which must be closed first exception in EndExecuteReader. And this is happening the first time I invoke EndExecuteReader. So the error message is full of baloney; there is no other open DataReader at that point, unless the connector has somehow opened one behind the scenes without me calling EndExecuteReader. So what's going on?
Here's my update loop, including copious logging:
for (int i=queries.Count-1; i>=0; i--) {
Debug.Log("Checking query: " + queries[i].command.CommandText);
if (!queries[i].operation.IsCompleted) continue;
var q = queries[i];
queries.RemoveAt(i);
Debug.Log("Finished, opening Reader for " + q.command.CommandText);
using (var reader = q.command.EndExecuteReader(q.operation)) {
try {
q.callback(reader, null);
} catch (System.Exception ex) {
Logging.LogError("Exception while processing: " + q.command.CommandText);
Logging.LogError(ex.ToString());
q.callback(null, ex.ToString());
}
}
Debug.Log("And done with callback for: " + q.command.CommandText);
}
And here's the log:
As you can see, I start both queries in rapid succession. (This is the first thing my program does after opening the DB connection, just to pin down what's happening.) Then the first one I check says it's done, so I call EndExecuteReader on it, and boom -- already it claims there's another open one. This happens immediately, before it even gets to my callback method. How can that be?
Is it not valid to have two open queries at once, even if I only call EndExecuteReader on one at a time?

When you run two queries concurrently, you must have two Connection objects. Why? Each Connection can only handle one query at a time. It looks like your code got into some kind of race condition where some of your concurrent queries worked and then a pair of them collided and failed.
At any rate your system will be more resilient in production if you can keep your startup sequences simple. If I were you I'd run one query after another rather than trying to run them all at once. (Obvs if that causes real performance problems you'll have to run them concurrently. But keep it simple until you need it to be complex.)

entity framework save taking a long time

There are many articles here on EF taking a long time to save, but I've looked through them and used their answers and still seem to get very slow results.
My code looks like so:
using (MarketingEntities1 db = new MarketingEntities1())
{
//using (var trans = db.Database.BeginTransaction(IsolationLevel.ReadUncommitted))
//{
int count = 0;
db.Configuration.AutoDetectChangesEnabled = false;
db.Configuration.ValidateOnSaveEnabled = false;
while (count < ranges.Count)
{
if (bgw != null)
{
bgw.ReportProgress(0, "Saving count: " + count.ToString());
}
db.Set<xGeoIPRanx>().AddRange(ranges.Skip(count).Take(BATCHCOUNT));
db.SaveChanges();
count+=BATCHCOUNT;
}
//trans.Commit();
//}
}
Each batch takes 30+ seconds to complete. BatchCount is 1000. i know EF isn't that slow. You can see that I've stopped using transaction, I've taken tracking off, none of it seemed to help.
Some more info:
xGeoIpRanx is an empty table, with no PK(I'm not sure how much it would help). I'm trying to insert about 10 mil ranges.
Edit:
i feel stupid but im trying to use bulkInsert and i keep getting this entity doesnt exist errors, i look at this code
using (var ctx = GetContext())
{
using (var transactionScope = new TransactionScope())
{
// some stuff in dbcontext
ctx.BulkInsert(entities);
ctx.SaveChanges();
transactionScope.Complete();
}
}
What is "entities" I tried a list of my entities, that doesnt work, what data type is that?
nvm it works as expected it was a strange error due to how i generated the edmx file

Pause the debugger 10 times under load and look at the stack including
external code. Where does it stop most often?
.
Its taking a long time on the .SaveChanges(). just from some quick tests, ADO.net code
That means network latency and server execution time are causing this. For inserts server execution time is usually not that high. You cannot do anything about network latency with EF because it sends one batch per insert. (Yes, this is a deficiency of the framework.).
Don't use EF for bulk work. Consider using table-values parameters or SqlBulkCopy or any other means of bulk inserting such as Aducci's proposal from the comments.

Can an NHibernate session have two data readers open in separate threads?

I'd like to know the correct approach for running two simultaneous queries using NHibernate. Right now, I have a single ISession object that I use for all my queries:
session = sessionFactory.OpenSession();
In one thread, I'm loading some data which takes 10-15 seconds, but I don't need it right away so I don't want to block the entire program while it's loading:
IDbCommand cmd = session.Connection.CreateCommand();
cmd.CommandType = CommandType.TableDirect;
cmd.CommandText = "RecipesForModelingGraph";
IDataReader reader = cmd.ExecuteReader();
while (reader.Read())
{
// Do stuff
}
reader.Close();
This works fine, however in another thread I might be running a query such as:
var newBlah = new Blah();
session.Save(newBlah);
When the above transaction commits, I occasionally get an exception:
Additional information: There is already an open DataReader associated
with this Command which must be closed first.
Now, I thought maybe this was because I was running everything in the same transaction. So, I surrounded all my loading code with:
using (ITransaction transaction = session.BeginTransaction(IsolationLevel.Serializable))
{
// Same DataReader code as above
}
However, the problem has not gone away. I'm thinking maybe I need each thread to have its own ISession object. Is this the correct approach, or am I doing something wrong. Note, I only want a single open connection to the database. Also, keep in mind the background thread is only loading data and nothing else, so I'm not worried about isolation levels and data changing as its being read.

The session is tied to the thread and the Commands created are linked to the sessions connection object. So yes, if a commit or close is executed while an open reader exists you will get an exception.
You could Join() your threads and wait until all are complete before closing/committing.

Can sql server queries be really cancelled/killed?

I would like to give a user the ability to cancel a running query. The query is really slow. (Query optimization is besides the point.) This is mainly out of my curiosity.
MSDN says:
If there is nothing to cancel, nothing occurs. However, if there is a
command in process, and the attempt to cancel fails, no exception is
generated.
Cmd - SqlCommand
DA - DataAdapter
Conn - SqlConnection
CurrentSearch - Thread
LongQuery - Singleton
Here's what I have:
var t = new Thread(AbortThread);
t.Start();
void AbortThread()
{
LongQuery.Current.Cmd.Cancel();
LongQuery.Current.Cmd.Dispose();
LongQuery.Current.DA.Dispose();
LongQuery.Current.Conn.Close();
LongQuery.Current.Conn.Dispose();
LongQuery.Current.Cmd = null;
LongQuery.Current.DA = null;
LongQuery.Current.Conn = null;
CurrentSearch.Abort();
CurrentSearch.Join();
CurrentSearch = null;
}
I noticed that CurrentSearch.Abort() was blocking, that's why I wrapped it in a thread, which probably means that the thread is still working.
Finally, is there anything else than this that I can do to cancel a query? Is it actually possible to cancel such a long query from .NET?

IF you really absolutely want to kill it for good use this approach:
store away the session ID right before starting the long-running query by calling SELECT ##SPID AS 'SESSIONID' on the same connection
When you want to kill it:
Open a new DB connection
issue a KILL command for that session ID
BEWARE as the MSDN documentation states you need the permission ALTER ANY CONNECTION to do this

Yes, you can kill a process from .NET. Here is an example. Please note you will need proper permissions and you have to figure out the process in question. I don't have a quick sample of determining which process your query is running under.
You example aborts the thread, but that does not mean the work on SQL Server was terminated. If you think about it this way: when you go through a bad cell zone and the call drops, if you mom/wife/friend was droning on, do they instantly stop talking? That is an analogy of aborting the thread, at least in the case of working with a database server.

Database file is inexplicably locked during SQLite commit

I'm performing a large number of INSERTS to a SQLite database. I'm using just one thread. I batch the writes to improve performance and have a bit of security in case of a crash. Basically I cache up a bunch of data in memory and then when I deem appropriate, I loop over all of that data and perform the INSERTS. The code for this is shown below:
public void Commit()
{
using (SQLiteConnection conn = new SQLiteConnection(this.connString))
{
conn.Open();
using (SQLiteTransaction trans = conn.BeginTransaction())
{
using (SQLiteCommand command = conn.CreateCommand())
{
command.CommandText = "INSERT OR IGNORE INTO [MY_TABLE] (col1, col2) VALUES (?,?)";
command.Parameters.Add(this.col1Param);
command.Parameters.Add(this.col2Param);
foreach (Data o in this.dataTemp)
{
this.col1Param.Value = o.Col1Prop;
this. col2Param.Value = o.Col2Prop;
command.ExecuteNonQuery();
}
}
this.TryHandleCommit(trans);
}
conn.Close();
}
}
I now employ the following gimmick to get the thing to eventually work:
private void TryHandleCommit(SQLiteTransaction trans)
{
try
{
trans.Commit();
}
catch (Exception e)
{
Console.WriteLine("Trying again...");
this.TryHandleCommit(trans);
}
}
I create my DB like so:
public DataBase(String path)
{
//build connection string
SQLiteConnectionStringBuilder connString = new SQLiteConnectionStringBuilder();
connString.DataSource = path;
connString.Version = 3;
connString.DefaultTimeout = 5;
connString.JournalMode = SQLiteJournalModeEnum.Persist;
connString.UseUTF16Encoding = true;
using (connection = new SQLiteConnection(connString.ToString()))
{
//check for existence of db
FileInfo f = new FileInfo(path);
if (!f.Exists) //build new blank db
{
SQLiteConnection.CreateFile(path);
connection.Open();
using (SQLiteTransaction trans = connection.BeginTransaction())
{
using (SQLiteCommand command = connection.CreateCommand())
{
command.CommandText = DataBase.CREATE_MATCHES;
command.ExecuteNonQuery();
command.CommandText = DataBase.CREATE_STRING_DATA;
command.ExecuteNonQuery();
//TODO add logging
}
trans.Commit();
}
connection.Close();
}
}
}
I then export the connection string and use it to obtain new connections in different parts of the program.
At seemingly random intervals, though at far too great a rate to ignore or otherwise workaround this problem, I get unhandled SQLiteException: Database file is locked. This occurs when I attempt to commit the transaction. No errors seem to occur prior to then. This does not always happen. Sometimes the whole thing runs without a hitch.
No reads are being performed on these files before the commits finish.
I have the very latest SQLite binary.
I'm compiling for .NET 2.0.
I'm using VS 2008.
The db is a local file.
All of this activity is encapsulated within one thread / process.
Virus protection is off (though I think that was only relevant if you were connecting over a network?).
As per Scotsman's post I have implemented the following changes:
Journal Mode set to Persist
DB files stored in C:\Docs + Settings\ApplicationData via System.Windows.Forms.Application.AppData windows call
No inner exception
Witnessed on two distinct machines (albeit very similar hardware and software)
Have been running Process Monitor - no extraneous processes are attaching themselves to the DB files - the problem is definitely in my code...
Does anyone have any idea whats going on here?
I know I just dropped a whole mess of code, but I've been trying to figure this out for way too long. My thanks to anyone who makes it to the end of this question!
brian
UPDATES:
Thanks for the suggestions so far! I've implemented many of the suggested changes. I feel that we are getting closer to the answer...however...
The code above technically works however it is non-deterministic! It is not guaranteed to do anything aside from spin in neutral forever. In practice it seems to work somewhere between the 1st and 10th iteration. If i batch my commits at a reasonable interval damage will be mitigated but I really do not want to leave things in this state...
More suggestions welcome!

It looks like you failed to link the command with the transaction you've created.
Instead of:
using (SQLiteCommand command = conn.CreateCommand())
You should use:
using (SQLiteCommand command = new SQLiteCommand("<INSERT statement here>", conn, trans))
Or you can set its Transaction property after its construction.
While we are at it - your handling of failures is incorrect:
The command's ExecuteNonQuery method can also fail and you are not really protected. You should change the code to something like:
public void Commit()
{
using (SQLiteConnection conn = new SQLiteConnection(this.connString))
{
conn.Open();
SQLiteTransaction trans = conn.BeginTransaction();
try
{
using (SQLiteCommand command = conn.CreateCommand())
{
command.Transaction = trans; // Now the command is linked to the transaction and don't try to create a new one (which is probably why your database gets locked)
command.CommandText = "INSERT OR IGNORE INTO [MY_TABLE] (col1, col2) VALUES (?,?)";
command.Parameters.Add(this.col1Param);
command.Parameters.Add(this.col2Param);
foreach (Data o in this.dataTemp)
{
this.col1Param.Value = o.Col1Prop;
this. col2Param.Value = o.Col2Prop;
command.ExecuteNonQuery();
}
}
trans.Commit();
}
catch (SQLiteException ex)
{
// You need to rollback in case something wrong happened in command.ExecuteNonQuery() ...
trans.Rollback();
throw;
}
}
}
Another thing is that you don't need to cache anything in memory. You can depend on SQLite journaling mechanism for storing incomplete transaction state.

Run Sysinternals Process Monitor and filter on filename while running your program to rule out if any other process does anything to it and to see what exacly your program is doing to the file. Long shot, but might give a clue.

We had a very similar problem using nested Transactions with the TransactionScope class. We thought all database actions occurred on the same thread...however we were caught out by the Transaction mechanism...more specifically the Ambient transaction.
Basically there was a transaction higher up the chain which, by the magic of ado, the connection automatically enlisted in. The result was that, even though we thought we were writing to the database on a single thread, the write didn't really happen until the topmost transaction was committed. At this 'indeterminate' point the database was written to causing it to be locked outside of our control.
The solution was to ensure that the sqlite database did not directly take part in the ambient transaction by ensuring we used something like:
using(TransactionScope scope = new TransactionScope(TransactionScopeOptions.RequiresNew))
{
...
scope.Complete()
}

Things to watch for:
don't use connections across multiple threads/processes.
I've seen it happen when a virus scanner would detect changes to the file and try to scan it. It would lock the file for a short interval and cause havoc.

I started facing this same problem today: I'm studying asp.net mvc, building my first application completely from scratch. Sometimes, when I'd write to the database, I'd get the same exception, saying the database file was locked.
I found it really strange, since I was completely sure that there was just one connection open at that time (based on process explorer's listing of active file handles).
I've also built the whole data access layer from scratch, using System.Data.SQLite .Net provider, and, when I planned it, I took special care with connections and transactions, in order to ensure no connection or transaction was left hanging around.
The tricky part was that setting a breakpoint on ExecuteNonQuery() command and running the application in debug mode would make the error disappear!
Googling, I found something interesting on this site: http://www.softperfect.com/board/read.php?8,5775. There, someone replied the thread suggesting the author to put the database path on the anti-virus ignore list.
I added the database file to the ignore list of my anti-virus (Microsoft Security Essentials) and it solved my problem. No more database locked errors!

Is your database file on the same machine as the app or is it stored on a server?
You should create a new connection in every thread. I would simplefy the creation of a connection, use everywhere: connection = new SQLiteConnection(connString.ToString());
and use a database file on the same machine as the app and test again.
Why the two different ways of creating a connection?

These guys were having similiar problems (mostly, it appears, with the journaling file being locked, maybe TortoiseSVN interactions ... check the referenced articles).
They came up with a set of recommendations (correct directories, changing journaling types from delete to persist, etc). http://sqlite.phxsoftware.com/forums/p/689/5445.aspx#5445
The journal mode options are discussed here: http://www.sqlite.org/pragma.html . You could try TRUNCATE.
Is there a stack trace during the exception into SQL Lite?
You indicate you "batch my commits at a reasonable interval". What is the interval?

I would always use a Connection, Transaction and Command in a using clause. In your first code listing you did, but your third (creating the tables) you didn't. I suggest you do that too, because (who knows?) maybe the commands that create the table somehow continue to lock the file. Long shot... but worth a shot?

Do you have Google Desktop Search (or another file indexer) running? As previously mentioned, Sysinternals Process Monitor can help you track it down.
Also, what is the filename of the database? From PerformanceTuningWindows:
Be VERY, VERY careful what you name your database, especially the extension
For example, if you give all your databases the extension .sdb (SQLite Database, nice name hey? I thought so when I choose it anyway...) you discover that the SDB extension is already associated with APPFIX PACKAGES.
Now, here is the cute part, APPFIX is an executable/package that Windows XP recognizes, and it will, (emphasis mine) ADD THE DATABASE TO THE SYSTEM RESTORE FUNCTIONALITY
This means, stay with me here, every time you write ANYTHING to the database, the Windows XP system thinks a bloody executable has changed and copies your ENTIRE 800 meg database to the system restore directory....
I recommend something like DB or DAT.

While the lock is reported on the COMMIT, the lock is on the INSERT/UPDATE command. Check for record locks not being released earlier in your code.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

SqlBulkCopy.WriteToServer hangs Thread.Abort is called but not sure why - c#

If you have admin to your server or database, you can run SELECT * FROM sys.dm_tran_session_transactions to see what transactions are currently active - From Pinal Additionally, you can run sp_lock to make sure there isn't something blocking your transaction.

Related

why does MySQL Connector claim there is already an open DataReader when there isn't?

entity framework save taking a long time

Can an NHibernate session have two data readers open in separate threads?

Can sql server queries be really cancelled/killed?

Database file is inexplicably locked during SQLite commit

Categories

Resources