Thread abort leaves zombie transactions and broken SqlConnection - c#

I feel like this behavior should not be happening. Here's the scenario:
Start a long-running sql transaction.
The thread that ran the sql command
gets aborted (not by our code!)
When the thread returns to managed
code, the SqlConnection's state is
"Closed" - but the transaction is
still open on the sql server.
The SQLConnection can be re-opened,
and you can try to call rollback on
the transaction, but it has no
effect (not that I would expect this behavior. The point is there is no way to access the transaction on the db and roll it back.)
The issue is simply that the transaction is not cleaned up properly when the thread aborts. This was a problem with .Net 1.1, 2.0 and 2.0 SP1. We are running .Net 3.5 SP1.
Here is a sample program that illustrates the issue.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.SqlClient;
using System.Threading;
namespace ConsoleApplication1
{
class Run
{
static Thread transactionThread;
public class ConnectionHolder : IDisposable
{
public void Dispose()
{
}
public void executeLongTransaction()
{
Console.WriteLine("Starting a long running transaction.");
using (SqlConnection _con = new SqlConnection("Data Source=<YourServer>;Initial Catalog=<YourDB>;Integrated Security=True;Persist Security Info=False;Max Pool Size=200;MultipleActiveResultSets=True;Connect Timeout=30;Application Name=ConsoleApplication1.vshost"))
{
try
{
SqlTransaction trans = null;
trans = _con.BeginTransaction();
SqlCommand cmd = new SqlCommand("update <YourTable> set Name = 'XXX' where ID = #0; waitfor delay '00:00:05'", _con, trans);
cmd.Parameters.Add(new SqlParameter("0", 340));
cmd.ExecuteNonQuery();
cmd.Transaction.Commit();
Console.WriteLine("Finished the long running transaction.");
}
catch (ThreadAbortException tae)
{
Console.WriteLine("Thread - caught ThreadAbortException in executeLongTransaction - resetting.");
Console.WriteLine("Exception message: {0}", tae.Message);
}
}
}
}
static void killTransactionThread()
{
Thread.Sleep(2 * 1000);
// We're not doing this anywhere in our real code. This is for simulation
// purposes only!
transactionThread.Abort();
Console.WriteLine("Killing the transaction thread...");
}
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
using (var connectionHolder = new ConnectionHolder())
{
transactionThread = new Thread(connectionHolder.executeLongTransaction);
transactionThread.Start();
new Thread(killTransactionThread).Start();
transactionThread.Join();
Console.WriteLine("The transaction thread has died. Please run 'select * from sysprocesses where open_tran > 0' now while this window remains open. \n\n");
Console.Read();
}
}
}
}
There is a Microsoft Hotfix targeted at .Net2.0 SP1 that was supposed to address this, but we obviously have newer DLL's (.Net 3.5 SP1) that don't match the version numbers listed in this hotfix.
Can anyone explain this behavior, and why the ThreadAbort is still not cleaning up the sql transaction properly? Does .Net 3.5 SP1 not include this hotfix, or is this behavior that is technically correct?

Since you're using SqlConnection with pooling, your code is never in control of closing the connections. The pool is. On the server side, a pending transaction will be rolled back when the connection is truly closed (socket closed), but with pooling the server side never sees a connection close. W/o the connection closing (either by physical disconnect at the socket/pipe/LPC layer or by sp_reset_connection call), the server cannot abort the pending transaction. So it really boils down to the fact that the connection does not get properly release/reset. I don't understand why you're trying to complicate the code with explicit thread abort dismissal and attempt to reopen a closed transaction (that will never work). You should simply wrap the SqlConnection in an using(...) block, the implied finally and connection Dispose will be run even on thread abort.
My recommendation would be to keep things simple, ditch the fancy thread abort handling and replace it with a plain 'using' block (using(connection) {using(transaction) {code; commit () }}.
Of course I assume you do not propagate the transaction context into a different scope in the server (you do not use sp_getbindtoken and friends, and you do not enroll in distributed transactions).
This little program shows that the Thread.Abort properly closes a connection and the transaction is rolled back:
using System;
using System.Data.SqlClient;
using testThreadAbort.Properties;
using System.Threading;
using System.Diagnostics;
namespace testThreadAbort
{
class Program
{
static AutoResetEvent evReady = new AutoResetEvent(false);
static long xactId = 0;
static void ThreadFunc()
{
using (SqlConnection conn = new SqlConnection(Settings.Default.conn))
{
conn.Open();
using (SqlTransaction trn = conn.BeginTransaction())
{
// Retrieve our XACTID
//
SqlCommand cmd = new SqlCommand("select transaction_id from sys.dm_tran_current_transaction", conn, trn);
xactId = (long) cmd.ExecuteScalar();
Console.Out.WriteLine("XactID: {0}", xactId);
cmd = new SqlCommand(#"
insert into test (a) values (1);
waitfor delay '00:01:00'", conn, trn);
// Signal readyness and wait...
//
evReady.Set();
cmd.ExecuteNonQuery();
trn.Commit();
}
}
}
static void Main(string[] args)
{
try
{
using (SqlConnection conn = new SqlConnection(Settings.Default.conn))
{
conn.Open();
SqlCommand cmd = new SqlCommand(#"
if object_id('test') is not null
begin
drop table test;
end
create table test (a int);", conn);
cmd.ExecuteNonQuery();
}
Thread thread = new Thread(new ThreadStart(ThreadFunc));
thread.Start();
evReady.WaitOne();
Thread.Sleep(TimeSpan.FromSeconds(5));
Console.Out.WriteLine("Aborting...");
thread.Abort();
thread.Join();
Console.Out.WriteLine("Aborted");
Debug.Assert(0 != xactId);
using (SqlConnection conn = new SqlConnection(Settings.Default.conn))
{
conn.Open();
// checked if xactId is still active
//
SqlCommand cmd = new SqlCommand("select count(*) from sys.dm_tran_active_transactions where transaction_id = #xactId", conn);
cmd.Parameters.AddWithValue("#xactId", xactId);
object count = cmd.ExecuteScalar();
Console.WriteLine("Active transactions with xactId {0}: {1}", xactId, count);
// Check count of rows in test (would block on row lock)
//
cmd = new SqlCommand("select count(*) from test", conn);
count = cmd.ExecuteScalar();
Console.WriteLine("Count of rows in text: {0}", count);
}
}
catch (Exception e)
{
Console.Error.Write(e);
}
}
}
}

This is a bug in Microsoft's MARS implementation. Disabling MARS in your connection string will make the problem go away.
If you require MARS, and are comfortable making your application dependent on another company's internal implementation, familiarize yourself with http://dotnet.sys-con.com/node/39040, break out .NET Reflector, and look at the connection and pool classes. You have to store a copy of the DbConnectionInternal property before the failure occurs. Later, use reflection to pass the reference to a deallocation method in the internal pooling class. This will stop your connection from lingering for 4:00 - 7:40 minutes.
There are surely other ways to force the connection out of the pool and to be disposed. Short of a hotfix from Microsoft, though, reflection seems to be necessary. The public methods in the ADO.NET API don't seem to help.

Related

Is it safe to rely on SqlConnection retry logic while using SqlCommand?

I was using Microsoft.Practice.TransientFaultHandling block for retry logic.
Now I switched my application to .Net 4.8 and use the new build in retry logic for SqlConnection.
I was wondering if I need a special retry logic for my SqlCommand (I used Polly before) or if this is also build in. There is no possibility to log a retry when relying on the build in functions which makes it really hard to test.
Microsoft states here :
"There is a subtlety. If a transient error occurs while your query is
being executed, your SqlConnection object doesn't retry the connect
operation. It certainly doesn't retry your query. However,
SqlConnection very quickly checks the connection before sending your
query for execution. If the quick check detects a connection problem,
SqlConnection retries the connect operation. If the retry succeeds,
your query is sent for execution."
I tested this by just disconnecting and reconnecting the internet within the retry time range and my command got executed after a while.
So it seems to work for this simple scenario. But is it really safe to rely on this or do I still have to implement a retry logic for my SqlCommand?
Here is my code:
SqlConnectionStringBuilder builder = new SqlConnectionStringBuilder(ConnectionString);
builder.ConnectRetryCount = 5;
builder.ConnectRetryInterval = 3;
MyDataSet m_myDataSet = new MyDataSet();
using (SqlConnection sqlConnection = new SqlConnection(builder.ConnectionString))
{
try
{
sqlConnection.Open();
}
catch (SqlException sqlEx)
{
// do some logging
return false;
}
try
{
using (SqlCommand cmd = new SqlCommand(selectCmd, sqlConnection))
{
using (SqlDataAdapter da = new SqlDataAdapter(cmd))
{
da.Fill(m_myDataSet, tableName);
}
}
}
}
The answer to your question is to analyze why your connection to the database is open so long that it is going idle and timing out. The ConnectRetryCount and ConnectRetryInterval properties allow you to adjust reconnection attempts after the server identifies an idle connection failure. I would follow the Microsoft recommendations on this one:
Connection Pooling Recommendation
We strongly recommend that you always close the connection when you
are finished using it so that the connection will be returned to the
pool. You can do this using either the Close or Dispose methods of the
Connection object, or by opening all connections inside a using
statement in C#, or a Using statement in Visual Basic. Connections
that are not explicitly closed might not be added or returned to the
pool. For more information, see using Statement or How to: Dispose of
a System Resource for Visual Basic.
Open your connections and close them when no longer needed like this:
MyDataSet m_myDataSet = new MyDataSet();
try
{
using (SqlConnection sqlConnection = new SqlConnection(ConnectionString))
{
sqlConnection.Open();
using (SqlCommand cmd = new SqlCommand(selectCmd, sqlConnection))
{
using (SqlDataAdapter da = new SqlDataAdapter(cmd))
{
da.Fill(m_myDataSet, tableName);
}
}
}
}
catch (SqlException sqlEx)
{
// do some logging
return false;
}
Hope that helps.
Happy coding!!!

TransactionScope breaking SqlConnection pooling?

I have an odd situation with TransactionScope and async/synchronous SQL calls that I'm having difficulty understanding. I hope that someone with a deeper understanding of the ins and outs of these kinds of operations can shed some light on the issue.
The situation:
I have a NUnit testfixture which creates a TransactionScope during [SetUp] and Disposes it at [TearDown] to let each test run on the same data. I have a series of tests which kick off an asynchronous operation on the database and then execute a synchronous operation on the database. The first such test completes successfully. The second such test fails with "There is already an open DataReader associated with this Command which must be closed first.".
If I comment out the TransactionScope entirely, all the tests pass.
I tried various different TransactionScope options, and Complete / Dispose, but the same issue occurs.
I am using the Resharper test runner on an NUnit test, .NET 4.5.1.
I realize the "correct" answer may be "make everything async await". That's not an option for me, unfortunately.
I don't want to enable MARS, as this issue only occurs in tests.
I don't want to use GetAwaiter().GetResult() due to the potential deadlocks.
What it looks like to me is that once a TransactionScope.Dispose/Complete is called, the automatic SQLConnection pooling loses track of which connections have open DataReaders. It hands out the same SqlConnection to two simultaneously running operations, and the second dies.
My primary question is "what is causing this behavior (specifically)?"
My secondary question is "is there anything that can be done to safely resolve the issue?"
The replicating code below prints out the client connection Ids. On my machine, the ClientConnectionId for the ASYNC and SYNC calls in the Second test case are always the same.
Replicating Code:
[TestFixture]
public class DataReaderTests
{
private TransactionScope _scope;
private string _connString = #"my connection string";
[SetUp]
public void Setup()
{
var options = new TransactionOptions()
{
IsolationLevel = IsolationLevel.ReadCommitted,
Timeout = TimeSpan.FromMinutes(1)
};
_scope = new TransactionScope(TransactionScopeOption.RequiresNew, options, TransactionScopeAsyncFlowOption.Enabled);
}
[Test]
[TestCase("First")]
[TestCase("Second")]
public void Test(string name)
{
DoAsyncThing().ConfigureAwait(false);
using (var conn = new SqlConnection(_connString))
{
try
{
conn.Open();
Console.WriteLine("SYNC: " + conn.ClientConnectionId);
using (var cmd = conn.CreateCommand())
{
cmd.CommandText = "SELECT 1";
using (var reader = cmd.ExecuteReader())
{
while (reader.Read())
{
int id = reader.GetInt32(0);
}
}
}
}
catch (TransactionAbortedException tax)
{
Console.WriteLine("ERROR: " + ((SqlException)tax.InnerException.InnerException).ClientConnectionId);
throw;
}
}
}
private async Task DoAsyncThing()
{
using (var connection = new SqlConnection(_connString))
{
await connection.OpenAsync();
Console.WriteLine("ASYNC: " + connection.ClientConnectionId);
using (var cmd = connection.CreateCommand())
{
cmd.CommandText = "WAITFOR DELAY '00:02';";
await cmd.ExecuteNonQueryAsync();
Console.WriteLine("ASYNC COMPLETE");
}
}
}
[TearDown]
public void Teardown()
{
_scope.Dispose();
}
}`
Check out this answer
I think the gist is that you cannot have two active sql commands executing over the same connection at the same time without a special connection string property. When you are operating under the transaction scope, you should find that both SqlConnection objects have the same client ID. However, if you remove the transaction scope they are different, which I believe implies that they are operating on separate connections.
Adding "MultipleActiveResultSets=true" to the connection string fixed the issue for me. Another alternative is to replace
DoAsyncThing().ConfigureAwait(false);
with
DoAsyncThing().ConfigureAwait(false).GetAwaiter().GetResult();
which will terminate the first command before starting the second command.

Cancelling SQL Query in background in .NET

I'm designing a small desktop app that fetches data from SQL server. I used BackgroundWorker to make the query execute in background. The code that fetches data generally comes down to this:
public static DataTable GetData(string sqlQuery)
{
DataTable t = new DataTable();
using (SqlConnection c = new SqlConnection(GetConnectionString()))
{
c.Open();
using (SqlCommand cmd = new SqlCommand(sqlQuery))
{
cmd.Connection = c;
using (SqlDataReader r = cmd.ExecuteReader())
{
t.Load(r);
}
}
}
return t;
}
Since query can take up 10-15 minutes I want to implement cancellation request and pass it from GUI layer to DAL. Cancellation procedure of BackroundWorker won't let me cancel SqlCommand.ExecuteReader() beacuse it only stops when data is fetched from server or an exception is thrown by Data Provider.
I tried to use Task and async/await with SqlCommand.ExecuteReaderAsync(CancellationToken) but I am confused where to use it in multi-layer app (GUI -> BLL -> DAL).
Have you tried using the SqlCommand.Cancel() method ?
Aproach: encapsulate that GetData method in a Thread/Worker and then when you cancel/stop that thread call the Cancel() method on the SqlCommand that is being executed.
Here is an example on how to use it on a thread
using System;
using System.Data;
using System.Data.SqlClient;
using System.Threading;
class Program
{
private static SqlCommand m_rCommand;
public static SqlCommand Command
{
get { return m_rCommand; }
set { m_rCommand = value; }
}
public static void Thread_Cancel()
{
Command.Cancel();
}
static void Main()
{
string connectionString = GetConnectionString();
try
{
using (SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
Command = connection.CreateCommand();
Command.CommandText = "DROP TABLE TestCancel";
try
{
Command.ExecuteNonQuery();
}
catch { }
Command.CommandText = "CREATE TABLE TestCancel(co1 int, co2 char(10))";
Command.ExecuteNonQuery();
Command.CommandText = "INSERT INTO TestCancel VALUES (1, '1')";
Command.ExecuteNonQuery();
Command.CommandText = "SELECT * FROM TestCancel";
SqlDataReader reader = Command.ExecuteReader();
Thread rThread2 = new Thread(new ThreadStart(Thread_Cancel));
rThread2.Start();
rThread2.Join();
reader.Read();
System.Console.WriteLine(reader.FieldCount);
reader.Close();
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
static private string GetConnectionString()
{
// To avoid storing the connection string in your code,
// you can retrieve it from a configuration file.
return "Data Source=(local);Initial Catalog=AdventureWorks;"
+ "Integrated Security=SSPI";
}
}
You can only do Cancelation checking and Progress Reporting between Distinct lines of code. Usually both require that you disect the code down to the lowest loop level, so you can do both these things between/in the loop itterations. When I wrote my first step into BGW, I had the advantage that I needed to do the loop anyway so it was no extra work. You have one of the worse cases - pre-existing code that you can only replicate or use as is.
Ideal case:
This operation should not take nearly as long is it does. 5-10 minutes indicates that there is something rather wrong with your design.
If the bulk of the time is transmission of data, then you are propably retreiving way to much data. Retrieving everything to do filtering in the GUI is a very common mistake. Do as much filtering in the query as possible. Usign a Distributed Database might also help with transmission performance.
If the bulk of the time is processing as part of the query operation (complex Conditions), something in your general approach might have to change. There are various ways to trade off complex calculation with a bit of memory on the DBMS side. Views afaik can cache the results of operations, while still maintaining transactional consistency.
But it really depends what your backend DB/DBMS and use case are. A lot of the use SQL as Query Language. So it does not allow us to predict wich options you have.
Second best case:
The second best thing if you can not cut it down, would be if you had the actually DB access code down to the lowest loop and would do progress reporting/cancelation checking on it. That way you could actually use the existing Cancelation Token System inherent in BGW.
Everything else
Using any other approach to Cancelation is really a fallback. I wrote a lot on why it is bad, but felt that this might work better if I focus on the core issue - likely something wrong in design of he DB and/or Query. Because those might well eliminate the issue altogether.

How to lock a object when using load balancing

Background: I'm writing a function putting long lasting operations in a queue, using C#,
and each operation is kind of divided into 3 steps:
1. database operation (update/delete/add data)
2. long time calculation using web service
3. database operation (save the calculation result of step 2) on the same db table in step 1, and check the consistency of the db table, e.g., the items are the same in step 1 (Pls see below for a more detailed example)
In order to avoid dirty data or corruptions, I use a lock object (a static singleton object) to ensure the 3 steps to be done as a whole transaction. Because when multiple users are calling the function to do operations, they may modify the same db table at different steps during their own operations without this lock, e.g., user2 is deleting item A in his step1, while user1 is checking if A still exists in his step 3. (additional info: Meanwhile I'm using TransactionScope from Entity framework to ensure each database operation as a transaction, but as repeat readable.)
However, I need to put this to a cloud computing platform which uses load balancing mechanism, so actually my lock object won't take effect, because the function will be deployed on different servers.
Question: what can I do to make my lock object working under above circumstance?
This is a tricky problem - you need a distributed lock, or some sort of shared state.
Since you already have the database, you could change your implementation from a "static C# lock" and instead the database to manage your lock for you over the whole "transaction".
You don't say what database you are using, but if it's SQL Server, then you can use an application lock to achieve this. This lets you explicitly "lock" an object, and all other clients will wait until that object is unlocked. Check out:
http://technet.microsoft.com/en-us/library/ms189823.aspx
I've coded up an example implementation below. Start two instances to test it out.
using System;
using System.Data;
using System.Data.SqlClient;
using System.Transactions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var locker = new SqlApplicationLock("MyAceApplication",
"Server=xxx;Database=scratch;User Id=xx;Password=xxx;");
Console.WriteLine("Aquiring the lock");
using (locker.TakeLock(TimeSpan.FromMinutes(2)))
{
Console.WriteLine("Lock Aquired, doing work which no one else can do. Press any key to release the lock.");
Console.ReadKey();
}
Console.WriteLine("Lock Released");
}
class SqlApplicationLock : IDisposable
{
private readonly String _uniqueId;
private readonly SqlConnection _sqlConnection;
private Boolean _isLockTaken = false;
public SqlApplicationLock(
String uniqueId,
String connectionString)
{
_uniqueId = uniqueId;
_sqlConnection = new SqlConnection(connectionString);
_sqlConnection.Open();
}
public IDisposable TakeLock(TimeSpan takeLockTimeout)
{
using (TransactionScope transactionScope = new TransactionScope(TransactionScopeOption.Suppress))
{
SqlCommand sqlCommand = new SqlCommand("sp_getapplock", _sqlConnection);
sqlCommand.CommandType = CommandType.StoredProcedure;
sqlCommand.CommandTimeout = (int)takeLockTimeout.TotalSeconds;
sqlCommand.Parameters.AddWithValue("Resource", _uniqueId);
sqlCommand.Parameters.AddWithValue("LockOwner", "Session");
sqlCommand.Parameters.AddWithValue("LockMode", "Exclusive");
sqlCommand.Parameters.AddWithValue("LockTimeout", (Int32)takeLockTimeout.TotalMilliseconds);
SqlParameter returnValue = sqlCommand.Parameters.Add("ReturnValue", SqlDbType.Int);
returnValue.Direction = ParameterDirection.ReturnValue;
sqlCommand.ExecuteNonQuery();
if ((int)returnValue.Value < 0)
{
throw new Exception(String.Format("sp_getapplock failed with errorCode '{0}'",
returnValue.Value));
}
_isLockTaken = true;
transactionScope.Complete();
}
return this;
}
public void ReleaseLock()
{
using (TransactionScope transactionScope = new TransactionScope(TransactionScopeOption.Suppress))
{
SqlCommand sqlCommand = new SqlCommand("sp_releaseapplock", _sqlConnection);
sqlCommand.CommandType = CommandType.StoredProcedure;
sqlCommand.Parameters.AddWithValue("Resource", _uniqueId);
sqlCommand.Parameters.AddWithValue("LockOwner", "Session");
sqlCommand.ExecuteNonQuery();
_isLockTaken = false;
transactionScope.Complete();
}
}
public void Dispose()
{
if (_isLockTaken)
{
ReleaseLock();
}
_sqlConnection.Close();
}
}
}
}

How to handle sql transaction in this scenario?

I am a C# programmer. I want to clear this complex concept.
If there are 2 databases: A and B. Suppose I want to insert records in both but first in A and then in B. Say if while inserting in db B an exception occurs. The situation is that if B crashes, transaction with db A should also be rolled back. What do I have to do?
I know I can use SqlTransaction object with SqlConnectionString class. Can I have some code for this?
Already asked here : Implementing transactions over multiple databases.
Best answer from keithwarren7 :
use the TransactionScope class like this
using(TransactionScope ts = new TransactionScope())
{
//all db code here
// if error occurs jump out of the using block and it will dispose and rollback
ts.Complete();
}
The class will automatically convert to a distributed transaction if necessary.
.
Edit : adding explanations to original answer
You've got a good example in the MSDN : http://msdn.microsoft.com/fr-fr/library/system.transactions.transactionscope%28v=vs.80%29.aspx.
This example shows you how to use 2 Database Connections in one TransactionScope.
// Create the TransactionScope to execute the commands, guaranteeing
// that both commands can commit or roll back as a single unit of work.
using (TransactionScope scope = new TransactionScope())
{
using (SqlConnection connection1 = new SqlConnection(connectString1))
{
try
{
// Opening the connection automatically enlists it in the
// TransactionScope as a lightweight transaction.
connection1.Open();
// Create the SqlCommand object and execute the first command.
SqlCommand command1 = new SqlCommand(commandText1, connection1);
returnValue = command1.ExecuteNonQuery();
writer.WriteLine("Rows to be affected by command1: {0}", returnValue);
// If you get here, this means that command1 succeeded. By nesting
// the using block for connection2 inside that of connection1, you
// conserve server and network resources as connection2 is opened
// only when there is a chance that the transaction can commit.
using (SqlConnection connection2 = new SqlConnection(connectString2))
try
{
// The transaction is escalated to a full distributed
// transaction when connection2 is opened.
connection2.Open();
// Execute the second command in the second database.
returnValue = 0;
SqlCommand command2 = new SqlCommand(commandText2, connection2);
returnValue = command2.ExecuteNonQuery();
writer.WriteLine("Rows to be affected by command2: {0}", returnValue);
}
catch (Exception ex)
{
// Display information that command2 failed.
writer.WriteLine("returnValue for command2: {0}", returnValue);
writer.WriteLine("Exception Message2: {0}", ex.Message);
}
}
catch (Exception ex)
{
// Display information that command1 failed.
writer.WriteLine("returnValue for command1: {0}", returnValue);
writer.WriteLine("Exception Message1: {0}", ex.Message);
}
}
// The Complete method commits the transaction. If an exception has been thrown,
// Complete is not called and the transaction is rolled back.
scope.Complete();
}
If you like to have only one connection and like to manage the things then you may using the Linked Server and call the SP from server A which can call the SP from Server B
:)

Categories

Resources