Bulletproof approach to tackle SQL transaction timeouts

Bulletproof approach to tackle SQL transaction timeouts - c#

Recently we faced quite an interesting issue that has to do with SQL transactions timeout. The statement that timed out does not really matter for the sake of question, but it was single INSERT statement w/o explicit transaction with client generated GUID as a key:
INSERT MyTable
(id, ...)
VALUES (<client-app-generated-guid>, ...)
We also have a retry policies in-place, so that if command fails with SqlException, then it will be retried. SQL Server (Azure SQL) did not behave normally one day and we faced a lot of strange PK violation errors during retries. They were caused by retrying actually successfully committed on the SQL Server transaction (so that causes insert with already taken ID). I understand that SQL timeout it's purely client side concept, so if Client thinks that SqlCommand failed - it might or might not mean it.
I suspect that Client explicit transaction control via for instance wrapping statements with TransactionScope as shown bellow will fix 99% of such troubles -- because Commit is actually quite fast&cheap operation. However, I still see the caveat there -- the timeout also can happen on Committing stage. The application again can be in conditions where it's impossible to guess whether transaction really committed or not (to figure out necessity of retry).
The question is how to write code in bulletproof (to such kind of troubles) and generic fashion and do a retry only when it's positively clear that transaction was not committed.
using (var trx = new TransactionScope())
using (var con = GetOpenConnection(connectionString))
{
con.Execute("<some-non-idempotent-query>");
// what if Complete() times out?!
// to retry or not to retry?!
trx.Complete();
}

The problem is that the Exception does not mean that the transaction failed. For any compensating action (like retrying) you need to have a definite way of telling if it failed. There are scalability issues with what I will suggest, but its the technique that is the important thing, the scalability issues can be solved in other ways.
My solution;
the last INSERT before COMMIT is to write a Guid to a tracking table.
if an exception occurs, that indicates a network failure, SELECT ##TRANCOUNT. If it indicates you are still in a transaction (is greater than 0)(which probably should never happen, but its worth checking) then you can happily resubmit your COMMIT
If ##TRANCOUNT returns 0 then you are no longer in a transaction. Selecting your Guid from the tracking table will tell you whether your COMMIT was successful.
If your commit was not successful (##TRANCOUNT ==0 and your Guid is not present in the tracking table) then resubmit your entire batch from the BEGIN TRANSACTION onwards.

The general approach is: try to read back what you just tried to insert.
If you can read back the ID that you tried to insert, then previous transaction committed successfully, no need to retry.
If you can't find the ID that you tried to insert, then you know that your attempt to insert has failed, so you should retry.
I'm afraid there is no way to have a completely generic pattern that would work for any SQL statement. Your "checking" code needs to know what to look for.
If it is INSERT with ID - then you are looking for that ID.
If it is some UPDATE, then the check would be custom and depend on the nature of that UPDATE.
If it is DELETE, then the check consists of trying to read what was meant to be deleted.
Actually, here is a generic pattern: any data modification batch that has one or multiple INSERT, UPDATE, DELETE statements should have one more INSERT statement within that transaction that inserts some GUID (some ID of the data modifying transaction itself) into a dedicated audit table. Then your checking code tries to read that same GUID from that dedicated audit table. If GUID is found, then you know that previous transaction committed successfully. If GUID is not found, then you know that previous transaction was rolled back and you can retry.
Having this dedicated audit table unifies/standardize the checks. The checks no longer depend on internals and details of your data changing code. Your data modification code and verification code depend on the same agreed interface - audit table.

Related

SQL Server errors in trigger that locks table lost with SqlDataAdapter and ambient transaction

Okay, so I've run into a rather bizarre circumstance. There are several layers to my situation. I haven't identified whether every layer is strictly required, but here's what's going on:
C# code is creating an ambient transaction, into which a SqlConnection is automatically enlisting.
C# code is using a SqlDataAdapter to insert a row into a table.
The InsertCommand is referencing a stored procedure. The stored procedure is a straightforward INSERT statement.
The table into which the INSERT is being done has a trigger on it INSTEAD OF INSERT.
The trigger obtains an exclusive lock on the table.
An error occurs within the trigger.
With this conjunction, the error is not raised within the C# code. However, if the trigger does not obtain an exclusive lock on the table, the error does make it up to the C# code.
The error is actually happening, though, evidenced by the fact that on the SQL Server side, the transaction has been aborted. The C# code doesn't know that the transaction has been aborted, and only encounters an error when the disposal of the TransactionScope tries to COMMIT TRANSACTION.
I have created a minimal reproduction of this scenario:
https://github.com/logiclrd/TestErrorWhileLockedInTrigger
Does anyone have any understanding of why this might be, and how proper error handling behaviour might be restored?

So, I've done some more testing of this.
My first thought was, if holding the exclusive lock is causing it to squelch the error, maybe explicitly releasing the lock will unsquelch it? So, I put a TRY/CATCH around the statement that generates the error in my proof-of-concept, had it ROLLBACK TRANSACTION and then re-THROW, but it didn't do anything.
So then my next thought was, the RAISERROR statement, when used with severity levels 20-25, forcibly terminates the connection. I'm not sure if this is an ideal solution, because it also writes an entry to the SQL Server event log when this happens. However, it does achieve the goal of having the SqlDataAdapter see the error during its Update command instead of the C# code thinking the transaction is still active and trying to commit it.
Does anyone know of other potential downsides to this "sledgehammer" approach, or is it possibly going to be the only way to get the error to propagate properly in this circumstance?

I have identified the cause of the problem.
The statement in the trigger locking the table looked like this:
SELECT TOP 0 *
FROM TableToTriggerAndLock WITH (TABLOCKX, HOLDLOCK)
While this returns no data, it does return an (empty) result set. It turns out the SqlDataAdapter class only cares about the first result set it gets back on the TDS stream, so the error coming back in the second result set is completely passed over.
Take out the locking statement, and you take out that redundant result set, and now the error is in the first result set.
The solution, then, is to suppress the result set, which I did by reworking the locking statement as:
DECLARE #Dummy INT
SELECT TOP 0 #Dummy = 1
FROM TableToTriggerAndLock WITH (TABLOCKX, HOLDLOCK)
Hope this helps someone out there working with SqlDataAdapter and more complicated underlying operations. :-)

SQL Server transactions and transaction isolation - getting errors that I don't know how to fix

I have an ASP.NET MVC application using EF6 and SQL Server with up to 15 or so concurrent users. To ensure the consistency of data between different queries during each page request, I have everything enclosed in transactions (using System.Transactions.TransactionScope).
When I use IsolationLevel.ReadCommitted and .Serializable, I get deadlock errors like this:
Transaction (Process ID #) was deadlocked on lock resources with another process and has been chosen as the deadlock victim.
When I use IsolationLevel.Snapshot, I get errors like this:
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.#' directly or indirectly in database '#' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
These errors are the least frequent when using IsolationLevel.Snapshot (one to three per day, roughly).
My understanding of the issue leads me to believe that the only ways to guarantee zero transaction failures is to either:
Completely serialize all database access, or
Implement some type of transaction retry functionality.
And I can't do 1 because some tasks and requests take a while to run, while other parts of the application need to stay reasonably responsive.
I'm inclined to think retry could be implemented by getting MVC to re-run the controller action, but I don't know how to go about doing such a thing.
I also don't know how to reproduce the errors that my users are causing. All I get right now are rather uninformative exception logs. I could set up EF to log all SQL that gets run on the DB, now that EF6 lets you do that, but I'm not sure how helpful that would actually be.
Any ideas?

Regardless of isolation level, there are two categories of locks. EXCLUSIVE for INSERT, DELETE, UPDATE and shared for SELECT.
You should try the limit the transaction time for EXCLUSIVE locks to a minimum. The default isolation level is READ COMMITTED. If you are writing/running reports against the OLTP systems, writers will block readers. You might get blocking issues.
In 2005, READ COMMITTED SNAPSHOT ISOLATION was introduced. For readers, the version store in tempdb is used to capture a snapshot of the data to satisfy the current query. A-lot less overhead than SNAPSHOT ISOLATION. In short readers are now not blocked by writers.
This should fix your blocking issues. You need to remove any table hints or isolation commands you currently have.
See article from Brent Ozar.
http://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/
Will it fix your deadlock? Probably not.
Deadlocks are causes by two or more resources exclusive lock in opposite order.
Check out MSDN = way cooler pictures and mentions deadlock flags.
http://technet.microsoft.com/en-us/library/ms178104(v=sql.105).aspx
Process 1
DEBIT BANK ACCOUNT
CREDIT VENDOR ACCOUNT
Process 2
CREDIT VENDOR ACCOUNT
DEBIT BANK ACCOUNT
In short, change the order of your DML to have consistent access to the tables. Turn on a trace flag to get the actual TSQL causing the issue.
Last but not least, check out application locks as a last resort. The can be used as MUTEX's on code that might be causing deadlocks.
http://www.sqlteam.com/article/application-locks-or-mutexes-in-sql-server-2005

Isolation level in Sql Transaction

I have implemented SqlTransaction in c# to begin, commit and rollback transaction. Everything is going right, but I've got some problem while accessing those tables which are in connection during transaction.
I was not able to read table during the transaction(those table which are in transaction). While searching about this, I found that it happens due to an exclusive lock. Any subsequent selects on that data in turn have to wait for the exclusive lock to be released. Then, I have gone through every isolation level provided by SqlTransaction, but it did not work.
So, I need to release exclusive lock during transaction so that other user can have access on that table and can read the data.
Is there any method to achieve this?
Thanks in advance.
Here's my c# code for the transaction
try
{
SqlTransaction transaction = null;
using (SqlConnection connection=new SqlConnection(Connection.ConnectionString))
{
connection.Open();
transaction=connection.BeginTransaction(IsolationLevel.Snapshot,"FaresheetTransaction");
//Here all transaction occurs
if (transaction.Connection != null)
{
transaction.Commit();
transaction.Dispose();
}
}
}
catch (Exception ex)
{
if (transaction.Connection != null)
transaction.Rollback();
transaction.Dispose();
} `
This code is working fine, but the problem is that when I access the data of tables (those accessed during the transaction) during the time of transaction. The tables are being accessed by other parts of the application. So, when I tried to read data from the table, it throws an exception.

A SQL transaction is, by design, ACID. In particular, it is the "I" that is hurting you here - this is designed to prevent other connections seeing the inconsistent intermediate state.
An individual reading connection can elect to ignore this rule by using the NOLOCK hint, or the READ UNCOMMITTED isolation level, but it sounds like you want is for the writing connection to not take locks. Well, that isn't going to happen.
However, what might help is for readers to use snapshot isolation, which achieves isolation without the reader taking locks (by looking at, as the name suggests, a point-in-time shapshot of the consistent state when the transaction started).
However, IMO you would be better advised to look at either:
multiple, more granular, transactions from the writer
performing the work in a staging table (a parallel copy of the data), then merging that into the real data in a few mass-insert/update/delete operations, minimising the transaction time
The first is simpler.
The simple fact is: if you take a long-running transaction that operates on a lot of data, yes you are going to be causing problems. Which is why you don't do that. The system is operating correctly.

Try to execute your reads within a transaction as well and use the isolation level READ UNCOMMITTED. This will prevent the read from being locked, but might produce invalid results:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRANSACTION
SELECT * FROM Table
COMMIT TRANSACTION
There is a misconception that dealing with transactions/isolation levels only matters when writing, when in fact it is equally important when reading.

#AKASH88, SNAPSHOT isolation level is what you are looking for.
You say that even with SNAPSHOT it is not working as expected, exclusive lock is happening, I can understand that, I had the same issue.
Make sure you don't just enable SNAPSHOT on the database options, but also READ COMMITTED SNAPSHOT must be turned on.
This is SQL Server 2008, so it's still uncertain if this answer will help :(
Best regards!

The problem is not on the level of writing into database but on the level of reading values. You are trying to read values that are inserting. Try to change your select query to following:
select * from your_table_with_inserts with (nolock)
however this one overrides isolation level of current transaction and can cause dirty reads.
So the question is : if you are using transaction on all queries or only insert/update?

How do transactions work in nhibernate?

I just started learning nHibernate and I'm confused by transactions. I know that nhibernate tracks all changes to persistent objects in a session and those changes get sent to database on commit, but what is the purpose of the transactions?
If I wrap code in a 'using transaction' block and call commit does it just commit the object changes that occurred within the transaction or does it commit all changes that occurred within the session since last commit of flush?

The purpose of transactions is to make sure that you dont commit a session with dirty data or error on it. Consider the very simple case of a transaction of placing an order for a book.
You will probably do the following actions:
a) Check if the book exists at this moment.
b) Read the customer details and see if he has anything in the shopping cart.
c) Update the book count
d) Make an entry for the order
Now consider the case where in you run into an error while the order is being entered obs you want your other changes to be rolled back and that is when you roll back the transaction.
How do you do it? Well there are many ways. One of the ways for web apps is to monitor the HTTP Error object as follows:
if(HttpContext.Current != null && HttpContext.Current.Error != null)
transaction.Rollback();
Ideally you should not break your unit of work pattern by using explicit transaction blocks. Try to avoid doing this as much as possible

If you don't use transactions then anytime NHibernate sends a batch, that alone will be a transaction. I'm not sure if the session.Flush() uses a batch or not. Let's suppose it does. Your first call to session.Flush() would result in a transaction. Suppose your second call to flush results in a an error. The changes from the first flush would remain in the DB.
If on the other hand you're using an explicit transaction, you can call flush a million times but if you roll back the transaction (maybe because the millionth and one flush threw errors) then all the flushes got rolled back.
Hope that makes sense.

Dot NET and SQL Server 2005 Transaction

In a situation where i have to insert a record into a table A, and one of the fields in the table references a record in another table B. How can i make sure that until i commit the insert statement, the record in table B referenced by a the record to be inserted in table A is not tampered with.
I am thinking of including both tables into a transaction and locking all the records involved in the transaction. but that may lead to concurrency deficiency. so need your recommendation.
Thank you,

Note that even with a transaction, you;;ll need to get the isolation level right. The most paranoid (and hence the most accurate) is "serializable", which takes out locks (even range locks) when you read data, so that other spids can't play with it.

If you want to make your changes to the two tables become a single atomic action then they should be performed in a single transaction. its relatively simple in .net, you just need to use the BeginTransaction method on a SqlConnection to create a new transaction and then you make your SqlCommands etc work against the transaction rather than the connection. You can also do it using a TransactionScope but you may have issues with MSDTC.
I wouldn't be to concerned about concurrency issues about using transactions. I would shy away from trying to deal with locking issues yourself, I would start with just making updates atomic and maintaining data integrity.

Yon don't need a transaction for this, just a foreign key relationship. A relationship from table A, field B_FK referencing table B's primary key will prevent the creation of the table A row if the corresponding table B row does not exist.

If by "tampered with" you mean deleted, then John's right, a foreign-key relationship is probably what you want. But, if you mean modified, then a transaction is the only way to go. Yes, it'll mean that you have a potential bottleneck, but there's no way to avoid that if you want your operation to be "atomic". To avoid any noticeable performance degradation, you'll want to keep the lifetime of the transaction to the bare minimum.
Since you're using c# (and presumably ADO.NET), you could use the transaction features built-into the framework. However, it's better to have the transaction handled by the databases server since this means that the transaction can be started, completed & committed in a single request (see above re transaction lifetime).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.