I'm using Dapper, but this applies the same to ADO.NET code.
I have an operation on a web app that changes a lot of state in the database. To ensure an all-or-nothing result, I use a transaction to manage this. To do this, all my Repository classes share a connection (which is instantiated per request). On my connection I can call Connection.BeginTransaction().
However, this operation can sometimes take a while (say 10 seconds), and it's locking some frequently-read-from tables while it does it's thing. I want to allow other repos on other threads to continue without locking while this is happening.
It looks like I need to do 2 things to make this happen:
1) Set the IsoloationLevel to something like ReadUncommited:
_transaction = Connection.BeginTransaction(IsolationLevel.ReadUncommitted);
2) For all other connections that don't need a transaction, I still need to enroll those connections in a transaction, so that I can again set ReadUncommited. If I don't do this then they'll still lock while they wait for the long running operation to complete.
So does this mean I need ALL my connections to start a transaction? This sounds expensive and sub-performant. Are there other solutions I'm missing here?
Thanks
Be aware that there is a trade-off between using locks or not, it's about performance vs concurrency control. Therefore, I don't think you should use ReadUncommited all the time.
If you try to use ReadUncommited on all other transactions that need not to be blocked by this long running transaction, they will by accident not be blocked also by other transactions.
Generally, we use this isolation level when performance is the first priority and does not need data accuracy
I want to allow other repos on other threads to continue without
locking while this is happening.
I think you can try IsolationLevel.SnapShot on only the transaction that does long locking work: https://msdn.microsoft.com/en-us/library/tcbchxcb(v=vs.110).aspx
Extracted from the link:
The term "snapshot" reflects the fact that all queries in the
transaction see the same version, or snapshot, of the database, based
on the state of the database at the moment in time when the
transaction begins. No locks are acquired on the underlying data rows
or data pages in a snapshot transaction, which permits other
transactions to execute without being blocked by a prior uncompleted
transaction. Transactions that modify data do not block transactions
that read data, and transactions that read data do not block
transactions that write data, as they normally would under the default
READ COMMITTED isolation level in SQL Server. This non-blocking
behavior also significantly reduces the likelihood of deadlocks for
complex transactions.
Be aware that an enormous amount of data could be generated in tempdb for version store if there are a lot of modifications.
Related
There is something that worries me about my application. I have a SQL query that does a bunch of inserts into the database across various tables. I timed how long it takes to complete the process, it takes about 1.5 seconds. At this point I'm not even done developing the query, I still have more inserts to program into this. So I fully expect this to process to take even longer, perhaps up to 3 seconds.
Now, it is important that all of this data be consistent and finish either completely, or not at all. So What I'm wondering about is, is it OK for a transaction to take that long. Doesn't it lock up the table, so selects, inserts, updates, etc... cannot be run until the transaction is finished? My concern is if this query is being run frequently it could lock up the entire application so that certain parts of it become either incredibly slow, or unusable. With a low user base, I doubt this would be an issue, but if my application should gain some traction, this query could potentially be a lot.
Should I be concerned about this or am I missing something where the database won't act how I am thinking. I'm using a SQL Server 2014 database.
To note, I timed this by using the StopWatch C# object immediately before the transaction starts, and stop it right after the changes are committed. So it's about as accurate as can be.
You're right to be concerned about this, as a transaction will lock the rows it's written until the transaction commits, which can certainly cause problems such as deadlocks, and temporary blocking which will slow the system response. But there are various factors that determine the potential impact.
For example, you probably largely don't need to worry if your users are only updating and querying their own data, and your tables have indexing to support both read and write query criteria. That way each user's row locking will largely not affect the other users--depending on how you write your code of course.
If your users share data, and you want to be able to support efficient searching across multiple user's data even with multiple concurrent updates for example, then you may need to do more.
Some general concepts:
-- Ensure your transactions write to tables in the same order
-- Keep your transactions as short as possible by preparing the data to be written as much as possible before starting the transaction.
-- If this is a new system (and even if not new), definitely consider enabling Snapshot Isolation and/or Read Committed Snapshot Isolation on the database. SI will (when explicitly set on the session) allow your read queries not to be blocked by concurrent writes. RCSI will allow all your read queries by default not to be blocked by concurrent writes. But read this to understand both the benefits and gotchas of both isolation levels: https://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/
I think its depend on your code, how you used loop effectively, select query and the other statement.
I am stress testing my website. It uses Entity Framework 6.
I have 10 threads. This is what they are doing:
Fetch some data from the web.
Create new database context.
Create/Update records in the database using Database.SqlQuery(sql).ToList() to read and Database.ExecuteSqlCommand(sql) to write (about 200 records/second)
Close context
It crashes within 2 minutes with a database deadlock exception (consistently on a read!).
I have tried wrapping steps 2-4 in a Transaction, but this did not help.
I have read that as of EF6, ExecuteSqlCommand is wrapped in a transaction by default (https://msdn.microsoft.com/en-us/data/dn456843.aspx). How do I turn this behavior off?
I don't even understand why my transactions are deadlocked, they are read/writing independent rows.
Is there a database setting I can flip somewhere increase the size of my pending transaction queue?
I doubt EF has anything to do with it. Even though you are reading/writing independent rows, locks can escalate and lock pages. If you are not careful with your database design, and how you perform the reads and writes (order is important) you can deadlock, with EF or any other access technique.
What transaction type is being used?
.Net's TransactionScope defaults to SERIALIZABLE, at least in my applications which admittedly do not use EF. SERIALIZABLE transactions deadlock much more easily in my experience than other types such as ReadCommitted.
I have an ASP.NET MVC application using EF6 and SQL Server with up to 15 or so concurrent users. To ensure the consistency of data between different queries during each page request, I have everything enclosed in transactions (using System.Transactions.TransactionScope).
When I use IsolationLevel.ReadCommitted and .Serializable, I get deadlock errors like this:
Transaction (Process ID #) was deadlocked on lock resources with another process and has been chosen as the deadlock victim.
When I use IsolationLevel.Snapshot, I get errors like this:
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.#' directly or indirectly in database '#' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
These errors are the least frequent when using IsolationLevel.Snapshot (one to three per day, roughly).
My understanding of the issue leads me to believe that the only ways to guarantee zero transaction failures is to either:
Completely serialize all database access, or
Implement some type of transaction retry functionality.
And I can't do 1 because some tasks and requests take a while to run, while other parts of the application need to stay reasonably responsive.
I'm inclined to think retry could be implemented by getting MVC to re-run the controller action, but I don't know how to go about doing such a thing.
I also don't know how to reproduce the errors that my users are causing. All I get right now are rather uninformative exception logs. I could set up EF to log all SQL that gets run on the DB, now that EF6 lets you do that, but I'm not sure how helpful that would actually be.
Any ideas?
Regardless of isolation level, there are two categories of locks. EXCLUSIVE for INSERT, DELETE, UPDATE and shared for SELECT.
You should try the limit the transaction time for EXCLUSIVE locks to a minimum. The default isolation level is READ COMMITTED. If you are writing/running reports against the OLTP systems, writers will block readers. You might get blocking issues.
In 2005, READ COMMITTED SNAPSHOT ISOLATION was introduced. For readers, the version store in tempdb is used to capture a snapshot of the data to satisfy the current query. A-lot less overhead than SNAPSHOT ISOLATION. In short readers are now not blocked by writers.
This should fix your blocking issues. You need to remove any table hints or isolation commands you currently have.
See article from Brent Ozar.
http://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/
Will it fix your deadlock? Probably not.
Deadlocks are causes by two or more resources exclusive lock in opposite order.
Check out MSDN = way cooler pictures and mentions deadlock flags.
http://technet.microsoft.com/en-us/library/ms178104(v=sql.105).aspx
Process 1
DEBIT BANK ACCOUNT
CREDIT VENDOR ACCOUNT
Process 2
CREDIT VENDOR ACCOUNT
DEBIT BANK ACCOUNT
In short, change the order of your DML to have consistent access to the tables. Turn on a trace flag to get the actual TSQL causing the issue.
Last but not least, check out application locks as a last resort. The can be used as MUTEX's on code that might be causing deadlocks.
http://www.sqlteam.com/article/application-locks-or-mutexes-in-sql-server-2005
I am having a challenge of maintaining an incredibly large transaction using Nhibernate. So, let us say, I am saving large number of entities. If I do not flush on a transaction N, let us say 10000, then the performance gets killed due to overcrowded Nh Session. If I do flush, I place locks on DB level which in combination with read committed isolation level do affect working application. Also note that in reality I import an entity whose business logic is one of the hearts of the system and on its import around 10 tables are affected. That makes Stateless session a bad idea due to manual maintaining of cascades.
Moving BL to stored procedure is a big challenge due to to reasons:
there is already complicated OO business logic in the domain
classes of application,
duplicated BL will be introduced.
Ideally I would want to Flush session to some file and only then preparation of data is completed, I would like to execute its contents. Is it possible?
Any other suggestions/best practices are more than welcome.
You scenario is a typical ORM batch problem. In general we can say that no ORM is meant to be used for stuff like that. If you want to have high batch processing performance (not everlasting locks and maybe deadlocks) you should not use the ORM to insert 1000s of records.
Instead use native batch inserts which will always be a lot faster. (like SqlBulkCopy for MMSQL)
Anyways, if you want to use nhibernate for this, try to make use of the batch size setting.
Call save or update to all your objects and only call session.Flush once at the end. This will create all your objects in memory...
Depending on the batch size, nhibernate should try to create insert/update batches with this size, meaning you will have lot less roundtrips to the database and therefore fewer locks or at least it shouldn't take that long...
In general, your operations should only lock the database the moment your first insert statement gets executed on the server if you use normal transactions. It might work differently if you work with TransactionScope.
Here are some additional reads of how to improve batch processing.
http://fabiomaulo.blogspot.de/2011/03/nhibernate-32-batching-improvement.html
NHibernate performance insert
http://zvolkov.com/clog/2010/07/16?s=Insert+or+Update+records+in+bulk+with+NHibernate+batching
Scenario: I want to let multiple (2 to 20, probably) server applications use a single database using ADO.NET. I want individual applications to be able to take ownership of sets of records in the database, hold them in memory (for speed) in DataSets, respond to client requests on the data, perform updates, and prevent other applications from updating those records until ownership has been relinquished.
I'm new to ADO.NET, but it seems like this should be possible using transactions with Data Adapters (ADO.NET disconnected layer).
Question part 1: Is that the right way to try and do this?
Question part 2: If that is the right way, can anyone point me at any tutorials or examples of this kind of approach (in C#)?
Question part 3: If I want to be able to take ownership of individual records and release them independently, am I going to need a separate transaction for each record, and by extension a separate DataAdapter and DataSet to hold each record, or is there a better way to do that? Each application will likely hold ownership of thousands of records simultaneously.
How long were you thinking of keeping the transaction open for?
How many concurrent users are you going to support?
These are two of the questions you need to ask yourself. If the answer for the former is a "long time" and the answer to the latter is "many" then the approach will probably run into problems.
So, my answer to question one is: no, it's probably not the right approach.
If you take the transactional lock approach then you are going to limit your scalability and response times. You could also run into database errors. e.g. SQL Server (assuming you are using SQL Server) can be very greedy with locks and could lock more resources than you request/expect. The application could request some row level locks to lock the records that it "owns" however SQL Server could escalate those row locks to a table lock. This would block and could result in timeouts or perhaps deadlocks.
I think the best way to meet the requirements as you've stated them is to write a lock manager/record checkout system. Martin Fowler calls this a Pessimistic Offline Lock.
UPDATE
If you are using SQL Server 2008 you can set the lock escalation behavior on a table level:
ALTER TABLE T1 SET (LOCK_ESCALATION = DISABLE);
This will disable lock escalation in "most" situations and may help you.
You actually need concurrency control,along with Transaction support.
Transaction only come into picture when you perform multiple operations on database. As soon as the connection is released the transaction is no more applicable.
concurrency lets you work with multiple updates on the same data. If two or more clients hold the same set of data and one needs to read/write the data after another client updates it, the concurrency will let you decide which set of updates to keep and which one to ignore. Mentioning the concept of concurrency is beyond the scope of this article. Checkout this article for more information.