Entity Framework 6 Fails Under High Load - c#

I am stress testing my website. It uses Entity Framework 6.
I have 10 threads. This is what they are doing:
Fetch some data from the web.
Create new database context.
Create/Update records in the database using Database.SqlQuery(sql).ToList() to read and Database.ExecuteSqlCommand(sql) to write (about 200 records/second)
Close context
It crashes within 2 minutes with a database deadlock exception (consistently on a read!).
I have tried wrapping steps 2-4 in a Transaction, but this did not help.
I have read that as of EF6, ExecuteSqlCommand is wrapped in a transaction by default (https://msdn.microsoft.com/en-us/data/dn456843.aspx). How do I turn this behavior off?
I don't even understand why my transactions are deadlocked, they are read/writing independent rows.
Is there a database setting I can flip somewhere increase the size of my pending transaction queue?

I doubt EF has anything to do with it. Even though you are reading/writing independent rows, locks can escalate and lock pages. If you are not careful with your database design, and how you perform the reads and writes (order is important) you can deadlock, with EF or any other access technique.

What transaction type is being used?
.Net's TransactionScope defaults to SERIALIZABLE, at least in my applications which admittedly do not use EF. SERIALIZABLE transactions deadlock much more easily in my experience than other types such as ReadCommitted.

Related

Using transactions during reads using Entity Framework 6+

In our dev team we have an interesting discussion regarding opening transactions during reads in Entity Framework.
Case is this: we have unit of work in MVC app which spans action methods - simple open EF transaction before executing action and commits after no error appears during execution. This is fine and maybe some of you use an UoW pattern with EF in that way.
Interesting part is what about actions that performs only reads (no modification of any entity for example get by id). Should transaction be opened also for reads? What would be the difference in approach when we don't open transaction and during read there is active transaction on same table we read data not using tran? Suppose that we have set default transaction isolation level to read committed.
I was pro opening transaction which keeps reads consistent but there are things against like transactions slowdown reads (which is true but I don't know how much).
What are your thoughts? I know that some of u will answer as old architects saying "it depends" but I need strong arguments not hate :)
For SQL Server at READ COMMITTED isolation there is no difference between a SELECT inside a transaction and one outside a transaction.
With legacy READ COMMITTED the S locks are released at the end of each query even in a transaction.
With READ COMMITTED SNAPSHOT (which is the default for EF Code First) there are no S locks taken, and row versions provide only a statement-level point-in-time view of the database.
At SNAPSHOT isolation, the whole transaction would see the database at a single point-in-time, still with no locking.

Is my SQL transaction taking too long?

There is something that worries me about my application. I have a SQL query that does a bunch of inserts into the database across various tables. I timed how long it takes to complete the process, it takes about 1.5 seconds. At this point I'm not even done developing the query, I still have more inserts to program into this. So I fully expect this to process to take even longer, perhaps up to 3 seconds.
Now, it is important that all of this data be consistent and finish either completely, or not at all. So What I'm wondering about is, is it OK for a transaction to take that long. Doesn't it lock up the table, so selects, inserts, updates, etc... cannot be run until the transaction is finished? My concern is if this query is being run frequently it could lock up the entire application so that certain parts of it become either incredibly slow, or unusable. With a low user base, I doubt this would be an issue, but if my application should gain some traction, this query could potentially be a lot.
Should I be concerned about this or am I missing something where the database won't act how I am thinking. I'm using a SQL Server 2014 database.
To note, I timed this by using the StopWatch C# object immediately before the transaction starts, and stop it right after the changes are committed. So it's about as accurate as can be.
You're right to be concerned about this, as a transaction will lock the rows it's written until the transaction commits, which can certainly cause problems such as deadlocks, and temporary blocking which will slow the system response. But there are various factors that determine the potential impact.
For example, you probably largely don't need to worry if your users are only updating and querying their own data, and your tables have indexing to support both read and write query criteria. That way each user's row locking will largely not affect the other users--depending on how you write your code of course.
If your users share data, and you want to be able to support efficient searching across multiple user's data even with multiple concurrent updates for example, then you may need to do more.
Some general concepts:
-- Ensure your transactions write to tables in the same order
-- Keep your transactions as short as possible by preparing the data to be written as much as possible before starting the transaction.
-- If this is a new system (and even if not new), definitely consider enabling Snapshot Isolation and/or Read Committed Snapshot Isolation on the database. SI will (when explicitly set on the session) allow your read queries not to be blocked by concurrent writes. RCSI will allow all your read queries by default not to be blocked by concurrent writes. But read this to understand both the benefits and gotchas of both isolation levels: https://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/
I think its depend on your code, how you used loop effectively, select query and the other statement.

ADO.NET Long running transactions

I'm using Dapper, but this applies the same to ADO.NET code.
I have an operation on a web app that changes a lot of state in the database. To ensure an all-or-nothing result, I use a transaction to manage this. To do this, all my Repository classes share a connection (which is instantiated per request). On my connection I can call Connection.BeginTransaction().
However, this operation can sometimes take a while (say 10 seconds), and it's locking some frequently-read-from tables while it does it's thing. I want to allow other repos on other threads to continue without locking while this is happening.
It looks like I need to do 2 things to make this happen:
1) Set the IsoloationLevel to something like ReadUncommited:
_transaction = Connection.BeginTransaction(IsolationLevel.ReadUncommitted);
2) For all other connections that don't need a transaction, I still need to enroll those connections in a transaction, so that I can again set ReadUncommited. If I don't do this then they'll still lock while they wait for the long running operation to complete.
So does this mean I need ALL my connections to start a transaction? This sounds expensive and sub-performant. Are there other solutions I'm missing here?
Thanks
Be aware that there is a trade-off between using locks or not, it's about performance vs concurrency control. Therefore, I don't think you should use ReadUncommited all the time.
If you try to use ReadUncommited on all other transactions that need not to be blocked by this long running transaction, they will by accident not be blocked also by other transactions.
Generally, we use this isolation level when performance is the first priority and does not need data accuracy
I want to allow other repos on other threads to continue without
locking while this is happening.
I think you can try IsolationLevel.SnapShot on only the transaction that does long locking work: https://msdn.microsoft.com/en-us/library/tcbchxcb(v=vs.110).aspx
Extracted from the link:
The term "snapshot" reflects the fact that all queries in the
transaction see the same version, or snapshot, of the database, based
on the state of the database at the moment in time when the
transaction begins. No locks are acquired on the underlying data rows
or data pages in a snapshot transaction, which permits other
transactions to execute without being blocked by a prior uncompleted
transaction. Transactions that modify data do not block transactions
that read data, and transactions that read data do not block
transactions that write data, as they normally would under the default
READ COMMITTED isolation level in SQL Server. This non-blocking
behavior also significantly reduces the likelihood of deadlocks for
complex transactions.
Be aware that an enormous amount of data could be generated in tempdb for version store if there are a lot of modifications.

SQL Server transactions and transaction isolation - getting errors that I don't know how to fix

I have an ASP.NET MVC application using EF6 and SQL Server with up to 15 or so concurrent users. To ensure the consistency of data between different queries during each page request, I have everything enclosed in transactions (using System.Transactions.TransactionScope).
When I use IsolationLevel.ReadCommitted and .Serializable, I get deadlock errors like this:
Transaction (Process ID #) was deadlocked on lock resources with another process and has been chosen as the deadlock victim.
When I use IsolationLevel.Snapshot, I get errors like this:
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.#' directly or indirectly in database '#' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
These errors are the least frequent when using IsolationLevel.Snapshot (one to three per day, roughly).
My understanding of the issue leads me to believe that the only ways to guarantee zero transaction failures is to either:
Completely serialize all database access, or
Implement some type of transaction retry functionality.
And I can't do 1 because some tasks and requests take a while to run, while other parts of the application need to stay reasonably responsive.
I'm inclined to think retry could be implemented by getting MVC to re-run the controller action, but I don't know how to go about doing such a thing.
I also don't know how to reproduce the errors that my users are causing. All I get right now are rather uninformative exception logs. I could set up EF to log all SQL that gets run on the DB, now that EF6 lets you do that, but I'm not sure how helpful that would actually be.
Any ideas?
Regardless of isolation level, there are two categories of locks. EXCLUSIVE for INSERT, DELETE, UPDATE and shared for SELECT.
You should try the limit the transaction time for EXCLUSIVE locks to a minimum. The default isolation level is READ COMMITTED. If you are writing/running reports against the OLTP systems, writers will block readers. You might get blocking issues.
In 2005, READ COMMITTED SNAPSHOT ISOLATION was introduced. For readers, the version store in tempdb is used to capture a snapshot of the data to satisfy the current query. A-lot less overhead than SNAPSHOT ISOLATION. In short readers are now not blocked by writers.
This should fix your blocking issues. You need to remove any table hints or isolation commands you currently have.
See article from Brent Ozar.
http://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/
Will it fix your deadlock? Probably not.
Deadlocks are causes by two or more resources exclusive lock in opposite order.
Check out MSDN = way cooler pictures and mentions deadlock flags.
http://technet.microsoft.com/en-us/library/ms178104(v=sql.105).aspx
Process 1
DEBIT BANK ACCOUNT
CREDIT VENDOR ACCOUNT
Process 2
CREDIT VENDOR ACCOUNT
DEBIT BANK ACCOUNT
In short, change the order of your DML to have consistent access to the tables. Turn on a trace flag to get the actual TSQL causing the issue.
Last but not least, check out application locks as a last resort. The can be used as MUTEX's on code that might be causing deadlocks.
http://www.sqlteam.com/article/application-locks-or-mutexes-in-sql-server-2005

Nhibernate large transactions, flushes vs locks

I am having a challenge of maintaining an incredibly large transaction using Nhibernate. So, let us say, I am saving large number of entities. If I do not flush on a transaction N, let us say 10000, then the performance gets killed due to overcrowded Nh Session. If I do flush, I place locks on DB level which in combination with read committed isolation level do affect working application. Also note that in reality I import an entity whose business logic is one of the hearts of the system and on its import around 10 tables are affected. That makes Stateless session a bad idea due to manual maintaining of cascades.
Moving BL to stored procedure is a big challenge due to to reasons:
there is already complicated OO business logic in the domain
classes of application,
duplicated BL will be introduced.
Ideally I would want to Flush session to some file and only then preparation of data is completed, I would like to execute its contents. Is it possible?
Any other suggestions/best practices are more than welcome.
You scenario is a typical ORM batch problem. In general we can say that no ORM is meant to be used for stuff like that. If you want to have high batch processing performance (not everlasting locks and maybe deadlocks) you should not use the ORM to insert 1000s of records.
Instead use native batch inserts which will always be a lot faster. (like SqlBulkCopy for MMSQL)
Anyways, if you want to use nhibernate for this, try to make use of the batch size setting.
Call save or update to all your objects and only call session.Flush once at the end. This will create all your objects in memory...
Depending on the batch size, nhibernate should try to create insert/update batches with this size, meaning you will have lot less roundtrips to the database and therefore fewer locks or at least it shouldn't take that long...
In general, your operations should only lock the database the moment your first insert statement gets executed on the server if you use normal transactions. It might work differently if you work with TransactionScope.
Here are some additional reads of how to improve batch processing.
http://fabiomaulo.blogspot.de/2011/03/nhibernate-32-batching-improvement.html
NHibernate performance insert
http://zvolkov.com/clog/2010/07/16?s=Insert+or+Update+records+in+bulk+with+NHibernate+batching

Categories

Resources