I need to wrap some pieces of code around a TransactionScope. The code inside this using statement calls a managed C++ library, which will call some unmanaged code. I do also want to update my database, which is using Entity Framework.
Here comes the problem, when doing SaveChanges on the DbContext inside the TransactionScope I always get some sort of Timeout exception in the database layer. I've googled this, and it seems to be a fairly common problem but I haven't found any applicable answers to my problem. This is a snippet of my code
using (var transactionScope = new TransactionScope())
{
try
{
//Do call to the managed C++ Library
using (var dbContext = _dbContextFactory.Create())
{
//doing some CRUD Operations on the DbContext
//Probably some more dbContext related stuff
dbContext.SaveChanges(); //Results with a timeout
}
}
catch (Exception)
{
transactionScope.Dispose();
throw;
}
}
I'm using Entity Framework 6.1.3 so I can access the BeginTransaction on the database, but I also need wrap the C++ calls inside a TransactionScope.
Any suggestions?
You will need to pass in TransactionScopeOptions defining your timeout (How long to keep the transaction open). An example for the absolute upper limit of timeouts would be:
TransactionOptions to = new TransactionOptions();
to.IsolationLevel = IsolationLevel.ReadCommitted;
to.Timeout = TransactionManager.MaximumTimeout;
using (TransactionScope ts = new TransactionScope(TransactionScopeOption.Required, to)) { }
You should definitely be aware of the impact of such a long-running transaction, this isn't typically required, and I'd highly recommend using something below MaximumTimeout which reflects how long you expect it to run. You should do your best to keep the time period for which the transaction is held as small as possible, doing any processing that doesn't have to be a single transaction outside the transaction scope.
It's also worth noting that depending on the underlying database it can enforce it's own limitations on transaction durations if configured to do so.
Related
In our code base, we use TransactionScope extensively to manage our transactions. We have code that could look like this in one part of our codebase:
// options declared elsewhere
using var transactionScope = new TransactionScope(TransactionScopeOption.Required, transactionScopeOptions, TransactionScopeAsyncFlowOption.Enabled);
await _repository.DeleteAll(cancellationToken);
// do more stuff, that might trigger a call to SaveChangesAsync somewhere
transactionScope.Complete()
Then, in our repository implementation, we may have something that looks like this:
public async Task DeleteAll(CancellationToken cancellationToken)
{
// This may not even be necessary
if (_dbContext.Database.GetDbConnection().State != ConnectionState.Open)
{
await _dbContext.Database.OpenConnectionAsync(cancellationToken);
}
await _dbContext.Database.ExecuteSqlRawAsync("DELETE FROM ThatTable", cancellationToken);
}
The documentation of ExecuteSqlRawAsync states that no transaction is started by that method. This leads me to my question: what is the proper way to start a transaction and have it enlisted in the transaction scope so that the call to Complete will commit this transaction along with the other work we have EF do?
As I understand, your goal is to run both DeleteAll (which uses ExecuteSqlRawAsync) and potential following SaveChangesAsync in the same transaction. If so - your code already achieves that.
Yes, ExecuteSqlRawAsync does not start a separate transaction, but you do not need another transaction, you are already inside a transaction, because you are inside TransactionScope. SqlClient (or whatever other provider for EF you use) will notice there is abmient Transaction.Current when connection is opened and will start a transaction. Both ExecuteSqlRawAsync and SaveChangesAsync will run inside that transaction and complete or rollback together (I verified it to be sure).
The comment about "doesn't start transaction" is more for situations like:
ExecuteSqlRawAsync("delete from sometable where id = 1;delete from sometable where id = 2;");
Where you indeed might want to run your sql inside a transaction (assuming one doesn't already exists), and so docs warn you that it will not do that for you.
I believe the best approach would be to do as follows (to start a transaction and stay within the transaction scope so that the call to complete will commit it):
using (var scope = new TransactionScope(...))
{
...
scope.Complete();
}
The transaction would begin as soon as you go within the brackets.
Background
We are trying to archive old user data to keep our most common tables smaller.
Issue
Normal EF code for removing records works for our custom tables. The AspNetUsers table is a different story. It appears that the way to do it is using _userManager.Delete or _userManager.DeleteAsync. These work without trying to do multiple db calls in one transaction. When I wrap this in a transactionScope, it times out. Here is an example:
public bool DeleteByMultipleIds(List<string> idsToRemove)
{
try
{
using (var scope = new TransactionScope())
{
foreach (var id in idsToRemove)
{
var user = _userManager.FindById(id);
//copy user data to archive table
_userManager.Delete(user);//causes timeout
}
scope.Complete();
}
return true;
}
catch (TransactionAbortedException e)
{
Logger.Publish(e);
return false;
}
catch (Exception e)
{
Logger.Publish(e);
return false;
}
}
Note that while the code is running and I call straight to the DB like:
DELETE
FROM ASPNETUSERS
WHERE Id = 'X'
It will also time out. This SQL works before the the C# code is executed. Therefore, it appears that more than 1 db hit seems to lock the table. How can I find the user(db hit #1) and delete the user (db hit #2) in one transaction?
For me, the problem involved the use of multiple separate DbContexts within the same transaction. The BeginTransaction() approach did not work.
Internally, UserManager.Delete() is calling an async method in a RunSync() wrapper. Therefore, using the TransactionScopeAsyncFlowOption.Enabled parameter for my TransactionScope did work:
using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
_myContext1.Delete(organisation);
_myContext2.Delete(orders);
_userManager.Delete(user);
scope.Complete();
}
Advice from microsoft is to use a different API when doing transactions with EF. This is due to the interactions between EF and the TransactionScope class. Implicitly transaction scope is forcing things up to serializable, which causes a deadlock.
Good description of an EF internal API is here: MSDN Link
For reference you may need to look into user manager if it exposes the datacontext and replace your Transaction scope with using(var dbContextTransaction = context.Database.BeginTransaction()) { //code }
Alternatively, looking at your scenario, you are actually quite safe in finding the user ID, then trying to delete it and then just catching an error if the user has been deleted in the fraction of a second between finding it and deleting it.
The previous version and question are provided as an added context below. The improved problem formulation and question could be as follows:
How does one share a transaction between multiple contexts in EF 6.1.0 database first and .NET 4.5.2 without doing a distributed transaction?
For that it looks like I need to share a connection between the multiple contexts, but the code examples and tutorials I've been looking at thus far haven't been that fruitful. The problem looks like is hovering around on how to define a functioning combination of a connection object and transaction object types so that EF database first object metadata is also built and found when constructing the object contexts.
That is, I would like to do akin to what has been described in the EF 6.n tutorials here. Some example code could be
int count1;
int count2;
using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
//How to define this connection so as not to run into UnintentionalCodeFirstException?
//Creating a dummy context to obtain the connectiong string like so
//dummyContext.Database.Connection.ConnectionString and then using the connection will be greeted with the aforementioned exception.
using(var conn = new SqlConnection("..."))
{
using(var c1 = new SomeEntities(conn, contextOwnsConnection: false))
{
//Use some stored procedures etc.
count1 = await c1.SomeEntity1.CountAsync();
}
using(var c2 = new SomeEntities(conn, contextOwnsConnection: false))
{
//Use some stored procedures etc.
count2 = await c2.SomeEntity21.CountAsync();
}
}
}
int count = count1 + count2;
In the examples there are also other methods as to how to create a shared connection and a transaction, but as written, the culprit seem to be that if, say, I provide the connectiong string in (the "..." part) the previous snippet as dummyContext.Database.Connection.ConnectionString I'll get just an exception.
I'm not sure if I'm just reading the wrong sources or if there's something else that's wrong in my code when I try to share a transaction across multiple EF contexts. How could it be done?
I've read quite a few other SO posts regarding this (e.g. this) and some tutorials. They did not help.
I have a strange problem in that it looks I don't have the constructor overloads defined as in other tutorials and posts. That is, taking the linked tutorial link, I can't write new BloggingContext(conn, contextOwnsConnection: false)) and use a shared connection and an external transaction.
Then if I write
public partial class SomeEntities: DbContext
{
public SomeEntities(DbConnection existingConnection, bool contextOwnsConnection): base(existingConnection, contextOwnsConnection) { }
}
and use it like in the tutorials, I get an exception from the following line from the following T4 template generated code
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
throw new UnintentionalCodeFirstException();
}
I'm using .NET 4.5.2 and EF 6.1.0. I'ved constructed the edmx from an existing database and generated the code from there. In this particular situation I'm using Task Parallel threads to load dozens of SQL Server Master Data Services staging tables (yes, a big model) and to call the associated procedures (provided by the MDS one per table). MDS has its own compensation logic in case staging to some of the tables fails, but rolling back a transaction should be doable too. It just looks like I have a (strange) problem with my EF.
<Addendum: Steve suggested using straight TransactionScope. Without a shared connection that would require a distributed transaction, which isn't an option I can choose. Then if I try to provide a shared connection for the contexts (some options shown in the tutorials, one here I have the problem of "missing constructors". When I define one, I get the exception I refer in the code. All in all, this feels quite strange. Maybe there's something wrong in how I go about generating the DbContext and related classes.
<Note 1: It looks like the root cause is as in this blog post by Arthur (of EF developer team) Don't use Code First by mistake. That is, in database first development the framework seeks for the class-relational mappings as defined in the connection string. Something fishy in my connection string that is..?
Have you tried wrapping the calls in a transaction scope?
using (var scope = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions() { IsolationLevel = IsolationLevel.ReadCommitted }))
{
// Do context work here
context1.Derp();
context2.Derp();
// complete the transaction
scope.Complete();
}
Because of connection pool, using multiple EF DbContext which are using exactly the same connection string, under same transaction scope will usually not result in DTC escalation (unless you disabled pooling in connection string), because the same connection will be reused for them both (from the pool). Anyway, you can reuse the same connection in your case like this (I assume you already added constructor which accepts DbConnection and flag indicating if context owns connection):
using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled)) {
// important - use EF connection string here,
// one that starts with "metadata=res://*/..."
var efConnectionString = ConfigurationManager.ConnectionStrings["SomeEntities"].ConnectionString;
// note EntityConnection, not SqlConnection
using (var conn = new EntityConnection(efConnectionString)) {
// important to prevent escalation
await conn.OpenAsync();
using (var c1 = new SomeEntities(conn, contextOwnsConnection: false)) {
//Use some stored procedures etc.
count1 = await c1.SomeEntity1.CountAsync();
}
using (var c2 = new SomeEntities(conn, contextOwnsConnection: false)) {
//Use some stored procedures etc.
count2 = await c2.SomeEntity21.CountAsync();
}
}
scope.Complete();
}
This works and does not throw UnintentionalCodeFirstExce‌​ption because you pass EntityConnection. This connection has information about EDMX metadata and that is what database first needs. When you pass plain SqlConnection - EF has no idea where to look for metadata and actually doesn't even know it should look for it - so it immediately assumes you are doing code-first.
Note that I pass EF connection string in code above. If you have some plain SqlConnection which you obtained by some other means, outside EF, this will not work, because connection string is needed. But, it's still possible because EntityConnection has constructor which accepts plain DbConnection. However, then you should pass reference to metadata yourself. If you are interested in this - I can provide code example of how to do that.
To check that you indeed prevent escalation in all cases - disable pooling (Pooling=false in connection string) and stop DTC service, then run this code - it should run fine. Then run another code which does not share the same connection and you should observe error indicating escalation was about to happen but service is not available.
I've got a doubt about transactionscope because I'd like to make a transactional operation where first I perform some CRUD operations (a transaction which inserts and updates some rows on the DataBase) and I got a result from the whole transaction (an XML).
After I got the XML I send the XML to a Web Service which my customer exposes to integrate my system with.
The point is, let's imagine that one day the WS that my customer exposes falls down due to a weekly or monthly support task that its IT Area perform, so everymoment I perform the whole thing It performs the DB operation but of course It will throw an exception at the moment that I try to call the WS.
After Searching on the Internet I started to think of Transaction Scope. My Data Access Method which is on my Data Access Layer already has a TransactionScope where I perform insert, update, delete, etc.
The following Code is what I'd like to try:
public void ProcessSomething()
{
using (TransactionScope mainScope = new TransactionScope())
{
FooDAL dl = new FooDAL();
string message = dl.ProcessTransaction();
WSClientFoo client = new WSClientFoo();
client.SendTransactionMessage(message);
mainScope.Complete();
}
}
public class FooDAL
{
public string ProcessTransaction()
{
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions(){ IsolationLevel = IsolationLevel.ReadCommitted}))
{
///Do Insert, Update, Delete and According to the Operation Generates a message
scope.Complete();
}
return transactionMessage;
}
}
The question is, is it correct to use TransactionScope to handle what I want to do ?
Thanks a lot for your time :)
TransactionScopeOption.Required in your FooDAL.ProcessTransaction method means in fact: if there is a transaction available, reuse it in this scope; otherwise, create a new one.
So in short: yes, this is the correct way of doing this.
But be advised that if you don't call scope.Complete() in FooDAL.ProcessTransaction, a call to mainScope.Complete() will crash with a 'TransactionAbortedException' or something like that, which makes sense: if a nested scope decides that the transaction cannot be committed the outer scope should not be able to commit it.
So I'm working on this Entity Framework project that'll be used as kind of a DAL and when running stress tests (starting a couple of updates on entities through Thread()'s) and I'm getting these:
_innerException = {"Transaction (Process ID 94) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."}
Here's some example of how I implemented my classes' methods:
public class OrderController
{
public Order Select(long orderID)
{
using (var ctx = new BackEndEntities())
{
try
{
var res = from n in ctx.Orders
.Include("OrderedServices.Professional")
.Include("Agency")
.Include("Agent")
where n.OrderID == orderID
select n;
return res.FirstOrDefault();
}
catch (Exception ex)
{
throw ex;
}
}
}
public bool Update(Order order)
{
using (var ctx = new BackEndEntities())
{
try
{
order.ModificationDate = DateTime.Now;
ctx.Orders.Attach(order);
ctx.SaveChanges();
return true;
}
catch (Exception ex)
{
throw ex;
}
}
}
}
and:
public class AgentController
{
public Agent Select(long agentID)
{
using (var ctx = new BackEndEntities())
{
try
{
var res = from n in ctx.Agents.Include("Orders")
where n.AgentID == agentID
select n;
return res.FirstOrDefault();
}
catch (Exception ex)
{
throw ex;
}
}
}
public bool Update(Agent agent)
{
using (var ctx = new BackEndEntities())
{
try
{
agent.ModificationDate = DateTime.Now;
ctx.Agents.Attach(agent);
ctx.ObjectStateManager.ChangeObjectState(agent, System.Data.EntityState.Modified);
ctx.SaveChanges();
return true;
}
catch (Exception ex)
{
throw ex;
}
}
}
}
Obviously, the code here probably could be better but I'm rather of an EF newbie. But I think my problem is rather a design problem with the context.
I remember someone here mentioning that if my context is NOT shared, I won't run into these deadlock issues.
This does not seem 'shared' to me as I do a using new BackEndEntities() in each method, so what do I have to change to make it more robust ?
This DAL will be used in a web service exposed on the internet (after code review of coure) so I have no control on how much it'll be stressed and lots of different instances might want to update the same entity.
Thanks!
The reason for thouse deadlocks isn't your code but due to EF that is using SERIALIZABLE for default TransactionScope isolation level.
SERIALIZABLE is the most restricted locking possible, this means that you are by default opting into the most restrictive isolation level, and you can expect a lot of locking!
The solution is to specify another TransactionScope depending on the action you want to perform. You can surround your EF actions with something like this:
using (var scope = new TransactionScope(TransactionScopeOption.Required, new
TransactionOptions { IsolationLevel= IsolationLevel.Snapshot }))
{
// do something with EF here
scope.Complete();
}
Read more on this issue:
http://blogs.msdn.com/b/diego/archive/2012/04/01/tips-to-avoid-deadlocks-in-entity-framework-applications.aspx
http://blogs.u2u.be/diederik/post/2010/06/29/Transactions-and-Connections-in-Entity-Framework-40.aspx
http://blog.aggregatedintelligence.com/2012/04/sql-server-transaction-isolation-and.html
https://serverfault.com/questions/319373/sql-deadlocking-and-timing-out-almost-constantly
Deadlock freedom is a pretty hard problem in a big system. It has nothing to do with EF by itself.
Shortening the lifetime of your transactions reduces deadlocks but it introduces data inconsistencies. In those places where you were deadlocking previously you are now destroying data (without any notification).
So choose your context lifetime and your transaction lifetime according to the logical transaction, not according to physical considerations.
Turn on snapshot isolation. This takes reading transactions totally out of the equation.
For writing transactions you need to find a lock ordering. Often it is the easiest way to lock pessimistically and at a higher level. Example: Are you always modifying data in the context of a customer? Take an update lock on that customer as the first statement of your transactions. That provides total deadlock freedom by serializing access to that customer.
The context is what gives entity its ability to talk to the database, without a context there's no concept of what goes where. Spinning up a context, therefore, is kind of a big deal and it occupies a lot of resources, including external resources like the database. I believe your problem IS the 'new' command, since you would have multiple threads attempting to spin up and grab the same database resource, which definitely would deadlock.
Your code as you've posted it seems to be an anti-pattern. The way it looks, you have your Entity Context spinning up and going out of scope relatively quickly, while your repository CRUD objects seem to be persisting for a much longer time.
The way the companies I have implemented Entity for have traditionally done it exactly the opposite way - the Context is created and is kept for as long as the assembly has need of database, and the repository CRUD objects are created and die in microseconds.
I cannot say where you got your assertion of the context not being shared from so I dunno what circumstances that was said under, but it is absolutely true that you should not share the context across assemblies. Among the same assembly I cannot see any reason why you wouldn't with how many resources it takes to start up a context, and how long it takes to do so. The Entity Context is quite heavy, and if you were to make your current code work by going single-threaded I suspect you would see some absolutely atrocious performance.
So what I would recommend instead is to refactor this so you have Create(BackEndEntites context) and Update(BackEndEntities context), then have your master thread (the one making all these child threads) create and maintain a BackEndEntities context to pass along to its children. Also be sure that you get rid of your AgentControllers and OrderControllers the instant you're done with them and never, ever, ever reuse them outside of a method. Implementing a good inversion of control framework like Ninject or StructureMap can make this a lot easier.