Background
We are trying to archive old user data to keep our most common tables smaller.
Issue
Normal EF code for removing records works for our custom tables. The AspNetUsers table is a different story. It appears that the way to do it is using _userManager.Delete or _userManager.DeleteAsync. These work without trying to do multiple db calls in one transaction. When I wrap this in a transactionScope, it times out. Here is an example:
public bool DeleteByMultipleIds(List<string> idsToRemove)
{
try
{
using (var scope = new TransactionScope())
{
foreach (var id in idsToRemove)
{
var user = _userManager.FindById(id);
//copy user data to archive table
_userManager.Delete(user);//causes timeout
}
scope.Complete();
}
return true;
}
catch (TransactionAbortedException e)
{
Logger.Publish(e);
return false;
}
catch (Exception e)
{
Logger.Publish(e);
return false;
}
}
Note that while the code is running and I call straight to the DB like:
DELETE
FROM ASPNETUSERS
WHERE Id = 'X'
It will also time out. This SQL works before the the C# code is executed. Therefore, it appears that more than 1 db hit seems to lock the table. How can I find the user(db hit #1) and delete the user (db hit #2) in one transaction?
For me, the problem involved the use of multiple separate DbContexts within the same transaction. The BeginTransaction() approach did not work.
Internally, UserManager.Delete() is calling an async method in a RunSync() wrapper. Therefore, using the TransactionScopeAsyncFlowOption.Enabled parameter for my TransactionScope did work:
using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
_myContext1.Delete(organisation);
_myContext2.Delete(orders);
_userManager.Delete(user);
scope.Complete();
}
Advice from microsoft is to use a different API when doing transactions with EF. This is due to the interactions between EF and the TransactionScope class. Implicitly transaction scope is forcing things up to serializable, which causes a deadlock.
Good description of an EF internal API is here: MSDN Link
For reference you may need to look into user manager if it exposes the datacontext and replace your Transaction scope with using(var dbContextTransaction = context.Database.BeginTransaction()) { //code }
Alternatively, looking at your scenario, you are actually quite safe in finding the user ID, then trying to delete it and then just catching an error if the user has been deleted in the fraction of a second between finding it and deleting it.
Related
I'm writing a WebApi solution and the architecture is done in a way that only controller layer can commit (save changes in the db) using the unit of work. I needed to do a batch insert (less than 100 entities). I tried to go with the following
foreach (var req in Reqs)
{
await unitOfWork.GetService<IReqService>().AddNewReq(req);
try
{
await unitOfWork.Commit();
}
catch (Exception e) { }
}
But it fails if one of the entities fails to be committed, I understood that the issue is once I have one bad entity linked to the db, every time I will try to commit, I will be committing everything not committed yet which includes the new entity I'm trying to add AND the ones that failed to be committed. Meaning nothing will be added.
I was reading around on how to solve this and read something about detaching the entities that fails but I'm not sure if it would be the way to do it with such an architecture.
I think it's relevant to post the Commit code so:
public async Task Commit()
{
if (_tenantDatabase != null)
await _tenantDatabase.SaveChangesAsync();
}
I need to wrap some pieces of code around a TransactionScope. The code inside this using statement calls a managed C++ library, which will call some unmanaged code. I do also want to update my database, which is using Entity Framework.
Here comes the problem, when doing SaveChanges on the DbContext inside the TransactionScope I always get some sort of Timeout exception in the database layer. I've googled this, and it seems to be a fairly common problem but I haven't found any applicable answers to my problem. This is a snippet of my code
using (var transactionScope = new TransactionScope())
{
try
{
//Do call to the managed C++ Library
using (var dbContext = _dbContextFactory.Create())
{
//doing some CRUD Operations on the DbContext
//Probably some more dbContext related stuff
dbContext.SaveChanges(); //Results with a timeout
}
}
catch (Exception)
{
transactionScope.Dispose();
throw;
}
}
I'm using Entity Framework 6.1.3 so I can access the BeginTransaction on the database, but I also need wrap the C++ calls inside a TransactionScope.
Any suggestions?
You will need to pass in TransactionScopeOptions defining your timeout (How long to keep the transaction open). An example for the absolute upper limit of timeouts would be:
TransactionOptions to = new TransactionOptions();
to.IsolationLevel = IsolationLevel.ReadCommitted;
to.Timeout = TransactionManager.MaximumTimeout;
using (TransactionScope ts = new TransactionScope(TransactionScopeOption.Required, to)) { }
You should definitely be aware of the impact of such a long-running transaction, this isn't typically required, and I'd highly recommend using something below MaximumTimeout which reflects how long you expect it to run. You should do your best to keep the time period for which the transaction is held as small as possible, doing any processing that doesn't have to be a single transaction outside the transaction scope.
It's also worth noting that depending on the underlying database it can enforce it's own limitations on transaction durations if configured to do so.
Using a try-catch structure i'm trying to figure what to do if an exception is caught in any point of the transaction. Below one sample of code:
try
{
DbContext.ExecuteSqlCommand("BEGIN TRANSACTION"); //Line 1
DBContext.ExecuteSqlCommand("Some Insertion/Deletion Goes Here"); //Line 2
DbContext.ExecuteSqlCommand("COMMIT"); //Line 3
}
catch(Exception)
{
}
If the expection was caught executing 'Line 1' nothing must be done besides alerting the error. If it was caught executing the second line i don't know if i need to try to rollback the transaction that was sucessfully opened and the same occurs in case something went wrong with the third line.
Should i just send a rollback anyway? Or send all the commands straight to the bank in a single method call?
Inside the try-catch there's a loop performing many transactions like the one in the sample (and i need lots of small transactions instead of just a big one so i can reuse the SQL's '_log' file properly and avoid it to grow unnecessarily).
If any of the transactions go wrong i'll just need to delete them all and inform what happen't, but i can't turn that into one big transaction and just use rollback otherwise it will make the log file grow up to 40GB.
Think this will help:
using (var ctx = new MyDbContext())
{
// begin a transaction in EF – note: this returns a DbContextTransaction object
// and will open the underlying database connection if necessary
using (var dbCtxTxn = ctx.Database.BeginTransaction())
{
try
{
// use DbContext as normal - query, update, call SaveChanges() etc. E.g.:
ctx.Database.ExecuteSqlCommand(
#"UPDATE MyEntity SET Processed = ‘Done’ "
+ "WHERE LastUpdated < ‘2013-03-05T16:43:00’");
var myNewEntity = new MyEntity() { Text = #"My New Entity" };
ctx.MyEntities.Add(myNewEntity);
ctx.SaveChanges();
dbCtxTxn.Commit();
}
catch (Exception e)
{
dbCtxTxn.Rollback();
}
} // if DbContextTransaction opened the connection then it will close it here
}
taken from: https://entityframework.codeplex.com/wikipage?title=Improved%20Transaction%20Support
Basically the idea of it is your transaction becomes part of the using block, and within that you have a try/catch with the actual sql. If anything fails within the try/catch, it will be rolled back
As of Entity Framework 6, ExecuteSqlCommand is wrapped with its own transaction as explained here: http://msdn.microsoft.com/en-gb/data/dn456843.aspx
Unless you explicitly need to roll multiple sql commands into a single transaction, there is no need to explicitly begin a new transaction scope.
With respect to transaction log growth and assuming you are targeting Sql Server then setting the transaction log operation to simple will ensure the log gets recycled between checkpoints.
Obviously if the transaction log history is not being maintained across the entire import, there is no implicit mechanism to rollback all the data in case of failure. Keeping it simple, I would probably just add a 'created' datetime field to the table and delete from the table based on a filter to the created field if I needed to delete all rows in case of error.
I've got a doubt about transactionscope because I'd like to make a transactional operation where first I perform some CRUD operations (a transaction which inserts and updates some rows on the DataBase) and I got a result from the whole transaction (an XML).
After I got the XML I send the XML to a Web Service which my customer exposes to integrate my system with.
The point is, let's imagine that one day the WS that my customer exposes falls down due to a weekly or monthly support task that its IT Area perform, so everymoment I perform the whole thing It performs the DB operation but of course It will throw an exception at the moment that I try to call the WS.
After Searching on the Internet I started to think of Transaction Scope. My Data Access Method which is on my Data Access Layer already has a TransactionScope where I perform insert, update, delete, etc.
The following Code is what I'd like to try:
public void ProcessSomething()
{
using (TransactionScope mainScope = new TransactionScope())
{
FooDAL dl = new FooDAL();
string message = dl.ProcessTransaction();
WSClientFoo client = new WSClientFoo();
client.SendTransactionMessage(message);
mainScope.Complete();
}
}
public class FooDAL
{
public string ProcessTransaction()
{
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions(){ IsolationLevel = IsolationLevel.ReadCommitted}))
{
///Do Insert, Update, Delete and According to the Operation Generates a message
scope.Complete();
}
return transactionMessage;
}
}
The question is, is it correct to use TransactionScope to handle what I want to do ?
Thanks a lot for your time :)
TransactionScopeOption.Required in your FooDAL.ProcessTransaction method means in fact: if there is a transaction available, reuse it in this scope; otherwise, create a new one.
So in short: yes, this is the correct way of doing this.
But be advised that if you don't call scope.Complete() in FooDAL.ProcessTransaction, a call to mainScope.Complete() will crash with a 'TransactionAbortedException' or something like that, which makes sense: if a nested scope decides that the transaction cannot be committed the outer scope should not be able to commit it.
So I'm working on this Entity Framework project that'll be used as kind of a DAL and when running stress tests (starting a couple of updates on entities through Thread()'s) and I'm getting these:
_innerException = {"Transaction (Process ID 94) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."}
Here's some example of how I implemented my classes' methods:
public class OrderController
{
public Order Select(long orderID)
{
using (var ctx = new BackEndEntities())
{
try
{
var res = from n in ctx.Orders
.Include("OrderedServices.Professional")
.Include("Agency")
.Include("Agent")
where n.OrderID == orderID
select n;
return res.FirstOrDefault();
}
catch (Exception ex)
{
throw ex;
}
}
}
public bool Update(Order order)
{
using (var ctx = new BackEndEntities())
{
try
{
order.ModificationDate = DateTime.Now;
ctx.Orders.Attach(order);
ctx.SaveChanges();
return true;
}
catch (Exception ex)
{
throw ex;
}
}
}
}
and:
public class AgentController
{
public Agent Select(long agentID)
{
using (var ctx = new BackEndEntities())
{
try
{
var res = from n in ctx.Agents.Include("Orders")
where n.AgentID == agentID
select n;
return res.FirstOrDefault();
}
catch (Exception ex)
{
throw ex;
}
}
}
public bool Update(Agent agent)
{
using (var ctx = new BackEndEntities())
{
try
{
agent.ModificationDate = DateTime.Now;
ctx.Agents.Attach(agent);
ctx.ObjectStateManager.ChangeObjectState(agent, System.Data.EntityState.Modified);
ctx.SaveChanges();
return true;
}
catch (Exception ex)
{
throw ex;
}
}
}
}
Obviously, the code here probably could be better but I'm rather of an EF newbie. But I think my problem is rather a design problem with the context.
I remember someone here mentioning that if my context is NOT shared, I won't run into these deadlock issues.
This does not seem 'shared' to me as I do a using new BackEndEntities() in each method, so what do I have to change to make it more robust ?
This DAL will be used in a web service exposed on the internet (after code review of coure) so I have no control on how much it'll be stressed and lots of different instances might want to update the same entity.
Thanks!
The reason for thouse deadlocks isn't your code but due to EF that is using SERIALIZABLE for default TransactionScope isolation level.
SERIALIZABLE is the most restricted locking possible, this means that you are by default opting into the most restrictive isolation level, and you can expect a lot of locking!
The solution is to specify another TransactionScope depending on the action you want to perform. You can surround your EF actions with something like this:
using (var scope = new TransactionScope(TransactionScopeOption.Required, new
TransactionOptions { IsolationLevel= IsolationLevel.Snapshot }))
{
// do something with EF here
scope.Complete();
}
Read more on this issue:
http://blogs.msdn.com/b/diego/archive/2012/04/01/tips-to-avoid-deadlocks-in-entity-framework-applications.aspx
http://blogs.u2u.be/diederik/post/2010/06/29/Transactions-and-Connections-in-Entity-Framework-40.aspx
http://blog.aggregatedintelligence.com/2012/04/sql-server-transaction-isolation-and.html
https://serverfault.com/questions/319373/sql-deadlocking-and-timing-out-almost-constantly
Deadlock freedom is a pretty hard problem in a big system. It has nothing to do with EF by itself.
Shortening the lifetime of your transactions reduces deadlocks but it introduces data inconsistencies. In those places where you were deadlocking previously you are now destroying data (without any notification).
So choose your context lifetime and your transaction lifetime according to the logical transaction, not according to physical considerations.
Turn on snapshot isolation. This takes reading transactions totally out of the equation.
For writing transactions you need to find a lock ordering. Often it is the easiest way to lock pessimistically and at a higher level. Example: Are you always modifying data in the context of a customer? Take an update lock on that customer as the first statement of your transactions. That provides total deadlock freedom by serializing access to that customer.
The context is what gives entity its ability to talk to the database, without a context there's no concept of what goes where. Spinning up a context, therefore, is kind of a big deal and it occupies a lot of resources, including external resources like the database. I believe your problem IS the 'new' command, since you would have multiple threads attempting to spin up and grab the same database resource, which definitely would deadlock.
Your code as you've posted it seems to be an anti-pattern. The way it looks, you have your Entity Context spinning up and going out of scope relatively quickly, while your repository CRUD objects seem to be persisting for a much longer time.
The way the companies I have implemented Entity for have traditionally done it exactly the opposite way - the Context is created and is kept for as long as the assembly has need of database, and the repository CRUD objects are created and die in microseconds.
I cannot say where you got your assertion of the context not being shared from so I dunno what circumstances that was said under, but it is absolutely true that you should not share the context across assemblies. Among the same assembly I cannot see any reason why you wouldn't with how many resources it takes to start up a context, and how long it takes to do so. The Entity Context is quite heavy, and if you were to make your current code work by going single-threaded I suspect you would see some absolutely atrocious performance.
So what I would recommend instead is to refactor this so you have Create(BackEndEntites context) and Update(BackEndEntities context), then have your master thread (the one making all these child threads) create and maintain a BackEndEntities context to pass along to its children. Also be sure that you get rid of your AgentControllers and OrderControllers the instant you're done with them and never, ever, ever reuse them outside of a method. Implementing a good inversion of control framework like Ninject or StructureMap can make this a lot easier.