I have C# window service which talks to multiple databases on a MS SQL server. It is multi threaded and has many functions each with long list of database operations, each of these functions run under their own transaction. So a typical function is like
public void DoSomeDBWork()
{
using (TransactionScope ts = new TransactionScope(TransactionScopeOption.RequiresNew))
{
DatabaseUpdate1();
DatabaseUpdate2();
DatabaseUpdate3();
DatabaseUpdate4();
DatabaseUpdate5();
DatabaseUpdate6();
}
}
Under heavy load we are experiencing deadlocks. My question is, if I write some C# code to automatically resubmit the DatabaseUpdate in case of a deadlock will it hold back resources for uncommitted operations? for example , if a deadlock exception occurs in DatabaseUpdate6() and i retry it 3 times with a wait of 3 seconds, during this time will all the uncommitted operations "DatabaseUpdates 1 to 5" hold on to their resources which might further increase chances of more deadlocks ? Is it even a good practice to retry in case of deadlocks.
You are barking up the wrong tree.
Deadlock means the entire transaction scope is undone. Depending on your application, you may be able to restart from the using block, ie. a new TransactionScope, but this is very very very unlikely to be correct. The reason you are seeing a deadlock is that someone else has changed data that you were changing too. Since most of these updated are applying the update to a value previously read from the database, the deadlock is a clear indication that whatever you've read was changed. So applying your updates w/o reading again is going to overwrite whatever was changed by the other transaction, thus causing lost updates. This is why deadlock can almost never be 'automatically' retried, the new data has to be reload from the db, if user action was involved (eg. form edit) then user has to be notified and has to re-validate the changes, and only then the update can be tried again. Only certain type of automatic processing actions can be retires, but they are never retried as in 'try to write again', but they always act in a loop of 'read-update-write' and deadlocks will cause the loop to try again, and since they always start with 'read'. They are automatically self-correcting.
That being said, your code deadlocks most likely because of abusing the serialization isolation level when not required: using new TransactionScope() Considered Harmful. You must overwrite the transaction options to use the ReadCommitted isolation level, serializable is almost never required and is a guaranteed way to achieve deadlocks.
Second issue is Why does serialization deadlock? It deadlocks because of table scans, which indicate you don't have proper indexes in place for your reads and your updates.
Last issue is that you use RequiresNew, which is again, 99% of the cases, incorrect. Unless you have real deep understanding of what's going on and a bulletproof case for requiring a standalone transaction, you should always use Required and enlist in the encompassing transaction of the caller.
This doesn't cover everything in your question but on the subject of retries. The idea of retrying transactions, database or not, is dangerous and you should not read this if the word "idempotent" means nothing to you (frankly, i don't know enough about it either but my management had the final word and off I went to write in retries for deadlocks. I spoke to a couple of the smartest guys I know in this area and they all came back to me with "BAD BAD" so I don't feel good about committing that source. disclaimer aside, had to do it so may as well make it fun..., here's something I wrote recently to retry MySql deadlocks a specified number of times before throwing and returning
Using anonymous method you only have to have one receiver that can dynamically handle method signatures and generic return types. You'll also need a similar one for void return that will just need to use Action() For MSSQL it'll look pretty much identical I think, minus the 'my'
The handler that does the retry:
//
private T AttemptActionReturnObject<T>(Func<T> action)
{
var attemptCount = 0;
do
{
attemptCount++;
try
{
return action();
}
catch (MySqlException ex)
{
if (attemptCount <= DB_DEADLOCK_RETRY_COUNT)
{
switch (ex.Number)
{
case 1205: //(ER_LOCK_WAIT_TIMEOUT) Lock wait timeout exceeded
case 1213: //(ER_LOCK_DEADLOCK) Deadlock found when trying to get lock
Thread.Sleep(attemptCount*1000);
break;
default:
throw;
}
}
else
{
throw;
}
}
} while (true);
}
Wrap your method call with delegate or lambda
public int ExecuteNonQuery(MySqlConnection connection, string commandText, params MySqlParameter[] commandParameters)
{
try
{
return AttemptActionReturnObject( () => MySqlHelper.ExecuteNonQuery(connection, commandText, commandParameters) );
}
catch (Exception ex)
{
throw new Exception(ex.ToString() + " For SQL Statement:" + commandText);
}
}
it may also look like this:
return AttemptActionReturnObject(delegate { return MySqlHelper.ExecuteNonQuery(connection, commandText, commandParameters); });
When SQL detects a deadlock, it kills one thread and reports an error. If your thread is killed it automatically rolls back any uncommitted transactions - in your case ALL of the DatabaseUpdate*() that were already ran during this most recent transaction.
The ways to deal with this depend entirely on your environment. If you have something like a control table, or a string table, which is not updated, but frequently read. You can use NOLOCK... cue kicking and screaming... It is actually quite useful when you aren't worried about time or transaction sensitive information. However when you are dealing with volatile or stateful information you cannot use NOLOCK because it will lead to unexpected behavior.
There are two ways to handle deadlocks that I use. Either straight up restart the transaction from the beginning when you detect a failure. Or you can read in your variables before you use them, and execute afterwards. The second is something of a resource hog and sees significant decrease in performance so it should not be used for high-volume functionality.
I think different database servers may respond to a deadlock differently, howerver with SQL Server if two transactions are deadlocked one is elected by the server to as the deadlock victim (error 1205) and that transaction is rolled back. This means of course that the other transaction is able to proceed.
If you're the deadlock victim, you will have to redo all your database updates, not just update6.
In response to comments about avoiding deadlocks with hints such as NOLOCK, I would strongly recommand against it.
Deadlocks are simply a fact of life. Imagine, two users each submitting a manual journal entry into an accounting system
The first entry does a credit of the bank account & a debit of the receivables.
The second entry does a debit of the ar & credit bank.
Now imagine both transactions play at the same time (something that rarely if ever happens in testing)
transaction 1 locks the bank account
transaction 2 locks the a/r account.
transactions 1 tries to lock receivables and blocks waiting on transaction 2.
transaction 2 tries to lock the bank and a deadlock is automatically and instantly detected.
one of the transactions is elected as a victim of a deadlock and is rolled back. The other transaction proceeds as if nothing happened.
Deadlocks are a reality and the way to respond to them is quite straight forward. "please hang up and try your call again."
See MSDN for more information on Handling Deadlocks with SQL Server
Related
I am calling few stored procedures (functions in postgresql) using ExecuteNonQuery inside C# transaction,My SPs looks like
CREATE OR REPLACE FUNCTION public.setuserstatus(
par_userid integer,
par_status character varying)
RETURNS void
LANGUAGE 'plpgsql'
COST 100
VOLATILE PARALLEL UNSAFE
AS $BODY$
BEGIN
UPDATE public.user
SET status = par_status
WHERE userid = par_userid;
END;
$BODY$;
So to call this function what should be the best practice? is below code enough or I should use this inside transaction and use commit and rollback? please suggest?
using (var conn = new NpgsqlConnection(_connectionString))
{
await conn.OpenAsync(ct);
using (var cmd = new NpgsqlCommand("irnutil.setcrdsstatusforapmaccount", conn))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.AddWithValue(ApmAccountIdParameterName, accountId);
cmd.Parameters.AddWithValue(CrdsStatusParameterName, crdsStatus.ToString());
await cmd.ExecuteNonQueryAsync(ct);
}
}
Also, the Npgsql documentations says "However, for maximum portability it's recommended to set the transaction on your commands.", is that mean we should use transaction here?
https://www.npgsql.org/doc/transactions.html
It depends on what you want to achieve.
The general idea of transaction is - A unit of work that is performed against a Database.
However, you will not want to use a transaction over select queries.
for Example:
select * from some table where condition = {condition}
Will not be a candidate for a proper transaction block.
Your example of update, is a good candidate.
Now comes the question, whether I should use it at code level or Database level?
The answer is again a question - What do you want to achieve?
If you perform it Database level, you will lose your ability to log errors. Yes, the transaction, if executed correctly to satisfy
your conditions will rollback the operation but you will have no
further details, meaning that you will have to use a profiler to
determine the cause for failure. On the other hand, Your transaction
will be safe and secured from the 'outside world' - If the chance
happen and someone changes your code or an harmful attack is taking
place, they will not be able to change the transaction and the data
will be safer.
If you perform it code level, You will be able to log the exception and review the cause for failure, it will be easier to
handle programatically however - It will be exposed to code changes
and possible malware attacks as I mentioned above.
A deadlock situation may occur during the execution of a transaction, meaning that if you execute a block of sql wrapped in transaction, you cannot execute it again until the transaction finished it's job or, you will encounter a dead lock and throw an exception. In that case it is recommended to create a queue to monitor your calls to the transaction and check whether you are clear to make another execution or not.
Again - this is all your decision according to your needs.
I have a problem of two requests entering the same code simulataneously. I implemented the following solution that I read in the following article: https://www.developerfusion.com/article/84514/deep-c-8211-avoiding-race-conditions/
Code:
public class TransactionManager
{
static readonly object MyCountLock = new object();
public void ExectuteSQLStatments()
{
try
{
Monitor.Enter(MyCountLock);
// SQL statements that must be executed by one thread (or entering SQL).
}
finally
{
Monitor.Exit(MyCountLock);
}
}
}
I am not sure that it is the best way to protect the execution of the statements by only one thread entering the 'Monitor' statement. That code was affected recently by race condition recently, and I want to prevent that condition from happening again.
It depends on what you are trying to achieve. If it is database locks that you are trying to achieve, there are multiple ways to do it.
Perfectly good example, the way you did it is just fine. Though this does not protect your database from deadlocks. If there is another piece of code that handles the same tables, then it could be that you'll still get to the deadlocked situation.
You can also use locks, check difference with Monitor here.
You can lock with the database itself. Aquire a lock to the table that you fear can be deadlock before anything. You can do this by doing a mock update (update to the same value) of the rows you want to change. That way the next command will wait for you to commit the transaction before being able to start. This has the added value that you can update other rows that are not used in the same transactions.
As always, it depends on what you want to achieve. Provide some extra information and code for more help on this.
I have the following code, intended to break a bulk EF save into smaller chunks, ostensibly to improve performance.
var allTasks = arrayOfConfigLists
.Select(configList =>
Task.Run(() => SaveConfigurations(configList))
.ToArray();
Task.WaitAll(allTasks);
Each call to SaveConfigurations creates a new context that runs to completion.
private static void SaveConfigurations(List<Configuration> configs)
{
using (var dc = new ConfigContext())
{
dc.Configuration.AutoDetectChangesEnabled = false;
dc.SaveConfigurations(configs);
}
}
As it stands, the code runs relatively efficiently, considering this might not be the optimal way of doing things. If one of the SaveConfigurations fails, however, I realized I would need to roll back any other configurations that were saved to the database.
After doing some research, I upgraded my existing frameworks to 4.5.1 and utilized the new TransactionScopeAsyncFlowOption.Enabled option to deal with async calls. I made the following change:
using (var scope =
new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
{
//... allTasks code snippet from above
scope.Complete();
}
At this point, I started aggregating all kinds of interesting errors:
The operation is not valid for the state of the transaction.
The underlying provider failed on Open.
Network access for Distributed Transaction Manager (MSDTC) has been disabled.
The transaction manager has disabled its support for remote/network transactions.
What I don't understand is why introducing TransactionScope would create so many issues. I assume I have a fundamental misunderstanding of how async calls interact with EF, and how TransactionScope wraps those calls, but I can't figure it out. And I really have no clue what the MSDTC exception pertains to.
Any thoughts as to how I could have rollback functionality with asynchronous calls made to the same database? Is there a better way to handle this situation?
Update:
After reviewing the documentation here, I see that Database.BeginTransaction() is the preferred EF call. However, this assumes that all of my changes will occur within the same context, which it won't. Short of creating a dummy context and passing around a transaction, I don't believe this solves my issue.
This has nothing to do with async. You are writing on multiple connections and want that to be atomic. That requires distributed transactions. There is no way around that.
You also might run into distributed deadlocks this way that will only be resolved by timeouts.
Probably, the best approach is to stop using multiple connections. If performance is such a concern consider making the writes using one of the well known bulk DML techniques which do not involve EF.
You can use MARS to make concurrent writes on the same connection but they are really executed serially on the server. This would provide a small speedup though due to pipelining effects. Likely not worth the trouble
How about this
This will only create one context i.e, and attach entities to the context.
See entity framework bulk insertion
If there is anything goes wrong in the insertion the entire transaction will be roll-backed. If you want more transaction like pattern implement Unit of work pattern
as far as i know Entity framework itself has Unit of work pattern.
public SaveConfigurations(List<Configuration> configs)
{
try
{
using (var dc = new ConfigContext())
{
dc.Configuration.AutoDetectChangesEnabled = false;
foreach(var singleConfig in configs)
{
//Donot invoke dc.SaveChanges on the loop.
//Assuming the SaveConfiguration is your table.
//This will add the entity to DbSet<T> , Will not insert to Db until you invoke SaveChanges
dc.SaveConfiguration.Add(singleConfig);
}
dc.Configuration.AutoDetectChangesEnabled = true;
dc.SaveChanges();
}
}
catch (Exception exception)
{
throw exception
}
}
Working with:
.NET 4.5.1
Web Forms
Entity Framework 6 (Context Per Request)
IIS 8, Windows 2012 Datacenter
Main concern: Thread safety and reliability.
The project consist of a closed system that will be used by numerous types of users which will execute various actions. In our current project I've decided to implement something that probably most developers find absolutely necessary.
There were serious problems in the past deriving from the lack of even the simplest logging system that would allow us to track some common user actions, especially manipulating data.
I know there are popular logging frameworks, but I want to achieve something relatively simple that will not block the main thread.
The idea was to pull all data I need while on the main thread, because it turned out some of the data was harder to access from a separate thread, and then create a task that will take care of the database insert. My knowledge of multi-threading is limited, and this is what I came-up with.
public static class EventLogger
{
public static void LogEvent(int eventType, string descr, Exception ex = null)
{
//Get the page that is currently being executed;
var executingPage = HttpContext.Current.CurrentHandler as Page;
string sourcePage = executingPage != null ? executingPage.AppRelativeVirtualPath : string.Empty;
var eventToAdd = new Event()
{
Date = DateTime.Now,
EventTypeId = eventType,
isException = ex != null ? true : false,
ExceptionDetails = ex != null ? ex.Message : string.Empty,
Source = sourcePage,
UserId = UserHelper.GetUserGuid(),
UserIP = UserHelper.GetUserIP(),
UserName = UserHelper.GetUserName(),
Description = descr
};
Task.Factory.StartNew(() => LogEventAsync(eventToAdd));
}
private static void LogEventAsync(Event eventToAdd)
{
using (var context = new PlaceholderEntities())
{
context.Events.Add(eventToAdd);
context.SaveChanges();
}
}
}
Questions:
Would this be a good-enough way to log what I need and is it safe in a multi-user environment?
How would you do it if you didn't want to dig into logging frameworks?
Yes, it is safe in a multithread environment. Because you are creating a new instance of the database context everytime you insert an event into the database, you are safe. EF is not thread-safe only if you would try to reuse the same context instance across threads.
The only possible issue is that doing this in an async way possibly means that multiple connections are opened concurrently and the connection pool could be depleted. The more often you log, the higher is the possibility that this happens.
Having said this, I still recommend you use log4net or any existing logging infrastructure and just find a way to log asynchronously. People blog often how to do this with log4net, take a look here for example
http://www.ben-morris.com/using-asynchronous-log4net-appenders-for-high-performance-logging
Note that issues related to async logging are also discussed there. To workaround some issues like the possibility of misordering entries, you could possibly have a queue in such custom appender. Sticking with existing framework lets you reuse a lot of ideas that are already developed. Going on your own sooner or later will stop you for longer, when requirements change.
As mentioned by #gbjbaanb you should implement this as a queue with one or more threads actually doing the database work via EF. You can make use of BlockingCollection<T> backed by a ConcurrentQueue<T> to do this in a producer/consumer fashion. Basic premise is each caller logging something simply adds to the queue (i.e. producer). You can then have one or more threads pulling information off the queue and persisting to the database (i.e. consumer). You'd need a separate context for each thread.
There's a reasonable example on the BlockingCollection<T> documentation pages.
No, its not. EF6 is not thread-safe (well, DBContext is not thread-safe, so if you have 1 DBcontext per thread you should be ok), so adding your log event can (ha! will) interfere with another thread writing a different log event.
What you will need to do is synchronise all the calls to EF in your task, what I'd do is add the log to a collection and then have another thread that pulls these logs off the collection and writes them to the DB, one at a time. The add/remove calls to the collection must be protected with a lock to stop 2 threads adding (or removing) an entry simultaneously.
Or I'd use log4net, which is thread-safe, so you can simply call it in your thread to do the writing or the log entries.
My other advice is to write the logs to a file rather than the DB. File-logging is quick so you can do it on the main thread with minimal performance impact (log4net is very efficient too, so just use that - and I know IIS will be writing access and error logs to file anyway) which will remove your issue with threading.
One thing I really do know is that if you're not too hot on multi-threading then you need to stop doing it until you are hot on it. No disrespect to you for trying, but thread bugs can be f***** nightmares to solve, you cannot reproduce them nor easily catch them in a debugger and their cause are often unrelated to the symptoms that are reported.
Here is a quote " So, if you are working with only one object context then you have already built-in support for database transactions when using the ObjectContext.SaveChanges method." I found here http://www.luisrocha.net/2011/08/managing-transactions-with-entity.html
So according to that, I don't have to use TransactionScope in a code below, right?
if (isLastCallSuccess)
{
if (condition1) //it's clear, no transaction needed
{
product.Property1 = true;
context.SaveChanges();
}
else if (condition2)
{
using (TransactionScope scope = new TransactionScope()) //do I need it?
{
context.DeleteObject(item); //deleting
context.AddObject("product", new product //adding
{
Id = oldObject.Id,
Property1 = true
});
context.SaveChanges(System.Data.Objects.SaveOptions.DetectChangesBeforeSave);
scope.Complete();
context.AcceptAllChanges();
}
}
What the quote means is that a single call to SaveChanges is automatically wrapped in a transaction, so it's atomic. You, however, are calling SaveChanges multiple times, so in order for the larger operation to be atomic you'll need to use the transaction as you currently have it.
So yes, you need it.
I would personally keep TransactionScope in so everything commits as a whole unit or rollsback upon an error (I.e. your save or add fails). If concurrency is a major part of your application using this will benefit your users, ensuring the integrity of the data is consistent.
I believe in your scenario you do need to use a transaction. SaveChanges creates an implicit transaction such that when it goes to persist a change to any of the objects, and that change cannot be persisted, it rolls back all other changes it attempted to make. But the transaction created by SaveChanges only lives as long as the call itself. If you are calling SaveChanges twice and want the actions of the first call to rollback if the second call fails, then yes, you need a transaction that wraps both calls, which the code you posted does just that.
I disagree; because you have multiple operations on your data, and you would want to make sure that the operations either succeed completely or fail completely (atomic). It also is good practice to make sure you are atomic.
If your delete worked, but your add failed, you would be left with a database in a bad state. At least if you had a transaction, the database would be back to the original state before you attempted the first operation.
EDIT:
Just for completion, inside a transaction, having the ability to rollback a transaction at any point is crucial, when you start to manipulate multiple tables in the same method/process.
From how I am reading it, you are worried about the delete and adding not committing to the database and if there is a fail then rolling the transaction back.
I dont think you need to wrap your insert and delete in a transaction, because as mentioned above it is all happening on one savechanges() which implicitly has transaction management. so if it did fail the changes would be rolled back.