I am well familiar with Transactions and what they are designed to achieve however when working with a large, multi-user system I can't seem to find a lot of guidance on when they should be employed and the potential overheads.
It is my understanding that when you execute a SqlCommand and you do no specify a Transaction that Sql Server will implicitly create a transation. So I assume that if I did explicitly create a Transaction (with the same IsolationLevel as what is default) then there would be no added overhead.
TransactionScope however, as I understand it, enlists any service that is "transaction aware" within it's scope and creates a transation for it, any subsequent services will also join that transation unless specifically told not to and services that use other databases TransactionScope will escalate it to a distributed Transaction.
Is there any added overhead to TransactionScope that would preclude it from being used heavily in a high performance multi-user system? Can I wrap all of my data access code within a TransactionScope object and not have to worry about locking or slow performance?
Related
TransactionScope provides functionality to convert a set of operations into a transaction so that either all are completed successfully or none. My question is does transaction scope apply to certain types of operations (e.g. only SQL connnections, Azure, etc.)?
For example, consider the code block below
using (TransactionScope scope = new TransactionScope())
{
SaveToSQLserver(parameter);
SaveToSalesForce(parameter);
SaveToSAP(parameter);
SaveToAzure(parameter);
scope.Complete();
}
Now suppose an error occurs at SaveToSAP, where it has already saved to Salesforce; how is the transaction going to rollback from Salesforce? And if everything is in memory, then how is it going to make sure that when it actually saves it will succeed?
A TransactionScope is capable of supporting a distributed transaction across many different types of systems, but it is not an automatic thing. This documentation provides a glimpse into that (it's worth checking out the whole of the document hierarchy on Transaction Processing.
As mentioned by Dave Anderson in the comments, a Resource Manager controls what is done during commit and rollback, so the "how is it done" is governed individually by each resource manager.
So, can other things participate in a transaction scope besides just SQL? Yes, they can. As long as a Resource Manager exists for each system (e.g. a messaging system), it can participate (enlist).
If you are working with something that can't enlist, you have to manually do a compensating transaction when you detect you need to rollback (usually when an exception occurs).
I have been using NHibernate for a while and came across the code below that uses a Transaction scope.
using (var scope = new TransactionScope(TransactionScopeOption.Required))
{
using (var session = sessionFactory.OpenSession())
{
using (var transaction = session.BeginTransaction())
{
// do work
}
}
}
I generally do everyting without wrapping the code to a TransactionScope Am I doing something wrong or am I just missing out some beautiful functionality ?
The usage is: transactions. Whether that is benefit is more complex. There are more direct ways of achieving transactions - ADO.NET transactions. These are a little awkward to work with (you need to remember to set the transaction on every command), but are very efficient.
Transaction scope has the advantage of an ambient transaction; this makes it easier to work with. However, it works in a different way. In particular, transaction-scope supports multiple resource transactions - which can mean multiple databases etc. This is typically done via DTC, but DTC has overheads - it is more expensive (and requires specific firewall configuration, etc). In many single-database cases it can short-cut and use the LTM instead of full DTC, but this is still more expensive than ADO.NET transactions... just not as expensive as DTC.
A powerful feature, but make sure you intend to use it before you do ;p
If you are not using any TransactionScope explicitly, every statement you execute on the database will run in a separate transaction.
With a TransactionScope you can bundle multiple statements into a big transaction and undo everything as a block.
This is necessary, when updating multiple tables in multiple statements but performing virtually one big thing, that has to work or not be done at all.
You are missing out on some beautiful functionality: with the transaction scope in place, the code with the transaction scope will participate in ambient transaction if invoked from inside a piece of code running in its own transaction scope. Without transaction scope, your code will have its own transaction (from the deepest nested block) which could fail without failing the outer transaction.
In addition, anything inside your // do work block would have easier time participating in your transaction if you put a transaction scope outside it. The code would be able to break your transaction without having to propagate the error code up the chain or throwing an exception, which could potentially be ignored by code in the middle.
Note: Don't forget to call scope.Complete() on the transaction scope before the end of its using block.
Are you using transactions at all. If you aren't, then you should be.
I personally don't use TransactionScope in any of my NHibernate, but then again I don't need it. In a web environment with a single database, it's not really necessary. my unit of work is a web request. I open my connection on BeginRequest and close it on EndRequest. I use a generic repository that will start a transation if none is present and defined a TransactionAttribute that decorates controller actions so that all table updates are performed in a single transaction.
TransactionScope is just Microsoft's generic way to put all transactional aware resources into a single transaction. This may be multiple databases, a transactional file system, .... Things to worry about in these scenarios is that the transaction most likely will be promoted to the DTC to coordinate all the updates.
Also, I don't know if it's still an issue, but older versions of NHibernate used to have a memory leak associated with using TransactionScope.
Here's a link with everything you need to know about TransactionScopes http://www.codeproject.com/Articles/690136/All-About-TransactionScope
Since I have a "DB util" class with a DataSet QueryDB(string spName, DBInputParams inputParams) method which I use for all my calls to the database, I would like to reuse this method in order to support transacted calls.
So, at the end I will have a SqlDataAdapter.Fill within a SqlTransaction. Will this be a bad practice? Because rarely I see usage of DataAdapter.Fill within a transaction and more often ExecuteReader(). Is there any catch?
Edit1: The thing is that inside my transaction is often needed to retrieve also some data (e.g auto-IDs)... that's why I would like to get it as DataSet.
Edit2: Strange is when I use this approach in a for loop (10000) from 2 different processes, I get "Transaction (Process ID 55) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction." . Is this the right behaviour?
Edit3: (answer for Edit2) I was using IDENT_CURRENT('XTable') which was the source of the error. After I went back to SCOPE_IDENTITY(), everything has been solved.
It is not a bad practice. One thing to remember is that all statements will use an implicit transaction that they will automatically commit when the statement ends. That is a SELECT (as in the SELECT used by Fill) will always use a transaction, the question is whether it will have to start it on itself or it will use the existing one.
Is there any difference between the number, type and duration of locks acquired by a SELECT in an implicit transaction vs. an explicit transaction? Under the default transaction model (READ COMMITTED isolation) NO, there is none. The behavior is identical and indistinguishable. Under other isolation levels (repeatable read, serializable) there is a difference, but that is the necessary difference for the desired higher isolation level to occur and using an explicit transaction is the only way to achieve this desired isolation level, when necessary.
In addition if the SELECT has to read the effects of a transaction that is pending (not yet committed), as in your example (read back the generated IDs) then there is no other way. The SELECT must be part of the transaction that generated the IDs, otherwise it will not be able to see those uncommitted IDs!
A note of caution though. I believe you have at your disposal a great tool that can make all this transaction handling much easier: the System.Transactions. All ADO.Net code is system transaction aware and will automatically enroll any connection and command into the pending transaction, if you simply declare a TransactionScope. That is if function Foo declares a TransactionScope and then calls function Bar, if Bar does any ADO.Net operatio, it will automatically be part of the transaction declared in Foo, even if Bar does nothing explicitly. The TransactionScope is hooked into the thread context and all ADO.Net call called by Bar will check for this context automatically, and use it. Note that I really mean any ADO.Net call, including Oracle provider ones. Alas though there is a warning: using new TransactionScope() Considered Harmful: the default constructor of TransactionScope will create a serializable transaction, which is overkill. You have to use the constructor that takes a TransactionOptions object and change the behavior to ReadCommitted. A second gotcha with TransactionScope is that you have to be very careful how you manage connections: if you open more than one connection under a scope then they will be enrolled in a distributed transaction, which is slow and requires MSDTC to be configured, and leads to all sort of hard to debug errors. But overall I fell that the benefits of using TransactionScope outweight the problems, and the resulted code is always more elegant than passing around IDbTransaction explicitly.
It is a bad practice because while the transaction is open, records/pages/tables that you make changes to are locked for the duration of the transaction. The fill just makes the whole process keep those resources locked longer. Depending on your sql settings, this could block other accesses to those resources.
That said, if it is necessary, it is necessary, just realize the penalty for doing it.
What are the good and bad points of the TransactionScope class in C#?
Thanks.
Some advantages from MSDN :
TransactionScope Benefits
The code inside the transactional
scope is not only transactional, it
is also promotable. The transaction
starts with the LTM and
System.Transactions will promote it
as required, according to the nature
of its interaction with the resources
or remote objects.
The scope is independent of the
application object model—any piece of
code can use the TransactionScope and
thus become transactional. There is
no need for special base class or
attributes.
There is no need to enlist resources
explicitly with the transaction. Any
System.Transactions resource manager
will detect the ambient transaction
created by the scope and
automatically enlist.
Overall, it is a simple and intuitive
programming model even for the more
complex scenarios that involve
transaction flow and nesting.
Good side:
Can do transactions beyond database context. Insert record into db. Write file to disk.
Bad side:
Requires MSDTC access on client machine, where TransactionScope is used.
Just to add to / clarify the points Incognito makes:
TransactionScopes make the implementation of ACID transactions simple (i.e. so you don't need to write explicit "rollback" or cleanup code)
TransactionScope can coordinate resources such as Databases, Message Queues and Transactional File Systems under a transaction
Re TransactionScopes are intuitive - resources such as SQL etc will automatically / seamlessly detect the ambient transaction and enlist as available.
The only 'bad' side is that you need to be aware that:
The default isolation level of TransactionScope is READ SERIALIZABLE, which is usually too 'strong' and can cause blocking and deadlocking. Would recommend using ReadCommitted for most transactions.
TransactionScope will escalate a transaction to DTC if more than one database / more than one concurrent connection / more than one resource (e.g. SQL and MSMQ) are used in a TransactionScope. But you can usually avoided in Single Threaded / single database scenarios by closing connections before opening a new one (or keeping one connection open throughout, which isn't recommended).
I'm busy building a software update application in C#, WinForms, .NET 3.5, where I have a collection of pluggable task classes with a common base class. I would like one task, the database upgrade, to begin a transaction, and another task, the web site upgrade, to commit or roll back the same transaction, so the web site and DB are always in sync.
I vaguely remember stuff from my VB6 days where a COM+ method could enlist in a transaction if one was already running, or begin one if not, etc. I also have vague memories of this porting to .NET Enterprise Services, but that was also a while ago.
What is the current technology to use to achieve this?
I think you are looking for environment transactions, or implicit transactions
using (TransactionScope scope = new TransactionScope())
{
// do several stuff in the same transaction
// calls in here implicitly are in the scope of the transaction.
// you open several independent connections, which are in the same transaction.
}
You might want to start here.
http://msdn.microsoft.com/en-us/library/86773566.aspx
SqlConnection.BeginTransaction returns a SqlTransaction, which implements IDbTransaction.
The two methods defined on IDbTransaction are Commit() and Rollback(). If you keep the connection object alive between calls, you should be able to pass the transaction from one place to another and perform the commit or rollback there.
If you're not using SQL Server, your database provider (OleDb, Odbc, etc.) will provide a corresponding object.
Off the top of my head, I think you should be able to do this w/o special technology. Create an UpgradeManager class that is responsible for kicking off both the database and web upgrades. The transaction should live here and be called surrounding the calls into the two other objects.
If you have other tasks to plug in, have the UpgradeManager iterate over a collection of your Tasks.
.....or you could pass the transaction around, like harpo said (his response came in the middle of composing mine)... good to have options. ;-)
Nate