Using transactions with ADO.NET Data Adapters - c#

Scenario: I want to let multiple (2 to 20, probably) server applications use a single database using ADO.NET. I want individual applications to be able to take ownership of sets of records in the database, hold them in memory (for speed) in DataSets, respond to client requests on the data, perform updates, and prevent other applications from updating those records until ownership has been relinquished.
I'm new to ADO.NET, but it seems like this should be possible using transactions with Data Adapters (ADO.NET disconnected layer).
Question part 1: Is that the right way to try and do this?
Question part 2: If that is the right way, can anyone point me at any tutorials or examples of this kind of approach (in C#)?
Question part 3: If I want to be able to take ownership of individual records and release them independently, am I going to need a separate transaction for each record, and by extension a separate DataAdapter and DataSet to hold each record, or is there a better way to do that? Each application will likely hold ownership of thousands of records simultaneously.

How long were you thinking of keeping the transaction open for?
How many concurrent users are you going to support?
These are two of the questions you need to ask yourself. If the answer for the former is a "long time" and the answer to the latter is "many" then the approach will probably run into problems.
So, my answer to question one is: no, it's probably not the right approach.
If you take the transactional lock approach then you are going to limit your scalability and response times. You could also run into database errors. e.g. SQL Server (assuming you are using SQL Server) can be very greedy with locks and could lock more resources than you request/expect. The application could request some row level locks to lock the records that it "owns" however SQL Server could escalate those row locks to a table lock. This would block and could result in timeouts or perhaps deadlocks.
I think the best way to meet the requirements as you've stated them is to write a lock manager/record checkout system. Martin Fowler calls this a Pessimistic Offline Lock.
UPDATE
If you are using SQL Server 2008 you can set the lock escalation behavior on a table level:
ALTER TABLE T1 SET (LOCK_ESCALATION = DISABLE);
This will disable lock escalation in "most" situations and may help you.

You actually need concurrency control,along with Transaction support.
Transaction only come into picture when you perform multiple operations on database. As soon as the connection is released the transaction is no more applicable.
concurrency lets you work with multiple updates on the same data. If two or more clients hold the same set of data and one needs to read/write the data after another client updates it, the concurrency will let you decide which set of updates to keep and which one to ignore. Mentioning the concept of concurrency is beyond the scope of this article. Checkout this article for more information.

Related

The problem of changes to a table with several different events (RabbitMQ) in a service and in ASP Core

The problem is that when a service receives messages from several other services and wants to apply those changes to a table, can this simultaneous change not cause a problem ?
To be more precise, the problem is that when a service receives two different messages from two different queues and wants to apply those received changes to the database, this synchronization will probably cause a problem !
Suppose a message contains updated user information and a message from another queue related to another case where these changes or updates are to be applied to Mongo ( assuming these changes occur at the same time or with a little distance ) . If the database is making changes to the author information, the information about the term collection must be updated at the same time or in a few moments later .
The table information for this service is as follows :
To deal with Concurrency Conflict, this usually comes in two flavors:
Pessimistic concurrency control
Pessimistic, or negative, concurrency control is when a record is locked at the time the user begins his or her edit process. In this concurrency mode, the record remains locked for the duration of the edit. The primary advantage is that no other user is able to get a lock on the record for updating, effectively informing any requesting user that they cannot update the record because it is in use.
There are several drawbacks to pessimistic concurrency control. If the user goes for a coffee break, the record remains locked, denying anyone else the ability to update the record, even if it has been untouched by the initial requestor. Also, in order to maintain record locks, a persistent connection to the database server is required. Since web applications can have hundreds or thousands of simultaneous users, a persistent connection to the database cannot be maintained without having tremendous resources on the database server. Moreover, some database tools are licensed based on the number of concurrent connections. As such, applications that use pessimistic concurrency would require additional licenses for use.
Optimistic concurrency control
Optimistic Concurrency means we allow concurrency conflicts happen. But we also (want to) believe that it will not happen. And if it happens anyway, we react on it in some manner. It’s supported in Entity Framework – you have got concurrency exceptions to handle, you can add a column of row version type (or timestamp) to database table and so on… It’s probably a good moment to stop and come back to the subject in separate post!
Frameworks such as Entity Framework have optimistic concurrency control built in (although it may be turned off). It’s instructive to quickly see how it works. Basically there are three steps:
Get an entity from the DB and disconnect.
Edit in memory.
Update the db with changes using a special update clause. Something
like: “Update this row WHERE the current values are same as original
values”.
There are some useful articles to help u with Optimistic concurrency control.
OPTIMISTIC CONCURRENCY IN MONGODB USING .NET AND C#
Document-Level Optimistic Concurrency in MongoDB
I use Transactions for concurrent updates. Query with ID before updating operation.

Is my SQL transaction taking too long?

There is something that worries me about my application. I have a SQL query that does a bunch of inserts into the database across various tables. I timed how long it takes to complete the process, it takes about 1.5 seconds. At this point I'm not even done developing the query, I still have more inserts to program into this. So I fully expect this to process to take even longer, perhaps up to 3 seconds.
Now, it is important that all of this data be consistent and finish either completely, or not at all. So What I'm wondering about is, is it OK for a transaction to take that long. Doesn't it lock up the table, so selects, inserts, updates, etc... cannot be run until the transaction is finished? My concern is if this query is being run frequently it could lock up the entire application so that certain parts of it become either incredibly slow, or unusable. With a low user base, I doubt this would be an issue, but if my application should gain some traction, this query could potentially be a lot.
Should I be concerned about this or am I missing something where the database won't act how I am thinking. I'm using a SQL Server 2014 database.
To note, I timed this by using the StopWatch C# object immediately before the transaction starts, and stop it right after the changes are committed. So it's about as accurate as can be.
You're right to be concerned about this, as a transaction will lock the rows it's written until the transaction commits, which can certainly cause problems such as deadlocks, and temporary blocking which will slow the system response. But there are various factors that determine the potential impact.
For example, you probably largely don't need to worry if your users are only updating and querying their own data, and your tables have indexing to support both read and write query criteria. That way each user's row locking will largely not affect the other users--depending on how you write your code of course.
If your users share data, and you want to be able to support efficient searching across multiple user's data even with multiple concurrent updates for example, then you may need to do more.
Some general concepts:
-- Ensure your transactions write to tables in the same order
-- Keep your transactions as short as possible by preparing the data to be written as much as possible before starting the transaction.
-- If this is a new system (and even if not new), definitely consider enabling Snapshot Isolation and/or Read Committed Snapshot Isolation on the database. SI will (when explicitly set on the session) allow your read queries not to be blocked by concurrent writes. RCSI will allow all your read queries by default not to be blocked by concurrent writes. But read this to understand both the benefits and gotchas of both isolation levels: https://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/
I think its depend on your code, how you used loop effectively, select query and the other statement.

ADO.NET Long running transactions

I'm using Dapper, but this applies the same to ADO.NET code.
I have an operation on a web app that changes a lot of state in the database. To ensure an all-or-nothing result, I use a transaction to manage this. To do this, all my Repository classes share a connection (which is instantiated per request). On my connection I can call Connection.BeginTransaction().
However, this operation can sometimes take a while (say 10 seconds), and it's locking some frequently-read-from tables while it does it's thing. I want to allow other repos on other threads to continue without locking while this is happening.
It looks like I need to do 2 things to make this happen:
1) Set the IsoloationLevel to something like ReadUncommited:
_transaction = Connection.BeginTransaction(IsolationLevel.ReadUncommitted);
2) For all other connections that don't need a transaction, I still need to enroll those connections in a transaction, so that I can again set ReadUncommited. If I don't do this then they'll still lock while they wait for the long running operation to complete.
So does this mean I need ALL my connections to start a transaction? This sounds expensive and sub-performant. Are there other solutions I'm missing here?
Thanks
Be aware that there is a trade-off between using locks or not, it's about performance vs concurrency control. Therefore, I don't think you should use ReadUncommited all the time.
If you try to use ReadUncommited on all other transactions that need not to be blocked by this long running transaction, they will by accident not be blocked also by other transactions.
Generally, we use this isolation level when performance is the first priority and does not need data accuracy
I want to allow other repos on other threads to continue without
locking while this is happening.
I think you can try IsolationLevel.SnapShot on only the transaction that does long locking work: https://msdn.microsoft.com/en-us/library/tcbchxcb(v=vs.110).aspx
Extracted from the link:
The term "snapshot" reflects the fact that all queries in the
transaction see the same version, or snapshot, of the database, based
on the state of the database at the moment in time when the
transaction begins. No locks are acquired on the underlying data rows
or data pages in a snapshot transaction, which permits other
transactions to execute without being blocked by a prior uncompleted
transaction. Transactions that modify data do not block transactions
that read data, and transactions that read data do not block
transactions that write data, as they normally would under the default
READ COMMITTED isolation level in SQL Server. This non-blocking
behavior also significantly reduces the likelihood of deadlocks for
complex transactions.
Be aware that an enormous amount of data could be generated in tempdb for version store if there are a lot of modifications.

Simple query regarding WCF service

I have a WCF service which has two methods exposed:
Note: The wcf service and sql server is deployed in same machine.
Sql server has one table called employee which maintains employee information.
Read() This method retrieves all employees from sql server.
Write() This method writes (add,update,delete) employee info in employee table into sql server.
Now I have developed a desktop based application through which any client can query, add,update and delete employee information by consuming a web service.
Question:
How can I handle the scenario, if mulitple clients want update the employee information at the same time? Is the sql server itself handle this by using database locks ??
Please suggest me the best approach!!
Generally, in a disconnected environment optimistic concurrency with a rowversion/timestamp is the preferred approach. WCF does support distributed transactions, but that is a great way to introduce lengthy blocking into the system. Most ORM tools will support rowversion/timestamp out-of-the-box.
Of course, at the server you might want to use transactions (either connection-based or TransactionScope) to make individual repository methods "ACID", but I would try to avoid transactions on the wire as far as possible.
Re comments; sorry about that, I honestly didn't see those comments; sometimes stackoverflow doesn't make this easy if you get a lot of comments at once. There are two different concepts here; the waiting is a symptom of blocking, but if you have 100 clients updating the same record it is entirely appropriate to block during each transaction. To keep things simple: unless I can demonstrate a bottleneck (requiring extra work), I would start with a serializable transaction around the update operations (TransactionScope uses this by default). That way yes: you get appropriate blocking (ACID etc) for most scenarios.
However; the second issue is concurrency: if you get 100 updates for the same record, how do you know which to trust? Most systems will let the first update in, and discard the rest as they are operating on stale assumptions about the data. This is where the timestamp/rowversion come in. By enforcing "the timestamp/rowversion must match" on the UPDATE statement, you ensure that people can only update data that hasn't changed since they took their snapshot. For this purpose, it is common to keep the rowversion alongside any interesting data you are updating.
Another alternative is that you could instantiate the WCF service as a singleton (InstanceContext.Single) - which means there is only one instance of it running ever. Then, you could keep a simple object in memory for the purpose of update locking, and lock in your update method based on that object. When update calls come in from other sessions, they will have to wait until the lock is released.
Regards,
Steve

Parallel processing of database queue

There is small system, where a database table as queue on MSSQL 2005. Several applications are writing to this table, and one application is reading and processing in a FIFO manner.
I have to make it a little bit more advanced to be able to create a distributed system, where several processing application can run. The result should be that 2-10 processing application should be able to run and they should not interfere each other during work.
My idea is to extend the queue table with a row showing that a process is already working on it. The processing application will first update the table with it's idetifyer, and then asks for the updated records.
So something like this:
start transaction
update top(10) queue set processing = 'myid' where processing is null
select * from processing where processing = 'myid'
end transaction
After processing, it sets the processing column of the table to something else, like 'done', or whatever.
I have three questions about this approach.
First: can this work in this form?
Second: if it is working, is it effective? Do you have any other ideas to create such a distribution?
Third: In MSSQL the locking is row based, but after an amount of rows are locked, the lock is extended to the whole table. So the second application cannot access it, until the first application does not release the transaction. How big can be the selection (top x) in order to not lock the whole table, only create row locks?
This will work, but you'll probably find you'll run into blocking or deadlocks where multiple processes try and read/update the same data. I wrote a procedure to do exactly this for one of our systems which uses some interesting locking semantics to ensure this type of thing runs with no blocking or deadlocks, described here.
This approach looks reasonable to me, and is similar to one I have used in the past - successfully.
Also, the row/ table will only be locked while the update and select operations take place, so I doubt the row vs table question is really a major consideration.
Unless the processing overhead of your app is so low as to be negligible, I'd keep the "top" value low - perhaps just 1. Of course that entirely depends on the details of your app.
Having said all that, I'm not a DBA, and so will also be interested in any more expert answers
In regards to your question about locking. You can use a locking hint to force it to lock only rows
update mytable with (rowlock) set x=y where a=b
Biggest problem with this approach is that you increase the number of 'updates' to the table. Try this with just one process consuming (update + delete) and others inserting data in the table and you will find that at around a million records, it starts to crumble.
I would rather have one consumer for the DB and use message queues to deliver processing data to other consumers.

Categories

Resources