I currently have a method which reads data to determine if an update is needed, and then pushes the update to the database (dependency injected). The method is hit very hard, and I found concurrency related bugs, namely, multiple updates since several threads read the data before the first update.
I solved this using a lock, it works quite nicely. How may I instead use a TransactionScope to do the same thing? Can I? Will it block another thread as a lock would? Further, can I 'lock' an a specific 'id' as I am doing with a lock (I keep a Dictionary that stores an object to lock on for each id)?
I am using Entity Framework 5, though its hidden by a repository and unit of work pattern.
Application level locking may not be a solution for this problem. First of all you usually need to lock only single record or range of records. Next you may later need to lock another modifications and get into quite complex code.
This situation is usually handled with either optimistic or pessimistic concurrency.
Optimistic concurrency - you will have additional database generated column (database usually have special type for that like timestamp or rowversion). Database will automatically update that column every time you update the record. If you configure this column as row version EF will include the column in the where condition of the update => the executed update will search for the record with given key and row version. If the record is found it will be updated. If the record is not found it means either record with the key doesn't exist or someone else has updated the record since current process loaded its data => you will get exception and you can try to refresh data and save changes again. This mode is useful for records which are not updated too much. In your case it can cause just another troubles.
Pessimistic concurrency - this mode uses database locking instead. When you query the record you will lock it for update so no one else can also lock it for update or update directly. Unfortunately this mode currently doesn't have direct support in EF and you must execute it through raw SQL. I wrote an article explaining the pessimistic concurrency and its usage with EF. Even pessimistic concurrency may not be a good solution for database under heavy load.
If you really build a solution where a lot of concurrent processes tries to update same data all the time you may end up with redesign because there will be no reliable high performing solution based on locking or rerunning failed updates.
Related
The problem is that when a service receives messages from several other services and wants to apply those changes to a table, can this simultaneous change not cause a problem ?
To be more precise, the problem is that when a service receives two different messages from two different queues and wants to apply those received changes to the database, this synchronization will probably cause a problem !
Suppose a message contains updated user information and a message from another queue related to another case where these changes or updates are to be applied to Mongo ( assuming these changes occur at the same time or with a little distance ) . If the database is making changes to the author information, the information about the term collection must be updated at the same time or in a few moments later .
The table information for this service is as follows :
To deal with Concurrency Conflict, this usually comes in two flavors:
Pessimistic concurrency control
Pessimistic, or negative, concurrency control is when a record is locked at the time the user begins his or her edit process. In this concurrency mode, the record remains locked for the duration of the edit. The primary advantage is that no other user is able to get a lock on the record for updating, effectively informing any requesting user that they cannot update the record because it is in use.
There are several drawbacks to pessimistic concurrency control. If the user goes for a coffee break, the record remains locked, denying anyone else the ability to update the record, even if it has been untouched by the initial requestor. Also, in order to maintain record locks, a persistent connection to the database server is required. Since web applications can have hundreds or thousands of simultaneous users, a persistent connection to the database cannot be maintained without having tremendous resources on the database server. Moreover, some database tools are licensed based on the number of concurrent connections. As such, applications that use pessimistic concurrency would require additional licenses for use.
Optimistic concurrency control
Optimistic Concurrency means we allow concurrency conflicts happen. But we also (want to) believe that it will not happen. And if it happens anyway, we react on it in some manner. It’s supported in Entity Framework – you have got concurrency exceptions to handle, you can add a column of row version type (or timestamp) to database table and so on… It’s probably a good moment to stop and come back to the subject in separate post!
Frameworks such as Entity Framework have optimistic concurrency control built in (although it may be turned off). It’s instructive to quickly see how it works. Basically there are three steps:
Get an entity from the DB and disconnect.
Edit in memory.
Update the db with changes using a special update clause. Something
like: “Update this row WHERE the current values are same as original
values”.
There are some useful articles to help u with Optimistic concurrency control.
OPTIMISTIC CONCURRENCY IN MONGODB USING .NET AND C#
Document-Level Optimistic Concurrency in MongoDB
I use Transactions for concurrent updates. Query with ID before updating operation.
If I do a write to a (ms)SQL database and saves the changes using SaveChangesAsync(), is there a possibility that a future read on the database could read unsaved changes?
Factors could include, whether a different DbContext, thread or process is used to access the database.
Short Answer, NO (Tim P is mis-informed).
Calling DbContext.SaveChangesAsync will automagically create a Transaction for the duration of the saving.
This means that if any other thread tries to access the table, one of a number of things can occur.
Generally it means that the other Database call will block on the other thread, whilst the transaction is uncommited/not-rolledback.
Short answer: Yes.
It depends on several factors, such as how much you are saving to the database. If you're saving 1,000 rows/objects, and it's a slow database server, it's possible that this time window is wide enough that another thread is reading while (for example) row #879 has yet to be saved. And this has nothing to do with it being asynchronous. This is the normal concurrency problem in dealing with multi-user relational database systems.
This might be a very dumb question, but as the saying goes, "The only dumb question is the one you don't ask"...
I've got a SQL Server 2008 database and I want to lock a record for editing. However, another user might want to see information in that record at the same time. So, I want the first person in to be able to lock the record in the sense that they are the only ones who can edit it. However, I still want other users to see the data if they want to.
This is all done from a C# front end as it's gonna be on our Intranet.
Don't do your own locking - let SQL Server handle it on its own.
As long as you only SELECT, you'll put what's called a shared lock on a row - other users who want to also read that row can do so.
Only when your code goes to update the row, it will place an exclusive lock on the row in order to be able to update it. During that period of time, no other users can read that one single row you're updating - until you commit your transaction.
To expand on Marc_s's answer, the reader can also use the
set transaction isolation Level read uncommitted
statement as described here to force reads to ignore any locks (with the notable exception of any Sch-M, schema modification, locks) that may exist. This is also a useful setting for reports that do not require absolute reproducibility, as it can significantly enance performance of those reports.for
In addition to the existing answers: You can enable snapshot isolation. That gives your transaction a point-in-time snapshot of the database for reads. This transaction will not take locks on data at all. It will not block.
I'm using LINQ to SQL, and having a bit of an issue incrementing a view counter cross-connection.
The teeny bit of code I'm using is:
t = this.AppManager.ForumManager.GetThread(id);
t.Views = t.Views + 1;
this.AppManager.DB.SubmitChanges();
Now in my tests, I am running this multiple times, non-concurrently. There are a total of 4 copies of the object performing this test.
That is to say, there is no locking issue, or anything like that but there are 4 data contexts.
Now, I would expect this to work like this: fetch a row, modify a field, update the row. However, this is throwing a ChangeConflictException.
Why would the change be conflicted if none of the copies of this are running concurrently?
Is there a way to ignore change conflicts on a certain table?
EDIT: Found the answer:
You can set "UpdateCheck = Never" on all columns on a table to create a last-in-wins style of update. This is what the application was using before I ported it to LINQ, so that is what I will use for now.
EDIT2: While my fix above did indeed prevent the exception from being thrown, it did not fix the underlying issue:
Since I have more than one data context, there ends up being more than one cached copy of each object. Should I be recreating my data context with every page load?
I would rather instruct the data context to forget everything. Is this possible?
I believe DataContext is indented to be relatively lightweight and short-lived. IMO, you should not cache data loaded with a DataContext longer than necessary. When it's short lived, it remains relatively small because (as I understand it) the DataContext's memory usage is primarily associated with tracking the changes you make to objects managed by it (retrieved by it).
In the application I work on, we create the context, display data on the UI, wait for user updates and then update the data. However, that is necessary mainly because we want the update to be based on what the user is looking at (otherwise we could retrieve the data and update all at once when the user hits update). If your updates are relatively free-standing, I think it would be sensible to retrieve the row immediately before updating it.
You can also use System.Data.Linq.DataContext.Refresh() to re-sync already-retrieved data with data in the database to help with this problem.
To respond to your last comment about making the context forget everything, I don't think there's a way to do that, but I suspect that's because all there is to a context is the tracked changes (and the connection), and it's just as well that you create a new context (remember to dispose of the old one) because really you want to throw away everything that the context is.
I have a ASP.NET C# business webapp that is used internally. One issue we are running into as we've grown is that the original design did not account for concurrency checking - so now multiple users are accessing the same data and overwriting other users changes. So my question is - for webapps do people usually use a pessimistic or optimistic concurrency system? What drives the preference to use one over another and what are some of the design considerations to take into account?
I'm currently leaning towards an optimistic concurrency check since it seems more forgiving, but I'm concerned about the potential for multiple changes being made that would be in contradiction to each other.
Thanks!
Optimistic locking.
Pessimistic is harder to implement and will give problems in a web environment. What action will release the lock, closing the browser? Leaving the session to time out? What about if they then do save their changes?
You don't specify which database you are using. MS SQL server has a timestamp datatype. It has nothing to do with time though. It is mearly a number that will get changed each time the row gets updated. You don't have to do anything to make sure it gets changed, you just need to check it. You can achive similar by using a date/time last modified as #KM suggests. But this means you have to remember to change it each time you update the row. If you use datetime you need to use a data type with sufficient precision to ensure that you can't end up with the value not changing when it should. For example, some one saves a row, then someone reads it, then another save happens but leaves the modified date/time unchanged. I would use timestamp unless there was a requirement to track last modified date on records.
To check it you can do as #KM suggests and include it in the update statement where clause. Or you can begin a transaction, check the timestamp, if all is well do the update, then commit the transaction, if not then return a failure code or error.
Holding transactions open (as suggested by #le dorfier) is similar to pessimistic locking, but the amount of data locked may be more than a row. Most RDBM's lock at the page level by default. You will also run into the same issues as with pessimistic locking.
You mention in your question that you are worried about conflicting updates. That is what the locking will prevent surely. Both optimistic or pessimistic will, when properly implemented prevent exactly that.
I agree with the first answer above, we try to use optimistic locking when the chance of collisions is fairly low. This can be easily implemented with a LastModifiedDate column or incrementing a Version column. If you are unsure about frequency of collisions, log occurrences somewhere so you can keep an eye on them. If your records are always in "edit" mode, having separate "view" and "edit" modes could help reduce collisions (assuming you reload data when entering edit mode).
If collisions are still high, pessimistic locking is more difficult to implement in web apps, but definitely possible. We have had good success with "leasing" records (locking with a timeout)... similar to that 2 minute warning you get when you buy tickets on TicketMaster. When a user goes into edit mode, we put a record into the "lock" table with a timeout of N minutes. Other users will see a message if they try to edit a record with an active lock. You could also implement a keep-alive for long forms by renewing the lease on any postback of the page, or even with an ajax timer. There is also no reason why you couldn't back this up with a standard optimistic lock mentioned above.
Many apps will need a combination of both approaches.
here's a simple solution to many people working on the same records.
when you load the data, get the last changed date, we use LastChgDate on our tables
when you save (update) the data add "AND LastChgDate=previouslyLoadedLastChgDate" to the where clause. If the row count=0 on the update, issue error where "someone else has already saved this data" and rollback everything, otherwise the data is saved.
I generally do the above logic on header tables only and not on the details tables, since they are all in one transaction.
I assume you're experiencing the 'lost update' problem.
To counter this as a rule of thumb I use pessimistic locking when the chances of a collision are high (or transactions are short lived) and optimistic locking when the chances of a collision are low (or transactions are long lived, or your business rules encompass multiple transactions).
You really need to see what applies to your situation and make a judgment call.