Understand - Deadlock found when trying to get lock; try restarting transaction

Understand - Deadlock found when trying to get lock; try restarting transaction - c#

I am using one REST API which will either insert or update record in mysql table based on unique key. While using this API parallel, in some cases i am getting error state as 'Deadlock found when trying to get lock; try restarting transaction'. I don't have access to server for verification of logs.
While i am curious to know that even multiple callers are calling this API during that process if mysql takes row level locking for inserting/updating record also ideally it should not create deadlock and other calls should wait for acquiring lock.
For example caller A call this API and take row level lock on table 'tableA' then caller B call should wait till caller A doesn't release the lock and it shouldn't throw deadlock. Please help me to understand this.
Below is the table query i am using it.
INSERT INTO tableA (A,B,C,D) VALUES
{{INSERT_CLAUSE}}
ON DUPLICATE KEY UPDATE `D`= (D) + 1, E = now()";
Unique Key On columns - A,B,C
P.S I have gone through all other suggested answer but nothing seems to clear this doubt.

Related

How do I block this race condition in an inventory allocation scenario? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm wrestling with the classic problem of inventory allocation and concurrency and I wondered if anyone could give me a steer on best practice in this situation.
My situation is that we have an order prepared with several "slots" which will be filled by unique inventory items at a certain stage in the process and at this point I want to make sure that nobody allocates the same unique unit to a slot on a different order. For example a user wants a van next Thursday so I reserve a "van" slot but at a later point in time I allocate a specific vehicle from the yard to this slot. I want to make sure that two different operators can't allocate the same van to two different customers next Thursday.
We already have a stock availability check process where we compare the aggregate of two tables within a date range, the result of summing these two tables (one is items in and the other is items out) tells me whether we have the specific item that I want to allocate to this slot on this date but I want to prevent another user from allocating the same item to their own slot at the same point in time.
I've already done some googling and research on this site and it looks like I need a "pessimistic locking" solution but I'm not sure how to put one in place effectively.
The allocation process will be called from a web API (rest api using .Net) with entity framework and I've considered the following two solutions:
Option 1 - Let the database handle it
At the point of allocation I begin a transaction and acquire an exclusive lock on the two tables used for evaluating stock availability.
The process confirms the stock availability, allocates the units to the slots and then releases the locks.
I think this would prevent the race condition of two users trying to allocate the same unique unit to two different orders but I'm uncomfortable with locking two tables for every other process that needs to query these tables until the allocation process completes as I think this could cause a bottleneck to other processes attempting to read those tables. In this scenario I think the second process which attempts to perform the duplicate allocation should be queued until the first has released the locks as it won't be able to query the availability tables and when it does it will fail the availability check and report an out of stock warning - so effectively blocking the second order from allocating the same stock.
On paper this sounds like it would work but I have two concerns; the first is that it will hit performance and the second is that I'm overlooking something. Also I'm using Postgres for the first time on this project (I'm normally a SQL Server guy) but I think Postgres still has the features to do this.
Option 2 - Use some kind of manual locking
I think my scenario is something like ticketing websites would encounter during the sales process for concerts or cinemas and I've seen them put up timers saying things like "your tickets will expire in 5 minutes" but I don't know how they implement this kind of system in the back end. Do they create a table of "reserved" stock before the allocation process begins with some kind of expiry time on them and then "blacklist" other users attempting to allocate the same units until that timer expires?
Sorry for the long intro but I wanted to explain the problem completely as I've seen plenty of questions about similar scenarios but nothing that really helped me to make a decision on how to proceed.
My question is which of the two options (if any) are "the right way" to do this?
Edit: The closest parallel to this question I've seen is How to deal with inventory and concurrency but it doesn't discuss option 1 (possibly because it's a terrible idea)

I think option 2 is better with some tweak.
This is what i'll do if i have to deal with such situation
Whenever user tries to book a vehicle for a slot, i'll make an entry(Entry should contain unique key which is made up with unique car id + slot time, And no duplicate entries should be allowed here for that combination that way you'll get Error in your application if two user tries to book same car for same slot at the same time so you can notify other user that van is already gone) in temporary holding area(normal table will do but if u have higher transaction you want to look into some caching database solutions.)
So before second user tries to book a vehicle user must check for lock in that slot for that car. (or you can show unavailability of cars for that slot using this data).

I'm not sure how your database is laid out, but if each inventory item is its own record in the database, just have a IsUsed flag on the table. When you go to update the record, just make sure you put IsUsed = 0 as part of the where clause. If total modified comes back as 0, then you know something else updated it before you.

If you have a table for storing vehicles in your db then you can take pessimistic no wait lock on vehcile to be allotted in slot selected by user.
This lock will be held by one transaction once aquired till it commits or rollbacks. All the other transaction if try to aquire the lock on the vehicle will fail immediately. Hence no waiting in db for transactions.
This will scalable as no waiting queues for txns in db to get the lock on vehicle to be allotted.
For failing transactions you can immediately roll back them and ask user to select different vehcile or slot.
Now it also applies if you have multiple vehicle of the same type and you get a chance to alott same vehicle I mean having same registration number to two user in same slot. As only one transaction wil win and others will fail.
Below is the postgresql query for this:
SELECT *
FROM vehicle
WHERE id = ?
FOR UPDATE nowait

There are different approaches to this problem and I'm just answering what I've thought about and eventually settled on when having to tackle this problem for a customer.
1. If the traffic is not heavy on your INSERT and UPDATE on these resources you can completely lock the table by doing something like this for example in a stored procedure, but this can also be done in simple client-side code:
CREATE PROCEDURE ...
AS
BEGIN
BEGIN TRANSACTION
-- lock table "a" till end of transaction
SELECT ...
FROM a
WITH (TABLOCK, HOLDLOCK)
WHERE ...
-- do some other stuff (including inserting/updating table "a")
-- release lock
COMMIT TRANSACTION
END
2. Use pessimistic locking by having your code obtain locks you yourself create. Put in an extra table pr resource-type you want to lock and set a unique constraint on the Id of the resource you want to lock. You then obtain a lock by trying to insert a row and you release the lock by deleting it. Put timestamps on so that you can have a job to clean up locks that got lost. The table could look like this:
Id bigint
BookingId bigint -- the resource you want to lock on. Put a unique constrain here
Creation datetime -- you can use these 2 timestamps to decide when to automatically remove a lock
Updated datetime
Username nvarchar(100) -- maybe who obtained the lock?
With this approach it's easy to decide which of your code needs to obtain a lock and what pieces of code can tolerate reading your resource and reservation table without a lock.
3. If it's a resource that is allotted by a begin- and end-time you could set the granularity of this timespan to e.g 15min. Each 15min timeslot of the day will then get a number starting from 0. Then you could create a table beside your reservation-table where start and end timestamps now consist of a number for the timeslot. Choose a reasonable starting timestamp as number 0. You will then insert as many rows with the different timeslot number as needed for every reservation. You of course need to have a unique constraint on the "Timeslot"+"ResourceId" so that any insert will be rejected if it is already reserved for that timeslot.
Updating this table could nicely be done in triggers on your table with reservations so that you can still have real timestamp on reservation-table and when an insert or update is performed you can update the timeslot-table and it can raise an error if you violate the unique constraint thereby rolling back the transaction and preventing a change in both tables.

Can bulk insert on multiple thread produce duplicate record in mongoDB

I am performing some operation in mongoDB where in am doing bulk insert WritemanyAsync() on multiple thread. Say, there are two entities J and C (Collection). On every update of document J am fetching all C and similarly for every update of document C am fetching all J and performing a INSERT in a document say XXX.
For example (Pseudo): Say there are (J1 .. J3) and C1 .. C3 and thus if J1 updates then a operation happens with J1 X (C1 .. C3).
This entire operation happens on multiple threads. I am seeing some duplicates records in collection XXX which has same J and C (I mean with same id).
Question: is there any possibility that duplicate is occurring cause the bulk insert is happening in multiple thread. Means, by the time thread T1 performing insert for record (C1,J1) another thread is also performing the same insert and thus the duplicates.
I found that, mongoDB takes a row level (document level) lock while performing insert and so there is a chance for this situation to occur but not sure.
Can anyone please suggest or shed some light in this.
This is first time working in mongoDB and so have no idea about it (my sole experience is in relational database but mongoDB concurrency doesn't work the way relational DB does).
EDIT: From some documentation, got to know that INSERT on mongoDB acquires a write lock/latch and also there could be only one writer lock at any point in time per DB / per collection.
With that, no matter how many parallel insert happens, they will be queued and will not perform simultaneously and thus my said scenario should never occur.
Can someone please confirm (or) let me know.

Concurrent task picking from database

Inside a database table, I keep a list of tasks. For simplicity, assume task is an integer and let there be 100 integers in increasing order in table. (Increasing order means if someone is working on task N, then all tasks < N are already being worked upon)
I also have 5 clients that connect and pick tasks from database and update the database when task is finished.
I do not want any two clients to pick the same task.
Whenever a task is picked, I add it to another table, 'tasksTable'. When picking a new task, I find max_int in 'tasksTable', and pick the task = max_int+1
To avoid work duplication, I serialize the process of picking tasks i.e.
getlock
read max_int
pick task
update tasksTable
releaselock
Since I had only about 10 workers, serializing was not much of issue. What if I have 1000's of clients. How can I parallelize the task picking?

Assume two tables, one for unpicked tasks (WaitingTasks), one for picked "in-progress" tasks (WorkingTasks):
CREATE TABLE WaitingTasks (WaitingTaskID INT IDENTITY(1,1), ColA, ColB, ...)
and
CREATE TABLE WorkingTasks (WorkingTaskID INT IDENTITY(1,1), WaitingTaskID INT, ColA, ...)
Concurrent task picking can be accomplished like so:
INSERT INTO WorkingTasks (WaitingTaskID)
SELECT TOP 1 WaitingTaskID
FROM WaitingTasks
WHERE WaitingTaskID NOT IN (SELECT WaitingTaskID FROM WorkingTasks);
SELECT SCOPE_IDENTITY();
This will create a new "working task" with its own unique ID, SCOPE_IDENTITY() will return that unique ID to you.

You can try using lock, but you could solve it with an update/select:
BEGIN TRAN
UPDATE Task
SET clientId = MyClientId
WHERE [PrimaryKey] = (
SELECT TOP 1 [PrimaryKey]
FROM Tasks
WHERE clientId IS NULL
ORDER BY CreationDateTime)
SELECT * FROM Tasks
WHERE [PrimaryKey] = MyClientId
COMMIT TRAN
Something like this?
Or you could use the OUPUT Clause:
DECLARE #TaskId AS INT
UPDATE TOP(1) Task
SET clientId = MyClientId
OUTPUT Task.[PrimaryKey] INTO #TaskId
WHERE clientId IS NULL
ORDER BY CreationDateTime
SELECT #TaskId
(not tested it)
source: http://blog.sqlauthority.com/2007/10/01/sql-server-2005-output-clause-example-and-explanation-with-insert-update-delete/

Not quite following what you are doing here. Are you locking a record in the database table until the task is complete? If so, I don't think it's a good idea. How about instead, adding a column to your task database table that provides the status of a task? If it’s checked out, set it to true, else set it to false. Alternately, you can add a column indicating who it’s checked out to (his personID). If it’s null, it’s not checked out to anyone. Note in this solution, you only have one task database table, not two (a separate one for completed tasks). I think thats' a better solution (better normalized).
Only display tasks to a user that haven't been assigned yet. If he picks one and someone beat him to signing up for that task before he has a chance to hit the update button, you can inform him upon the update that its too late. Example: the 'where' clause in this SQL does the trick: 'update tasks set personID=233421 where personID==null'. If the update comes back with zero records updated, you know what happened.
I don't think your database/program will have any problem supporting thousands of users since its only when more than one user goes to update the task table within the same few milliseconds is there a slight delay until the first update completes. Even then, for 1000 users, I doubt more than 10 or so will click update at the same time (a few milliseconds times 10 is still not much time).
Also, I think you should let the database generate a new task number rather than programmatically doing it with task = max_int+1.That way you don't have to lock the table to ensure it doesn't change until you update. On a side note: if you need to update more than one table per update, set up a transaction. You don't need to do it for one table update.
Last note: One general solution to handling updates is to add a 'version' column (integer or long) to your tables and increament it each time someone updates the record. Then for each update, see if the version number is the same as when it was last read (SQL: update tasks set personID=233421 where version=4). If not the same, inform the user.

I think your question is mainly about increasing concurrency. You already have a safe algorithm. Now you want scale.
Some obvious solutions have the problem of serializing access to the table by locking the same portion of it. Be careful here and test for that.
Task queueing is usually implemented with READPAST. This allows a query to read past already locked rows, allowing to dequeue the next task while previous tasks are currently being marked for being in processing.
You can achieve exclusive access to a certain row by updating it, or by using lock hints like XLOCK, ROWLOCK, HOLDLOCK.
There is a set of articles by Rusanu Remus on queue tables that answers about everything. Feel free to ask follow-up question in the comments below this post.

Pessimistic locking of record?

I am creating a WCF Web Service, for a Silverlight application, and I need to have a record to be Read/Write Locked when Modified.
I am using MySQL version 5.5.11.
To be more specific, i would like to prevent a request from reading data from a Row when it is being modified.
The two SQL commands for UPDATE and SELECT are actually pretty simple, something like:
Update(should lock for write/read):
UPDATE user SET user = ..... WHERE id = .....
Select(should not be able to read when locked from the query above):
SELECT * FROM user WHERE id = .....
Here is what i tried but it doesn't seem to work or lock anything at all:
START TRANSACTION;
SELECT user
FROM user
WHERE id = 'the user id'
FOR UPDATE;
UPDATE user
SET user = 'the user data'
WHERE id = 'the user id';
COMMIT;

How are you determining that it's not locking the record?
When a query is run over a table with locks on it, it will wait for the locks to be released or eventually timeout. Your update transaction would happen so fast that you'd never even be able to tell that it was locked.
The only way you'd be able to tell there was a problem is if you had a query that ran after your transaction started, but returned the original value for user instead of the updated value. Has that happened?
I would have just put this in a comment but it was too long, but I'll update this with a more complete answer based off your response.

MySql uses multi-versioned concurrency control by default (and this is a very, very good behavior, instead of MSSQL). Try to use locking reads (LOCK IN SHARE MODE) to achieve what you want.

ThreadPool and GUI wait question

I am new to threads and in need of help. I have a data entry app that takes an exorbitant amount of time to insert a new record(i.e 50-75 seconds). So my solution was to send an insert statement out via a ThreadPool and allow the user to begin entering the data for the record while that insert which returns a new record ID while that insert is running. My problem is that a user can hit save before the new ID is returned from that insert.
I tried putting in a Boolean variable which get set to true via an event from that thread when it is safe to save. I then put in
while (safeToSave == false)
{
Thread.Sleep(200)
}
I think that is a bad idea. If i run the save method before that tread returns, it gets stuck.
So my questions are:
Is there a better way of doing this?
What am I doing wrong here?
Thanks for any help.
Doug
Edit for more information:
It is doing an insert into a very large (approaching max size) FoxPro database. The file has about 200 fields and almost as many indexes on it.
And before you ask, no I cannot change the structure of it as it was here before I was and there is a ton of legacy code hitting it. The first problem is, in order to get a new ID I must first find the max(id) in the table then increment and checksum it. That takes about 45 seconds. Then the first insert is simply and insert of that new id and an enterdate field. This table is not/ cannot be put into a DBC so that rules out auto-generating ids and the like.
#joshua.ewer
You have the proccess correct and I think for the short term I will just disable the save button, but I will be looking into your idea of passing it into a queue. Do you have any references to MSMQ that I should take a look at?

1) Many :), for example you could disable the "save" button while the thread is inserting the object, or you can setup a Thread Worker which handle a queue of "save requests" (but I think the problem here is that the user wants to modify the newly created record, so disabling the button maybe it's better)
2) I think we need some more code to be able to understand... (or maybe is a synchronization issue, I am not a bug fan of threads too)
btw, I just don't understand why an insert should take so long..I think that you should check that code first! <- just as charles stated before (sorry, dind't read the post) :)

Everyone else, including you, addressed the core problems (insert time, why you're doing an insert, then update), so I'll stick with just the technical concerns with your proposed solution. So, if I get the flow right:
Thread 1: Start data entry for
record
Thread 2: Background calls to DB to retrieve new Id
The save button is always enabled,
if user tries to save before Thread
2 completes, you put #1 to sleep for
200 ms?
The simplest, not best, answer is to just have the button disabled, and have that thread make a callback to a delegate that enables the button. They can't start the update operation until you're sure things are set up appropriately.
Though, I think a much better solution (though it might be overblown if you're just building a Q&D front end to FoxPro), would be to throw those save operations into a queue. The user can key as quickly as possible, then the requests are put into something like MSMQ and they can complete in their own time asynchronously.

Use a future rather than a raw ThreadPool action. Execute the future, allow the user to do whatever they want, when they hit Save on the 2nd record, request the value from the future. If the 1st insert finished already, you'll get the ID right away and the 2nd insert will be allowed to kick off. If you are still waiting on the 1st operation, the future will block until it is available, and then the 2nd operation can execute.
You're not saving any time unless the user is slower than the operation.

First, you should probably find out, and fix, the reason why an insert is taking so long... 50-75 seconds is unreasonable for any modern database for a single row insert, and indicates that something else needs to be addressed, like indices, or blocking...
Secondly, why are you inserting the record before you have the data? Normally, data entry apps are coded so that the insert is not attempted until all the necessary data for the insert has been gathered from the user. Are you doing this because you are trying to get the new Id back from the database first, and then "update" the new empty record with the user-entered data later? If so, almost every database vendor has a mechanism where you can do the insert only once, without knowing the new ID, and have the database return the new ID as well... What vendor database are you using?

Is a solution like this possible:
Pre-calculate the unique IDs before a user even starts to add. Keep a list of unique Id's that are already in the table but are effectively place holders. When a user is trying to insert, reserve them one of the unique IDs, when the user presses save, they now replace the place-holder with their data.
PS: It's difficult to confirm this, but be aware of the following concurrency issue with what you are proposing (with or without threads): User A, starts to add, user B starts to add, user A calculates ID 1234 as the max free ID, user B calculates ID 1234 as the max free ID. User A inserts ID 1234, User B inserts ID 1234 = Boom!

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.