I have a table where the primary key is an increment int 'ID', that I have to manually set. I know an autoincrement int (IDENTITY) should have been the best option, but I can't change the existing table design.
So I need to atomize the operation of Read-Write, in some sort of:
Lock table
Read the MAX value of existings ID
Add new record with Primary Key = ID+1
Release table
What is the correct way to lock the table in a multiuser environment? I suppose it's a mix of transactions and the use of TABLOCX. I need to ensure:
No deadlocks
If something fails, the table should no stay locked (for example, program fails and exits when triying to write, and no COMMIT/ROLLBACK is called). I don't know even if this could be possible.
NOTE: The database is also used by other applications that I suppose care themselves of this problem.
EDITED: Could this be considered enough atomic to be a solution?:
INSERT INTO MYTABLE (ID, OtherFields...) VALUES ((Select Max(ID)+1 from MYTABLE), 'values'...)
Attempting to roll your own auto-increment mechanism using table locks is almost bound to fail - however, since you wrote you can't change the existing table, I would suggest using a sequence to get the next number instead of locking the table.
CREATE SEQUENCE dbo.MySequence -- Don't use this name, please!
AS int -- note: default is bigInt
START WITH 1
INCREMENT BY 1
NO CYCLE;
This has all some1 of the benefits of an identity column, without having to add an identity column to your table.
You can also use the sequence to generate a default value to a column (assuming adding a default constraint doesn't count as "changing the existing table structure", of course). See example D in official documentation
ALTER TABLE dbo.YourTableName
ADD CONSTRAINT YourTableName_id_default
DEFAULT NEXT VALUE FOR MySequence
FOR Id;
1 The benefits are you don't need to add locks or to calculate the next number yourself.
However, you should know that unlike an identity column, this doesn't protect you from updates to the id column, nor does it protect you from insert statements that explicitly insert a value to this column (without using next value for).
The first problem can be quite easily solved with an instead-of-update trigger on the table that will only update columns that aren't the id column, but I'm not sure how to solve the other problem.
So if the other process is correctly handling the locking, you could do exactly what you mentioned (lock, get last ID, insert and release) by executing something similar to the following:
DECLARE #MaxID INT
BEGIN TRY
BEGIN TRANSACTION
SELECT
#MaxID = MAX(I.ID)
FROM
MyTable AS I WITH (TABLOCKX, HOLDLOCK) -- TABLOCKX: no operations can be done, HOLDLOCK: until the end of the transaction
INSERT INTO MyTable (
ID,
OtherColumn)
SELECT
ID = ISNULL(#MaxID + 1, 1)
OtherColumn = 'Other values'
COMMIT
END TRY
BEGIN CATCH
-- Handle your error logging and rollback the transaction so the table locks are released, a basic example:
DECLARE #ErrorMessage VARCHAR(MAX) = ERROR_MESSAGE()
IF ##TRANCOUNT > 0
ROLLBACK
RAISERROR(#ErrorMessage, 16, 1)
END CATCH
However you will still have to do additional stuff for batch inserts, or if you need the inserted ID to load other related tables.
Also TABLOCKX is pretty restrictive, there are other less-restrictive locks but I believe they might leave you open for concurrency issues. You can check other locking hints in the docs.
Related
I want to get new id(Identity) before insert it. so, use this code:
select SCOPE_IDENTITY() AS NewId from tblName
but is get this:
1- Null
2- Null
COMPUTED COLUMN VERSION
You'll have to do this on the sql server to add the column.
alter table TableName add Code as (name + cast(id as varchar(200)))
Now your result set will always have Code as the name + id value, nice because this column will remain updated with that expression even if the field are changed (such as name).
Entity Framework Option (Less ideal)
You mentioned you are using Entity Framework. You need to concatenate the ID on a field within the same record during insert. There is no capacity in SQL (outside of Triggers) or Entity Framework to do what you are wanting in one step.
You need to do something like this:
var obj = new Thing{ field1= "some value", field2 = ""};
context.ThingTable.Add(obj);
context.SaveChanges();
obj.field2 = "bb" + obj.id; //after the first SaveChanges is when your id field would be populated
context.SaveChanges();
ORIGINAL Answer:
If you really must show this value to the user then the safe way to do it would be something like this:
begin tran
insert into test(test) values('this is something')
declare #pk int = scope_identity()
print #pk
You can now return the value in #pk and let the user determine if its acceptable. If it is then issue a COMMIT else issue the ROLLBACK command.
This however is not a very good design and I would think a misuse of the how identity values are generated. Also you should know if you perform a rollback, the ID that would of been used is lost and wont' be used again.
This is too verbose for a comment.
Consider how flawed this concept really is. The identity property is a running tally of the number of attempted inserts. You are wanting to return to the user the identity of a row that does not yet exist. Consider what would happen if you have values in the insert that cause it too fail. You already told the user what the identity would be but the insert failed so that identity has already been consumed. You should report to the user the value when the row actually exists, which is after the insert.
I can't understand why you want to show that identity to user before insert, I believe (as #SeanLange said) that is not custom and not useful, but if you insist I think you can do some infirm ways. One of them is
1) Insert new row then get ID with SCOPE_IDENTITY() and show to user
2) Then if you want to cancel operation delete the row and reset
identity (if necessary) with DBCC CHECKIDENT('[Table Name]', RESEED,
[Identity Seed]) method
Other way is not using the Identity column and manage id column by yourself and it must be clear this approach can't be work in concurrency scenarios.
I think perhaps you're confusing the SQL identity with a ORACLE sequence.
They work completely different.
With the ORACLE sequence you'll get the sequence before you insert the record.
With a SQL Identity, the last identity generated AFTER the insert in available via the SCOPE_IDENTITY() function.
If you really need to show the ID to the user before the insert, your best bet is to keep a counter in a separate table, and read the current value, and increment that by one. As long as "gaps" in the numbers aren't a problem.
I have created two threads in C# and I am calling two separate functions in parallel. Both functions read the last ID from XYZ table and insert new record with value ID+1. Here ID column is the primary key. When I execute the both functions I am getting primary key violation error. Both function having the below query:
insert into XYZ values((SELECT max(ID)+1 from XYZ),'Name')
Seems like both functions are reading the value at a time and trying to insert with the same value.
How can I solve this problem.. ?
Let the database handle selecting the ID for you. It's obvious from your code above that what you really want is an auto-incrementing integer ID column, which the database can definitely handle doing for you. So set up your table properly and instead of your current insert statement, do this:
insert into XYZ values('Name')
If your database table is already set up I believe you can issue a statement similar to:
alter table your_table modify column you_table_id int(size) auto_increment
Finally, if none of these solutions are adequate for whatever reason (including, as you indicated in the comments section, inability to edit the table schema) then you can do as one of the other users suggested in the comments and create a synchronized method to find the next ID. You would basically just create a static method that returns an int, issue your select id statement in that static method, and use the returned result to insert your next record into the table. Since this method would not guarantee a successful insert (due to external applications ability to also insert into the same table) you would also have to catch Exceptions and retry on failure).
Set ID column to be "Identity" column. Then, you can execute your queries as:
insert into XYZ values('Name')
I think that you can't use ALTER TABLE to change column to be Identity after column is created. Use Managament Studio to set this column to be Identity. If your table has many rows, this can be a long running process, because it will actually copy your data to a new table (will perform table re-creation).
Most likely that option is disabled in your Managament Studio. In order to enable it open Tools->Options->Designers and uncheck option "Prevent saving changes that require table re-creation"...depending on your table size, you will probably have to set timeout, too. Your table will be locked during that time.
A solution for such problems is to have generate the ID using some kind of a sequence.
For example, in SQL Server you can create a sequence using the command below:
CREATE SEQUENCE Test.CountBy1
START WITH 1
INCREMENT BY 1 ;
GO
Then in C#, you can retrieve the next value out of Test and assign it to the ID before inserting it.
It sounds like you want a higher transaction isolation level or more restrictive locking.
I don't use these features too often, so hopefully somebody will suggest an edit if I'm wrong, but you want one of these:
-- specify the strictest isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
insert into XYZ values((SELECT max(ID)+1 from XYZ),'Name')
or
-- make locks exclusive so other transactions cannot access the same rows
insert into XYZ values((SELECT max(ID)+1 from XYZ WITH (XLOCK)),'Name')
We have a table with a key field, and another table which contains the current value of that key sequence, ie, to insert a new record you need to:
UPDATE seq SET key = key + 1
SELECT key FROM seq
INSERT INTO table (id...) VALUES (#key...)
Today I have been investigating collisions, and have found that without using transactions the above code run in parallel induces collisions, however, swapping the UPDATE and SELECT lines does not induce collisions, ie:
SELECT key + 1 FROM seq
UPDATE seq SET key = key + 1
INSERT INTO table (id...) VALUES (#key...)
Can anyone explain why? (I am not interested in better ways to do this, I am going to use transactions, and I cannot change the database design, I am just interested in why we observed what we did.)
I am running the two lines of SQL as a single string using C#'s SqlConnection, SqlCommand and SqlDataAdapter.
First off, your queries do not entirely make sense. Here's what I presume you are actually doing:
UPDATE seq SET key = key + 1
SELECT #key = key FROM seq
INSERT INTO table (id...) VALUES (#key...)
and
SELECT #key = key + 1 FROM seq
UPDATE seq SET key = #key
INSERT INTO table (id...) VALUES (#key...)
You're experiencing concurrency issues tied to the Transaction Isolation Level.
Transaction Isolation Levels represent a compromise between the need for concurrency (i.e. performance) and the need for data quality (i.e. accuracy).
By default, SQL uses a Read Committed isolation level, which means you can't get "dirty" reads (reads of data that has been modified by another transaction that but not yet committed to the table). It does not, however, mean that you are immune from other types of concurrency issues.
In your case, the issue you are having is called a non-repeatable read.
In your first example, the first line is reading the key value, then updating it. (In order for the UPDATE to set the column to key+1 it must first read the value of key). Then the second line's SELECT is reading the key value again. In a Read Committed or Read Uncommitted isolation level, it is possible that another transaction meanwhile completes an update to the key field, meaning that line 2 will read it as key+2 instead of the expected key+1.
Now, with your second example, once the key value has been read and modified and placed in the #key variable, it is not being read again. This prevents the non-repeatable read issue, but you're still not totally immune from concurrency problems. What can happen in this scenario is a lost update, in which two or more transactions end up trying to update key to the same value, and subsequently inserting duplicate keys to the table.
To be absolutely certain of having no concurrency problems with this structure as designed, you will need to use locking hints to ensure that all reads and updates to key are serializable (i.e. not concurrent). This will have horrendous performance, but "WITH UPDLOCK,HOLDLOCK" will get you there.
Your best solution, if you cannot change the database design, is to find someone who can. As Brian Hoover indicated, an auto-incrementing IDENTITY column is the way to do this with superb performance. The way you're doing it now reduces SQL's V-8 engine to one that is only allowed to fire on one cylinder.
When an INSERT fails into a table with an auto incremented identity field, the identity is still incremented, thus producing gaps in the identity sequence. Is there a way to avoid this?
The only way to do this is to reseed the table or build your own identity generator
example
CREATE TABLE test(id INT IDENTITY, bla INT)
INSERT test VALUES(1)
INSERT test VALUES('b') --fails
DBCC CHECKIDENT(test,RESEED,1) --RESEED table
INSERT test VALUES(1)
SELECT * FROM test
DROP TABLE test
On a busy table you might get inserts after that and the reseed won't be correct anymore
But why do you need this? Who cares if there are gaps
Random idea from left field: Identify what's actually causing the failed inserts and validate against that, while of course realizing that due to the validations and the inserts not being atomic, you may still get failures (for things such as duplicates in unique fields).
It seems the identity gap is the symptom, the failure is the illness. Treat the illness.
Generally you want the key to have no meaning behind it other than identifying the record. That means if there are gaps in the numbering, it should not matter. If you are inserting a record into another related table, you can always use SCOPE_IDENTITY() in your stored proc/trigger, etc.
I have a legacy data table in SQL Server 2005 that has a PK with no identity/autoincrement and no power to implement one.
As a result, I am forced to create new records in ASP.NET manually via the ole "SELECT MAX(id) + 1 FROM table"-before-insert technique.
Obviously this creates a race condition on the ID in the event of simultaneous inserts.
What's the best way to gracefully resolve the event of a race collision? I'm looking for VB.NET or C# code ideas along the lines of detecting a collision and then re-attempting the failed insert by getting yet another max(id) + 1. Can this be done?
Thoughts? Comments? Wisdom?
Thank you!
NOTE: What if I cannot change the database in any way?
Create an auxiliary table with an identity column. In a transaction insert into the aux table, retrieve the value and use it to insert in your legacy table. At this point you can even delete the row inserted in the aux table, the point is just to use it as a source of incremented values.
Not being able to change database schema is harsh.
If you insert existing PK into table you will get SqlException with a message indicating PK constraint violation. Catch this exception and retry insert a few times until you succeed. If you find that collision rate is too high, you may try max(id) + <small-random-int> instead of max(id) + 1. Note that with this approach your ids will have gaps and the id space will be exhausted sooner.
Another possible approach is to emulate autoincrementing id outside of database. For instance, create a static integer, Interlocked.Increment it every time you need next id and use returned value. The tricky part is to initialize this static counter to good value. I would do it with Interlocked.CompareExchange:
class Autoincrement {
static int id = -1;
public static int NextId() {
if (id == -1) {
// not initialized - initialize
int lastId = <select max(id) from db>
Interlocked.CompareExchange(id, -1, lastId);
}
// get next id atomically
return Interlocked.Increment(id);
}
}
Obviously the latter works only if all inserted ids are obtained via Autoincrement.NextId of single process.
The key is to do it in one statement or one transaction.
Can you do this?
INSERT (PKcol, col2, col3, ...)
SELECT (SELECT MAX(id) + 1 FROM table WITH (HOLDLOCK, UPDLOCK)), #val2, #val3, ...
Without testing, this will probably work too:
INSERT (PKcol, col2, col3, ...)
VALUES ((SELECT MAX(id) + 1 FROM table WITH (HOLDLOCK, UPDLOCK)), #val2, #val3, ...)
If you can't, another way is to do it in a trigger.
The trigger is part of the INSERT transaction
Use HOLDLOCK, UPDLOCK for the MAX. This holds the row lock until commit
The row being updated is locked for the duration
A second insert will wait until the first completes.
The downside is that you are changing the primary key.
An auxiliary table needs to be part of a transaction.
Or change the schema as suggested...
Note: All you need is a source of ever-increasing integers. It doesn't have to come from the same database, or even from a database at all.
Personally, I would use SQL Express because it is free and easy.
If you have a single web server:
Create a SQL Express database on the web server with a single table [ids] with a single autoincrementing field [new_id]. Insert a record into this [ids] table, get the [new_id], and pass that onto your database layer as the PK of the table in question.
If you have multiple web servers:
It's a pain to setup, but you can use the same trick by setting appropriate seed/increment (i.e. increment = 3, and seed = 1/2/3 for three web servers).
What about running the whole batch (select for id and insert) in serializable transaction?
That should get you around needing to make changes in the database.
Is the main concern concurrent access? I mean, will multiple instances of your app (or, God forbid, other apps outside your control) be performing inserts concurrently?
If not, you can probably manage the inserts through a central, synchronized module in your app, and avoid race conditions entirely.
If so, well... like Joel said, change the database. I know you can't, but the problem is as old as the hills, and it's been solved well -- at the database level. If you want to fix it yourself, you're just going to have to loop (insert, check for collisions, delete) over and over and over again. The fundamental problem is that you can't perform a transaction (I don't mean that in the SQL "TRANSACTION" sense, but in the larger data-theory sense) if you don't have support from the database.
The only further thought I have is that if you at least have control over who has access to the database (e.g., only "authorized" apps, either written or approved by you), you could implement a side-band mutex of sorts, where a "talking stick" is shared by all the apps and ownership of the mutex is required to do an insert. That would be its own hairy ball of wax, though, as you'd have to figure out policy for dead clients, where it's hosted, configuration issues, etc. And of course a "rogue" client could do inserts without the talking stick and hose the whole setup.
The best solution is to change the database. You may not be able to change the column to be an identity column, but you should be able to make sure there's a unique constraint on the column and add a new identity column seeded with your existing PK's. Then either use the new column instead or use a trigger to make the old column mirror the new, or both.