In the table, there's an identity column incrementing by one each time we add a row (the lowest value is 1). Now, there's a new requirement - we need to implement soft deletion.
So, my idea was to basically multiply any softly deleted row by -1. That ensures the uniqueness and clearly draws a line between active and softly deleted item.
update Things set Id = -101
where Id = 101
Would you know, the stupid computer doesn't let me do that. The suggested work-around is to:
alternate the column
perform the update
alternate back the column
and to me it seems like a Q&D. However, the only alternative I can see is to add a new column carrying the deletion status.
I'm using EF to perform the work with the extra quirk that when I changed the value of the identity and stored it, the software was kind enough to think for me and actually create a new row (with incrementally set identity that was neither the original one, nor the negative of it).
How should I tackle this issue?
I would strongly discourage you from overloading your identity column with any additional meaning. Someone who will look at your database table for the first time has no way of knowing that a negative ID means "deleted".
Introducing a new column Deleted BIT NOT NULL DEFAULT 0 does not have this disadvantage. It is self-explanatory. And it costs almost nothing: in the times of Big Data, an additional BIT column isn't going to fill your hard disk.
All of that being said, if you still want to do that, you could try to SET IDENTITY_INSERT dbo.Things ON before you UPDATE. (I cannot currently verify whether that would work, though.)
Related
The more I read on this, the more confused I get, so hope someone can help. I have a complex database setup, which sometimes produces the error on update:
"Concurrency violation: the UpdateCommand affected 0 of the expected 1 records"
I say sometimes, because I cannot recreate conditions to trigger it consistently. I have a remote mySQL database connected to my app through the DataSource Wizard, which produces the dataset, tables and linked DataTableAdapters.
My reading suggests that this error is meant to occur when there is more than one open connection to the database trying to update the same record? This shouldn't happen in my instance, as the only updates are sequential from my app.
I am wondering whether it has something to do with running the update from a background worker? I have my table updates in one, for example, thusly:
Gi_gamethemeTableAdapter.Update(dbDS.gi_gametheme)
Gi_gameplaystyleTableAdapter.Update(dbDS.gi_gameplaystyle)
Gi_gameTableAdapter.Update(dbDS.gi_game)
These run serially in the backgroundworker, however, so unsure about this. The main thread also waits for it to finish, and there are no other db operations going on before or after this is started.
I did read about going into the dataset designer view, choosing "configure" in the datatableadapter > advanced options and setting "Use optimistic concurrency" to false. This might have worked (hard to say because of the seemingly random nature of the error), however, there are drawbacks to this that I want to avoid:
I have around 60 tables. I don't want to do this for each one.
I sometimes have to re-import the mysql schema into the dataset designer, or delete a table and re-add it. This would obviously lose this setting and I would have to remember to do it on all of them again, potentially. I also can't find a way to do this automatically in code.
I'm afraid I'm not at code level in terms of the database updates etc, relying on the Visual Studio wizards. It's a bit late to change the stack as well (e.g. can't change to Entity Framework etc).
SO my question is:
what is/how can I find what's causing the error?
What can I do about it?
thanks
When you have tableadapters that download data into datatables, they can be configured for optimistic concurrency
This means that for a table like:
Person
ID Name
1 John
They might generate an UPDATE query like:
UPDATE Person SET Name = #newName WHERE ID = #oldID AND Name = #oldName
(In reality they are more complex than this but this will suffice)
Datatables track original values and current values; you download 1/"John", and then change the name to "Jane", you(or the tableadapter) can ask the DT what the original value was and it will say "John"
The datatable can also feed this value into the UPDATE query and that's how we detect "if something else changed the row in the time we had it" i.e. a concurrency violation
Row was "John" when we downloaded it, we edited to "Jane", and went to save.. But someone else had been in and changed it to "Joe". Our update will fail because Name is no longer "John" that it was (and we still think it is) when we downloaded it. By dint of the tableadapter having an update query that said AND Name = #oldName, and setting #oldName parameter to the original value somedatarow["Name", DataRowVersion.Original].Value (i.e. "John") we cause the update to fail. This is a useful thing; mostly they will succeed so we can opportunistically hope our users can update our db without needing to get into locking rows while they have them open in some UI
Resolving the cases where it doesn't work is usually a case of coding up some strategy:
My changes win - don't use an optimistic query that features old values, just UPDATE and erase their changes
Their changes win - cancel your attempts
Re-download the latest DB state and choose what to do - auto merge it somehow (maybe the other person changed fields you didn't), or show the user so they can pick and choose what to keep etc (if both people edited the same fields)
Now you're probably sat there saying "but noone else changes my DB" - we can still get this though, if the database has changed some values upon one save and you don't have the latest ones in your dataset..
There's another option in the tableadapter wizardd - "refresh the dataset" - it's supposed to run a select after a modification to import any latest database calculated values (like auto inc primary keys or triggers/defaults/etc). Some query like INSERT INTO Person(Name) VALUES(#name) is supposed to silently have a SELECT * FROM PERSON WHERE ID = last_inserted_id() tagged on the end of it to retrieve the latest values
Except "refresh the dataset" doesn't work :/
So, while I can't tell you exactly why youre getting your CV exception, I hope that explaining why they occur and pointing out that there are sometimes bugs that cause them (insert new record, calculated ID is not retreieved, edit this recent record, update fails because data wasn't fresh) will hopefully arm you with what you need to find the problem: when you get one, keep the app stopped on the breakpoint and inspect the datarow: take a look at the query being run and what original/current values are being put as parameters - inspect the original and current values held by the row using the overload of the Item indexer that allows you to state the version you want and look in the DB
Somewhere in all of that there will be the mismatch that explains why 0 records were updated - the db has "Joe" as the name or 174354325 as the ID, your datarow has "John" as the original name or -1 as the ID (it never refreshed), and the WHERE clause is finding 0 records as a result
Some of your tables will contain a field that is marked as [ConcurrencyCheck] or [TimeStamp] concurrency token.
When you update a record, the SQL generated will include a WHERE [ConcurrencyField]='Whatever the value was when the record was retrieved'.
If that record was updated by another thread or process or something other than the current thread, then your UPDATE will return 0 records updated, rather than the 1 (or more) that was expected.
What can you do about it? Firstly, put a try/catch(DbConcurrencyException) around your code. Then you can re-read the offending record and try and update it again.
I have a table that contains a non primary key RequestID. When I do a bulkInsert, all the records must have the same RequestID. But If I do another BulkInsert, the next inserted rows must have RequestID incremented :
NewRequestID = PreviousRequestID + 1
The only solution I found so far -and I don't like it by the way-, is to get the last record everytime before inserting the new records.
Why I dont like this approach ? because the database is supposed to be relationnel, which means there is "no specific order". Besides, I don't have primary keys or Dates to order with.
What is the best way to implement this?
(I've added c# tag because i am using EF. if there is an easy solution with EF)
You could take a number of different approaches:
Are you guaranteed that your RequestID's are always incremented? If so, you could query table for largest RequestID and that should represent the "last one inserted."
You could track state somewhere in your application, but this is likely dangerous in scenarios where service fails/restarts (unless state is tracked externally).
Assuming you have control over the schema, if you don't want to update the particular table schema you are speaking of, you could create another table to track the last RequestID used, and retrieve it from there (which would protect you against service restarts/failures).
Those are a few that come to mind.
UPDATE:
Assuming RequestID isn't a particular type of identifier, you could use timestamp - which will always be incremented when you do a new batch, however, I'm not sure if you needed it to always be incremented by exactly '1' which would preclude this approach.
I'm doing EF design, who could tell me what does StoreGeneratedPattern mean?
I can't find a easy straight answer online.
If you look at the samed called enumeration it tells what should be done if you insert or update rows:
None: No auto generated value is generated
Identity: A new value is generated on insert, but not changed on update
Computed: A new value is generated on insert and update
These answers are also not an easy straight answer and just point to or repeat the same arcane documentation that the OP is referring to.
This attribute is used when the column is computed by the database. So on inserts and updates, the value will not be written.
The value will be read back from the database after inserts and updates, though I would guess that if set to Identity, EF may not read the value after an update since it won't have changed. Whether it really makes that tiny optimisation I don't know.
An example might be an identity column or a last updated time-stamp.
I need to be able to change the primary keys in a table. The problem is, some of the keys will be changing to existing key values. E.g. record1.ID 3=>4 and record2.ID 4=>5. I need to keep these as primary keys as they are set as foreign keys (which cascade up update) Is there a reasonable way to accomplish this, or am I attempting sql heresey?
As for the why, I have data from one set of tables linked by this primary key that are getting inserted/updated into another set of similarly structured tables. The insertion is in parts, as it is part of a deduping process, and if I could simply update all of the tables that are to be inserted with the new primary key, life would be easier.
One solution is to start the indexing on the destination table higher than the incoming tables row count will ever reach (the incoming table gets re=indexed every time), but I'd still like to know if it is possible to do the above, otherwise.
TIA
You are attempting sql heresy. I'm actually pretty open-minded and know a lot of times one must do things that seem crazy. It annoys me when people arrogantly answer with "you should do that differently", when they have know idea what the situation is. However I must tell you that you should do this differently. heh heh.
No, there is no way to do this elegantly with sql\DataAdapter. You could do it through ADO.NET with a series of t-sql commands. You have to, every time, turn on an identity-overwrite mode (set identity_insert theTable on), do your query where all the values on that table are incremented up one, and then turn of autonumber-overwrite mode. But then you would need to increment all the other tables that use this as a foreign key. But wait, it gets worse:
You would need to all this in a transaction, because you cannot have anything else happening to these tables during this time, and because if there was a failure you would most definitely need to rollback. This could be a good-size chunk of processing; your tables would be locked for a good bit.
If you have any foreign key constraints between these tables, you would need to turn them off before you do this, and re-implement them afterwards.
If you find yourself starting to think about update primary key values, alarm bells should start ringing.
It may seem easier, but I'd class it as more of a hack than a solution. Personally, I'd be having a rethink and try to address the real problem - may seem harder now, but it will be much better to maintain and reduce potential horrible issues down the line.
We have a large database with enquiries, each enquirys is referenced using a Guid. The Guid isn't very customer friendly so we want to the additional 5 digit "human id" (ok as we'll very likely won't have more than 99999 enquirys active at any time, and it's ok if a humanuid reference multiple enquirys as they aren't used for anything important).
1) Is there any way to have a IDENTITY column reset to 1 after 99999?
My current workaround to this is to use a INT IDENTITY(1,1) NOT NULL column and when presenting a HumanId take HumanId % 100000.
2) Is there any way to automatically "randomly distribute" the ids over [0..99999] so that two enquirys created after each other don't get the adjacent ids? I guess I'm looking for a two-way one-to-one hash function??
... Ideally I'd like to create this using T-SQL automatically creating these id's when a enquiry is created.
If performance and concurrency isn't too much of an issue, you can use triggers and the MAX() function to calculate a 'next human ID' value. You probably would want to keep your IDENTITY column as is, and have the 'human ID' in a separate column.
EDIT: On a side note, this sounds like a 'presentation layer' issue, which shouldn't be in your database. Your presentation layer of your application should have the code to worry about presenting a record in a human readable manner. Just a thought...
If you absolutely need to do this in the database, then why not derive your human-friendly value directly from the GUID column?
-- human_id doesn't have to be calculated when you retrieve the data
-- you could create a computed column on the table itself if you prefer
SELECT (CAST(your_guid_column AS BINARY(3)) % 100000) AS human_id
FROM your_table
This will give you a random-ish value between 0 and 99999, derived from the first 3 bytes of the GUID. If you want a larger, or smaller, range then adjust the divisor accordingly.
I would strongly recommend relooking at your logic. Your approach has a few dangers, including:
It is always a bad idea to re-use ID's, even if the original record has become "obsolete" - do you lose anything by continuing to grow ID's beyond 99999? The problem here is more likely to be with long term maintenance, especially if there is any danger of the system developing over time. Another thing to consider - is there any chance a user will take this reference number, and use it to reference your system at some stage in the future?
With manually assigning a generated / random ID, you will need to ensure that multiple records are not assigned the same ID. There are a few options that you have to follow this (for example, using transactions), however you should ensure that the scope of the transactions is not going to leave you open to problems with concurrent transactions being blocked - this may cause a few problems eg. Performance. You may be best served by generating your ID externally (as SQL does not do random especially well), and then enforcing a unique constraint on your DB, perhaps in the way suggested by Firoz Ansari.
If you still want to reset the identity column, this can be done with the DBCC CHECKIDENT command.
An example of generating random seeds in SQL server can be found here:
http://weblogs.sqlteam.com/jeffs/archive/2004/11/22/2927.aspx
You can create composite primary key with two columns, say..BatchId and HumanId.
Records in these columns will look like this:
BatchId, HumanId
1, 1
1, 2
1, 3
.
.
1, 99998
1, 99999
2, 1
2, 2
3, 3
use MAX or ORDER BY DESC to get next available HumanId with condition with BachId
SELECT TOP 1 #NextHumanId=HumanId
FROM [THAT_TABLE]
ORDER BY BatchId DESC, HumanID DESC
IF #NextHumanId>=99999 THEN SET #NextHumanId=1
Hope this help.
You could have a table of available HUMANIDs, each time you add an enquiry you could randomly pull a HUMANID from the table (and DELETE it), and each time you delete the enquiry you could add it back (by INSERTing).